[0a9449]: / docs / build / doctrees / notes / casestudy.doctree

Download this file

165 lines (141 with data), 15.2 kB

ÇĽL;îdocutils.nodesöîdocumentöôö)üö}ö(î	rawsourceöîöîchildrenö]öhîsectionöôö)üö}ö(hhh]ö(hîtitleöôö)üö}ö(hî
Case Studyöh]öhîTextöôöî
Case Studyöůöüö}ö(hhîparentöhhhîsourceöNîlineöNubaî
attributesö}ö(îidsö]öîclassesö]öînamesö]öîdupnamesö]öîbackrefsö]öuîtagnameöhhhhhhîQ/Users/futianfan/Downloads/spring2020/DeepPurpose/docs/source/notes/casestudy.rstöhKubhîbullet_listöôö)üö}ö(hhh]öhî	list_itemöôö)üö}ö(hîJ**1a. Antiviral Drugs Repurposing for SARS-CoV2 3CLPro, using One Line.**
öh]öhî	paragraphöôö)üö}ö(hîI**1a. Antiviral Drugs Repurposing for SARS-CoV2 3CLPro, using One Line.**öh]öhîstrongöôö)üö}ö(hh:h]öhîE1a. Antiviral Drugs Repurposing for SARS-CoV2 3CLPro, using One Line.öůöüö}ö(hhhh>ubah}ö(h]öh!]öh#]öh%]öh']öuh)h<hh8ubah}ö(h]öh!]öh#]öh%]öh']öuh)h6hh*hKhh2ubah}ö(h]öh!]öh#]öh%]öh']öuh)h0hh-hhhh*hNubah}ö(h]öh!]öh#]öh%]öh']öîbulletöî*öuh)h+hh*hKhhhhubh7)üö}ö(hî˙Given a new target sequence (e.g. SARS-CoV2 3CL Protease),
retrieve a list of repurposing drugs from a curated drug library of 81 antiviral drugs.
The Binding Score is the Kd values.
Results aggregated from five pretrained model on BindingDB dataset!öh]öhî˙Given a new target sequence (e.g. SARS-CoV2 3CL Protease),
retrieve a list of repurposing drugs from a curated drug library of 81 antiviral drugs.
The Binding Score is the Kd values.
Results aggregated from five pretrained model on BindingDB dataset!öůöüö}ö(hhahh_hhhNhNubah}ö(h]öh!]öh#]öh%]öh']öuh)h6hh*hKhhhhubhî
literal_blocköôö)üö}ö(hîlfrom DeepPurpose import oneliner
oneliner.repurpose(*load_SARS_CoV2_Protease_3CL(), *load_antiviral_drugs())öh]öhîlfrom DeepPurpose import oneliner
oneliner.repurpose(*load_SARS_CoV2_Protease_3CL(), *load_antiviral_drugs())öůöüö}ö(hhhhoubah}ö(h]öh!]öh#]öh%]öh']öî	xml:spaceöîpreserveöîforceöëîlanguageöîpythonöîhighlight_argsö}öuh)hmhh*hKhhhhubh,)üö}ö(hhh]öh1)üö}ö(hîQ**1b. New Target Repurposing using Broad Drug Repurposing Hub, with One Line.**

öh]öh7)üö}ö(hîO**1b. New Target Repurposing using Broad Drug Repurposing Hub, with One Line.**öh]öh=)üö}ö(hhŹh]öhîK1b. New Target Repurposing using Broad Drug Repurposing Hub, with One Line.öůöüö}ö(hhhhĆubah}ö(h]öh!]öh#]öh%]öh']öuh)h<hhőubah}ö(h]öh!]öh#]öh%]öh']öuh)h6hh*hKhhçubah}ö(h]öh!]öh#]öh%]öh']öuh)h0hhähhhh*hNubah}ö(h]öh!]öh#]öh%]öh']öh]h^uh)h+hh*hKhhhhubh7)üö}ö(hXGiven a new target sequence (e.g. MMP9),
retrieve a list of repurposing drugs from Broad Drug Repurposing Hub,
which is the default.
Results also aggregated from five pretrained model!
Note the drug name here is the Pubchem CID since some drug names in Broad is too long.öh]öhXGiven a new target sequence (e.g. MMP9),
retrieve a list of repurposing drugs from Broad Drug Repurposing Hub,
which is the default.
Results also aggregated from five pretrained model!
Note the drug name here is the Pubchem CID since some drug names in Broad is too long.öůöüö}ö(hh░hh«hhhNhNubah}ö(h]öh!]öh#]öh%]öh']öuh)h6hh*hKhhhhubhn)üö}ö(hîAfrom DeepPurpose import oneliner
oneliner.repurpose(*load_MMP9())öh]öhîAfrom DeepPurpose import oneliner
oneliner.repurpose(*load_MMP9())öůöüö}ö(hhhh╝ubah}ö(h]öh!]öh#]öh%]öh']öh}h~hëhÇîpythonöhé}öuh)hmhh*hKhhhhubh,)üö}ö(hhh]öh1)üö}ö(hîC**2. Repurposing using Customized training data, with One Line.**

öh]öh7)üö}ö(hîA**2. Repurposing using Customized training data, with One Line.**öh]öh=)üö}ö(hhŇh]öhî=2. Repurposing using Customized training data, with One Line.öůöüö}ö(hhhhÎubah}ö(h]öh!]öh#]öh%]öh']öuh)h<hhËubah}ö(h]öh!]öh#]öh%]öh']öuh)h6hh*hK%hh¤ubah}ö(h]öh!]öh#]öh%]öh']öuh)h0hh╠hhhh*hNubah}ö(h]öh!]öh#]öh%]öh']öh]h^uh)h+hh*hK%hhhhubh7)üö}ö(hXGiven a new target sequence (e.g. SARS-CoV 3CL Pro),
training on new data (AID1706 Bioassay),
and then retrieve a list of repurposing drugs from a proprietary library (e.g. antiviral drugs).
The model can be trained from scratch or finetuned from the pretraining checkpoint!öh]öhXGiven a new target sequence (e.g. SARS-CoV 3CL Pro),
training on new data (AID1706 Bioassay),
and then retrieve a list of repurposing drugs from a proprietary library (e.g. antiviral drugs).
The model can be trained from scratch or finetuned from the pretraining checkpoint!öůöüö}ö(hh°hh÷hhhNhNubah}ö(h]öh!]öh#]öh%]öh']öuh)h6hh*hK(hhhhubhn)üö}ö(hX from DeepPurpose import oneliner
from DeepPurpose.dataset import *

oneliner.repurpose(*load_SARS_CoV_Protease_3CL(), *load_antiviral_drugs(no_cid = True),  *load_AID1706_SARS_CoV_3CL(), \
        split='HTS', convert_y = False, frac=[0.8,0.1,0.1], pretrained = False, agg = 'max_effect')öh]öhX from DeepPurpose import oneliner
from DeepPurpose.dataset import *

oneliner.repurpose(*load_SARS_CoV_Protease_3CL(), *load_antiviral_drugs(no_cid = True),  *load_AID1706_SARS_CoV_3CL(), \
        split='HTS', convert_y = False, frac=[0.8,0.1,0.1], pretrained = False, agg = 'max_effect')öůöüö}ö(hhhjubah}ö(h]öh!]öh#]öh%]öh']öh}h~hëhÇîpythonöhé}öuh)hmhh*hK/hhhhubh,)üö}ö(hhh]öh1)üö}ö(hî]3. **A Framework for Drug Target Interaction Prediction, with less than 10 lines of codes.**
öh]öhîenumerated_listöôö)üö}ö(hhh]öh1)üö}ö(hîZ**A Framework for Drug Target Interaction Prediction, with less than 10 lines of codes.**
öh]öh7)üö}ö(hîY**A Framework for Drug Target Interaction Prediction, with less than 10 lines of codes.**öh]öh=)üö}ö(hj&h]öhîUA Framework for Drug Target Interaction Prediction, with less than 10 lines of codes.öůöüö}ö(hhhj(ubah}ö(h]öh!]öh#]öh%]öh']öuh)h<hj$ubah}ö(h]öh!]öh#]öh%]öh']öuh)h6hh*hK@hj ubah}ö(h]öh!]öh#]öh%]öh']öuh)h0hjubah}ö(h]öh!]öh#]öh%]öh']öîenumtypeöîarabicöîprefixöhîsuffixöî.öîstartöKuh)jhjubah}ö(h]öh!]öh#]öh%]öh']öuh)h0hjhhhNhNubah}ö(h]öh!]öh#]öh%]öh']öh]h^uh)h+hh*hK@hhhhubh7)üö}ö(hîVUnder the hood of one model from scratch, a flexible framework for method researchers:öh]öhîVUnder the hood of one model from scratch, a flexible framework for method researchers:öůöüö}ö(hj[hjYhhhNhNubah}ö(h]öh!]öh#]öh%]öh']öuh)h6hh*hKBhhhhubhn)üö}ö(hXÖ	from DeepPurpose import models
from DeepPurpose.utils import *
from DeepPurpose.dataset import *

# Load Data, an array of SMILES for drug,
# an array of Amino Acid Sequence for Target
# and an array of binding values/0-1 label.
# e.g. ['Cc1ccc(CNS(=O)(=O)c2ccc(s2)S(N)(=O)=O)cc1', ...],
#      ['MSHHWGYGKHNGPEHWHKDFPIAKGERQSPVDIDTH...', ...],
#      [0.46, 0.49, ...]
# In this example, BindingDB with Kd binding score is used.
X_drug, X_target, y  = process_BindingDB(download_BindingDB(SAVE_PATH),
                                         y = 'Kd',
                                         binary = False,
                                         convert_to_log = True)

# Type in the encoding names for drug/protein.
drug_encoding, target_encoding = 'MPNN', 'Transformer'

# Data processing, here we select cold protein split setup.
train, val, test = data_process(X_drug, X_target, y,
                                drug_encoding, target_encoding,
                                split_method='cold_protein',
                                frac=[0.7,0.1,0.2])

# Generate new model using default parameters;
# also allow model tuning via input parameters.
config = generate_config(drug_encoding, target_encoding, \
                                                 transformer_n_layer_target = 8)
net = models.model_initialize(**config)

# Train the new model.
# Detailed output including a tidy table storing
#    validation loss, metrics, AUC curves figures and etc.
#    are stored in the ./result folder.
net.train(train, val, test)

# or simply load pretrained model from a model directory path
#   or reproduced model name such as DeepDTA
net = models.model_pretrained(MODEL_PATH_DIR or MODEL_NAME)

# Repurpose using the trained model or pre-trained model
# In this example, loading repurposing dataset using
# Broad Repurposing Hub and SARS-CoV 3CL Protease Target.
X_repurpose, drug_name, drug_cid = load_broad_repurposing_hub(SAVE_PATH)
target, target_name = load_SARS_CoV_Protease_3CL()

_ = models.repurpose(X_repurpose, target, net, drug_name, target_name)

# Virtual screening using the trained model or pre-trained model
X_repurpose, drug_name, target, target_name = \
                ['CCCCCCCOc1cccc(c1)C([O-])=O', ...], ['16007391', ...], \
                ['MLARRKPVLPALTINPTIAEGPSPTSEGASEANLVDLQKKLEEL...', ...],\
                ['P36896', 'P00374']

_ = models.virtual_screening(X_repurpose, target, net, drug_name, target_name)öh]öhXÖ	from DeepPurpose import models
from DeepPurpose.utils import *
from DeepPurpose.dataset import *

# Load Data, an array of SMILES for drug,
# an array of Amino Acid Sequence for Target
# and an array of binding values/0-1 label.
# e.g. ['Cc1ccc(CNS(=O)(=O)c2ccc(s2)S(N)(=O)=O)cc1', ...],
#      ['MSHHWGYGKHNGPEHWHKDFPIAKGERQSPVDIDTH...', ...],
#      [0.46, 0.49, ...]
# In this example, BindingDB with Kd binding score is used.
X_drug, X_target, y  = process_BindingDB(download_BindingDB(SAVE_PATH),
                                         y = 'Kd',
                                         binary = False,
                                         convert_to_log = True)

# Type in the encoding names for drug/protein.
drug_encoding, target_encoding = 'MPNN', 'Transformer'

# Data processing, here we select cold protein split setup.
train, val, test = data_process(X_drug, X_target, y,
                                drug_encoding, target_encoding,
                                split_method='cold_protein',
                                frac=[0.7,0.1,0.2])

# Generate new model using default parameters;
# also allow model tuning via input parameters.
config = generate_config(drug_encoding, target_encoding, \
                                                 transformer_n_layer_target = 8)
net = models.model_initialize(**config)

# Train the new model.
# Detailed output including a tidy table storing
#    validation loss, metrics, AUC curves figures and etc.
#    are stored in the ./result folder.
net.train(train, val, test)

# or simply load pretrained model from a model directory path
#   or reproduced model name such as DeepDTA
net = models.model_pretrained(MODEL_PATH_DIR or MODEL_NAME)

# Repurpose using the trained model or pre-trained model
# In this example, loading repurposing dataset using
# Broad Repurposing Hub and SARS-CoV 3CL Protease Target.
X_repurpose, drug_name, drug_cid = load_broad_repurposing_hub(SAVE_PATH)
target, target_name = load_SARS_CoV_Protease_3CL()

_ = models.repurpose(X_repurpose, target, net, drug_name, target_name)

# Virtual screening using the trained model or pre-trained model
X_repurpose, drug_name, target, target_name = \
                ['CCCCCCCOc1cccc(c1)C([O-])=O', ...], ['16007391', ...], \
                ['MLARRKPVLPALTINPTIAEGPSPTSEGASEANLVDLQKKLEEL...', ...],\
                ['P36896', 'P00374']

_ = models.virtual_screening(X_repurpose, target, net, drug_name, target_name)öůöüö}ö(hhhjgubah}ö(h]öh!]öh#]öh%]öh']öh}h~hëhÇîpythonöhé}öuh)hmhh*hKDhhhhubh,)üö}ö(hhh]öh1)üö}ö(hîE4. **Virtual Screening with Customized Training Data with One Line**
öh]öj)üö}ö(hhh]öh1)üö}ö(hîB**Virtual Screening with Customized Training Data with One Line**
öh]öh7)üö}ö(hîA**Virtual Screening with Customized Training Data with One Line**öh]öh=)üö}ö(hjçh]öhî=Virtual Screening with Customized Training Data with One Lineöůöüö}ö(hhhjëubah}ö(h]öh!]öh#]öh%]öh']öuh)h<hjůubah}ö(h]öh!]öh#]öh%]öh']öuh)h6hh*hKůhjüubah}ö(h]öh!]öh#]öh%]öh']öuh)h0hj~ubah}ö(h]öh!]öh#]öh%]öh']öjGjHjIhjJjKjLKuh)jhjzubah}ö(h]öh!]öh#]öh%]öh']öuh)h0hjwhhhNhNubah}ö(h]öh!]öh#]öh%]öh']öh]h^uh)h+hh*hKůhhhhubh7)üö}ö(hî}Given a list of new drug-target pairs to be screened,
retrieve a list of drug-target pairs with top predicted binding scores.öh]öhî}Given a list of new drug-target pairs to be screened,
retrieve a list of drug-target pairs with top predicted binding scores.öůöüö}ö(hjÂhj┤hhhNhNubah}ö(h]öh!]öh#]öh%]öh']öuh)h6hh*hKçhhhhubhn)üö}ö(hîgfrom DeepPurpose import oneliner
oneliner.virtual_screening(['MKK...LIDL', ...], ['CC1=C...C4)N', ...])öh]öhîgfrom DeepPurpose import oneliner
oneliner.virtual_screening(['MKK...LIDL', ...], ['CC1=C...C4)N', ...])öůöüö}ö(hhhj┬ubah}ö(h]öh!]öh#]öh%]öh']öh}h~hëhÇîpythonöhé}öuh)hmhh*hKőhhhhubeh}ö(h]öî
case-studyöah!]öh#]öî
case studyöah%]öh']öuh)h	hhhhhh*hKubah}ö(h]öh!]öh#]öh%]öh']öîsourceöh*uh)hîcurrent_sourceöNîcurrent_lineöNîsettingsöîdocutils.frontendöîValuesöôö)üö}ö(hNî	generatoröNî	datestampöNîsource_linköNî
source_urlöNî
toc_backlinksöîentryöîfootnote_backlinksöKî
sectnum_xformöKîstrip_commentsöNîstrip_elements_with_classesöNî
strip_classesöNîreport_levelöKî
halt_levelöKîexit_status_levelöKîdebugöNîwarning_streamöNî	tracebacköłîinput_encodingöî	utf-8-sigöîinput_encoding_error_handleröîstrictöîoutput_encodingöîutf-8öîoutput_encoding_error_handleröjřîerror_encodingöîUTF-8öîerror_encoding_error_handleröîbackslashreplaceöî
language_codeöîenöîrecord_dependenciesöNîconfigöNî	id_prefixöhîauto_id_prefixöîidöî
dump_settingsöNîdump_internalsöNîdump_transformsöNîdump_pseudo_xmlöNîexpose_internalsöNîstrict_visitoröNî_disable_configöNî_sourceöh*î_destinationöNî
_config_filesö]öîpep_referencesöNîpep_base_urlöî https://www.python.org/dev/peps/öîpep_file_url_templateöîpep-%04döîrfc_referencesöNîrfc_base_urlöîhttps://tools.ietf.org/html/öî	tab_widthöKîtrim_footnote_reference_spaceöëîfile_insertion_enabledöłîraw_enabledöKîsyntax_highlightöîlongöîsmart_quotesöłîsmartquotes_localesö]öîcharacter_level_inline_markupöëîdoctitle_xformöëî
docinfo_xformöKîsectsubtitle_xformöëîembed_stylesheetöëîcloak_email_addressesöłîenvöNubîreporteröNîindirect_targetsö]öîsubstitution_defsö}öîsubstitution_namesö}öîrefnamesö}öîrefidsö}öînameidsö}öjÎjďsî	nametypesö}öjÎNsh}öjďhsî
footnote_refsö}öî
citation_refsö}öî
autofootnotesö]öîautofootnote_refsö]öîsymbol_footnotesö]öîsymbol_footnote_refsö]öî	footnotesö]öî	citationsö]öîautofootnote_startöKîsymbol_footnote_startöKî
id_counteröîcollectionsöîCounteröôö}öůöRöîparse_messagesö]ö(hîsystem_messageöôö)üö}ö(hhh]öh7)üö}ö(hî:Enumerated list start value not ordinal-1: "3" (ordinal 3)öh]öhî>Enumerated list start value not ordinal-1: ÔÇť3ÔÇŁ (ordinal 3)öůöüö}ö(hhhj_ubah}ö(h]öh!]öh#]öh%]öh']öuh)h6hj\ubah}ö(h]öh!]öh#]öh%]öh']öîlevelöKîtypeöîINFOöîsourceöh*îlineöKuh)jZhjubj[)üö}ö(hhh]öh7)üö}ö(hî:Enumerated list start value not ordinal-1: "4" (ordinal 4)öh]öhî>Enumerated list start value not ordinal-1: ÔÇť4ÔÇŁ (ordinal 4)öůöüö}ö(hhhj{ubah}ö(h]öh!]öh#]öh%]öh']öuh)h6hjxubah}ö(h]öh!]öh#]öh%]öh']öîlevelöKîtypeöjuîsourceöh*îlineöKuh)jZhjzubeîtransform_messagesö]öîtransformeröNî
decorationöNhhub.