TY - GEN
T1 - Similarity-based methods to predict drug targets, indications and side-effects
AU - Sharan, Roded
PY - 2011
Y1 - 2011
N2 - Elucidating drug targets, potential indications and side effects are fundamental challenges in drug development. Key to addressing these challenges are methods that can integrate similarity information on drugs, genes, diseases and side effects from multiple sources. We present an array of similarity-based methods to predict drug properties that are extensible to additional emerging similarity measures among drug- and disease-related entities. The first method, Similarity-based Inference of drug-TARgets (SITAR), incorporates multiple drug-drug and gene-gene similarity measures for drug target prediction. SITAR consists of a new scoring scheme for drug-gene associations based on a given pair of drug-drug and gene-gene similarity measures, combined with a logistic regression component that integrates the scores of multiple measures to yield the final association score. We apply SITAR to predict targets for hundreds of drugs using both commonly used and novel drug-drug and gene-gene similarity measures and compare our results to existing state of the art methods, markedly outperforming them. We then employ our framework to make novel target predictions for hundreds of drugs; we validate these predictions via curated databases that were not used in the learning stage. The second method, PREdiction of Drug IndiCaTions (PREDICT), is designed for the large-scale prediction of drug indications and can handle both approved drugs and novel molecules. PREDICT is based on the observation that similar drugs are indicated for similar diseases, and utilizes multiple drug-drug and disease-disease similarity measures for the prediction task. On cross validation, it obtains high specificity and sensitivity (AUC=0.9) in predicting drug indications, surpassing existing methods. We validate our predictions by their overlap with drug indications that are currently under clinical trials, and by their agreement with tissue expression information for the drug targets. We further show that disease-specific genetic signatures can be used to accurately predict drug indications for new diseases (AUC=0.92). This lays the computational foundation for future personalized drug treatments, where gene expression signatures from individual patients would replace the disease-specific signatures. Finally, we present a novel approach to predict the side effects of a given drug, taking into consideration information on other drugs and their side effects. Starting from a query drug, a combination of canonical correlation analysis and network-based diffusion is applied to predict its side effects. We evaluate our method by measuring its performance in a cross validation setting using a comprehensive data set of 692 drugs and their known side effects derived from package inserts. For 34matches a known side effect of the drug. Remarkably, even on unseen data, our method is able to infer side effects that highly match existing knowledge. In addition, we show that our method outperforms a prediction scheme that considers each side effect separately. We believe that these methods represent a promising step toward shortcutting the process and reducing the cost of drug development.
AB - Elucidating drug targets, potential indications and side effects are fundamental challenges in drug development. Key to addressing these challenges are methods that can integrate similarity information on drugs, genes, diseases and side effects from multiple sources. We present an array of similarity-based methods to predict drug properties that are extensible to additional emerging similarity measures among drug- and disease-related entities. The first method, Similarity-based Inference of drug-TARgets (SITAR), incorporates multiple drug-drug and gene-gene similarity measures for drug target prediction. SITAR consists of a new scoring scheme for drug-gene associations based on a given pair of drug-drug and gene-gene similarity measures, combined with a logistic regression component that integrates the scores of multiple measures to yield the final association score. We apply SITAR to predict targets for hundreds of drugs using both commonly used and novel drug-drug and gene-gene similarity measures and compare our results to existing state of the art methods, markedly outperforming them. We then employ our framework to make novel target predictions for hundreds of drugs; we validate these predictions via curated databases that were not used in the learning stage. The second method, PREdiction of Drug IndiCaTions (PREDICT), is designed for the large-scale prediction of drug indications and can handle both approved drugs and novel molecules. PREDICT is based on the observation that similar drugs are indicated for similar diseases, and utilizes multiple drug-drug and disease-disease similarity measures for the prediction task. On cross validation, it obtains high specificity and sensitivity (AUC=0.9) in predicting drug indications, surpassing existing methods. We validate our predictions by their overlap with drug indications that are currently under clinical trials, and by their agreement with tissue expression information for the drug targets. We further show that disease-specific genetic signatures can be used to accurately predict drug indications for new diseases (AUC=0.92). This lays the computational foundation for future personalized drug treatments, where gene expression signatures from individual patients would replace the disease-specific signatures. Finally, we present a novel approach to predict the side effects of a given drug, taking into consideration information on other drugs and their side effects. Starting from a query drug, a combination of canonical correlation analysis and network-based diffusion is applied to predict its side effects. We evaluate our method by measuring its performance in a cross validation setting using a comprehensive data set of 692 drugs and their known side effects derived from package inserts. For 34matches a known side effect of the drug. Remarkably, even on unseen data, our method is able to infer side effects that highly match existing knowledge. In addition, we show that our method outperforms a prediction scheme that considers each side effect separately. We believe that these methods represent a promising step toward shortcutting the process and reducing the cost of drug development.
UR - http://www.scopus.com/inward/record.url?scp=79960941185&partnerID=8YFLogxK
U2 - 10.1145/1995412.1995415
DO - 10.1145/1995412.1995415
M3 - منشور من مؤتمر
SN - 9781450307956
T3 - Proceedings - 4th International Conference on SImilarity Search and APplications, SISAP 2011
SP - 5
EP - 6
BT - Proceedings - 4th International Conference on SImilarity Search and APplications, SISAP 2011
T2 - 4th International Conference on SImilarity Search and APplications, SISAP 2011
Y2 - 30 June 2011 through 1 July 2011
ER -