AUTHOR=Guin Debleena , Rani Jyoti , Singh Priyanka , Grover Sandeep , Bora Shivangi , Talwar Puneet , Karthikeyan Muthusamy , Satyamoorthy K , Adithan C , Ramachandran S , Saso Luciano , Hasija Yasha , Kukreti Ritushree TITLE=Global Text Mining and Development of Pharmacogenomic Knowledge Resource for Precision Medicine JOURNAL=Frontiers in Pharmacology VOLUME=Volume 10 - 2019 YEAR=2019 URL=https://www.frontiersin.org/journals/pharmacology/articles/10.3389/fphar.2019.00839 DOI=10.3389/fphar.2019.00839 ISSN=1663-9812 ABSTRACT=To provide personalized healthcare it is important to understand patients' genomic variations and the effect these variants have in protecting or predisposing patients to drug response phenotypes. Several studies manually curated such genotype-phenotype relationships in organized databases from clinical trial data or published literature. However, there are no text mining tools available for extracting high-accuracy information from existing knowledge. In this work, we used a semi-automated text mining approach to retrieve a complete pharmacogenomic (PGx) resource integrating the disease-drug-gene-polymorphism relationships to derive a global perspective for the ease in therapeutic approaches. We implemented an R-package, pubmed.mineR, to automatically retrieve pharmacogenomics related literature. Herein, we identified 1753 disease types, 666 drugs, associated with 4132 genes and 33942 polymorphisms collated from 184560 publications and with further manual curation, a total of 2304 pharmacogenomic relationships were obtained. We evaluated our approach by performance comparison (precision= 0·806) with benchmark datasets- PharmGKB (0·904), OMIM (0·600) and, CTD (0·729). A validation study was conducted comparing the results of our approach with commercially used FDA approved drug labelling biomarkers. Of the 228 FDA approved pharmacogenomic markers, 127 were common with the 2304 markers obtained from our proposed approach. Our semi-automated text mining approach may show significant pharmacogenomic information with markers for drug response prediction. In addition, outcome of the proposed approach represents a scalable state-of-art improvement in curation for pharmacogenomic utility.