AUTHOR=Hong Zhi , Pauloski J. Gregory , Ward Logan , Chard Kyle , Blaiszik Ben , Foster Ian TITLE=Models and Processes to Extract Drug-like Molecules From Natural Language Text JOURNAL=Frontiers in Molecular Biosciences VOLUME=Volume 8 - 2021 YEAR=2021 URL=https://www.frontiersin.org/journals/molecular-biosciences/articles/10.3389/fmolb.2021.636077 DOI=10.3389/fmolb.2021.636077 ISSN=2296-889X ABSTRACT=Researchers worldwide are seeking to repurpose existing drugs or discover new drugs to counter the disease caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). A promising source of candidates for such studies is molecules that have been reported in the scientific literature to be druglike. We report here on a project that leverages both human and artificial intelligence to detect references to drug-like molecules in free text. We engage non-expert humans to create a corpus of labeled text, use this labeled corpus to train a named entity recognition model, and employ the trained model to extract 10912 drug-like molecules from the CORD-19 corpus of around 200000 papers. Performance analyses show that our automated extraction model can achieve performance on par with that of non-expert humans.