AUTHOR=Wang Kangtao , Herr Ingrid TITLE=Machine-Learning-Based Bibliometric Analysis of Pancreatic Cancer Research Over the Past 25 Years JOURNAL=Frontiers in Oncology VOLUME=12 YEAR=2022 URL=https://www.frontiersin.org/journals/oncology/articles/10.3389/fonc.2022.832385 DOI=10.3389/fonc.2022.832385 ISSN=2234-943X ABSTRACT=

Machine learning and semantic analysis are computer-based methods to evaluate complex relationships and predict future perspectives. We used these technologies to define recent, current and future topics in pancreatic cancer research. Publications indexed under the Medical Subject Headings (MeSH) term ‘Pancreatic Neoplasms’ from January 1996 to October 2021 were downloaded from PubMed. Using the statistical computing language R and the interpreted, high-level, general-purpose programming language Python, we extracted publication dates, geographic information, and abstracts from each publication’s metadata for bibliometric analyses. The generative statistical algorithm “latent Dirichlet allocation” (LDA) was applied to identify specific research topics and trends. The unsupervised “Louvain algorithm” was used to establish a network to identify relationships between single topics. A total of 60,296 publications were identified and analyzed. The publications were derived from 133 countries, mostly from the Northern Hemisphere. For the term “pancreatic cancer research”, 12,058 MeSH terms appeared 1,395,060 times. Among them, we identified the four main topics “Clinical Manifestation and Diagnosis”, “Review and Management”, “Treatment Studies”, and “Basic Research”. The number of publications has increased rapidly during the past 25 years. Based on the number of publications, the algorithm predicted that “Immunotherapy”, Prognostic research”, “Protein expression”, “Case reports”, “Gemcitabine and mechanism”, “Clinical study of gemcitabine”, “Operation and postoperation”, “Chemotherapy and resection”, and “Review and management” as current research topics. To our knowledge, this is the first study on this subject of pancreatic cancer research, which has become possible due to the improvement of algorithms and hardware.