AUTHOR=Yamamoto Shintaro , Lauscher Anne , Ponzetto Simone Paolo , Glavaš Goran , Morishima Shigeo TITLE=Visual Summary Identification From Scientific Publications via Self-Supervised Learning JOURNAL=Frontiers in Research Metrics and Analytics VOLUME=Volume 6 - 2021 YEAR=2021 URL=https://www.frontiersin.org/journals/research-metrics-and-analytics/articles/10.3389/frma.2021.719004 DOI=10.3389/frma.2021.719004 ISSN=2504-0537 ABSTRACT=The exponential growth of scientific literature yields the need for supporting users in effectively and efficiently analyzing and understanding the body of research works. This procedure can be facilitated by providing graphical abstracts -- a visual summary of a scientific publication. Accordingly, previous work presented an initial study on automatically identifying a central figure from a scientific paper that can serve as a visual summary. However, these efforts are currently limited to the biomedical domain only. This is primarily due to the current state-of-the-art relying on supervised machine learning, which requires large amounts of labeled data, and the only existing annotated data set consisting of biomedical research papers only. To alleviate the issues, we build a novel benchmark data set for visual summary identification from scientific publications, which consists of papers presented at conferences of several domains in computer science. We couple this contribution with a new self-supervised learning approach to learn a heuristic matching of inline reference to figures with figure captions, which only requires a collection of scientific papers, thereby reducing the need for large annotated data sets. We evaluate the proposed approach on both the existing biomedical and our newly presented computer science data set. The experimental results suggest that the proposed method is able to outperform the previous state-of-the-art without any annotated training data.