AUTHOR=Walker Barnaby E. , Tucker Allan , Nicolson Nicky TITLE=Harnessing Large-Scale Herbarium Image Datasets Through Representation Learning JOURNAL=Frontiers in Plant Science VOLUME=Volume 12 - 2021 YEAR=2022 URL=https://www.frontiersin.org/journals/plant-science/articles/10.3389/fpls.2021.806407 DOI=10.3389/fpls.2021.806407 ISSN=1664-462X ABSTRACT=The mobilisation of large-scale datasets of specimen images and associated metadata generated through herbarium digitisation, along with the expert workflows associated with these data objects provide a rich environment for the application and development of machine learning techniques. However, limited access to computational resources and uneven progress in digitisation, especially for small herbaria, still present barriers to the wide adoption of these new technologies. Using deep learning to extract representations of herbarium specimens useful for a wide variety of applications could help remove these barriers. Despite its recent popularity for camera trap and natural world images, representation learning is not yet as popular for herbarium specimen images. We investigated the potential of representation learning with specimen images by training three neural networks on a large-scale publicly available dataset. We compared the extracted representations and tested their performance in application tasks relevant to research carried out with herbarium specimens. We found a triplet network, a type of metric learning, produced representations that transferred the best across all applications investigated. Our results demonstrate that it is possible to learn representations of specimen images useful in different applications, and we identify some further steps that we believe are necessary for representation learning to harness the rich information help in the world's herbaria.