Editorial: Herbarium Collection-Based Plant Evolutionary Genetics and Genomics

Over the past centuries and especially the last decades, herbarium collections world-wide have amassed an estimated 350 million specimens of plants and fungi, deposited in 3,400 herbaria world-wide (Soltis, 2017). Many of these specimens comprise yet-undescribed species, estimated to be around 70,000 (Bebber et al., 2010), and therefore represent a potentially highly significant contribution to taxonomy and systematics. These collections are a huge repository of botanical metadata, not only at the level of the specimen itself—with its associated collection locality, population, and associated pathogen data (Yoshida et al., 2015; James et al., 2018)—but also at the level of characters and traits such as leaf morphology (Queenborough, 2017), gene (Bieker and Martin, 2018), and genome (GGBN; Seberg et al., 2016). Analyzing this metadata permits us to look back—and perhaps even forward—in time, testing historical biological hypotheses, identifying extinct genotypes, as well as modeling past ecological processes or extrapolating future trends. Especially for plants with long generation times, it is often not feasible to observe genetic changes overmany generations in greenhouse experiments. As time series for many plant species can be amassed from global herbaria, these collections facilitate studying plants in more detail, for example to estimate mutation rates after introduction to a novel range (Exposito-Alonso et al., 2018). In fact, collectively, herbarium collections can be seen as being part of a Global Museum in which new questions can be addressed using novel combinations of disciplines that so far may not typically interact (Bakker et al., 2020), and in which digitisation of herbarium collections plays an important role (Soltis, 2017).


OUT OF THE BOX: HERBARIUM COLLECTIONS AS RESEARCH TOOLS
Over the past centuries and especially the last decades, herbarium collections world-wide have amassed an estimated 350 million specimens of plants and fungi, deposited in 3,400 herbaria world-wide (Soltis, 2017). Many of these specimens comprise yet-undescribed species, estimated to be around 70,000 (Bebber et al., 2010), and therefore represent a potentially highly significant contribution to taxonomy and systematics.
These collections are a huge repository of botanical metadata, not only at the level of the specimen itself-with its associated collection locality, population, and associated pathogen data (Yoshida et al., 2015;James et al., 2018)-but also at the level of characters and traits such as leaf morphology (Queenborough, 2017), gene (Bieker and Martin, 2018), and genome (GGBN; Seberg et al., 2016). Analyzing this metadata permits us to look back-and perhaps even forward-in time, testing historical biological hypotheses, identifying extinct genotypes, as well as modeling past ecological processes or extrapolating future trends. Especially for plants with long generation times, it is often not feasible to observe genetic changes over many generations in greenhouse experiments. As time series for many plant species can be amassed from global herbaria, these collections facilitate studying plants in more detail, for example to estimate mutation rates after introduction to a novel range (Exposito-Alonso et al., 2018). In fact, collectively, herbarium collections can be seen as being part of a Global Museum in which new questions can be addressed using novel combinations of disciplines that so far may not typically interact (Bakker et al., 2020), and in which digitisation of herbarium collections plays an important role (Soltis, 2017).

HERBARIUM DNA
Over the past three decades, a considerable body of literature has accumulated concerning herbarium DNA, mostly focusing on its properties and post-mortem damage (e.g., Savolainen et al., 1995;Staats et al., 2011Staats et al., , 2013Weiss et al., 2016) and its potential utility for biological inference (e.g., Erkens et al., 2008;Drábková et al., 2002;Telle and Thines, 2008;Särkinen et al., 2012;Bakker et al., 2016;Gutaker et al., 2017;Brewer et al., 2019). The overarching conclusion has been that plant archival DNA is remarkably well-preserved, in spite of the heat often applied to the specimens during collection and preservation. In contrast with animal tissues, the cell wall in plant (and fungal) material probably provides good protection against DNA damage caused by oxidative stress (Mateiu and Rannala, 2008;Roldán-Arjona and Ariza, 2009).
Plant nuclear genomes are generally much larger than animal and fungal genomes (Gregory et al., 2007) and are characterized by abundant repeats, which can hamper the assembly of genome sequences. But provided an existing reference genome sequence exists, whole-genome studies including herbarium specimens are possible. For example, Exposito-Alonso et al., 2018 re-sequenced the nuclear genomes of 36 Arabidopsis thaliana herbarium specimens collected between 1863 and 1993. As the number of available reference genomes rapidly increases, in part through initiatives such as the Earth BioGenome Project (Lewin et al., 2018), this approach becomes available even for non-model plant species.
Plastomes, with their structural conservation across land plants (Wicke and Schneeweiss, 2015), modest length around 160 kbp, and high copy number in the cell, have proven to be feasible targets for herbarium DNA studies, especially through the application of "genome skimming" (Straub et al., 2012;Bakker, 2017). Remarkable examples of plastome sequencing from herbarium DNA include now-extinct species, as in, for instance, the de-novo assembly of the complete (Zedane et al., 2015) and mitogenome (Van de Paer et al., 2018) from a 140year-old specimen of Hesperelaea (Oleacea), the reference-guided assembly of the plastome from a 167-year old specimen of Leptagrostis schimperiana (Arundinoideae, Poaceae) to resolve its taxonomic position (Hardion et al., 2020), as well as the reconstruction of the complete plastome from the nowextinct, endemic Hawaiian mint Stenogyne haliakalae (Welch et al., 2016). Due to their influence on crop production and their development of herbicide resistance, weeds are also of general interest. As an example, here Sablok et al. contribute the reconstructed plastome sequence of Ambrosia trifida (Asteraceae) from a herbarium specimen collected in 1886, and investigate the plant's resistance to the widely used herbicide glyphosate.

ANCIENT ALLELES
Herbarium specimens contain not only DNA from the specimen itself, but also from associated microbes and pathogens that can be exploited for evolutionary and ecological inference (Bieker et al., 2020). Indeed, examples in which herbarium DNA contributed to evolutionary genetic inference have focused on historical pathogens (Martin et al., 2013;Yoshida et al., 2014Yoshida et al., , 2015, determining the genotype of the oomycete plant pathogen Phytophthora infestans that caused the nineteenth-century Irish potato famine. Ristaino summarizes emerging patterns in global Phytophthora distribution and how mycological and plant herbaria have played an important role in reconstructing pathways of plant pathogen movement. Ancient alleles in Alopecurus myosuroides Huds., relevant to herbicide resistance but pre-dating human influence, were detected from herbarium DNA by Délye et al. (2013). Likewise, Besnard et al. (2014), using DNA from a 100-year-old Madagascan herbarium specimen, reconstructed the shift in underlying genetics from C 3 to C 4 in grass photosynthesis. These examples demonstrate that we are currently at the dawn of an era of historical herbarium genomics (Buerki and Baker, 2015;Bieker and Martin, 2018), and it is likely that a large body of plant archival genomic data will be generated in the years to come.
That clade-or phylum-specific challenges remain in sequencing herbarium DNA is illustrated by Forin et al., who describes the case of the Saccardo mycological herbarium and how ribosomal DNA sequencing of 100-year old fungal specimens was feasible. Forin et al. point out the potential of mycological herbarium genomics, as only an estimated 1% of total known fungal species currently have associated DNA sequences in public databases.

DNA BARCODING AND METABARCODING
DNA barcoding is a commonly used method for species identification and phylogenetic analysis. It relies on the amplification of short, conserved genomic loci that show enough variation to separate species but have low intraspecific variation. Due to the high levels of DNA fragmentation (Weiss et al., 2016), it is challenging to obtain barcode sequencing for herbarium specimens (Prosser et al., 2016). Kistenich et al. explore DNA sequencing of a ca. 900-bp portion of the long mitochondrial ribosomal small subunit (mtSSU) from historical lichen specimens using a two-step PCR approach followed by Ion Torrent sequencing. Their approach demonstrated a high success rate compared to traditional Sanger sequencing, providing enough sequencing information for species identification even in a 150-year old specimen. Moreover, no significant correlation between sequencing success and habitat ecology of the investigated specimens was found, which provides confidence for future lichen herbarium sequencing projects. These and other methodological advancements suggest that barcode information can also be more easily retrieved from herbarium and fungarium type specimens, which are often more than 100 years old.
Barcoding studies, and especially those employing metabarcoding, rely heavily on the completeness of databases for species identification. In order to complete the databases, which currently only contain an estimated 20% of described land plants (Wilkinson et al., 2017), herbarium collections offer a valuable and largely untapped resource. Focusing on Australian plant biodiversity, Dormontt et al. make a strong case for how herbarium collections should be systematically analyzed to capitalize on their scientific potential, and that the curation of specimen reference data is paramount in this.

GENOMIC SEQUENCING AND OTHER DATA FROM HERBARIUM DNA
The increasingly common retrieval of genome-wide SNP data from herbarium DNA presents an enormous potential for future phylogenetic, population genomic, and molecularbased ecological studies that include these valuable specimens. For instance, genotyping by sequencing (GBS) of Solidago species from relatively young herbarium specimens collected between 1970 and 2010 was successful in 98% of samples (Beck and Semple, 2015). More recently, Lang et al. (2020) described a very promising reduced-representation sequencing method designed for hybridization-capture of ddRAD loci from historical plant specimens, demonstrating its utility on Arabidopsis thaliana and Cardamine bulbifera. At a lowerthroughput genomic scale, targeted enrichment approaches aim to obtain sequence data for hundreds of nuclear-encoded loci from herbarium samples (Hart et al., 2016;Brewer et al., 2019;Viruel et al., 2019). Forrest et al. test the limits of the Hyb-Seq approach for sequencing herbarium specimens, especially with regards to historic and contemporary techniques of specimen preservation. They show that although data could also be obtained from poor quality samples, preservation methods like heat and alcohol treatment yielded greater DNA degradation and poorer DNA retrieval. Finally, Dodsworth et al. explore the angiosperm non-coding genome, showing that herbarium DNA is virtually indistinguishable from fresh DNA in analyses of nuclear genomic repeat clusters and their abundancies. Thus, they argue that herbarium collections can facilitate further genomic exploration of the repetitive content of plant genomes, which yields additional (nuclear) phylogenetic markers useful at the (sub)species-level (see also Dodsworth et al., 2015).

SCALING-UP THE USE OF COLLECTIONS
This topic has collected a variety of outstanding recent successes in genetic analysis of herbarium specimens, inspiring our prediction of the great potential of these approaches for evolutionary genomic, population genetic, phylogenetic, and biosystematic discovery in archival plants and their associated microbial communities. For instance, future studies of the evolution of photosynthesis, or the evolutionary ecology of adaptive and functional traits, will undoubtedly benefit from the extensive samples offered by herbarium collections. We posit that we are at the dawn of an era of herbarium genomics in which, augmenting traditional phylogenetic studies that have become the standard for herbarium collection-based genetic work, soon will come a flood of ecological and evolutionary investigations utilizing large quantities of genomic and genetic data gleaned directly from voucher specimens.

AUTHOR CONTRIBUTIONS
MM conceived of the Research Topic. FB drafted the editorial with significant contributions from VB and MM. All authors contributed to the article and approved the submitted version.