Integrating global geochemical volcano rock composition with eruption history datasets

The major element composition of volcanic rocks carries important information about the source and differentiation processes affecting the magma, the physical properties that allow it to erupt, and its eruptive style. Although global rock geochemical databases exist, these are not linked to volcanic eruption history which hampers our global understanding of the relationship between magma composition and eruption dynamics. Here, we integrate two global databases, the Geochemistry of Rocks of the Oceans and Continents (GEOROC) and the Holocene volcanoes of the world of the Global Volcanism Program (VOTW-GVP). The integration is based on matching the location name, geographic position and eruption time, which is automated by a tool called DashVolcano. The tool is open-source, accessible at https://github.com/feog/DashVolcano, and gives access to the integrated datasets via an interactive dashboard. DashVolcano is based on more than 138,000 volcanic rock samples and provides the basis for the identification of global relationships between eruption styles, volcano types, and rock composition for more than 700 volcanoes and their eruptions for the last 10,000 years. The combined record of the eruptive history and its corresponding geochemical rock composition that DashVolcano provides can be used for characterizing global geochemical differences between volcanoes, and should also prove useful for improved long-term hazard and risk evaluations.


Introduction
The deposits of volcanic eruptions are a record of magmatic processes that reflect the physical and chemical changes of the erupted magma throughout its lifetime, from the source origin, during residency in the crustal plumbing system, until cooling at the Earth's surface (Schmincke, 2008;. The variations in the geochemistry of the erupted magma are key factors for understanding its eruption processes  and styles , and thus important for volcanic hazard mitigation. The magma major element composition has an important role in controlling the physical properties such as viscosity and density, that impede or facilitate eruption, and whether these are explosive or effusive (Gonnermann and Manga, 2012;National Academies of Sciences, Engineering, and Medicine, 2017;Cassidy, 2018). Thus, a comprehensive record of the eruptive histories and their corresponding geochemical rock compositions is critical for long-term evaluations of hazard and risk. The multifaceted and multiparametric nature of volcanic phenomena means that many different databases have been created, but each typically only targets one of the aspects of volcanology (Andrews et al., 2022). For instance, much of the information about the composition of volcanic rocks can be gathered from the Geochemistry of Rocks of the Oceans and Continents (GEOROC, https://georoc.eu; Sarbas and Nohl, 2008;DIGIS Team, 2022a; DIGIS Team, 2022b; DIGIS Team, 2022c; DIGIS Team, 2022d; DIGIS Team, 2022e; DIGIS Team, 2022f; DIGIS Team, 2022g; DIGIS Team, 2022h; DIGIS Team, 2022i;DIGIS Team, 2022j) of the Digital Geochemistry Infrastructure (DIGIS), and the Volcanoes of the World (VOTW) of the Smithsonian's Global Volcanism Program (GVP, https://volcano. si.edu/; Global Volcanism Program, 2013) provides a wealth of background and eruption chronology data for Holocene events. However, much more could be learnt if the individual global volcanological datasets could be jointly explored, potentially unveiling correlations unseen so far. The GEOROC database contains more than 380,000 volcanic rock analyses, with the concentration of their major oxides. Linking the compositional information to their corresponding volcanic episode, would enrich the knowledge provided by GVP database, which includes over 1,400 volcanoes and around 9,800 eruptions.
In this paper, we present a tool that automates the integration of GVP and GEOROC databases. We describe 1) how the geochemical and volcanological data integration was done by matching parameters from both databases, and linking their data contents, 2) the complexity and challenges encountered in processing the available datasets, and in automating the linkage between the two resources, 3) data preparation and preconditioning, and 4) the type of information obtained from this integration and the complementarity that it offers. We created the interactive data exploration dashboard called DashVolcano which provides an intuitive way to visually explore, select, compare, and download datasets. DashVolcano gives access to both GVP and GEOROC datasets, allowing applications for exploring the data available for volcanoes of interest (addition of other datasets is also possible as discussed in Supplementary Appendix II), comparing the rock compositions and eruptive histories of volcanoes, and thus gather evidence of expected behaviors, or form new hypotheses. We also provide a prospective look at potential applications to utilize the integrated datasets, with selected case examples to illustrate how to perform data query, visualization, plot, and download.

Methods
We start by summarizing the features of interest from the GVP and GEOROC databases. The GVP database allows to query both volcano and eruption data (see https://volcano.si.edu/). We are using Holocene volcano datasets with known eruption data downloaded from the GVP volcano search URL (https://volcano. si.edu/search_volcano.cfm), including the volcano name, major and minor rocks types, the location, and the tectonic setting (Table 1). Hereafter, we only considered confirmed eruptions of GVP (https:// volcano.si.edu/search_eruption.cfm) which typically include the eruption date(s), the volcanic explosivity index (VEI; Newhall and Self, 1982), among other data ( Table 2). As classified by GVP. A detailed explanation of how eruption dates are handled depending on the different types of available data is provided in Supplementary Appendix I. Overall, GVP contains data for about 1,400 volcanoes and 9,800 confirmed eruptions.
The GEOROC database contains rock compositional data, including major element oxides in mass % (wt%), trace elements, and a range of isotope compositions (https://georoc.eu/georoc/newstart.asp) of about 384,000 volcanic sample analyses. It is possible to query the database, or to download precompiled files by locations, which are categorized by tectonic settings. Hereafter we will identify the features in GEOROC in capital letters to make it easier to distinguish between GEOROC features from those from GVP. The features of a rock sample from a precompiled file can include sample name, rock type, eruption date, sample location, and the geochemical data (only a subset of the available chemicals is reported here; Table 3).
We are only interested in volcanic rocks, so from the precompiled files only samples whose ROCK TYPE is VOL were considered. The precompiled files are grouped in 11 different folders that mainly reflect the tectonic settings: Archean cratons, complex volcanic settings, continental flood basalts, convergent margins, intraplate volcanics, ocean basin flood basalts, ocean island groups, oceanic plateaus, rift volcanics, seamounts and submarine ridges. The reported chemical compositions within those files are obtained from diverse types of material analysed: whole rock, melt inclusion, volcanic glass, and mineral. The melt inclusions analyses are reported in a separate precompiled fileMost mineral compositions are shown the plots because they are very different from the rest of the data, but when they do, their symbol is the same as whole rock.

Linking GEOROC and GVP
To link the two databases we need to match GEOROC rock samples to GVP volcanoes and eruptions, for which we matched the The most common rock types for the volcano, based on the TAS (Total-Alkali Silica) diagram Le Bas et al. (1986) and Le Maitre et al. (2002) and listed in general order of abundance, described as "major." A rock type is described as "minor" or "rare," if once quantified, it consists of less than 10% of the total volume Siebert et al. (2011) Latitude geographical location and temporal features as described below ( Figure 1).
In the GVP database, volcanoes are identified by their name and a volcano number. The volcano number is necessary because there are different volcanoes with the same name (e.g., Cerro Azul is the name of both a volcano in Chile, and also one in the Galapagos islands in Ecuador). Moreover, there are volcanoes which have several names (synonyms), or several spellings of the same name (e.g., Rinjani/Rindjani in Indonesia). A GVP volcano can contain multiple eruption records, but each eruption belongs to only one volcano.
In the GEOROC database, a sample has a LOCATION, a LOCATION COMMENT, a range of latitudes (LATITUDE MIN, LATITUDE MAX) and a range of longitudes (LONGITUDE MIN, LONGITUDE MAX). The ranges can be reduced to a single point, depending on the precision. The LOCATION is of the form REGION/SUBREGION/SUBSUBREGION; depending on the record, it can have from 1 to 9 (sub) region(s). The tectonic settings use different conventions in the two databases. There is no convention for regions and sub-regions either.
Linking the two databases requires matching the volcano names (or locations) in GVP with names appearing in the LOCATION field (or LOCATION COMMENT) of the GEOROC files. The overall process is summarized in Figure 2. Using only the names appearing in the LOCATION field gave too many errors, due to the same or similar names being present in separate places. We decided to use instead the latitude and longitude data first. We used the ranges provided by the GEOROC files, allowing a margin of error of 0.5°, and then retained only those GEOROC records whose range contains a latitude-longitude pair corresponding to a GVP volcano, and discarded the rest ( Figure 2). We then automatically matched names by finding a GVP volcano name in the LOCATION field (or in the LOCATION COMMENT field) of shortlisted samples. Complications arise from 1) having different spellings for the same volcano [e.g., in the GEOROC records, the names TJERIMAI (TJAREME), CEREME, CIREMAI are found for the GVP name Cereme], 2) having different names used in both databases (e.g., the GEOROC name TERNATE corresponds to the GVP name Gamalama), 3) linking several GEOROC location names to the same GVP volcano, 4) identifying samples belonging to the same volcano even though they may appear in different GEOROC files.
Once the automatic matching was completed, part of the resulting data were checked manually. A completely reliable automatic matching without manual check is not possible in general: there are, for example, samples located in between volcanoes, requiring expert knowledge to accurately assign the sample analysis to a particular volcano. Sometimes, there may also be errors in the records. The DashVolcano provides a map The start and end dates of an eruption occurred in the last 10,000 years, giving the duration of the eruption. For historical eruptions, when known, they are reported in year-month-date. The BC dates are represented in negative year.
VEI VEI stands for the Volcanic Explosivity Index Newhall and Self (1982), which categorizes the strength of an eruption Event Record of events for each eruption, illustrating the eruptive types and processes Siebert et al. (2011). Event types are provided (e.g., earthquake, explosion, pyroclastic flow, ash). The major elements oxides of chemical rock compositions, reported in weight percent (wt%), that are used in this paper LOI(wt%) Loss-on-ignition, reported in weight percent (wt%) Frontiers in Earth Science frontiersin.org 03 of the sample locations to help decide on the attribution of a sample to a volcano when ambiguous, as will be detailed below.

GVP and GEOROC datasets
The datasets of GVP and GEOROC were downloaded in 2021 (more precisely in June 2021 for GEOROC data). From GVP we accessed~1,400 Holocene volcanoes and~9,800 eruption records, with~860 volcanoes having known eruption date(s), and included some eruptions older than Holocene for Yellowstone caldera. GEOROC precompiled files provided 384,000 volcanic rock analyses,~138,000 of them matched and found around GVP Holocene volcanoes. Following the matching procedure described above, we obtained around 1000 GVP volcanoes that matched rock samples, around 700 among those have eruption date(s). We expect that most of the GVP Holocene are (at least partially) matched.

Matching GEOROC dates with GVP eruptions
Once the GEOROC samples are matched with a GVP volcano, the correspondence of the samples to eruptions was done based on temporal data, more precisely on the ERUPTION YEAR field, and on the Start Year and End Year fields, respectively (if the GVP End Year is missing, the GVP Start Year is used). The algorithm worked as follows: 1) if the ERUPTION YEAR is in between the Start Year and End Year, a match is found; 2) otherwise, look for a Start Year before the ERUPTION YEAR, and an End Year which has not been identified in 1). If the Start Year and End Year are the same, and matching an ERUPTION YEAR, then we searched for a match at the month level. Many eruption dates in the GVP dataset are incomplete and only part of the start/end date is known. Moreover, for eruptions after 1,679 (the year 1,679 is used because of a programming viewpoint, not a volcanology one), many records also contain the start and end day, whereas for most older eruptions, only the year is known. To address the issue of missing dates we first discarded all eruptions for which the start date was not known. Then, to be able to plot the data for older eruptions the start and end dates were interpolated following the existing data. For eruptions older than 1,679 and where the end year is missing, we made it the same as the start year. These procedures were used to visualize eruption periods, the exact content of GVP dates is preserved, and the correct data are available when hovering the mouse above the data points. The exact methodology we used is detailed in the Supplementary Appendix I of the paper.

The DashVolcano tool
To make the integration of GVP and GEOROC databases easily accessible and explorable, we designed a dashboard which, given a volcano of interest, comprises five types of visualization of the joint data which are described in sequence below: 1. A map of the world (using ©OpenStreetMap), showing GVP volcanoes, and GEOROC samples around volcanoes, 2. TAS (total-alkalis versus silica) diagrams (Le Bas et al., 1986;Le Bas et al., 1992), 3. Harker diagrams (e.g., Winter, 2001) from the GEOROC oxide compositions, together with, 4. GVP bar plots, summarizing known VEI, and 5. A chronogram with GVP eruptive history and GEOROC silica and alkalis evolution. Frontiers in Earth Science frontiersin.org 04

Map view of GVP volcanoes and matched GEOROC samples
The location of the samples is shown using a world map (Figure 3), which includes GVP volcanoes and GEOROC samples around the volcanoes. Rock sample locations are given with a range of latitude and longitude (LATITUDE MIN/MAX and LONGITUDE MIN/MAX). Only samples whose latitudelongitude range contain a volcano listed by GVP, with a variation of 0.5°are shown. We included volcanoes with known and unknown eruption dates. Some data samples have erroneous values for their latitude/longitude ranges, and these were discarded and not displayed. To display a sample, the mean of the latitude range (namely, [LATITUDE MIN + LATITUDE MAX]/2) and that of the longitude range are used. With the map tool, it is possible to zoom into a selected volcano from a drop-down menu, in which case GEOROC samples automatically attached to this volcano are highlighted in blue ( Figure 4A). Two selection tools are available that allow to group GEOROC samples of interest together. The corresponding TAS diagram of either the chosen volcano or selected samples is also displayed (see also Figures 4B, 5B).

TAS diagrams
The most abundant element oxide in volcanic rocks is silica (SiO 2 ) and together with sodium (Na 2 O) and potassium (K 2 O), they are used to classify volcanic rocks with the Total-Alkali Silica (TAS) diagram described in detail by Le Bas et al., 1986 andLe Bas et al., 1992. If a sample has no data (or a 0 value) for all the three, then it is not displayed. Care should be taken with evaluating or interpreting the iron concentrations, as it is reported as FeO, Fe 2 O 3, or both. Most researchers only determine the total iron and report it as FeO T (total iron as FeO). GEOROC contains data fields for FE 2 O 3 (wt%), FEO(wt%), and FEOT(wt%). For samples where only FEOT(wt%) is reported we used this value. For samples where FE 2 O 3 (wt%) and FEO(wt %) are reported, we recalculated Fe 2 O 3 into FeO and added it to the reported FEO(wt%), so that for all samples we report FEOT(wt%) as FE2O3(WT%) 1.111 + FEO(WT%) as in (Kress & Carmichael, 1991). All analyses are re-calculated in volatilefree as a means to consider alteration by measuring its "losson-ignition" (LOI) and that the major elements normalized into a weight totaling 100%. The major element compositions (wt%) used in the calculation are listed in Table 3. Each sample point in GEOROC can be WL (whole rock), GL (volcanic glass), INC (inclusion which refers to melt inclusions) and MIN (mineral/ component incl. groundmass) according to the field MATERIAL (as shown in the legends of Figures 4B, 5B, 6A, 7). We used different symbols for each material in the TAS plots (except for minerals which typically do not appear in the plots due to their very different compositions), which can be deactivated during visualization as needed (Figure 7). A popup window with information text of the respective sample will appear when hovering over the icon. A drop-down menu is available for the known eruption date of the sample ( Figure 6); when selected, the TAS diagram displays only sample data related to a particular date, if dates are available from the GEOROC dataset.

Variation (Harker) diagrams
Harker diagrams, are classically used to show the variation of the amount of each of the chemical constituents of igneous rocks (e.g.,

FIGURE 2
Flow diagram showing how the data mapping between GVP volcano name and GEOROC location name was done by matching name and geographical location.

GVP bar plots of eruption history with known VEI
We display the TAS diagram of a selected volcano using GEOROC data ( Figure 6A), together with a bar plot of the values of VEI from eruptions in GVP ( Figure 6C). The bar plot illustrates the volcanic explosivity history of a given volcano using the number of eruptions with a given VEI. Each VEI point is displayed with a date if provided by GVP. Unknown values of VEI are also indicated. The list of major and minor rocks (see description in Table 1) from the GVP dataset is given and reflected in the color of the histogram (from warmer red colors for felsic, towards colder blue colors for mafic). When a date is chosen from the GEOROC date menu, the corresponding eruption is reflected in the GVP histogram if an automatic match is found.

GVP eruption chronogram
A chronogram is the eruption timeline of a volcano extracted from the GVP dataset. We distinguish three time periods: before common era (BC), before 1,679 (CE), and after 1,679 (we note that this year has no geological significance, it is related to the precision of the time recordings of Python). This was necessary because they have different levels of granularity: recent eruptions are often precise to the day, whereas BC data are generally less precise but may be known at the year level. All known eruptive history is displayed in chronological order; each colored bar represents the duration of an eruption based on the known start and end date, so short and long eruption periods are distinguishable. The actual dates in the GVP dataset are available upon hovering on the chronogram, so the user knows which dates are present in the dataset versus the dates used for plotting purposes.
The color of the bar reflects how many events an eruption had. The list of corresponding events which occurred during an eruption as reported by GVP can be seen when hovering on each bar. The different event types (e.g., explosion, earthquake, ash) are a record of the eruptive types and processes (Siebert et al., 2011). Dots connected with a line are superimposed for better visualizing the fluctuations of VEIs (see Figure 8).

Integrated the GEOROC data to the chronogram visualization plots
We display two views of the same TAS diagram (Figures 8A, B). The first contains all GEOROC data samples. In the second diagram, only GEOROC samples matching GVP eruptions are shown, and

FIGURE 3
A world view of the GVP volcanoes. Red dots are for volcanoes with known dated eruption(s) and black dots for those without known eruption dates, and the matched GEOROC samples appear as yellow dots. Note that a sample location may represent many rock samples.

Frontiers in Earth Science
frontiersin.org 06 the symbols reflect the VEI. Moreover, the GEOROC data samples displayed in the overall TAS diagram are further superimposed on the chronogram, which can be deactivated if not needed ( Figure 8C). The values of SIO 2 (wt%) and NA 2 O(wt%) + K 2 O(wt%) are plotted. The goal of the plot is to show the fluctuations of major chemical compositions over time as well as the ranges within an eruption.

GEOROC-GVP index files
The DashVolcano tool automates the integration of GEOROC and GVP datasets, it is not a new database that stores these data. It instead generates index files to cross-reference the volcanoes and rock samples. To link a GVP volcano to GEOROC rock samples, an noting that all data have been recalculated to 100% volatile-free and with total iron oxide as FeO* (see text for details). Different symbol corresponds to different material types, which can be deactivated when not needed. The datasets for Campi Flegrei, which consist of around 2,300 analyses, can be downloaded in CSV format file by clicking "DOWNLOAD" button located between the map and TAS diagram plot. The downloaded file, which stored all the data being plotted, will be stored under the DashVolcano.1.0 folder.

Frontiers in Earth Science
frontiersin.org 07 indexing is used, which, given as input a GVP volcano name and GEOROC location names of interest, will dynamically provide a linking of the respective data so they can be combined, jointly visualized, and possibly jointly analyzed later. This indexing relies on mapping files. Each GEOROC precompiled file corresponds to a mapping file. Each line of a mapping line contains a volcano name from GVP, followed by a list of names that appear in the field LOCATION (or in the first entry of the field LOCATION COMMENT) of GEOROC. This is best illustrated with an example. Suppose we are interested in MOUNT MAZAMA volcano. It belongs to the Cascade Volcanic Range in western North America, archived in the GEOROC precompiled file PVFZCE_CASCADES.csv, itself part of the Convergent Margin tectonic setting. This file contains the samples from MOUNT MAZAMA, under two names: MOUNT MAZAMA, PRE-MAZAMA GROUP. In the GVP database, MOUNT MAZAMA is known as a synonym of Crater Lake. The index file for PVFZCE_ CASCADES.csv thus contains one line of the form Crater Lake; MOUNT MAZAMA, PRE-MAZAMA GROUP. We note that if a user only wishes to study GEOROC data which do not correspond to FIGURE 5 (A) The wider Neapolitan Volcanic area can also be selected to include more data around the volcano of interest, or when needed to cover many volcanoes at once using the box or lasso select tools provided. (B) The TAS diagram is automatically updated to plot the chemical compositions of the rock samples within the selected area.

Frontiers in Earth Science
frontiersin.org 08 any GVP volcanoes, it is also possible to use the above mechanism to link several GEOROC samples together, by tying them with a "GVP name" that is not present in the GVP volcano list. This can be done by adding a location-name in the mapping file, for example, by adding "Bayah Dome; BAYAH DOME" in the last line of the Sunda Arc mapping file (/GeorocGVPmapping/Convergent_Margins_ The GVP information related to the VEI and rocks of matching volcanoes is integrated, whereas Mt. Mazama has good links between composition and eruptive history, Yellowstone does not, but the older than Holocene eruptions we report were found in the downloaded data in 2021.

Frontiers in Earth Science
frontiersin.org 09 comp/SUNDA_ARC.txt), will allow user to browse available GEOROC samples belong to Bayah Dome located in West Java (Indonesia).

Manual data curation
Some users may want to include their own data collection, and combine it with the matched GEOROC datasets, to improve data coverage of a volcano of interest and enable data visualization using the DashVolcano tools. It is possible to do so, but it may be required to edit mapping files accordingly. The format of the manual data must follow the GEOROC Precompile File spreadsheet template with the compulsory fields SAMPLE NAME, LOCATION (name), geographic location (LATITUDE, LONGITUDE), MATERIAL, TECTONIC SETTING and the major element composition. Manual curation is also a way to check the performance and consistency of the code to link information between geochemical data (GEOROC or manual curation) and those from GVP. As examples of manual curation, we compiled geochemical analyses of volcanic rock samples (Widiwijayanti, 2022a;Widiwijayanti, 2022b;Widiwijayanti, 2022c) from Mt. Fuji (Takahashi et al., 2003;Yamamoto et al., 2007;Yamamoto et al., 2020;Miyaji et al., 2011;Yamamoto, 2011;Ishizuka et al., 2021), Rinjani (Vidal et al., 2015;Métrich et al., 2017) and Merapi (Costa et al., 2013) volcanoes. These manually curated datasets contain data that are not currently in the GEOROC database. We also provide detail explanation about the manual data curation including stepby-step how to add data manually and example of uses in the Supplementary Appendix II of the paper.

Discussion and application examples 4.1 Case example of map visualization
The geographic location of GEOROC volcanic samples does not always reflect the volcano they originated from, e.g., the latitude and longitude ranges are too wide. Therefore, by plotting both GVP volcanoes and GEOROC samples, the Map visualization tool allows to plot the spatial distribution of the GEOROC samples, and to relate both the unmatched and automatically matched GEOROC samples to their respective GVP volcano, if any. This can be seen when plotting samples from Campi Flegrei (Figure 4)   Frontiers in Earth Science frontiersin.org 10 GEOROC database. The map ( Figure 5A) also shows two others active GVP volcanoes close by, which are Vesuvius and Ischia (red dots), and the distribution of the GEOROC rock samples for the three volcanoes in the Neapolitan Volcanic Area. Selecting Campi Flegrei from the drop-down menu in the Map visualization tool will generates its TAS diagram, in which all the corresponding plotted samples belong to Campi Flegrei, shown in blue on the map ( Figure 4A).
Using the Mapview tool, a user may select the area of interest interactively from the map using box or lasso select tools depending on the desired shape. For example, a user wishes to include more GEOROC samples around Campi Flegrei or to select wider Neapolitan Volcanic area. A new TAS diagram will be automatically generated to include all the GEOROC samples in the area selected. The TAS diagram shown in Figure 5B can be compared to the Neapolitan Volcanic rock data plotted in Figure 2 of Pignatelli and Piochi (2021) for validation purposes. It is also interesting to compare the rock compositions of a given volcano (here Campi Flegrei volcano, in Figure 4B) with the rock compositions of a larger area to which it belongs (here the Neapolitan Volcanic area, in Figure 5B).

Comparative studies and side-by-side plots
In order to facilitate comparative studies between two volcanoes, or two eruptions, side-by-side plots are provided, comprising TAS and Harker diagrams for analyzing geochemical variability, and bar plots to summarize the VEI's of the historical eruptions. The bar plots show how many eruptions for each VEI and the eruption date will be shown when user mouseover the dots inside the bar. Below the bar plots, the GVP major and minor rock types are reported (Global Volcanism Program, 2013). Side-by-side plots allow to quickly visualize, for example, which volcano has the most continuous range in composition, what are the predominant rock types in each volcano, and more generally provides a comparison between the general trends of the GEOROC datasets. As an example (Figure 6), we compare two large caldera-forming volcanoes in the North American continent, Yellowstone and Crater Lake (Mount Mazama). Both feature a variety of well-studied volcanic rocks that originated from some of the most explosive eruption to have occurred within the last 2 million years (Bacon and Lowenstern, 2005;Bindeman et al., 2007) but the causes of their eruptive behavior are quite different and are closely linked to plate tectonic setting, Yellowstone in continental hotspot (Smith and Braile, 1994;Christiansen, 2001;Smith et al., 2009) and Crater Lake in convergent margin of Cascadia subduction zone (Bacon, 1983). The volcanism at Mt. Mazama and Yellowstone highlights these differences and forms the basis of comparative studies of diverse origins of two caldera forming volcanoes that are being used as exemplary teaching activities to explore compositional variations using GEOROC datasets (Ratajeski, 2004). There are two known large Holocene eruption records for Mt. Mazama, but The TAS diagram that only plots GEOROC samples can be matched to GVP eruptions. Here the shapes are used to highlight the VEI of the eruption (unknown of low VEI vs. VEI at least 3); the y-axis remains the total alkali oxides. (C) A chronogram plot with time series variations of the rock composition throughout historical eruptions of Mt. Etna from 252AD to present can be explored interactively. The total alkali oxides (Na 2 O + K 2 O) and silica (SiO 2 ) content of samples with known eruption time (blue dots) are plotted together with the GVP historical eruption records (coloured bars, the more eventful the eruption, the darker the bar) with the VEI data (purple dots, a line connects them to highlight the variations in VEI). All known eruptive history will be displayed in chronological order, each coloured bar represents duration of an eruption based on the known start and end date. A list of corresponding events (Global Volcanism Program, 2013) is shown on hover above the data point. We note that the date 1,679 has no geological significance, it is related to the precision of the time recordings of Python.
Frontiers in Earth Science frontiersin.org 11 Yellowstone does not have any, but the older than Holocene eruptions we report were found in the downloaded data in 2021.
The interactive visualization tools of DashVolcano allow to explore the variability of silica versus other major element oxides in Harker diagrams ( Figure 6) and the variability of silica versus alkalis for different types of material analyzed, e.g., whole rock, volcanic glass or melt inclusion in TAS diagram (Figure 7).

Data visualization with an eruption chronogram
The linked information between GEOROC and GVP data allows extraction of data to relate the volcanic rock samples composition to their eruption time and VEI, hereafter expressed as a pair of TAS diagrams and a chronogram (Figure 8). The GEOROC rock samples that matched a chosen volcano, or a specific eruption data for this volcano if available, are plotted in two TAS diagrams. The left side diagram contains only the GEOROC data and the right side are combined with GVP data but only shown eruptions that are matched in both databases (with VEI data if available). The eruption history of this volcano is illustrated in a chronogram. The chronogram summarizes the timeline of the historical eruptions, the events occurred during the eruption and VEI information as reported by GVP. The total alkali oxides (Na 2 O + K 2 O) and silica (SiO 2 ) content of samples with known eruption time (blue dots) are plotted together with the GVP historical eruption records (red bars, the more event reported in an eruption, the redder the bar) with the VEI data (purple dots, a line connects them to highlight the variations in VEI). All known eruptive history will be displayed in chronological order; each coloured bar represents duration of an eruption based on the known start and end date. A list of corresponding events (Global Volcanism Program, 2013) is shown when hovering the mouse above the data point. We selected Mount Etna as a comprehensive example (Figure 8) as it erupts frequently (Tanguy et al., 2007), has well documented historical eruptions (with known start and end date, VEI, events), and many rock sample analyses that allow to illustrate the chemical composition variation along the eruption timeline.

Conclusion
The open-source DashVolcano app automatically integrates two global databases, the Geochemistry of Rocks of the Oceans and Continents (GEOROC) and the Holocene volcanoes of the world of the Global Volcanism Program (VOTW-GVP), with a graphical user interface (GUI). The app provides the user a dashboard to explore and compare the joint data for volcano(es) and/or eruption(s) in terms of chemical rock composition, analyzed rock's material, eruption timeline, eruption events, and VEI information. DashVolcano can be used to find evidence to support known hypotheses or to articulate new ones based on the combined datasets, which are now widely accessible and available for the first time. This work is a significant contribution to the community, from the point of view of methodology on establishing interoperability between two global Earth science databases, helping the reuse of existing datasets, and providing new tools to explore the integrated datasets. The data and the code are publicly available without restriction. With this paper we provide an example of improving the findability, accessibility, interoperability, and reuse (FAIR) of global volcanological and geochemical data.

Author contributions
FO was the primary author who wrote the manuscript and is responsible for the code, data linking and integration. CW also wrote the manuscript and is responsible for manual data check and manual curation. FC initiated this study and edited the manuscript.