<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article article-type="research-article" dtd-version="2.3" xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Food. Sci. Technol.</journal-id>
<journal-title>Frontiers in Food Science and Technology</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Food. Sci. Technol.</abbrev-journal-title>
<issn pub-type="epub">2674-1121</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">979028</article-id>
<article-id pub-id-type="doi">10.3389/frfst.2022.979028</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Food Science and Technology</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Data mining for prediction and interpretation of bacterial population behavior in food</article-title>
<alt-title alt-title-type="left-running-head">Hosoe et al.</alt-title>
<alt-title alt-title-type="right-running-head">
<ext-link ext-link-type="uri" xlink:href="https://doi.org/10.3389/frfst.2022.979028">10.3389/frfst.2022.979028</ext-link>
</alt-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Hosoe</surname>
<given-names>Junpei</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="fn" rid="fn1">
<sup>&#x2020;</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/1909849/overview"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Sunagawa</surname>
<given-names>Junya</given-names>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<xref ref-type="fn" rid="fn1">
<sup>&#x2020;</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/1885611/overview"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Nakaoka</surname>
<given-names>Shinji</given-names>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/2014491/overview"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Koseki</surname>
<given-names>Shige</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/665377/overview"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Koyama</surname>
<given-names>Kento</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="corresp" rid="c001">&#x2a;</xref>
<uri xlink:href="https://loop.frontiersin.org/people/682791/overview"/>
</contrib>
</contrib-group>
<aff id="aff1">
<sup>1</sup>
<institution>Graduate School of Agricultural Science</institution>, <institution>Hokkaido University Kita-9</institution>, <addr-line>Sapporo</addr-line>, <country>Japan</country>
</aff>
<aff id="aff2">
<sup>2</sup>
<institution>Graduate School of Life Science</institution>, <institution>Hokkaido University Kita-10 Nishi-8</institution>, <addr-line>Sapporo</addr-line>, <country>Japan</country>
</aff>
<author-notes>
<fn fn-type="edited-by">
<p>
<bold>Edited by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1086791/overview">Gopalan Sivaraman</ext-link>, Central Institute of Fisheries Technology (ICAR), India</p>
</fn>
<fn fn-type="edited-by">
<p>
<bold>Reviewed by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/489202/overview">Qingli Dong</ext-link>, University of Shanghai for Science and Technology, China</p>
<p>
<ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1029807/overview">Donald W. Schaffner</ext-link>, Rutgers, The State University of New Jersey, United States</p>
</fn>
<corresp id="c001">&#x2a;Correspondence: Kento Koyama, <email>kkoyama@agr.hokudai.ac.jp</email>
</corresp>
<fn fn-type="equal" id="fn1">
<label>
<sup>&#x2020;</sup>
</label>
<p>These authors have contributed equally to this work</p>
</fn>
<fn fn-type="other">
<p>This article was submitted to Food Safety and Quality Control, a section of the journal Frontiers in Food Science and Technology</p>
</fn>
</author-notes>
<pub-date pub-type="epub">
<day>15</day>
<month>12</month>
<year>2022</year>
</pub-date>
<pub-date pub-type="collection">
<year>2022</year>
</pub-date>
<volume>2</volume>
<elocation-id>979028</elocation-id>
<history>
<date date-type="received">
<day>27</day>
<month>06</month>
<year>2022</year>
</date>
<date date-type="accepted">
<day>05</day>
<month>12</month>
<year>2022</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#xa9; 2022 Hosoe, Sunagawa, Nakaoka, Koseki and Koyama.</copyright-statement>
<copyright-year>2022</copyright-year>
<copyright-holder>Hosoe, Sunagawa, Nakaoka, Koseki and Koyama</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p>
</license>
</permissions>
<abstract>
<p>Although bacterial population behavior has been investigated in a variety of foods in the past 40&#xa0;years, it is difficult to obtain desired information from the mere juxtaposition of experimental data. We predicted the changes in the number of bacteria and visualize the effects of pH, a<sub>w</sub>, and temperature using a data mining approach. Population growth and inactivation data on eight pathogenic and food spoilage bacteria under 5,025 environmental conditions were obtained from the ComBase database (<ext-link ext-link-type="uri" xlink:href="http://www.combase.cc/">www.combase.cc</ext-link>), including 15 food categories, and temperatures ranging from 0&#xb0;C to 25&#xb0;C. The eXtreme gradient boosting tree was used to predict population behavior. The root mean square error of the observed and predicted values was 1.23 log CFU/g. The data mining model extracted the growth inhibition for the investigated bacteria against a<sub>w</sub>, temperature, and pH using the SHapley Additive eXplanations value. A data mining approach provides information concerning bacterial population behavior and how food ecosystems affect bacterial growth and inactivation.</p>
</abstract>
<kwd-group>
<kwd>databases</kwd>
<kwd>predictive microbiology</kwd>
<kwd>data-driven methods</kwd>
<kwd>SHapley Additive</kwd>
<kwd>population behavior</kwd>
</kwd-group>
<contract-sponsor id="cn001">Kieikai Research Foundation<named-content content-type="fundref-id">10.13039/501100012014</named-content>
</contract-sponsor>
<contract-sponsor id="cn002">TOBE MAKI Scholarship Foundation<named-content content-type="fundref-id">10.13039/501100012631</named-content>
</contract-sponsor>
<contract-sponsor id="cn003">Japan Society for the Promotion of Science<named-content content-type="fundref-id">10.13039/501100001691</named-content>
</contract-sponsor>
<contract-sponsor id="cn004">Japan Society for the Promotion of Science<named-content content-type="fundref-id">10.13039/501100001691</named-content>
</contract-sponsor>
</article-meta>
</front>
<body>
<sec id="s1">
<title>1 Introduction</title>
<p>Different types of microorganisms are present in food. Some of these cause foodborne illness and food spoilage. To control food pathogens and spoilage bacteria, various preservation techniques have been developed to prevent harmful bacteria from growing during processing, distribution, and storage. Many factors influence the microbial response in food ecosystems. For instance, temperature, pH, water activity (a<sub>w</sub>), antimicrobial additives, and gas components can affect bacterial population behavior (<xref ref-type="bibr" rid="B31">Leistner, 2000</xref>; <xref ref-type="bibr" rid="B14">Doyle et al., 2019</xref>), even though the effects on bacterial growth and inactivation vary by bacterial species or genus. Adjusting the various environmental conditions in food enables the suppression of bacterial growth and food spoilage (<xref ref-type="bibr" rid="B16">Gould, 1996</xref>; <xref ref-type="bibr" rid="B31">Leistner, 2000</xref>). Thus, appropriate microbiological control can help prevent food loss and improve food safety.</p>
<p>To quantify and evaluate bacterial growth for control, many studies on microbiological response in food have been conducted since Roberts &#x26; Jarvis (1983) introduced predictive microbiology, which originated from the research by Bigelow (1921), Bigelow &#x26; Esty (1920), and Esty &#x26; Meyer (1922). Each experimental data on bacterial growth and inactivation was obtained by counting the number of colonies on the culture plate as viable cell counts or by measuring optical density for the cell density over time under controlled conditions, such as temperature, pH, and a<sub>w</sub>. Microbial responses in food have been explained by mathematical models, the main exploratory variables of which are temperature, pH, and a<sub>w</sub> (<xref ref-type="bibr" rid="B47">Ross and McMeekin, 1994</xref>; <xref ref-type="bibr" rid="B21">Jagannath and Tsuchido, 2003</xref>). Over the last 40 years, experimental data on microbial responses to the food environment have been collected by research institutions, universities, and companies according to their objectives. The accumulated data are stored in databases such as the ComBase database (<ext-link ext-link-type="uri" xlink:href="http://www.combase.cc/">www.combase.cc</ext-link>), which was developed to provide easy access to microbiological data in research establishments and publications produced by different laboratories (<xref ref-type="bibr" rid="B4">Baranyi et al., 2004</xref>). Currently, every effort is vital for collecting data with similar species or conditions through literature or database to assess product safety. A comprehensive statistical analysis is needed for understanding bacterial population behavior regardless of food and bacteria.</p>
<p>Studies have been conducted to understand the global trends of microbial responses from the accumulated data. In predictive microbiology, studies have used meta-analysis, which is a method to amalgamate, summarize, or review previous quantitative research for identifying trends with a statistical model of specific foods and bacteria [e.g., evaluating inactivation of <italic>Escherichia coli</italic> in fermented meat (<xref ref-type="bibr" rid="B37">McQuestin et al., 2009</xref>), meta-analysis for quantitative microbiological risk assessments and benchmarking data (<xref ref-type="bibr" rid="B12">den Besten and Zwietering, 2012</xref>), growth and inactivation of <italic>Listeria monocytogenes</italic> in milk or non-thermal inactivation of <italic>Listeria monocytogenes</italic> in fermented sausages (<xref ref-type="bibr" rid="B35">Mataragas et al., 2015</xref>)]. Statistical models generally require analysts to specify the functional form between explanatory variables and response variables (<xref ref-type="bibr" rid="B20">Hochachka et al., 2007</xref>). Studies using meta-analysis have provided results that suit some objectives, such as trends in maximum growth rate or bacterial growth/inactivation by foods. However, a meta-analysis with statistical models alone is not necessarily systematic and tends to be fragmentary in terms of cross-food/bacteria analysis. Bacterial growth is affected by not only factors such as temperature, pH, and a<sub>w</sub>, but also cell density (<xref ref-type="bibr" rid="B28">Koutsoumanis and Sofos, 2005</xref>; <xref ref-type="bibr" rid="B49">Skandamis et al., 2007</xref>; <xref ref-type="bibr" rid="B5">Bidlas et al., 2008</xref>), characteristics of each food, and gaseous atmosphere (<xref ref-type="bibr" rid="B14">Doyle et al., 2019</xref>). Identifying complex relationships between food and bacteria requires the development of complex mathematical formulas with high-dimensional variables. To predict the bacterial population change and to explore the influence of each factor on microbial response using big data, a non-parametric approach, which needs to develop no hypothesis based on domain knowledge, is considered useful (<xref ref-type="bibr" rid="B13">Deringer et al., 2021</xref>).</p>
<p>Data mining is an effective method for analyzing large amounts of accumulated data. Data mining is the secondary analysis of a large database to identify and interpret hidden patterns (<xref ref-type="bibr" rid="B18">Hand, 1998</xref>). The recent accumulation of big data has promoted the development of databases in various fields. Consequently, data mining has been employed in many fields such as agriculture (<xref ref-type="bibr" rid="B10">Cortez et al., 2009</xref>; <xref ref-type="bibr" rid="B17">Gulyaeva et al., 2020</xref>), ecology (<xref ref-type="bibr" rid="B20">Hochachka et al., 2007</xref>; <xref ref-type="bibr" rid="B46">Ross et al., 2018</xref>), healthcare and medicine (<xref ref-type="bibr" rid="B8">Cios and William Moore, 2002</xref>; <xref ref-type="bibr" rid="B11">Delen et al., 2005</xref>; <xref ref-type="bibr" rid="B26">Koh and Tan, 2005</xref>; <xref ref-type="bibr" rid="B38">Mohanty et al., 2022</xref>), and food quality (<xref ref-type="bibr" rid="B10">Cortez et al., 2009</xref>; <xref ref-type="bibr" rid="B24">Jim&#xe9;nez-Carvelo et al., 2019</xref>; <xref ref-type="bibr" rid="B42">Nychas et al., 2021</xref>). Using machine learning, the relationship between the response and function can be determined empirically from the data. This approach can discover new knowledge of these patterns. To the best of our knowledge, only one data mining study has been conducted in the field of predicting bacterial population behavior. <xref ref-type="bibr" rid="B19">Hiura et al. (2021)</xref> predicted the bacterial behavior of <italic>Listeria monocytogenes</italic> using the ComBase database of microbial responses to food environments. The ComBase database contains information about bacteria and foods, such as the name of the bacterial genus or species and the category or name of the medium or food. Such information enables us to extract the comprehensive characteristics of bacterial responses, such as the association between food ecosystems and bacterial population changes, by analyzing the data reported in previous studies. In this manner, exploring the whole trend of interactions between microbial growth and inactivation and conditions from accumulated data would have an advantage in the comparison and evaluation of bacterial population changes in various bacteria and different foods.</p>
<p>In the present study, the objective was to not only develop a single machine learning model for predicting population behavior of food-related bacteria in various kinds of food, but to also visualize the effect of pH, a<sub>w</sub>, and temperature using a data mining approach. Data regarding the change in viable cell number over time were used for eight foodborne and food spoilage bacteria: &#x201c;<italic>Aeromonas hydrophila</italic>,&#x201d; &#x201c;<italic>Bacillus cereus</italic>,&#x201d; &#x201c;<italic>Escherichia coli</italic>,&#x201d; &#x201c;<italic>Listeria monocytogenes</italic>,&#x201d; &#x201c;<italic>Pseudomonas</italic> spp.,&#x201d; &#x201c;<italic>Staphylococcus aureus</italic>,<italic>&#x201d;</italic> &#x201c;<italic>Salmonella</italic> spp.,&#x201d; and &#x201c;<italic>Yersinia enterocolitica</italic>.&#x201d; The microbial responses to the food environment were collected from the ComBase database. The collected data included population behavior based on 15 food categories &#x2014;&#x201c;beef,&#x201d; &#x201c;culture medium,&#x201d; &#x201c;pork,&#x201d; &#x201c;poultry,&#x201d; &#x201c;seafood/fish,&#x201d; &#x201c;vegetable or fruit and their products,&#x201d; &#x201c;water,&#x201d; &#x201c;dessert food,&#x201d; &#x201c;milk,&#x201d; &#x201c;sausage,&#x201d; &#x201c;cheese,&#x201d; &#x201c;eggs and egg product,&#x201d; &#x201c;juice and beverage,&#x201d; &#x201c;sauce/dressing,&#x201d; and &#x201c;bread&#x201d;&#x2014; with temperature ranging 0&#xb0;C&#x2013;25&#xb0;C. Data mining and machine learning approaches provide information concerning population behavior and its effects on food ecosystems.</p>
</sec>
<sec sec-type="materials|methods" id="s2">
<title>2 Materials and methods</title>
<sec id="s2-1">
<title>2.1 Data selection from ComBase database</title>
<p>The ComBase database contains quantified microbial responses to food with approximately 60,000 records collected from various research establishments and publications. Changes in the bacterial density over time were recorded for each experimental condition. The dataset of a change in bacterial density over time in ComBase contains &#x201c;Record ID,&#x201d; &#x201c;Organism,&#x201d; &#x201c;Food category,&#x201d; &#x201c;Food name,&#x201d; &#x201c;Temperature,&#x201d; &#x201c;pH,&#x201d; &#x201c;a<sub>w</sub>,&#x201d; &#x201c;Conditions,&#x201d; &#x201c;Time,&#x201d; and &#x201c;Viable cell counts&#x201d;. Each dataset of changes in a bacterial population is assigned a &#x201c;Record ID,&#x201d; which allows us to recognize one series of experiments on population behavior.</p>
<p>In this study, we investigated changes in the populations of eight pathogenic and food spoilage bacteria: <italic>A. hydrophila</italic>, <italic>B. cereus, E. coli</italic>, <italic>L. monocytogenes</italic>, <italic>Pseudomonas</italic> spp. (<italic>Pseudomonads</italic>), <italic>S. aureus</italic>, <italic>Salmonella</italic> spp. (<italic>Salmonella</italic>), and <italic>Y. enterocolitica</italic>. These bacteria are known to cause food spoilage and foodborne illnesses. Fifteen kinds of food categories were included &#x201c;beef,&#x201d; &#x201c;culture medium,&#x201d; &#x201c;pork,&#x201d; &#x201c;poultry,&#x201d; &#x201c;seafood/fish,&#x201d; &#x201c;vegetable or fruit and their products,&#x201d; &#x201c;water,&#x201d; &#x201c;dessert food,&#x201d; &#x201c;milk,&#x201d; &#x201c;sausage,&#x201d; &#x201c;cheese,&#x201d; &#x201c;eggs and egg product,&#x201d; &#x201c;juice and beverage,&#x201d; &#x201c;sauce/dressing,&#x201d; and &#x201c;bread.&#x201d; The data used for model development and evaluation were those with temperatures ranging from 0&#xb0;C to 25&#xb0;C and containing greater than or equal to four observed values in each series of experiments on bacterial population behavior. In addition, records for which viable counts at 0&#xa0;h were unavailable were excluded, because the objective values, <inline-formula id="inf1">
<mml:math id="m1">
<mml:mrow>
<mml:mi mathvariant="italic">log</mml:mi>
<mml:mo>&#x2061;</mml:mo>
<mml:mtext>&#xa0;</mml:mtext>
<mml:msub>
<mml:mi>N</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mo>/</mml:mo>
<mml:msub>
<mml:mi>N</mml:mi>
<mml:mn>0</mml:mn>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> can not be calculated. Records containing preservatives such as acetic acid, lactic acid, nitrite, and sorbic acid were also excluded. In total, 9,091 records of bacterial population behavior were extracted from ComBase and 101,861 viable count data were used. <xref ref-type="table" rid="T1">Table 1</xref> summarizes the data selected for this study. The entire &#x201c;Record ID&#x201d; list extracted from ComBase is available online in <xref ref-type="sec" rid="s11">Supplementary Data S1</xref>.</p>
<table-wrap id="T1" position="float">
<label>TABLE 1</label>
<caption>
<p>Summary of the extracted data from ComBase.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left">Microorganisms</th>
<th align="left">Temperature (&#xb0;C)</th>
<th align="left">pH</th>
<th align="left">Water activity (a<sub>w</sub>)</th>
<th align="left">Number of food categories</th>
<th align="left">Number of environmental IDs</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">
<italic>Aeromonas hydrophila</italic>
</td>
<td align="left">0&#x2013;25</td>
<td align="left">4.0&#x2013;8.0</td>
<td align="left">0.957&#x2013;0.997</td>
<td align="left">7</td>
<td align="left">618</td>
</tr>
<tr>
<td align="left">
<italic>Bacillus cereus</italic>
</td>
<td align="left">1&#x2013;25</td>
<td align="left">4.5&#x2013;8.2</td>
<td align="left">0.911&#x2013;0.997</td>
<td align="left">8</td>
<td align="left">620</td>
</tr>
<tr>
<td align="left">
<italic>Escherichia coli</italic>
</td>
<td align="left">0&#x2013;25</td>
<td align="left">3.2&#x2013;8.5</td>
<td align="left">0.190&#x2013;0.999</td>
<td align="left">11</td>
<td align="left">532</td>
</tr>
<tr>
<td align="left">
<italic>Listeria monocytogenes</italic>
</td>
<td align="left">0&#x2013;25</td>
<td align="left">3.5&#x2013;8.0</td>
<td align="left">0.750&#x2013;0.999</td>
<td align="left">13</td>
<td align="left">1,192</td>
</tr>
<tr>
<td align="left">
<italic>Pseudomonads</italic>
</td>
<td align="left">0&#x2013;25</td>
<td align="left">4.0&#x2013;7.4</td>
<td align="left">0.954&#x2013;0.997</td>
<td align="left">7</td>
<td align="left">406</td>
</tr>
<tr>
<td align="left">
<italic>Staphylococcus aureus</italic>
</td>
<td align="left">0&#x2013;25</td>
<td align="left">3.9&#x2013;8.0</td>
<td align="left">0.880&#x2013;0.997</td>
<td align="left">9</td>
<td align="left">228</td>
</tr>
<tr>
<td align="left">
<italic>Salmonella</italic>
</td>
<td align="left">0&#x2013;25</td>
<td align="left">3.2&#x2013;8.9</td>
<td align="left">0.300&#x2013;0.998</td>
<td align="left">14</td>
<td align="left">588</td>
</tr>
<tr>
<td align="left">
<italic>Yersinia enterocolitica</italic>
</td>
<td align="left">0&#x2013;25</td>
<td align="left">3.4&#x2013;10</td>
<td align="left">0.846&#x2013;0.999</td>
<td align="left">6</td>
<td align="left">841</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="s2-2">
<title>2.2 Data preprocessing</title>
<p>In the present study, we set the change ratio of viable counts as the objective variable to predict bacterial behavior to evaluate both the increase and decrease in the bacterial population. For each Record ID, the cell concentration was transformed to a common logarithm of the change ratio of viable counts to the initial cell number <inline-formula id="inf2">
<mml:math id="m2">
<mml:mrow>
<mml:mi mathvariant="italic">log</mml:mi>
<mml:mo>&#x2061;</mml:mo>
<mml:mtext>&#xa0;</mml:mtext>
<mml:msub>
<mml:mi>N</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mo>/</mml:mo>
<mml:msub>
<mml:mi>N</mml:mi>
<mml:mn>0</mml:mn>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> defined in Eq. <xref ref-type="disp-formula" rid="e1">1</xref>:<disp-formula id="e1">
<mml:math id="m3">
<mml:mrow>
<mml:mi>log</mml:mi>
<mml:mo>&#x2061;</mml:mo>
<mml:msub>
<mml:mi>N</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>log</mml:mi>
<mml:mo>&#x2061;</mml:mo>
<mml:msub>
<mml:mi>N</mml:mi>
<mml:mn>0</mml:mn>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>log</mml:mi>
<mml:mfrac>
<mml:mrow>
<mml:msub>
<mml:mi>N</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mi>N</mml:mi>
<mml:mn>0</mml:mn>
</mml:msub>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
<label>(1)</label>
</disp-formula>
</p>
<p>where <inline-formula id="inf3">
<mml:math id="m4">
<mml:mrow>
<mml:mi>log</mml:mi>
<mml:mo>&#x2061;</mml:mo>
<mml:msub>
<mml:mi>N</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf4">
<mml:math id="m5">
<mml:mrow>
<mml:mi>log</mml:mi>
<mml:mo>&#x2061;</mml:mo>
<mml:msub>
<mml:mi>N</mml:mi>
<mml:mn>0</mml:mn>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> are viable cell concentrations (log colony forming unit (CFU)/g) when the storage time is <inline-formula id="inf5">
<mml:math id="m6">
<mml:mrow>
<mml:mi>t</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula> (h) and the logarithm of the initial cell concentration (log CFU/g), respectively. We used <inline-formula id="inf6">
<mml:math id="m7">
<mml:mrow>
<mml:mi mathvariant="italic">log</mml:mi>
<mml:mo>&#x2061;</mml:mo>
<mml:mtext>&#xa0;</mml:mtext>
<mml:msub>
<mml:mi>N</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mo>/</mml:mo>
<mml:msub>
<mml:mi>N</mml:mi>
<mml:mn>0</mml:mn>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> as the objective variable. Eight types of explanatory variables were included: &#x201c;Time (h),&#x201d; &#x201c;Temperature (&#xb0;C),&#x201d; &#x201c;pH,&#x201d; &#x201c;a<sub>w</sub>,&#x201d; &#x201c;Initial cell number (log CFU/g),&#x201d; &#x201c;Food category,&#x201d; &#x201c;Food name,&#x201d; and &#x201c;Organism.&#x201d; The data included both numerical and categorical data. &#x201c;Time,&#x201d; &#x201c;Temperature,&#x201d; &#x201c;pH,&#x201d; &#x2018;a<sub>w</sub>,&#x201d; and &#x201c;Initial cell number&#x201d; were numerical data, which were used without modification for model development. The viable cell concentration at 0&#xa0;h was used as the initial cell number for each record ID. Furthermore, because food category, food name, and organism are categorical variables, they were replaced with dummy variables, which is a common technique in models based on decision trees (<xref ref-type="bibr" rid="B19">Hiura et al., 2021</xref>). The 15 food categories were converted as 1&#x2013;15. The types of food names were converted to 1&#x2013;261. The eight organisms were converted to 1&#x2013;8. The data acquired from ComBase included &#x201c;Record ID&#x201d; and could be employed for each series of experimental results of pathogen survival registered based on the record ID. In the original dataset, there are some data that Record IDs are different but the experimental conditions (&#x201c;Temperature,&#x201d; &#x201c;pH,&#x201d; &#x201c;aw,&#x201d; &#x201c;Food category,&#x201d; &#x201c;Food name,&#x201d; and &#x201c;Organism&#x201d;) are the same. To unify the experimental condition, we renamed &#x201c;Record ID&#x201d; to &#x201c;Environmental ID, which avoids overlapping with the experimental conditions in the training and test datasets. The record IDs for which temperature, pH, a<sub>w</sub>, food category, food name, and organism were the same were regarded as the results of experiments conducted through different repetitions under the same conditions, and the same &#x201c;Environmental ID&#x201d; was reassigned as the result of a single experimental condition. Thus, 9,091 record IDs were assigned to 5,025 environmental IDs. In total, 101,861 observed plots were investigated. In the test dataset, the number of environmental conditions used was 542 and the number of observed plots used was 11,106.</p>
</sec>
<sec id="s2-3">
<title>2.3 Model development</title>
<sec id="s2-3-1">
<title>2.3.1 eXtreme gradient-boosting tree (XGBoost) model</title>
<p>The XGBoost was first proposed by <xref ref-type="bibr" rid="B7">Chen and Guestrin in 2016</xref>. XGboost extends the concept of the Gradient Boosting Decision Tree (GBDT). The GBDT is an iterative decision tree that includes multiple decision trees (<xref ref-type="bibr" rid="B15">Friedman, 2001</xref>). The GBDT is a tree-based ensemble technique that uses a decision tree as the base model, and gradient boosting trains it sequentially by adding each base model and fixing the errors generated by the previous tree model. The GBDT method has been widely employed in machine learning and data mining studies (<xref ref-type="bibr" rid="B6">Chang et al., 2018</xref>; <xref ref-type="bibr" rid="B41">Nguyen et al., 2019</xref>; <xref ref-type="bibr" rid="B45">Rodrigo et al., 2021</xref>; <xref ref-type="bibr" rid="B48">Shehadeh et al., 2021</xref>). XGBoost was used in the present study because it has several advantages in terms of fewer requirements for feature engineering, allowing steps such as handling missing values without specific processing, and variables without normalization and scaling (<xref ref-type="bibr" rid="B51">Wang et al., 2020</xref>; <xref ref-type="bibr" rid="B38">Mohanty et al., 2022</xref>). The XGBoost models were built using the XGBoost (Version 1.5.0) Python Package (<ext-link ext-link-type="uri" xlink:href="https://xgboost.readthedocs.io/en/latest/python/index.html">https://xgboost.readthedocs.io/en/latest/python/index.html</ext-link>).</p>
</sec>
<sec id="s2-3-2">
<title>2.3.2 Modeling procedure</title>
<p>We aimed to develop a machine learning model for predicting bacterial responses to various food environments, characterized by controlling factors such as temperature, pH, and a<sub>w</sub>. Eight input variables that included five numerical data types&#x2014;temperature (&#xb0;C), pH, a<sub>w</sub>, time (h), and initial cell number (log CFU/g)&#x2014;and three categorical data types&#x2014;food category, food name, and organism&#x2014;were used to develop a model to predict change ratio of a bacterial population.</p>
<p>There were several steps to divide the imbalanced whole dataset into training and test dataset. First, the whole dataset was separated by &#x201c;Microorganisms.&#x201d; Second, the dataset separated by &#x201c;Microorganisms&#x201d; was separated by &#x201c;Food category.&#x201d; Third, the dataset separated by &#x201c;Microorganisms&#x201d; and &#x201c;Food category&#x201d; was randomly divided into 9:1 without overlapping with the experimental conditions in the training and test datasets. Thus, the imbalanced dataset was separated into the training and test dataset. The training dataset was used to build a model for predicting bacterial responses to various food environments, while its hyperparameters were optimized. The test dataset was used to evaluate the performance of the tuned model.</p>
<p>Prior to training the predicting model, the hyperparameters of the XGBoost model used in this study were determined by a 5-fold cross-validation and grid search. Cross-validation validates the model performance using only the training dataset under an arbitrary hyperparameter set. It attempts to avoid overfitting which deteriorates the performance on unknown data (i.e., test dataset). In this method, the training dataset was divided into 5-fold (4-fold of training data and 1-fold of validation data) and then the training data was used to train a model, and validation data was used to verify the performance. Repeating this validation cycle by swapping the validation data with the training data, the performance of the model was validated. A grid search was conducted by selecting each hyperparameter value from a pre-defined range, and thus the highest performing (i.e., optimal) hyperparameters are determined. The XGBoost model hyperparameters were set in some ranges (<xref ref-type="sec" rid="s11">Supplementary Table S1</xref>) and optimized as follows: a maximum tree depth of 9, min_child_weight of 1, gamma of 0.3, subsample of 0.6, colsample_bytree of 0.6, and reg_alpha of 100.</p>
</sec>
</sec>
<sec id="s2-4">
<title>2.4 Evaluation of model accuracy</title>
<p>The prediction accuracy of the developed model was evaluated using 542 test datasets of environmental ID that were not used in model development. The coefficient of determination (<italic>R</italic>
<sup>2</sup>) and root mean square error (RMSE) were calculated for all test data, each organism, and each food category, as an index to evaluate the accuracy of the model. The <italic>R</italic>
<sup>2</sup> and RMSE values are given by Eqs <xref ref-type="disp-formula" rid="e2">2</xref>, <xref ref-type="disp-formula" rid="e3">3</xref>, respectively:<disp-formula id="e2">
<mml:math id="m8">
<mml:mrow>
<mml:msup>
<mml:mi>R</mml:mi>
<mml:mn>2</mml:mn>
</mml:msup>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>&#x2212;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:msubsup>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>n</mml:mi>
</mml:msubsup>
<mml:msup>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mover accent="true">
<mml:mi>y</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
<mml:mrow>
<mml:msubsup>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>n</mml:mi>
</mml:msubsup>
<mml:msup>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mi>&#x3bc;</mml:mi>
<mml:mi>y</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
<label>(2)</label>
</disp-formula>
<disp-formula id="e3">
<mml:math id="m9">
<mml:mrow>
<mml:mi>R</mml:mi>
<mml:mi>M</mml:mi>
<mml:mi>S</mml:mi>
<mml:mi>E</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:msqrt>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:mfrac>
<mml:mrow>
<mml:munderover>
<mml:mstyle displaystyle="true">
<mml:mo>&#x2211;</mml:mo>
</mml:mstyle>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>n</mml:mi>
</mml:munderover>
<mml:msup>
<mml:mrow>
<mml:mfenced open="(" close=")" separators="|">
<mml:mrow>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mover accent="true">
<mml:mi>y</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
</mml:mrow>
</mml:msqrt>
</mml:mrow>
</mml:math>
<label>(3)</label>
</disp-formula>where <inline-formula id="inf7">
<mml:math id="m10">
<mml:mrow>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>, <inline-formula id="inf8">
<mml:math id="m11">
<mml:mrow>
<mml:msub>
<mml:mover accent="true">
<mml:mi>y</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>, and <inline-formula id="inf9">
<mml:math id="m12">
<mml:mrow>
<mml:msub>
<mml:mi>&#x3bc;</mml:mi>
<mml:mi>y</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> are the <inline-formula id="inf10">
<mml:math id="m13">
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula> th observed <inline-formula id="inf11">
<mml:math id="m14">
<mml:mrow>
<mml:mi mathvariant="italic">log</mml:mi>
<mml:mo>&#x2061;</mml:mo>
<mml:mtext>&#xa0;</mml:mtext>
<mml:msub>
<mml:mi>N</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mo>/</mml:mo>
<mml:msub>
<mml:mi>N</mml:mi>
<mml:mn>0</mml:mn>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>, the <inline-formula id="inf12">
<mml:math id="m15">
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula> th predicted <inline-formula id="inf13">
<mml:math id="m16">
<mml:mrow>
<mml:mi mathvariant="italic">log</mml:mi>
<mml:mo>&#x2061;</mml:mo>
<mml:mtext>&#xa0;</mml:mtext>
<mml:msub>
<mml:mi>N</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mo>/</mml:mo>
<mml:msub>
<mml:mi>N</mml:mi>
<mml:mn>0</mml:mn>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>, and the average observed <inline-formula id="inf14">
<mml:math id="m17">
<mml:mrow>
<mml:mi mathvariant="italic">log</mml:mi>
<mml:mo>&#x2061;</mml:mo>
<mml:mtext>&#xa0;</mml:mtext>
<mml:msub>
<mml:mi>N</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mo>/</mml:mo>
<mml:msub>
<mml:mi>N</mml:mi>
<mml:mn>0</mml:mn>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>, respectively. Each evaluation metric was calculated using the Scikit-learn (version 1.0.1) Python package.</p>
</sec>
<sec id="s2-5">
<title>2.5 Two-dimensional (2D) plot visualization of bacterial behaviors</title>
<p>Using the developed model, the <inline-formula id="inf15">
<mml:math id="m18">
<mml:mrow>
<mml:mi mathvariant="italic">log</mml:mi>
<mml:mo>&#x2061;</mml:mo>
<mml:mtext>&#xa0;</mml:mtext>
<mml:msub>
<mml:mi>N</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mo>/</mml:mo>
<mml:msub>
<mml:mi>N</mml:mi>
<mml:mn>0</mml:mn>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> was predicted at various pH, a<sub>w</sub>, and temperatures when the initial cell count was 4 log CFU/g at 10 days in broth. To visualize microbial response to various environments, the <inline-formula id="inf16">
<mml:math id="m19">
<mml:mrow>
<mml:mi mathvariant="italic">log</mml:mi>
<mml:mo>&#x2061;</mml:mo>
<mml:mtext>&#xa0;</mml:mtext>
<mml:msub>
<mml:mi>N</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mo>/</mml:mo>
<mml:msub>
<mml:mi>N</mml:mi>
<mml:mn>0</mml:mn>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> at 10 days was divided into four levels, &#x201c;strongly increased&#x201d; (change &#x3e; 3-log cycle), &#x201c;increased&#x201d; (change of 2 &#xb1; 1 log cycle), &#x201c;survival&#x201d; (change of 0 &#xb1; 1 log cycle), and &#x201c;decreased&#x201d; (change of &#x2212;2 &#xb1; 1 log cycle). We then plotted the responses as 2D color maps, to obtain three types of maps, pH&#x2013;a<sub>w</sub>, pH&#x2013;temperature, and temperature&#x2013;a<sub>w</sub>. To confirm the validity of this 2D plot, we predicted the <inline-formula id="inf17">
<mml:math id="m20">
<mml:mrow>
<mml:mi mathvariant="italic">log</mml:mi>
<mml:mo>&#x2061;</mml:mo>
<mml:mtext>&#xa0;</mml:mtext>
<mml:msub>
<mml:mi>N</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mo>/</mml:mo>
<mml:msub>
<mml:mi>N</mml:mi>
<mml:mn>0</mml:mn>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> and visualized it under some experimental conditions reported in previous studies.</p>
<p>We then compared our 2D color map with the data in the literature on growth/no-growth experiments, which was not recorded in ComBase. The data used for external validation were selected, considering that the experimental conditions are simple enough to describe bacterial behavior by eight explanatory variables. As a representative inactivation process, we cite a study of growth/no growth experiments of <italic>L. monocytogenes</italic> in broth (<xref ref-type="bibr" rid="B27">Koutsoumanis et al., 2004</xref>). The growth/no-growth <italic>L. monocytogenes</italic> were experimentally observed in a culture medium after 30 days at 25&#xa0;&#xb0;C (a), with pH 5.47&#x2013;5.58 (b) and a<sub>w</sub> of 0.965&#x2013;0.967 (c) after 30 days.</p>
</sec>
<sec id="s2-6">
<title>2.6 Interpretation of machine learning model</title>
<sec id="s2-6-1">
<title>2.6.1 Feature importance</title>
<p>The feature importance was calculated to interpret the developed model from the process of model development. This allowed us to understand how each explanatory variable contributed to the predicted performance during the training of the XGBoost algorithm. The importance of the features was evaluated using gain, which is an index that shows the usefulness of a feature in constructing a tree-based model. A higher value indicates that the feature significantly affects the predicted <inline-formula id="inf18">
<mml:math id="m21">
<mml:mrow>
<mml:mi mathvariant="italic">log</mml:mi>
<mml:mo>&#x2061;</mml:mo>
<mml:mtext>&#xa0;</mml:mtext>
<mml:msub>
<mml:mi>N</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mo>/</mml:mo>
<mml:msub>
<mml:mi>N</mml:mi>
<mml:mn>0</mml:mn>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>. Feature importance was calculated using the XGBoost Python package (<ext-link ext-link-type="uri" xlink:href="https://xgboost.readthedocs.io/en/latest/python/python_api.html">https://xgboost.readthedocs.io/en/latest/python/python_api.html</ext-link>).</p>
</sec>
<sec id="s2-6-2">
<title>2.6.2 SHapley Additive eXplanations (SHAP) value</title>
<p>As another approach for model interpretation, we used the SHAP framework proposed by <xref ref-type="bibr" rid="B33">Lundberg and Lee (2017)</xref>. SHAP is a new and flexible method that addresses the machine learning system as a so-called &#x201c;black-box model&#x201d; by providing an interpretation of how strongly the features affect the predicted outcome. Although the feature importance employed by XGBoost is positioned as the global explanation of the model, SHAP can directly measure a local feature explanation for a single sample, which could otherwise go unnoticed (<xref ref-type="bibr" rid="B39">Moncada-Torres et al., 2021</xref>). Because SHAP approaches are model-agnostic, they are used with various model types and in many fields of study (<xref ref-type="bibr" rid="B33">Lundberg and Lee, 2017</xref>; <xref ref-type="bibr" rid="B2">Agius et al., 2020</xref>; <xref ref-type="bibr" rid="B34">Mangalathu et al., 2020</xref>; <xref ref-type="bibr" rid="B40">Ndraha et al., 2021</xref>; <xref ref-type="bibr" rid="B45">Rodrigo et al., 2021</xref>; <xref ref-type="bibr" rid="B52">Yang and Liu, 2021</xref>; <xref ref-type="bibr" rid="B53">Zoabi et al., 2021</xref>).</p>
<p>The SHAP value for a single feature within a single sample describes the extent to which that feature contributes to the predicted output. A higher SHAP value indicates that a feature has a larger impact on the predicted <inline-formula id="inf19">
<mml:math id="m22">
<mml:mrow>
<mml:mi mathvariant="italic">log</mml:mi>
<mml:mo>&#x2061;</mml:mo>
<mml:mtext>&#xa0;</mml:mtext>
<mml:msub>
<mml:mi>N</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mo>/</mml:mo>
<mml:msub>
<mml:mi>N</mml:mi>
<mml:mn>0</mml:mn>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>, whereas a lower SHAP value indicates a smaller impact. A positive SHAP value indicates that a feature makes a positive contribution to the predicted <inline-formula id="inf20">
<mml:math id="m23">
<mml:mrow>
<mml:mi mathvariant="italic">log</mml:mi>
<mml:mo>&#x2061;</mml:mo>
<mml:mtext>&#xa0;</mml:mtext>
<mml:msub>
<mml:mi>N</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mo>/</mml:mo>
<mml:msub>
<mml:mi>N</mml:mi>
<mml:mn>0</mml:mn>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>, whereas a negative value indicates a negative contribution. By computing a SHAP value for each data point, a more detailed explanation of the global feature importance, such as the relationship between the feature and its corresponding effect on the output [e.g., SHAP dependence plot (<xref ref-type="bibr" rid="B32">Lundberg et al., 2018</xref>)], can be obtained.</p>
<p>SHAP values were calculated using TreeSHAP (<xref ref-type="bibr" rid="B32">Lundberg et al., 2018</xref>), and a variant of SHAP, which was developed for tree-based machine learning models, such as XGBoost, as incorporated in the SHAP (Version 0.40.0) Python Package (<ext-link ext-link-type="uri" xlink:href="https://shap.readthedocs.io/en/latest/index.html">https://shap.readthedocs.io/en/latest/index.html</ext-link>). All pre-processing steps, model development, and statistical analyses were performed using Python (version 3. 8. 12).</p>
</sec>
</sec>
</sec>
<sec sec-type="results" id="s3">
<title>3 Results</title>
<sec id="s3-1">
<title>3.1 Evaluation of model accuracy</title>
<p>By developing a single machine learning model, our data-mining approach meets in rough agreement with respect to prediction accuracy under various types of microorganisms and food categories. <xref ref-type="fig" rid="F1">Figure 1</xref> represents overall predicted results including all types of microorganisms and food categories, in which the <italic>R</italic>
<sup>2</sup> and RMSE values obtained were 0.76 and 1.23, respectively. The accuracy evaluated is consistently convincing compared with that of <xref ref-type="bibr" rid="B19">Hiura et al. (2021)</xref> (0.75 for <italic>R</italic>
<sup>2</sup> and 1.02 for RMSE, respectively). Followed by this, results divided by each microorganism, and by each food category are shown in <xref ref-type="fig" rid="F2">Figures 2</xref>, <xref ref-type="fig" rid="F3">3</xref>, respectively. For each organism in <xref ref-type="fig" rid="F2">Figure 2</xref>, the RMSE values were 1.35, 1.41, 1.42, 1.20, 1.42, 1.03, 1.07, and 1.08, for <italic>A. hydrophila</italic>, <italic>B. cereus</italic>, <italic>E. coli</italic>, <italic>L. monocytogenes</italic>, <italic>Pseudomonads</italic>, <italic>S. aureus</italic>, <italic>Salmonella</italic>, and <italic>Y. enterocolitica,</italic> respectively. In <xref ref-type="fig" rid="F3">Figure 3</xref>, the RMSE values for the culture medium, beef, pork, poultry, sausage, eggs, seafood, milk, cheese, vegetables or fruits, bread, dessert food, beverage, water, and sauce/dressing were 1.21, 1.24, 1.29, 1.52, 1.43, 1.11, 1.06, 1.17, 1.51, 1.31, 0.60, 0.98, 1.29, 1.33, and 1.89, respectively. These results show that the developed model responds flexibly to various environmental conditions in different amounts of data. Note that the <italic>R</italic>
<sup>2</sup> and RMSE in sauce/dressing [<xref ref-type="fig" rid="F3">Figure 3</xref> (o)] are comparably worse than other organisms and food categories.</p>
<fig id="F1" position="float">
<label>FIGURE 1</label>
<caption>
<p>Comparison of the predicted and observed log change ratio for all the test data. The solid line represents residuals &#x3d; 0.</p>
</caption>
<graphic xlink:href="frfst-02-979028-g001.tif"/>
</fig>
<fig id="F2" position="float">
<label>FIGURE 2</label>
<caption>
<p>Comparison of the predicted and observed log change ratio for test data of <italic>Aeromonas hydrophila</italic> <bold>(A)</bold>, <italic>Bacillus cereus</italic> <bold>(B)</bold>, <italic>Escherichia coli</italic> <bold>(C)</bold>, <italic>Listeria monocytogenes</italic> <bold>(D)</bold>, <italic>Pseudomonads</italic> <bold>(E)</bold>, <italic>Staphylococcus aureus</italic> <bold>(F)</bold>, <italic>Salmonella</italic> <bold>(G)</bold>, and <italic>Yersinia enterocolitica</italic> <bold>(H)</bold>. The solid line represents residuals &#x3d; 0.</p>
</caption>
<graphic xlink:href="frfst-02-979028-g002.tif"/>
</fig>
<fig id="F3" position="float">
<label>FIGURE 3</label>
<caption>
<p>Comparison of the predicted and observed log change ratio for test data of culture medium <bold>(A)</bold>, beef <bold>(B)</bold>, pork <bold>(C)</bold>, poultry <bold>(D)</bold>, sausage <bold>(E)</bold>, egg <bold>(F)</bold>, seafood <bold>(G)</bold>, milk <bold>(H)</bold>, cheese <bold>(I)</bold>, vegetable or fruit <bold>(J)</bold>, bread <bold>(K)</bold>, dessert food (L), beverage <bold>(M)</bold>, water <bold>(N)</bold>, and sauce/dressing <bold>(O)</bold>. The solid line represents residuals &#x3d; 0.</p>
</caption>
<graphic xlink:href="frfst-02-979028-g003.tif"/>
</fig>
</sec>
<sec id="s3-2">
<title>3.2 2D color plot visualization of bacterial behaviors</title>
<p>We introduced new 2D visualizations to illustrate the bacterial growth/survival ratio using combinations of temperature, pH, and a<sub>w</sub>. <xref ref-type="fig" rid="F4">Figure 4</xref> shows a color map of eight bacterial behaviors in the broth after 10 days when the initial number was 4 log CFU/g. The limiting a<sub>w</sub> value of growth for <italic>S. aureus</italic> was 0.90, whereas that for most microorganisms was an a<sub>w</sub> value of 0.91 or above. The minimum pH value of <italic>B. cereus</italic> for growth was estimated to be 5.0, whereas that of many other organisms was 4.0&#x2013;4.5. All eight bacteria grew when the pH was greater than 5.5. Some examples of 2D color maps, such as temperature&#x2013;a<sub>w</sub> and pH&#x2013;temperature, can be found in the <xref ref-type="sec" rid="s11">Supplementary Information</xref>. We compared the observed growth/no-growth experiment of <italic>L. monocytogenes</italic> in a culture medium at 25&#xb0;C (<xref ref-type="bibr" rid="B27">Koutsoumanis et al., 2004</xref>) and our 2D map prediction. As shown in <xref ref-type="fig" rid="F5">Figure 5</xref>, the validity of the 2D plots was visually confirmed.</p>
<fig id="F4" position="float">
<label>FIGURE 4</label>
<caption>
<p>Change ratio from initial cell counts in broth at 20&#xb0;C with an initial concentration of 4 log CFU/g after 10&#xa0;days for <italic>Aeromonas hydrophila</italic> <bold>(A)</bold>, <italic>Bacillus cereus</italic> <bold>(B)</bold>, <italic>Escherichia coli</italic> <bold>(C)</bold>, <italic>Listeria monocytogenes</italic> <bold>(D)</bold>, <italic>Pseudomonads</italic> <bold>(E)</bold>, <italic>Staphylococcus aureus</italic> <bold>(F)</bold>, <italic>Salmonella</italic> <bold>(G)</bold>, and <italic>Yersinia enterocolitica</italic> <bold>(H)</bold>. Each square plot represents the value of the log change ratio (<inline-formula id="inf21">
<mml:math id="m24">
<mml:mrow>
<mml:mi mathvariant="italic">log</mml:mi>
<mml:mo>&#x2061;</mml:mo>
<mml:mtext>&#xa0;</mml:mtext>
<mml:msub>
<mml:mi>N</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mo>/</mml:mo>
<mml:msub>
<mml:mi>N</mml:mi>
<mml:mn>0</mml:mn>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>). Plot area by an organism is defined by the range of pH/a<sub>w</sub> in the dataset. No-plots area represents outside of train and test data range.</p>
</caption>
<graphic xlink:href="frfst-02-979028-g004.tif"/>
</fig>
<fig id="F5" position="float">
<label>FIGURE 5</label>
<caption>
<p>Comparison between observed growth (<inline-formula id="inf22">
<mml:math id="m25">
<mml:mrow>
<mml:mo>&#x25cb;</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula>) and no-growth (&#x2613;) and predicted change ratio from initial cell counts in a broth of <italic>Listeria monocytogenes</italic> after 30&#xa0;days at 25&#xb0;C <bold>(A)</bold>, with pH 5.47&#x2013;5.58 <bold>(B)</bold> with a<sub>w</sub> of 0.965&#x2013;0.967 <bold>(C)</bold>. Experimental data were taken from <xref ref-type="bibr" rid="B27">Koutsoumanis et al. (2004)</xref>. Each square plot represents the value of the log change ratio (<inline-formula id="inf23">
<mml:math id="m26">
<mml:mrow>
<mml:mi mathvariant="italic">log</mml:mi>
<mml:mo>&#x2061;</mml:mo>
<mml:mtext>&#xa0;</mml:mtext>
<mml:msub>
<mml:mi>N</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mo>/</mml:mo>
<mml:msub>
<mml:mi>N</mml:mi>
<mml:mn>0</mml:mn>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>); dark red is &#x201c;strongly increased&#x201d; (change &#x3e; 3-log cycle), red is &#x201c;increased&#x201d; (change of 2 &#xb1; 1 log cycle), grey is &#x201c;survival&#x201d; (change of 0 &#xb1; 1 log cycle), and blue is &#x201c;decreased&#x201d; (change of &#x2212;2 &#xb1; 1 log cycle).</p>
</caption>
<graphic xlink:href="frfst-02-979028-g005.tif"/>
</fig>
</sec>
<sec id="s3-3">
<title>3.3 Interpretation of the model</title>
<sec id="s3-3-1">
<title>3.3.1 Feature importance</title>
<p>We calculated the feature importance to obtain the explanatory variable that was important in terms of contribution to the prediction performance in model development. <xref ref-type="fig" rid="F6">Figure 6</xref> shows the feature importance of the developed XGBoost model. The importance of each feature represents the ratio of the importance of each feature when the sum of all feature importance values is 1. &#x201c;Initial cell number,&#x201d; &#x201c;Time,&#x201d; and &#x201c;a<sub>w</sub>&#x201d; contributed the most to model development and to almost the same extent. A categorical variable &#x201c;Organism&#x201d; representing the name of bacteria contributed to model development mostly to the same extent as the numerical variables &#x201c;pH&#x201d; and &#x201c;Temperature.&#x201d; Information regarding food, such as food category and name, also contributed to model development. All features contributed to the model development to some extent.</p>
<fig id="F6" position="float">
<label>FIGURE 6</label>
<caption>
<p>Feature importance of the developed XGBoost model. The <italic>x</italic>-axis indicates the relative importance, and the <italic>y</italic>-axis indicates the feature names. Blue and gray bars indicate categorical and numerical variables, respectively.</p>
</caption>
<graphic xlink:href="frfst-02-979028-g006.tif"/>
</fig>
</sec>
<sec id="s3-3-2">
<title>3.3.2 SHAP value</title>
<p>To see the model interpretability in a deeper perspective (i.e., the relationship between each environmental condition and the bacterial growth), we introduced the SHAP framework. The SHAP values for the three environmental features were calculated to determine the contribution of the environmental factors to bacterial growth. The SHAP value explains the contribution of each variable to the predicted <inline-formula id="inf24">
<mml:math id="m27">
<mml:mrow>
<mml:mi mathvariant="italic">log</mml:mi>
<mml:mo>&#x2061;</mml:mo>
<mml:mtext>&#xa0;</mml:mtext>
<mml:msub>
<mml:mi>N</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mo>/</mml:mo>
<mml:msub>
<mml:mi>N</mml:mi>
<mml:mn>0</mml:mn>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> value of an instance. Positive and high SHAP values indicate that the feature value positively affected the predicted <inline-formula id="inf25">
<mml:math id="m28">
<mml:mrow>
<mml:mi mathvariant="italic">log</mml:mi>
<mml:mo>&#x2061;</mml:mo>
<mml:mtext>&#xa0;</mml:mtext>
<mml:msub>
<mml:mi>N</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mo>/</mml:mo>
<mml:msub>
<mml:mi>N</mml:mi>
<mml:mn>0</mml:mn>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>. Conversely, negative and low SHAP values imply that the feature value has a negative effect. The absolute SHAP value indicates the effect size of the environmental factors. <xref ref-type="fig" rid="F7">Figure 7</xref> shows the SHAP-dependence plots for a<sub>w</sub>, pH, and temperature. The higher the a<sub>w</sub>, the higher the SHAP value for a<sub>w</sub> (<xref ref-type="fig" rid="F7">Figure 7A</xref>). The SHAP value for temperature followed a similar relationship as that of a<sub>w</sub> (<xref ref-type="fig" rid="F7">Figure 7C</xref>). However, the SHAP value for pH was the highest when the pH value was approximately 7 (<xref ref-type="fig" rid="F7">Figure 7B</xref>). According to the results of the SHAP dependency for each environmental factor, several trends in bacterial behavior could be suggested. When the value of a<sub>w</sub> was greater than 0.95, the a<sub>w</sub> positively affected bacterial growth. When the pH was approximately 7, it positively influenced bacterial growth. When the pH was less than 5.0, the low pH negatively influenced bacterial growth. When the temperature was 10&#x2013;25&#xb0;C, it positively influenced bacterial growth.</p>
<fig id="F7" position="float">
<label>FIGURE 7</label>
<caption>
<p>
<italic>SHapley Additive eXplanations</italic> (SHAP) dependency plots for water activity <bold>(A)</bold>, pH <bold>(B)</bold>, and temperature <bold>(C)</bold>.</p>
</caption>
<graphic xlink:href="frfst-02-979028-g007.tif"/>
</fig>
</sec>
</sec>
</sec>
<sec sec-type="discussion" id="s4">
<title>4 Discussion</title>
<p>In the present study, we demonstrated the application of a data mining approach to predict bacterial population behavior using the ComBase database (<xref ref-type="fig" rid="F1">Figures 1</xref>&#x2013;<xref ref-type="fig" rid="F3">3</xref>) and visualized these as 2D maps (<xref ref-type="fig" rid="F4">Figure 4</xref>). Categorical data such as organism, food category, and food name also contributed to the construction of the model to some extent in the developed model (<xref ref-type="fig" rid="F6">Figure 6</xref>). In addition, we demonstrated the environmental effects on the growth of the bacterial population (<xref ref-type="fig" rid="F7">Figure 7</xref>). The data mining approach allowed us to model and reveal the multidimensional relationship between bacterial population behavior and the food environment. We showed that a data-driven approach to analyzing accumulated data could be useful for addressing food safety issues.</p>
<p>Although the dataset used in this study consisted of numerical and categorical variables, using a machine learning algorithm enabled us to predict bacterial population behavior using a single predictive model (<xref ref-type="fig" rid="F1">Figure 1</xref>). Unlike numerical variables, categorical variables (e.g., food category, food name, and organism) must be replaced with numerical data, such as dummy variables for numerical operations (<xref ref-type="bibr" rid="B43">Palaniappan and Awang, 2008</xref>; <xref ref-type="bibr" rid="B25">Kim and Hong, 2017</xref>). Statistical modeling makes it difficult to consider categorical data when multiple conditions exist, such as food names and environmental conditions (<xref ref-type="bibr" rid="B25">Kim and Hong, 2017</xref>; <xref ref-type="bibr" rid="B19">Hiura et al., 2021</xref>). In contrast, the machine learning model enabled the description of the relationship between the food environment and bacteria, which is difficult for statistical models or human hand definition owing to the high dimensionality. Additionally, we succeeded in extending the model proposed by <xref ref-type="bibr" rid="B19">Hiura et al. (2021)</xref> to include eight bacterial species and 15 food categories.</p>
<p>We visualized the bacterial population behavior based on the idea of panoramic evaluation of the whole trend of the microbial response to various conditions. Our 2D map visualization showed a combination of factors that prevented bacterial growth (<xref ref-type="fig" rid="F4">Figure 4</xref>). Compared to the literature data, our 2D color map can describe trends of population behavior of <italic>Listeria monocytogenes</italic> to the same extent (<xref ref-type="fig" rid="F5">Figure 5</xref>), which supports the validity of our color map. Similar to our study, Ratkowsky &#x26; Ross (1995) proposed a growth/no-growth interface model. The growth/no growth interface models estimate the probability of bacterial growth and find combinations of factors preventing growth. The growth/no growth interface has been widely used in previous studies in predictive microbiology (<xref ref-type="bibr" rid="B50">Tienungoon et al., 2000</xref>; <xref ref-type="bibr" rid="B36">McKellar and Lu, 2001</xref>; <xref ref-type="bibr" rid="B30">Le Marc et al., 2005</xref>; <xref ref-type="bibr" rid="B44">Polese et al., 2011</xref>; <xref ref-type="bibr" rid="B9">Coroller et al., 2012</xref>; <xref ref-type="bibr" rid="B29">Kuroda et al., 2019</xref>). This approach was used to determine whether bacteria can grow easily under a wide range of experimental conditions. However, this interface cannot express the details of the bacterial population density. In the present study, we succeeded in evaluating not only whether there was growth or not, but also the change in bacterial population density (<xref ref-type="fig" rid="F4">Figure 4</xref>). Our visualizing method helps us understand the bacterial concentration in various conditions at glance. Our visualization methods can be useful for developing processes that provide information for realistic estimations of food safety risks. Thus, our 2D map models are important for the dissemination of food safety regulations.</p>
<p>The SHAP value describes the contribution of each explanatory variable to each predicted <inline-formula id="inf26">
<mml:math id="m29">
<mml:mrow>
<mml:mi mathvariant="italic">log</mml:mi>
<mml:mo>&#x2061;</mml:mo>
<mml:mtext>&#xa0;</mml:mtext>
<mml:msub>
<mml:mi>N</mml:mi>
<mml:mi>t</mml:mi>
</mml:msub>
<mml:mo>/</mml:mo>
<mml:msub>
<mml:mi>N</mml:mi>
<mml:mn>0</mml:mn>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>. A positive SHAP value indicates bacterial growth. A negative SHAP value indicates a decrease in a bacterial population. We succeeded in mining information regarding the relationship between bacterial growth and environmental conditions from the dataset (<xref ref-type="fig" rid="F7">Figure 7</xref>). These results mostly conform to the general opinion in food microbiology. For many food-spoilage and food-poisoning bacteria such as <italic>E. coli</italic>, <italic>Pseudomonas</italic> spp., and <italic>B. cereus</italic>, minimum a<sub>w</sub> values for growth are approximately 0.95 (<xref ref-type="bibr" rid="B22">Jay et al., 2008a</xref>). The a<sub>w</sub> value positively affected bacterial growth if it was greater than approximately 0.95. The optimal pH range for bacterial growth was approximately 7 (<xref ref-type="fig" rid="F7">Figure 7A</xref>). Most food-spoilage and food-poisoning bacteria grow poorly as the pH decreases, especially below 3.5 (<xref ref-type="bibr" rid="B1">Adams and Nicolaides, 1997</xref>; <xref ref-type="bibr" rid="B22">Jay et al., 2008a</xref>). Similarly, the pH values could be used to predict bacterial population growth in the range of 6&#x2013;7, whereas they worked negatively under pH 5 (<xref ref-type="fig" rid="F7">Figure 7B</xref>). In addition, most foodborne microorganisms grow well at 20&#x2013;45&#xb0;C, and many bacterial species, except psychotropic bacteria or psychrophiles, cannot grow below 7&#xb0;C (<xref ref-type="bibr" rid="B22">Jay et al., 2008a</xref>; <xref ref-type="bibr" rid="B23">Jay et al., 2008b</xref>). Similarly, temperature contributes to bacterial growth at more than approximately 10&#xb0;C (<xref ref-type="fig" rid="F7">Figure 7C</xref>). We showed a negative effect on bacterial growth by computing the SHAP value. The associations detected here replicate the well-known characteristics of bacterial growth in food microbiology, which supports the validity of our results and the possibility of utilizing data mining to extract bacterial population behavior.</p>
<p>Although our study uses data-driven methods to analyze the experimental data in the ComBase database with some advantages and expectations, it also has some limitations. Our model is not assumed to predict the ecology of microorganisms, because the ComBase database mainly focuses on bacterial growth and inactivation of only one type of bacterial species for each experiment for simplification. In the future, the competitive microbial condition would be analyzed with the dataset containing the ecology of microorganisms by data driven methods.</p>
</sec>
<sec sec-type="conclusion" id="s5">
<title>5 Conclusion</title>
<p>Data mining predicted the population behavior of eight foodborne pathogens and spoilage bacteria in the 15 food environments. In addition, growth inhibition owing to the food environment was quantitatively evaluated using data-driven methods. Our approach enabled us to extract useful information regarding food safety from a large amount of experimental data. The bacterial population behavior predicted by this procedure can provide guidelines for determining food processing and storage conditions. The main findings of this study support the data mining approach as valuable in the field of food microbiology.</p>
</sec>
</body>
<back>
<sec sec-type="data-availability" id="s6">
<title>Data availability statement</title>
<p>The original contributions presented in the study are included in the article/<xref ref-type="sec" rid="s11">Supplementary Material</xref>, further inquiries can be directed to the corresponding author.</p>
</sec>
<sec id="s7">
<title>Author contributions</title>
<p>JH, JS, SN, SK, and KK conceptualized the study. JH, JS, SN, and KK designed the computation. JH and JS analyzed the data. JH wrote the python script and the first draft of the manuscript. All authors reviewed the manuscript.</p>
</sec>
<sec id="s8">
<title>Funding</title>
<p>This work was also supported by Kieikai Research Foundation and Tobe Maki scholarship Foundation (to KK), the Japan Science and Technology Agency Center of Innovation Program (grant number: JPMJCE1301), CREST (JPMJCR20H4), and JSPS KAKENHI (JP20H00425, JP21K19813) (to SN).</p>
</sec>
<sec sec-type="COI-statement" id="s9">
<title>Conflict of interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec sec-type="disclaimer" id="s10">
<title>Publisher&#x2019;s note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<sec id="s11">
<title>Supplementary material</title>
<p>The Supplementary Material for this article can be found online at: <ext-link ext-link-type="uri" xlink:href="https://www.frontiersin.org/articles/10.3389/frfst.2022.979028/full#supplementary-material">https://www.frontiersin.org/articles/10.3389/frfst.2022.979028/full&#x23;supplementary-material</ext-link>
</p>
<supplementary-material xlink:href="DataSheet1.pdf" id="SM1" mimetype="application/pdf" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="DataSheet2.csv" id="SM2" mimetype="application/csv" xmlns:xlink="http://www.w3.org/1999/xlink"/>
</sec>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Adams</surname>
<given-names>M. R.</given-names>
</name>
<name>
<surname>Nicolaides</surname>
<given-names>L.</given-names>
</name>
</person-group> (<year>1997</year>). <article-title>Review of the sensitivity of different foodborne pathogens to fermentation</article-title>. <source>Food control.</source> <volume>8</volume> (<issue>5&#x2013;6</issue>), <fpage>227</fpage>&#x2013;<lpage>239</lpage>. <pub-id pub-id-type="doi">10.1016/s0956-7135(97)00016-9</pub-id>
</citation>
</ref>
<ref id="B2">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Agius</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Brieghel</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Andersen</surname>
<given-names>M. A.</given-names>
</name>
<name>
<surname>Pearson</surname>
<given-names>A. T.</given-names>
</name>
<name>
<surname>Ledergerber</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Cozzi-Lepri</surname>
<given-names>A.</given-names>
</name>
<etal/>
</person-group> (<year>2020</year>). <article-title>Machine learning can identify newly diagnosed patients with CLL at high risk of infection</article-title>. <source>Nat. Commun.</source> <volume>11</volume> (<issue>1</issue>), <fpage>363</fpage>&#x2013;<lpage>417</lpage>. <pub-id pub-id-type="doi">10.1038/s41467-019-14225-8</pub-id>
</citation>
</ref>
<ref id="B4">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Baranyi</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Tamplin</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Ross</surname>
<given-names>T.</given-names>
</name>
</person-group> (<year>2004</year>). <article-title>The ComBase initiative</article-title>. <source>Microbiol. Aust.</source> <volume>25</volume> (<issue>3</issue>), <fpage>32</fpage>. <pub-id pub-id-type="doi">10.1071/ma04332</pub-id>
</citation>
</ref>
<ref id="B5">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bidlas</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Du</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Lambert</surname>
<given-names>R. J. W.</given-names>
</name>
</person-group> (<year>2008</year>). <article-title>An explanation for the effect of inoculum size on MIC and the growth/no growth interface</article-title>. <source>Int. J. Food Microbiol.</source> <volume>126</volume> (<issue>1&#x2013;2</issue>), <fpage>140</fpage>&#x2013;<lpage>152</lpage>. <pub-id pub-id-type="doi">10.1016/j.ijfoodmicro.2008.05.023</pub-id>
</citation>
</ref>
<ref id="B6">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chang</surname>
<given-names>Y. C.</given-names>
</name>
<name>
<surname>Chang</surname>
<given-names>K. H.</given-names>
</name>
<name>
<surname>Wu</surname>
<given-names>G. J.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Application of eXtreme gradient boosting trees in the construction of credit risk assessment models for financial institutions</article-title>. <source>Appl. Soft Comput.</source> <volume>73</volume>, <fpage>914</fpage>&#x2013;<lpage>920</lpage>. <pub-id pub-id-type="doi">10.1016/j.asoc.2018.09.029</pub-id>
</citation>
</ref>
<ref id="B7">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Chen</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Guestrin</surname>
<given-names>C.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>XGBoost: A scalable tree boosting system</article-title>. <conf-name>Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining</conf-name>. <publisher-name>ACM</publisher-name>, <conf-loc>New York, NY, USA</conf-loc>. <conf-date>2016</conf-date>, <fpage>785</fpage>&#x2013;<lpage>794</lpage>. <pub-id pub-id-type="doi">10.1145/2939672.2939785</pub-id>
</citation>
</ref>
<ref id="B8">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cios</surname>
<given-names>K. J.</given-names>
</name>
<name>
<surname>William Moore</surname>
<given-names>G.</given-names>
</name>
</person-group> (<year>2002</year>). <article-title>Uniqueness of medical data mining</article-title>. <source>Artif. Intell. Med.</source> <volume>26</volume> (<issue>1&#x2013;2</issue>), <fpage>1</fpage>&#x2013;<lpage>24</lpage>. <pub-id pub-id-type="doi">10.1016/S0933-3657(02)00049-0</pub-id>
</citation>
</ref>
<ref id="B9">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Coroller</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Kan-King-Yu</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Leguerinel</surname>
<given-names>I.</given-names>
</name>
<name>
<surname>Mafart</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Membre</surname>
<given-names>J. M.</given-names>
</name>
</person-group> (<year>2012</year>). <article-title>Modelling of growth, growth/no-growth interface and nonthermal inactivation areas of Listeria in foods</article-title>. <source>Int. J. Food Microbiol.</source> <volume>152</volume> (<issue>3</issue>), <fpage>139</fpage>&#x2013;<lpage>152</lpage>. <pub-id pub-id-type="doi">10.1016/j.ijfoodmicro.2011.09.023</pub-id>
</citation>
</ref>
<ref id="B10">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cortez</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Cerdeira</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Almeida</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Matos</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Reis</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2009</year>). <article-title>Modeling wine preferences by data mining from physicochemical properties</article-title>. <source>Decis. Support Syst.</source> <volume>47</volume> (<issue>4</issue>), <fpage>547</fpage>&#x2013;<lpage>553</lpage>. <pub-id pub-id-type="doi">10.1016/j.dss.2009.05.016</pub-id>
</citation>
</ref>
<ref id="B11">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Delen</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Walker</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Kadam</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2005</year>). <article-title>Predicting breast cancer survivability: A comparison of three data mining methods</article-title>. <source>Artif. Intell. Med.</source> <volume>34</volume> (<issue>2</issue>), <fpage>113</fpage>&#x2013;<lpage>127</lpage>. <pub-id pub-id-type="doi">10.1016/j.artmed.2004.07.002</pub-id>
</citation>
</ref>
<ref id="B12">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>den Besten</surname>
<given-names>H. M. W.</given-names>
</name>
<name>
<surname>Zwietering</surname>
<given-names>M. H.</given-names>
</name>
</person-group> (<year>2012</year>). <article-title>Meta-analysis for quantitative microbiological risk assessments and benchmarking data</article-title>. <source>Trends Food Sci. Technol.</source>, <fpage>34</fpage>&#x2013;<lpage>39</lpage>. <pub-id pub-id-type="doi">10.1016/j.tifs.2011.12.004</pub-id>
</citation>
</ref>
<ref id="B13">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Deringer</surname>
<given-names>V. L.</given-names>
</name>
<name>
<surname>Bartok</surname>
<given-names>A. P.</given-names>
</name>
<name>
<surname>Bernstein</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Wilkins</surname>
<given-names>D. M.</given-names>
</name>
<name>
<surname>Ceriotti</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Csanyi</surname>
<given-names>G.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Gaussian process regression for materials and molecules</article-title>. <source>Chem. Rev.</source> <volume>121</volume>, <fpage>10073</fpage>&#x2013;<lpage>10141</lpage>. <pub-id pub-id-type="doi">10.1021/acs.chemrev.1c00022</pub-id>
</citation>
</ref>
<ref id="B14">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Doyle</surname>
<given-names>M. P.</given-names>
</name>
<name>
<surname>Diez-Gonzalez</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Hill</surname>
<given-names>C.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Food microbiology: Fundamentals and frontiers</article-title>. <source>Food Microbiol. Fundam. Front.</source> <pub-id pub-id-type="doi">10.1002/9781683670476</pub-id>
</citation>
</ref>
<ref id="B15">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Friedman</surname>
<given-names>J. H.</given-names>
</name>
</person-group> (<year>2001</year>). <article-title>Greedy function approximation: A gradient boosting machine</article-title>. <source>Ann. Stat.</source> <volume>29</volume> (<issue>5</issue>), <fpage>1189</fpage>&#x2013;<lpage>1232</lpage>. <pub-id pub-id-type="doi">10.1214/aos/1013203451</pub-id>
</citation>
</ref>
<ref id="B16">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gould</surname>
<given-names>G. W.</given-names>
</name>
</person-group> (<year>1996</year>). <article-title>Methods for preservation and extension of shelf life</article-title>. <source>Int. J. Food Microbiol.</source> <volume>33</volume> (<issue>1</issue>), <fpage>51</fpage>&#x2013;<lpage>64</lpage>. <pub-id pub-id-type="doi">10.1016/0168-1605(96)01133-6</pub-id>
</citation>
</ref>
<ref id="B17">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gulyaeva</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Huettmann</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Shestopalov</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Okamatsu</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Matsuno</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Chu</surname>
<given-names>D. H.</given-names>
</name>
<etal/>
</person-group> (<year>2020</year>). <article-title>Data mining and model-predicting a global disease reservoir for low-pathogenic Avian Influenza (A) in the wider Pacific rim using big data sets</article-title>. <source>Sci. Rep.</source> <volume>10</volume> (<issue>1</issue>), <fpage>16817</fpage>&#x2013;<lpage>16911</lpage>. <pub-id pub-id-type="doi">10.1038/s41598-020-73664-2</pub-id>
</citation>
</ref>
<ref id="B18">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hand</surname>
<given-names>D. J.</given-names>
</name>
</person-group> (<year>1998</year>). <article-title>Data mining: Statistics and more?</article-title> <source>Am. Stat.</source> <volume>52</volume> (<issue>2</issue>), <fpage>112</fpage>&#x2013;<lpage>118</lpage>. <pub-id pub-id-type="doi">10.1080/00031305.1998.10480549</pub-id>
</citation>
</ref>
<ref id="B19">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hiura</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Koseki</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Koyama</surname>
<given-names>K.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Prediction of population behavior of Listeria monocytogenes in food using machine learning and a microbial growth and survival database</article-title>. <source>Sci. Rep.</source> <volume>11</volume> (<issue>1</issue>), <fpage>10613</fpage>. <pub-id pub-id-type="doi">10.1038/s41598-021-90164-z</pub-id>
</citation>
</ref>
<ref id="B20">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hochachka</surname>
<given-names>W. M.</given-names>
</name>
<name>
<surname>Caruana</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Fink</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Munson</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Riedewald</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Sorokina</surname>
<given-names>D.</given-names>
</name>
<etal/>
</person-group> (<year>2007</year>). <article-title>Data-mining discovery of pattern and process in ecological systems</article-title>. <source>J. Wildl. Manage.</source> <volume>71</volume> (<issue>7</issue>), <fpage>2427</fpage>. <pub-id pub-id-type="doi">10.2193/2006-503</pub-id>
</citation>
</ref>
<ref id="B21">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Jagannath</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Tsuchido</surname>
<given-names>T.</given-names>
</name>
</person-group> (<year>2003</year>). <source>&#x2018;Predictive microbiology: A review&#x2019;, <italic>biocontrol science</italic>
</source>. <publisher-loc>Japan</publisher-loc>: <publisher-name>The Society for Antibacterial and Antifungal Agents</publisher-name>, <fpage>1</fpage>&#x2013;<lpage>7</lpage>. <pub-id pub-id-type="doi">10.4265/bio.8.1</pub-id>
</citation>
</ref>
<ref id="B22">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Jay</surname>
<given-names>J. M.</given-names>
</name>
<name>
<surname>Loessner</surname>
<given-names>M. J.</given-names>
</name>
<name>
<surname>Golden</surname>
<given-names>D. A.</given-names>
</name>
</person-group> (<year>2008a</year>). &#x201c;<article-title>Intrinsic and extrinsic parameters of foods that affect microbial growth</article-title>,&#x201d; in <source>Modern food microbiology</source> (<publisher-name>Springer US</publisher-name>), <fpage>39</fpage>&#x2013;<lpage>59</lpage>. <pub-id pub-id-type="doi">10.1007/0-387-23413-6_3</pub-id>
</citation>
</ref>
<ref id="B23">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Jay</surname>
<given-names>J. M.</given-names>
</name>
<name>
<surname>Loessner</surname>
<given-names>M. J.</given-names>
</name>
<name>
<surname>Golden</surname>
<given-names>D. A.</given-names>
</name>
</person-group> (<year>2008b</year>). &#x201c;<article-title>Protection of foods with low-temperatures, and characteristics of psychrotrophic microorganisms</article-title>,&#x201d; in <source>Modern food microbiology</source> (<publisher-name>Springer US</publisher-name>), <fpage>395</fpage>&#x2013;<lpage>413</lpage>. <pub-id pub-id-type="doi">10.1007/0-387-23413-6_16</pub-id>
</citation>
</ref>
<ref id="B24">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Jim&#xe9;nez-Carvelo</surname>
<given-names>A. M.</given-names>
</name>
<etal/>
</person-group> (<year>2019</year>). <article-title>Alternative data mining/machine learning methods for the analytical evaluation of food quality and authenticity &#x2013; a review</article-title>. <source>Food Res. Int. Elsevier Ltd</source>, <fpage>25</fpage>&#x2013;<lpage>39</lpage>. <pub-id pub-id-type="doi">10.1016/j.foodres.2019.03.063</pub-id>
</citation>
</ref>
<ref id="B25">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kim</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Hong</surname>
<given-names>J. S</given-names>
</name>
<name>
<surname>Yun</surname>
<given-names>K. Y.</given-names>
</name>
<name>
<surname>Han</surname>
<given-names>S. E.</given-names>
</name>
<name>
<surname>Kim</surname>
<given-names>E. S.</given-names>
</name>
<name>
<surname>Kwon</surname>
<given-names>B. S.</given-names>
</name>
<etal/>
</person-group> (<year>2017</year>). <article-title>Laparoscopically assisted suprapubic surgery for adnexal tumors under epidural anesthesia</article-title>. <source>Minim. Invasive Ther. Allied Technol.</source> <volume>98</volume>, <fpage>39</fpage>&#x2013;<lpage>43</lpage>. <pub-id pub-id-type="doi">10.1080/13645706.2016.1223695</pub-id>
</citation>
</ref>
<ref id="B26">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Koh</surname>
<given-names>H. C.</given-names>
</name>
<name>
<surname>Tan</surname>
<given-names>G.</given-names>
</name>
</person-group> (<year>2005</year>). <article-title>Data mining applications in healthcare</article-title>. <source>J. Healthc. Inf. Manag.</source> <volume>19</volume> (<issue>2</issue>), <fpage>64</fpage>&#x2013;<lpage>72</lpage>. <pub-id pub-id-type="doi">10.4314/ijonas.v5i1.49926</pub-id>
</citation>
</ref>
<ref id="B27">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Koutsoumanis</surname>
<given-names>K. P.</given-names>
</name>
<name>
<surname>Kendall</surname>
<given-names>P. A.</given-names>
</name>
<name>
<surname>Sofos</surname>
<given-names>J. N.</given-names>
</name>
</person-group> (<year>2004</year>). <article-title>A comparative study on growth limits of Listeria monocytogenes as affected by temperature, pH and aw when grown in suspension or on a solid surface</article-title>. <source>Food Microbiol.</source> <volume>21</volume> (<issue>4</issue>), <fpage>415</fpage>&#x2013;<lpage>422</lpage>. <pub-id pub-id-type="doi">10.1016/j.fm.2003.11.003</pub-id>
</citation>
</ref>
<ref id="B28">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Koutsoumanis</surname>
<given-names>K. P.</given-names>
</name>
<name>
<surname>Sofos</surname>
<given-names>J. N.</given-names>
</name>
</person-group> (<year>2005</year>). <article-title>Effect of inoculum size on the combined temperature, pH and aw limits for growth of Listeria monocytogenes</article-title>. <source>Int. J. Food Microbiol.</source> <volume>104</volume> (<issue>1</issue>), <fpage>83</fpage>&#x2013;<lpage>91</lpage>. <pub-id pub-id-type="doi">10.1016/j.ijfoodmicro.2005.01.010</pub-id>
</citation>
</ref>
<ref id="B29">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kuroda</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Okuda</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Ishida</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Koseki</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Modeling growth limits of Bacillus spp. spores by using deep-learning algorithm</article-title>. <source>Food Microbiol.</source> <volume>78</volume>, <fpage>38</fpage>&#x2013;<lpage>45</lpage>. <pub-id pub-id-type="doi">10.1016/j.fm.2018.09.013</pub-id>
</citation>
</ref>
<ref id="B30">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Le Marc</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Pin</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Baranyi</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2005</year>). &#x201c;<article-title>Methods to determine the growth domain in a multidimensional environmental space</article-title>,&#x201d; in <source>International journal of food microbiology</source> (<publisher-name>Elsevier</publisher-name>), <fpage>3</fpage>&#x2013;<lpage>12</lpage>. <pub-id pub-id-type="doi">10.1016/j.ijfoodmicro.2004.10.003</pub-id>
</citation>
</ref>
<ref id="B31">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Leistner</surname>
<given-names>L.</given-names>
</name>
</person-group> (<year>2000</year>). &#x201c;<article-title>Basic aspects of food preservation by hurdle technology</article-title>,&#x201d; in <source>International journal of food microbiology</source> (<publisher-name>Elsevier</publisher-name>), <fpage>181</fpage>&#x2013;<lpage>186</lpage>. <pub-id pub-id-type="doi">10.1016/S0168-1605(00)00161-6</pub-id>
</citation>
</ref>
<ref id="B32">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lundberg</surname>
<given-names>S. M.</given-names>
</name>
<name>
<surname>Erion</surname>
<given-names>G. G.</given-names>
</name>
<name>
<surname>Lee</surname>
<given-names>S.-I.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Consistent individualized feature attribution for tree ensembles</article-title>. <comment>Available at:</comment>
<ext-link ext-link-type="uri" xlink:href="https://arxiv.org/abs/1802.03888v3">https://arxiv.org/abs/1802.03888v3</ext-link> (<comment>Accessed: January 21, 2022)</comment>.</citation>
</ref>
<ref id="B33">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Lundberg</surname>
<given-names>S. M.</given-names>
</name>
<name>
<surname>Lee</surname>
<given-names>S. I.</given-names>
</name>
</person-group> (<year>2017</year>). &#x201c;<article-title>A unified approach to interpreting model predictions</article-title>,&#x201d; in <source>Advances in neural information processing systems</source>, <fpage>4766</fpage>&#x2013;<lpage>4775</lpage>. <comment>Available at:</comment>
<ext-link ext-link-type="uri" xlink:href="https://github.com/slundberg/shap">https://github.com/slundberg/shap</ext-link> (<comment>Accessed: September 28, 2021)</comment>.</citation>
</ref>
<ref id="B34">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mangalathu</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Hwang</surname>
<given-names>S. H.</given-names>
</name>
<name>
<surname>Jeon</surname>
<given-names>J. S.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Failure mode and effects analysis of RC members based on machine-learning-based SHapley Additive exPlanations (SHAP) approach</article-title>. <source>Eng. Struct.</source> <volume>219</volume>, <fpage>110927</fpage>. <pub-id pub-id-type="doi">10.1016/j.engstruct.2020.110927</pub-id>
</citation>
</ref>
<ref id="B35">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mataragas</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Rantsioua</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Alessandria</surname>
<given-names>V.</given-names>
</name>
<name>
<surname>CocoLin</surname>
<given-names>L.</given-names>
</name>
</person-group> (<year>2015</year>). <article-title>Estimating the non-thermal inactivation of Listeria monocytogenes in fermented sausages relative to temperature, pH and water activity</article-title>. <source>Meat Sci.</source> <volume>100</volume>, <fpage>171</fpage>&#x2013;<lpage>178</lpage>. <pub-id pub-id-type="doi">10.1016/j.meatsci.2014.10.016</pub-id>
</citation>
</ref>
<ref id="B36">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>McKellar</surname>
<given-names>R. C.</given-names>
</name>
<name>
<surname>Lu</surname>
<given-names>X.</given-names>
</name>
</person-group> (<year>2001</year>). <article-title>A probability of growth model for Escherichia coli O157:H7 as a function of temperature, pH, acetic acid, and salt</article-title>. <source>J. Food Prot.</source> <volume>64</volume> (<issue>12</issue>), <fpage>1922</fpage>&#x2013;<lpage>1928</lpage>. <pub-id pub-id-type="doi">10.4315/0362-028X-64.12.1922</pub-id>
</citation>
</ref>
<ref id="B37">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>McQuestin</surname>
<given-names>O. J.</given-names>
</name>
<name>
<surname>Shadbolt</surname>
<given-names>C. T.</given-names>
</name>
<name>
<surname>Ross</surname>
<given-names>T.</given-names>
</name>
</person-group> (<year>2009</year>). <article-title>Quantification of the relative effects of temperature, pH, and water activity on inactivation of Escherichia coli in fermented meat by meta-analysis</article-title>. <source>Appl. Environ. Microbiol.</source> <volume>75</volume> (<issue>22</issue>), <fpage>6963</fpage>&#x2013;<lpage>6972</lpage>. <pub-id pub-id-type="doi">10.1128/AEM.00291-09</pub-id>
</citation>
</ref>
<ref id="B38">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mohanty</surname>
<given-names>S. D.</given-names>
</name>
<name>
<surname>Lekan</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>McCoy</surname>
<given-names>T. P.</given-names>
</name>
<name>
<surname>Jenkins</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Manda</surname>
<given-names>P.</given-names>
</name>
</person-group> (<year>2022</year>). <article-title>Machine learning for predicting readmission risk among the frail: Explainable AI for healthcare</article-title>. <source>Patterns</source> <volume>3</volume> (<issue>1</issue>), <fpage>100395</fpage>. <pub-id pub-id-type="doi">10.1016/j.patter.2021.100395</pub-id>
</citation>
</ref>
<ref id="B39">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Moncada-Torres</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>van Maaren</surname>
<given-names>M. C.</given-names>
</name>
<name>
<surname>Hendriks</surname>
<given-names>M. P.</given-names>
</name>
<name>
<surname>Siesling</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Geleijnse</surname>
<given-names>G.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Explainable machine learning can outperform Cox regression predictions and provide insights in breast cancer survival</article-title>. <source>Sci. Rep.</source> <volume>11</volume> (<issue>1</issue>), <fpage>6968</fpage>&#x2013;<lpage>7013</lpage>. <pub-id pub-id-type="doi">10.1038/s41598-021-86327-7</pub-id>
</citation>
</ref>
<ref id="B40">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ndraha</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Hsiao</surname>
<given-names>H. I.</given-names>
</name>
<name>
<surname>Hsieh</surname>
<given-names>Y. Z.</given-names>
</name>
<name>
<surname>Pradhan</surname>
<given-names>A. K.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Predictive models for the effect of environmental factors on the abundance of Vibrio parahaemolyticus in oyster farms in Taiwan using extreme gradient boosting</article-title>. <source>Food control.</source> <volume>130</volume>, <fpage>108353</fpage>. <pub-id pub-id-type="doi">10.1016/j.foodcont.2021.108353</pub-id>
</citation>
</ref>
<ref id="B41">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Nguyen</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Long</surname>
<given-names>S. W.</given-names>
</name>
<name>
<surname>McDermott</surname>
<given-names>P. F.</given-names>
</name>
<name>
<surname>Olsen</surname>
<given-names>R. J.</given-names>
</name>
<name>
<surname>Olson</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Stevens</surname>
<given-names>R. L.</given-names>
</name>
<etal/>
</person-group> (<year>2019</year>). <article-title>Using machine learning to predict antimicrobial MICs and associated genomic features for nontyphoidal Salmonella</article-title>. <source>J. Clin. Microbiol.</source> <volume>57</volume> (<issue>2</issue>), <fpage>012600</fpage>&#x2013;<lpage>e1318</lpage>. <pub-id pub-id-type="doi">10.1128/JCM.01260-18</pub-id>
</citation>
</ref>
<ref id="B42">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Nychas</surname>
<given-names>G.-J.</given-names>
</name>
<name>
<surname>Sims</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Tsakanikas</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Mohareb</surname>
<given-names>F.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Data science in the food industry</article-title>. <source>Annu. Rev. Biomed. Data Sci.</source> <volume>4</volume> (<issue>1</issue>), <fpage>341</fpage>&#x2013;<lpage>367</lpage>. <pub-id pub-id-type="doi">10.1146/annurev-biodatasci-020221-123602</pub-id>
</citation>
</ref>
<ref id="B43">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Palaniappan</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Awang</surname>
<given-names>R.</given-names>
</name>
</person-group> (<year>2008</year>). <article-title>Intelligent heart disease prediction system using data mining techniques</article-title>. <conf-name>AICCSA 08 - 6th IEEE/ACS International Conference on Computer Systems and Applications</conf-name>. <publisher-name>IEEE</publisher-name>, <publisher-loc>Doha, Qatar</publisher-loc> <fpage>108</fpage>&#x2013;<lpage>115</lpage>. <pub-id pub-id-type="doi">10.1109/AICCSA.2008.4493524</pub-id>
<comment>31 March 2008 - 04 April 2008</comment>
</citation>
</ref>
<ref id="B44">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Polese</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Del Torre</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Spaziani</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Stecchini</surname>
<given-names>M. L.</given-names>
</name>
</person-group> (<year>2011</year>). <article-title>A simplified approach for modelling the bacterial growth/no growth boundary</article-title>. <source>Food Microbiol.</source> <volume>28</volume> (<issue>3</issue>), <fpage>384</fpage>&#x2013;<lpage>391</lpage>. <pub-id pub-id-type="doi">10.1016/j.fm.2010.09.011</pub-id>
</citation>
</ref>
<ref id="B45">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rodrigo</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Beukes</surname>
<given-names>E. W.</given-names>
</name>
<name>
<surname>Andersson</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Manchaiah</surname>
<given-names>V.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Exploratory data mining techniques (decision tree models) for examining the impact of internet-based cognitive behavioral therapy for tinnitus: Machine learning approach</article-title>. <source>J. Med. Internet Res.</source> <volume>23</volume> (<issue>11</issue>), <fpage>e28999</fpage>. <pub-id pub-id-type="doi">10.2196/28999</pub-id>
</citation>
</ref>
<ref id="B46">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ross</surname>
<given-names>S. R. P. J.</given-names>
</name>
<name>
<surname>Friedman</surname>
<given-names>N. R.</given-names>
</name>
<name>
<surname>Dudley</surname>
<given-names>K. L.</given-names>
</name>
<name>
<surname>Yoshimura</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Yoshida</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Economo</surname>
<given-names>E. P.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Listening to ecosystems: Data-rich acoustic monitoring through landscape-scale sensor networks</article-title>. <source>Ecol. Res.</source> <volume>33</volume> (<issue>1</issue>), <fpage>135</fpage>&#x2013;<lpage>147</lpage>. <pub-id pub-id-type="doi">10.1007/s11284-017-1509-5</pub-id>
</citation>
</ref>
<ref id="B47">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ross</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>McMeekin</surname>
<given-names>T. A.</given-names>
</name>
</person-group> (<year>1994</year>). <article-title>Predictive microbiology</article-title>. <source>Int. J. Food Microbiol.</source> <volume>23</volume> (<issue>3&#x2013;4</issue>), <fpage>241</fpage>&#x2013;<lpage>264</lpage>. <pub-id pub-id-type="doi">10.1016/0168-1605(94)90155-4</pub-id>
</citation>
</ref>
<ref id="B48">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Shehadeh</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Alshboul</surname>
<given-names>O.</given-names>
</name>
<name>
<surname>Al Mamlook</surname>
<given-names>R. E.</given-names>
</name>
<name>
<surname>Hamedat</surname>
<given-names>O.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Machine learning models for predicting the residual value of heavy construction equipment: An evaluation of modified decision tree, LightGBM, and XGBoost regression</article-title>. <source>Automation Constr.</source> <volume>129</volume>, <fpage>103827</fpage>. <pub-id pub-id-type="doi">10.1016/j.autcon.2021.103827</pub-id>
</citation>
</ref>
<ref id="B49">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Skandamis</surname>
<given-names>P. N.</given-names>
</name>
<name>
<surname>Stopforth</surname>
<given-names>J. D.</given-names>
</name>
<name>
<surname>Kendall</surname>
<given-names>P. A.</given-names>
</name>
<name>
<surname>Belk</surname>
<given-names>K. E.</given-names>
</name>
<name>
<surname>Scanga</surname>
<given-names>J. A.</given-names>
</name>
<name>
<surname>Smith</surname>
<given-names>G. C.</given-names>
</name>
<etal/>
</person-group> (<year>2007</year>). <article-title>Modeling the effect of inoculum size and acid adaptation on growth/no growth interface of Escherichia coli O157:H7</article-title>. <source>Int. J. Food Microbiol.</source> <volume>120</volume> (<issue>3</issue>), <fpage>237</fpage>&#x2013;<lpage>249</lpage>. <pub-id pub-id-type="doi">10.1016/j.ijfoodmicro.2007.08.028</pub-id>
</citation>
</ref>
<ref id="B50">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tienungoon</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Ratkowsky</surname>
<given-names>D. A.</given-names>
</name>
<name>
<surname>McMeekin</surname>
<given-names>T. A.</given-names>
</name>
<name>
<surname>Ross</surname>
<given-names>T.</given-names>
</name>
</person-group> (<year>2000</year>). <article-title>Growth limits of Listeria monocytogenes as a function of temperature, pH, NaCl, and lactic acid</article-title>. <source>Appl. Environ. Microbiol.</source> <volume>66</volume> (<issue>11</issue>), <fpage>4979</fpage>&#x2013;<lpage>4987</lpage>. <pub-id pub-id-type="doi">10.1128/AEM.66.11.4979-4987.2000</pub-id>
</citation>
</ref>
<ref id="B51">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Jin</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Che</surname>
<given-names>H.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Prediction of type 2 diabetes risk and its effect evaluation based on the xgboost model</article-title>. <source>Healthc. Switz.</source> <volume>8</volume> (<issue>3</issue>), <fpage>247</fpage>. <pub-id pub-id-type="doi">10.3390/healthcare8030247</pub-id>
</citation>
</ref>
<ref id="B52">
<citation citation-type="web">
<person-group person-group-type="author">
<name>
<surname>Yang</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Mining meta-indicators of university ranking: A machine learning approach based on SHAP</article-title>. <comment>Available at: <ext-link ext-link-type="uri" xlink:href="https://arxiv.org/abs/2111.12526v1">https://arxiv.org/abs/2111.12526v1</ext-link> (Accessed: January 7, 2022)</comment>.</citation>
</ref>
<ref id="B53">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zoabi</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Kehat</surname>
<given-names>O.</given-names>
</name>
<name>
<surname>Lahav</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Weiss-Meilik</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Adler</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Shomron</surname>
<given-names>N.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Predicting bloodstream infection outcome using machine learning</article-title>. <source>Sci. Rep.</source> <volume>11</volume> (<issue>1</issue>), <fpage>20101</fpage>&#x2013;<lpage>20111</lpage>. <pub-id pub-id-type="doi">10.1038/s41598-021-99105-2</pub-id>
</citation>
</ref>
</ref-list>
</back>
</article>