AUTHOR=Rodrigues Leatrice Talita , Goeldner Barbara Sanches Antunes , Mercuri Emílio Graciliano Ferreira , Noe Steffen Manfred TITLE=Tradescantia response to air and soil pollution, stamen hair cells dataset and ANN color classification JOURNAL=Frontiers in Big Data VOLUME=Volume 7 - 2024 YEAR=2024 URL=https://www.frontiersin.org/journals/big-data/articles/10.3389/fdata.2024.1384240 DOI=10.3389/fdata.2024.1384240 ISSN=2624-909X ABSTRACT=Tradescantia plant is a complex system that is sensible to environmental factors like water supply, pH, temperature, light, radiation, impurities and nutrient availability. It can be used as a biomonitor for environmental changes, however the bioassays are time consuming and have a strong human interference factor that might change the result depending on who is performing the analysis. We have developed computer vision models to study colour variations from Tradescantia clone 4430 plant stamen hair cells, which can change be stressed due to air pollution and soil contamination. The study introduces a novel dataset, Trad-204, comprising single-cell images from Tradescantia clone 4430, captured during the Tradescantia stamen-hair mutation bioassay (Trad-SHM). The dataset contain images from two experiments, one focusing on air pollution by particulate matter and another based on soil contaminated by diesel oil. Both experiments were carried out in Curitiba, Brazil, between 2020/2023. The images represent single cells with different shapes, sizes, and colours, reflecting the plant’s responses to environmental stressors. An automatic classification task was developed to distinguishing between blue and pink cells, and the study explores both a baseline model and three artificial neural network (ANN) architectures: TinyVGG, VGG-16, and ResNet34. Tradescantia revealed sensibility to both air particulate matter concentration and diesel oil in soil. The results indicate that Residual Network architecture outperforms the other models in terms of accuracy on both training and testing sets. The dataset and findings contribute to the understanding of plant cell responses to environmental stress and provide valuable resources for further research in automated image analysis of plant cells. The comparison between ANN architectures aligns with previous research, emphasizing the superior performance of ResNet models in image classification tasks. Artificial intelligence identification of pink cells improves the counting accuracy, thus avoiding human errors due to different colour perceptions, in addition speeding up the analysis process. Overall, the study offers insights into plant cell dynamics and provides a foundation for future investigations, as well as biomonitoring being an important tool for political discussions, being a relevant issue in risk assessment and thedevelopment of new public policies relating to the environment.