AUTHOR=Zuo Zhiyu , Mu Jindong , Li Wenjie , Bu Quan , Mao Hanping , Zhang Xiaodong , Han Lvhua , Ni Jiheng TITLE=Study on the detection of water status of tomato (Solanum lycopersicum L.) by multimodal deep learning JOURNAL=Frontiers in Plant Science VOLUME=Volume 14 - 2023 YEAR=2023 URL=https://www.frontiersin.org/journals/plant-science/articles/10.3389/fpls.2023.1094142 DOI=10.3389/fpls.2023.1094142 ISSN=1664-462X ABSTRACT=With the rise of precision agriculture, precise irrigation is also proposed. It is necessary to optimize irrigation management for the growth and development of crops. Especially, the water status of crops plays an essential role in achieving on-demand irrigation and reducing the water used in agriculture. The objective of this study was to investigate the detection of tomato (Solanum lycopersicum L.) water status, and a detection method of tomato water status based on deep multimodal learning was proposed. Five irrigation levels were set to cultivate tomatoes in different water states, with irrigation amounts of 150%, 125%, 100%, 75%, and 50% of reference evapotranspiration calculated by a modified Penman-Monteith equation, respectively. The water status of tomatoes was divided into five categories: severely irrigated deficit, slightly irrigated deficit, moderately irrigated, slightly over-irrigated, and severely over-irrigated. RGB images, depth images and NIR images of the upper part of the tomato plant were taken as data sets. The data sets were used to train and test the tomato water status detection models built with single-mode and multimode deep learning networks, respectively. In the single-mode deep learning network, two CNNs, VGG-16 and Resnet-50, are trained on a single RGB image, a depth image, or a NIR image for a total of six cases. In the multimodal deep learning network, two or more of the RGB images, depth images and NIR images are trained with VGG-16 or Resnet-50, respectively, for a total of 20 combinations. The experimental results showed that the accuracy of tomato water status detection based on single-mode deep learning ranged from 88.97% to 93.09%, while the accuracy of tomato water status detection based on multimode deep learning ranged from 93.09% to 99.18%. The multimodal deep learning significantly outperformed the single-modal deep learning. The optimal tomato water status detection model is built using a multimodal deep learning network with ResNet-50 for RGB images and VGG-16 for depth and NIR images.