A review of artificial intelligence in marine science

Utilization and exploitation of marine resources by humans have contributed to the growth of marine research. As technology progresses, artificial intelligence (AI) approaches are progressively being applied to maritime research, complementing traditional marine forecasting models and observation techniques to some degree. This article takes the artificial intelligence algorithmic model as its starting point, references several application trials, and methodically elaborates on the emerging research trend of mixing machine learning and physical modeling concepts. This article discusses the evolution of methodologies for the building of ocean observations, the application of artificial intelligence to remote sensing satellites, smart sensors, and intelligent underwater robots, and the construction of ocean big data. We also cover the method of identifying internal waves (IW), heatwaves, El Niño-Southern Oscillation (ENSO), and sea ice using artificial intelligence algorithms. In addition, we analyze the applications of artificial intelligence models in the prediction of ocean components, including physics-driven numerical models, model-driven statistical models, traditional machine learning models, data-driven deep learning models, and physical models combined with artificial intelligence models. This review shows the growth routes of the application of artificial intelligence in ocean observation, ocean phenomena identification, and ocean elements forecasting, with examples and forecasts of their future development trends from several angles and points of view, by categorizing the various uses of artificial intelligence in the ocean sector.


Introduction
The study and exploration of the oceans began centuries ago, initially for commercial and military purposes. It subsequently developed into a systematic discipline belonging to a large and important branch of Earth sciences. Over the centuries, human beings have exploited and utilized marine resources and studied marine application technologies, providing an effective means for human understanding of the marine world with multiple sources and scales in spatial information. Research in marine science is inseparable from advanced research methods. We review the development and current research of AI techniques in ocean observation, ocean element forecasting, and numerical models. The pitfalls of poor interpretability of AI techniques have led more scholars to study AI techniques combining physical information (Schneider et al., 2022).
Frontiers in Earth Science 01 frontiersin.org With the advancement of technology and the advent of big data in recent years, the storage volume of Earth system data has far exceeded tens of petabytes (Agapiou, 2017). We require rapid and timely analysis and processing of the massive data of ocean observation and simulation to construct accurate and real-time forecast data in a shorter period. Emerging technologies such as AI are gradually being applied to research and development in the field of ocean, which to a certain extent, can complement and assist the traditional numerical forecasting models in the weak links of traditional ocean forecasting. For some prediction and forecasting problems, classical mathematical models and traditional ocean theories cannot be easily described accurately, especially for the regional ocean and climate mechanisms humans have not yet mastered (Yu, 2021). With the increase in the amount of information acquired by the ocean, data-driven AI technology may, in turn, be a strong point in these fields. How to use ocean data-driven AI has become a vital issue in ocean AI (Chengcheng and Ge, 2018).
Ocean observation is the foundation for the study and use of the ocean, which has a deep historical foundation, and in recent decades remote sensing technology (Boukabara et al., 2019) and sensor network technology (Lu et al., 2019) have developed rapidly. Chengcheng and Ge (2018) describe the development of ocean big data acquisition, analysis, and application. Ocean observation technology, driven by ocean big data, has gradually been combined with big data enabling technology, blockchain platform technology, and AI application technology, leading to the research and development of smart ocean technology . This technology uses AI to combine satellite remote sensing data, sensor networking, and traditional robotics to make intelligent ocean observations. The ocean data generated by observations inevitably produce errors, and AI algorithms can reconstruct the data.
Many physical phenomena in the ocean have a critical impact on the Earth's climate, marine ecology, and human activities, e.g., internal waves, heatwaves, and eddies. Traditional methods are based on physical knowledge using satellite data for identification. The development of AI has brought new ideas for the application of ocean data. Applying the new technology in ocean phenomena identification proves its sufficient accuracy and identification speed. In this paper, we mainly sort out the AI identification methods of internal waves, heatwaves, the ENSO phenomenon, and sea ice using ocean remote sensing data.
Numerical and statistical models are essential tools for forecasting traditional ocean elements. By complementing and synergizing the technology in AI with traditional ocean theory and numerical forecasting, the effect and value of data-driven prediction forecasting can effectively compensate for the uncertainty factor of statistical methods of numerical forecasting. Ocean big data AI has become a new bridge between ocean data and technology, playing an essential role in the detection and simulation of ocean currents, sea ice, and seafloor targets . Although deep learning models show their great merits, they still have problems such as poor interpretability and traditional prediction methods not constrained by physical laws. Therefore, more and more researchers are now combining physical information constraints with AI techniques to form physics-driven AI methods, which can better solve scientific problems using physical laws for model modeling.
The AI techniques in the ocean domain have also been analyzed in relevant research reports. Sonnewald et al. (2021) focuses on three branches covering the field: observations, theory, and numerical modeling, discussing the historical background and the application of ML theory to oceanographic exploration advances, model error, bias correction, and an emphasis on current and potential uses in data assimilation. However, today's machine learning techniques still encounter development bottlenecks. Their interpretability remains insufficient to allow purely data-driven models to be trusted, so hybrid models that combine machine learning with physical modeling may have better scope for future development (Reichstein et al., 2019;Schneider et al., 2022). The research on physically driven machine learning has not yet matured, so the above reviews provide relatively limited descriptions of this aspect.
In this survey, we demonstrate the development route of ocean observation, identification of ocean processes, and ocean element forecasting by merging numerous instances of AI applications in ocean field technology based on the AI algorithm model. Alongside these instances, we concentrate on the specialized application of AI technology to large data sets. In addition, the article illustrates the considerable study effort on AI models in the fields of ocean element forecasting and ocean phenomena identification, as well as the research directions that have evolved in recent years to combine AI with physical modeling concepts. We have provided proper nouns and abbreviations throughout the text in the Supplementary Materials.

The application of AI in ocean observation
Ocean observation is the basis for studying the ocean, developing the ocean, and using the ocean. Ocean observation technologies such as buoys, remote sensing, and drones have also promoted the development of marine science. Ocean observation capability also reflects a country's comprehensive national power. Through the observation and investigation of the ocean, human beings can more fully understand and use the laws of the ocean so that they can better carry out their productive lives. This chapter is mainly divided into two parts to elaborate. See Figure 1. The papers covered in this section are listed in Tables 1, 2.

History of ocean observation
The first global observations of the ocean date back to the Challenger expedition in 1873 AD (Sonnewald et al., 2021), which first studied the global distribution of ocean depth, temperature, and salinity and revealed the three-dimensional structure of the ocean. This expedition also significantly increased interest in the oceans. Countries worldwide have sent ocean exploration vessels to conduct regional or global ocean exploration surveys. By the 1940s, humanity introduced radar precision navigation systems for submarine detection in World War II, which also brought a revolution in ocean observation. In the second half of the 20th century, several countries launched many satellites for ocean observation, such as the first artificial satellite launched by the Soviet Union in 1957 and the Seasat satellite launched by the United States Frontiers in Earth Science 02 frontiersin.org

FIGURE 1
The Application of AI in Ocean Observation. First, the history of ocean observation is presented, followed by a discussion of the inadequacies of conventional ocean observation methods, the strategies for merging ocean observation with artificial intelligence, and then the reconstruction of data.

Observation technology Content or methods Papers
in 1978 (Stewart, 1988), which also contributed to the development of ocean observation. In 1985, the World Climate Research Program (WCRP) launched its first international cooperative project, the Tropical Ocean and Global Atmosphere (TOGA). This project focuses on sea-air interactions on seasonal to interannual time scales, mainly focusing on the equatorial eastern Pacific Ocean where ENSO occurs. In this project, observations are made by a three-dimensional network of satellite systems, aircraft, survey ships, ground-based sounding stations, moored buoys, and drifting buoys (Lin and Yang, 2020). In the 21st century, several atmospheric and oceanic scientists proposed the Argo project, which opened a new era in ocean environment observation, establishing an ocean Frontiers in Earth Science 03 frontiersin.org

Methods Papers
Traditional reconstruction methods Dynamic optimal interpolation method Ma and Jing (2004) Data interpolation empirical orthogonal function Liu and Wang (2018) Improved empirical orthogonal function for data interpolation Ping et al. (2016) Reconstruction methods combined with AI super-resolution convolutional neural network Ducournau and Fablet (2016) machine learning-based IFBN model Li et al. (2021) evolutionary product unit neural network Durán-Rosal et al. (2016) a machine learning-based model Park et al. (2019) observation network and placing many automatic detection buoys into the ocean. In 2002, China officially announced its participation in the Argo program. After nearly 20 years of effort, the Argo Ocean Observing Network, consisting of more than 3,000 Argo profiling buoys, has been built worldwide. It provides continuous information on ocean temperature, depth, salinity measurements, and current velocities up to 2,000 m. It is available for free research and application by scientists worldwide. As early as the 17th century, studies have shown that some surveyors began to make observations of sea level. The first devices used to observe sea level were graduated rods, often called "tide rods, " fixed in a position where the observer could read the instantaneous height of the sea level at any time (Wöppelmann et al., 2006). The point tide gauge was invented in the 1970s and uses a measuring device to measure the tide height and then record the data through computer processing. This instrument has the advantages of high automation, high accuracy, and ease of use, so it has been widely used.
Since the 19th century, many atmospheric pressure sensors have been used for ocean observations. Sea level pressure data are used to calculate trends for diagnosing climate models and constructing climate indices. For example, the tropical Pacific SLP field defines multivariate ENSO indices (Wolter and Timlin, 2011). Sea level pressure data also contribute directly or indirectly to many reanalysis products used in climate assessments. Climate-related changes in mean atmospheric load or SLP correspond to sea level changes of about 1 cm for a 100 hPa difference, the so-called inverse barometer effect (Wunsch and Stammer, 1997). Sea level pressure observations can also be used to assess changes in the frequency and intensity of temperate storms and to monitor and predict monsoonal changes and trends in extreme weather.

Defects of traditional observation methods
However, traditional ocean observation methods have several limitations. Drifters used for surface observations have financial difficulties in deployment, there are still limitations in distribution, the Arctic is mainly devoid of drifting buoys, maintaining global distribution is complex, and achieving this goal requires the cooperation of multiple countries (Centurioni et al., 2019). There are also some problems with shipboard observation techniques, namely the structure of the ship itself and the placement of its sensors can make errors when measuring data (Berry et al., 2004;Popinet et al., 2004). Although moored buoys can provide a variety of realistic, high-quality ocean observations (Bourras, 2006), they cannot achieve global coverage due to maintenance costs and deployment problems.
In recent decades, AI has become popular, and more and more fields are using AI technology, and the ocean is no exception.

Technologies of ocean observation
AI was first introduced in 1956 (Spector, 2006). At first, it was only intended to use some mechanical tools to replace humans in doing some simple and tedious tasks. In the 21st century, with the accumulation of global ocean quasi-real-time 3D observation data, the growth rate of ocean data has accelerated by 40%, which has given rise to the growing trend of ocean big data (Overpeck et al., 2011). Various countries have increasingly recognized the importance of big data in social development and national core competitiveness (Chengcheng and Ge, 2018). With the rapid development of computer hardware and software, computer capacity and computing power have been much improved, and the cost has been reduced significantly. AI-related technologies such as machine learning, deep learning, and pattern recognition have advanced rapidly, bringing a boon to the marine field. The smart ocean combines AI and the ocean, the deep development of ocean information, and the nervous system to understand and plan the ocean . Making full use of ocean big data can help human research to achieve better development in coping with climate change, protecting the ecological environment, and preventing natural disasters (Rasouli et al., 2012;Kim et al., 2014;Deo and Şahin, 2015;Rosso et al., 2020;Lou et al., 2021;Syeed et al., 2022).

Remote sensing technique
Satellite remote sensing technology is also a way of ocean observation and constitutes the essential modern ocean technology innovation (Liu et al., 2017). Figure 2 shows the acquisition process of remote sensing data. Traditional processing of satellite remote sensing data is highly dependent on personnel with specialized knowledge. It requires manual processing and interpretation by combining satellite remote sensing data characteristics, actual conditions, and ocean expertise. It is time-consuming and challenging to guarantee accuracy. Deep learning of remote sensing Remote sensing data acquisition process. Ocean data is taken as input, and the remote sensing information is finally acquired through pre-processing, geometric correction, image enhancement, image crop, image mosaicing, and color grading turn.

FIGURE 3
The Identification of Ocean Phenomena. This section describes four ocean phenomena (internal waves, heatwaves, ENSO, and sea ice).
has flourished with the technological development of AI and the open sharing of remote sensing data sets. Various forms of AI, including machine learning, have been used successfully on many satellite remote sensing problems (Haupt et al., 2008;Hsieh, 2009;Krasnopolsky, 2013;Ball et al., 2017;Boukabara et al., 2019).
Remote sensing data are fundamental to ocean prediction, revealing new phenomena at critical spatial and temporal scales previously unavailable using in situ observational data alone (Liu et al., 2017). These data can provide rich data fuel for datadriven deep learning, and data-driven deep learning provides a promising avenue for making full use of ocean remote sensing big data. This win-win situation has also led several research scholars abroad to conduct exploratory research around AI-based satellite remote sensing applications, proposing new ideas, methods, and techniques. Li et al. (2020) reviewed the application of supervised classification and target detection based on deep learning networks in extracting several typical ocean phenomena from ocean remote sensing images, such as ocean internal wave detection, coastal flood detection, and sea ice detection. Ducournau and Fablet (2016) Ducournau and Fablet used a deep learning-based image super-resolution model to address the reduction of sea surface temperature data derived from ocean remote sensing. Du et al. (2019) developed the DeepEddy deep learning model to solve the problem of ocean eddy detection using synthetic aperture radar (SAR) image data. Yuan et al. (2020) presented the main results and problems of AI techniques in environmental remote sensing, detailed the traditional neural network and deep learning network structures and the applications of both methods in marine as well as other aspects, and pointed out the technical bottlenecks of AI in combining physical model simulation, incorporating geographic laws and small sample-based and migration learning. Ghaffarian et al. (2021) introduced a remote sensing image processing method based on a deep learning attention mechanism, which can improve the overall accuracy when using deep learning methods for remote sensing image classification, image segmentation, change detection, and target detection. Therefore, combining AI with satellite remote sensing for ocean observation big data is vital for the development and progress of the ocean field, remote sensing technology, and AI field.
Frontiers in Earth Science 05 frontiersin.org

Smart sensor networking
The so-called ocean intelligent sensors also apply AI technology to the sensors. These sensors also play a tremendous auxiliary role in ocean observation. With many advantages, such as small size, easy networking, and resistance to electromagnetic interference, it is also expected to become an auxiliary measurement tool in ocean observation. Howe et al. (2010) developed an intelligent sensor network that combines many of the fundamental elements of ocean observing systems, embeds a data assimilation framework, and facilitates adaptive sampling and calibration. Cater et al. (2009) define a new generation of smart ocean sensors that can be located and identified on the World Wide Web. They demonstrate "plug-and-play" interoperability in the field and provide data that can be shared, processed, and presented to end users across many disciplines and applications. Sensors are primarily deployed in networks in the ocean to facilitate ocean observation. An ocean sensor network is an underwater wireless sensor network that typically consists of a variety of ocean sensors, autonomous underwater vehicles, surface research vessels, and in some cases, coastal radars and large gliders. Different types of underwater devices in such a network can communicate through underwater communication technologies to form an underwater wireless network.
In contrast, marine sensors can form various sensing and detection tasks for marine applications. Luo et al. (2022) proposed a software-defined intelligent wireless sensor network to solve ocean monitoring problems such as sea surface and subsea monitoring. Marine sensor networks have a wide range of applications and a high potential for ocean observation and prediction.

Intelligent robots
Combining traditional robots with AI and using them for ocean observation has also given birth to many intelligent robots for ocean observation, such as underwater robots fish, marine robots, and oil drilling robots. Traditional robots can be called programmable machines and perform a series of operations and instructions automatically or semi-automatically. With the addition of AI technology, these robots can automate and learn independently.  designed a gliding robot fish with multi-linked fins in a BCF-like mode to exploit marine resources and monitor oil rigs. This robotic fish can swim flexibly, glide efficiently in three dimensions, and analyze gliding motion and fishlike swimming behavior. Yoerger et al. (2021) designed a Mesobot intelligent underwater robot for observation in the ocean twilight zone with target detection, classification, and tracking of underwater animals.

The reconstruction of ocean big data
For scientific research in the ocean, two characteristics of ocean big data: accuracy and spatiotemporal continuity, are crucial. The actual process of generating ocean data from ocean observations leads to missing or unavailable data due to errors in observation equipment, satellites, weather, and other factors, which then involves the problem of data reconstruction (Sun et al., 2018). Interpolation algorithms are usually used to achieve the reconstruction of ocean data. Ma and Jing (2004) used a dynamic optimal interpolation method to assimilate sea surface temperature (SST) data from the Bohai Sea in July, combined with the shelf sea model (HAMSOM) for validation. Liu and Wang (2018) used data interpolation empirical orthogonal functions to perform the missing VIIRS ocean color dataset. Ping et al. (2016) proposed an improved data interpolation empirical orthogonal function algorithm to solve the problem of missing values of spatiotemporal sea surface temperature data.
However, the interpolation process leads to the loss of important information, resulting in large data reconstruction errors and posing a huge challenge for traditional data reconstruction methods. The emergence of AI also alleviates this problem very well. Among the methods of AI, deep learning can learn complex models from a large amount of sample data and can control the model's efficiency.

The identification of ocean phenomena
This chapter is mainly divided into four parts to elaborate (see Figure 3). The significant papers covered in this section are listed in the Table 3.

Internal waves
Ocean internal waves are a wave phenomenon that occurs in densely stable and stratified oceans. Due to differences in temperature and salinity, this phenomenon usually occurs in densely stratified bodies of water (Zheng et al., 2021b). Large amplitudes, long crests, and long propagation distances characterize internal waves (IW) . Internal waves occur at all ocean depths and differ from other waves in that they play an essential role in transmitting the energy of mesoscale and large-scale motions (Dong et al., 2022a). In addition, internal waves cause disturbances at the ocean's surface, which can cause problems for maritime transport. Internal waves have more energy, which can modify the atmosphere, while the critical role of internal waves is in ocean acoustics, ocean mixing, marine engineering, and submarine navigation Vasavi et al., 2021). As a result, internal waves have attracted extensive research interest.
Scientists have long recognized the potential of using satellite imagery to study IW. Satellite images can compensate for in situ observations to study the generation, propagation, evolution, and dissipation of IW. SAR is an active sensor that measures the roughness of the sea surface. It is unaffected by cloud cover and can image the sea surface at a spatial resolution of 1 m to tens of Frontiers in Earth Science 06 frontiersin.org

Ocean phenomena Methods Papers
Internal waves SegNet-based internal wave segmentation algorithm Zheng et al. (2021a) DL model with inverting internal wave amplitudes Pan et al. (2018) R-CNN model learning from multiple data sources Zheng et al. (2022) Heatwaves decision trees and random forests Asadollah et al. (2022) multi-task fully connected neural network meters in all-weather, day, and night conditions . In principle, the internal wave-induced flow will interact with the sea surface and modulate the distribution of small slopes of the character, so the optical sensor receives the sunlight reflected by those modulated small slopes due to the mirror reflection. As a result, internal waves will appear to be bright and dark bands on the optical remote-sensing images. Therefore, the parameters of internal waves are indirectly reflected by the stripes on visual remote-sensing images (Pan et al., 2018). In layperson's terms, internal oceanic waves appear as irregular streaks of alternating light and dark in SAR images. This feature makes SAR graphics the primary means and means by which scientists study internal waves.
Nowadays, machine learning is developing rapidly and is also being used to solve some of the problems in the ocean. It shows powerful advantages, for example, in studying internal waves. The general flow is shown in the Figure 4. As mentioned above, internal oceanic waves appear as irregular streaks of alternating light and dark in SAR images. However, this feature can easily be confused with other similar oceanic phenomena. In addition, during the polarity transition of internal waves, the separation distance between the light and dark bands becomes wider (Zheng et al., 2021a). Segmenting the maritime internal wave fringes is necessary to determine the position of internal waves in SAR images. A considerable amount of research has been devoted to segmenting oceanic internal wave streaks in SAR images to obtain the role of internal waves in the ocean. Zheng et al. (2021a) proposed a SegNet-based oceanic internal wave streak segmentation algorithm, which can obtain the positions of internal waves. The results show that the method can identify whether the SAR image contains internal waves, get the respective roles of light and dark stripes in the SAR image, and can accurately determine the relative parts of light and dark stripes. Zheng et al. (2021b) used support vector machine (SVM) to classify SAR images to obtain images containing internal waves. Then, the Canny edge detection method was used to detect and identify the oceanic inner wave streaks in the SAR images. These streaks were filtered by three parameters, namely length, area ratio, and direction, and finally, the positions of the oceanic internal wave streaks were obtained. Zheng et al. (2022) proposed a Mask RCNN-based algorithm for the segmentation of internal waves. The results show that the proposed method can identify the presence or absence of internal waves and obtain the respective positions of light and dark stripes in the image. In addition, based on the relative positions of the identified light and dark stripes, the time of polarity transition of internal oceanic waves can be further determined. It was found that there is another category of research dedicated to the inversion of some parametric information of internal waves. For example, Vasavi et al. (2021) combine numerical and machine learning methods to perform noise removal. Then The data enhancement of SAR maps using convolutional neural network (CNN), followed by segmentation and feature extraction of wave parameters such as frequency, amplitude, longitude, and latitude using U-Net. Finally, for the modeling of internal waves, the Korteweg-de Vries (KdV) solver is used to take the internal wave parameters as input and give the velocity and density maps of the internal waves. Zhang et al. (2022) developed an AI-based wave amplitude inversion model using laboratory experiments and in situ satellite observations using a migration learning approach. The results are more accurate than the conventional KdV equation. Pan et al. (2018) introduced a deep learning model to invert the internal wave amplitudes based on many optical remote sensing images to investigate the relationship between internal wave amplitudes and the characteristic parameters of remote sensing images. The inversion results are in good agreement with the observed data. While the above studies have almost always used a single data source, Drees et al. (2020) uses a multimodal deep learning algorithm to obtain information from multiple data sources. The authors utilize a multimodal neural network approach, SONET, which is trained jointly on two modalities, radiometric full-resolution image (OLCI) and Water data product (SRAL). A joint representation of the two modalities is then obtained and compressed by the fully connected (FC) layer until the output layer returns a classification result. Good results in internal wave recognition were achieved. His multimodal approach may become a hot topic for future internal wave research.

Heatwaves
Heatwaves are a major cause of weather-related deaths (Robinson, 2001). Heatwaves become increasingly severe, longlasting, and recurrent as global temperatures rise (Asadollah et al., 2022). In recent years, heatwaves have received widespread attention for their wide-ranging impacts on human health, ecosystems, agriculture, and the economy (Gao et al., 2018). The prediction of heatwaves is the primary task in heatwaves research. Defining a heatwave is one of the main obstacles to heatwaves prediction and analysis. Heatwaves are usually defined according to their location and timing (Perkins and Alexander, 2013;You et al., 2017). The most common definition is the accumulation of excessive sensible heat resulting in a heat load (Sanderson et al., 2017;You et al., 2017). Based on this heat load concept, Khan et al. (2019a); Perkins-Kirkpatrick and Gibson (2017) uses different temperature thresholds and periods to define heatwaves.
Heatwaves are usually predicted using temperature prediction models, mainly classified as statistical or dynamic models. Kinetic models rely on physical interactions between the ocean, atmosphere, and land to develop predictive models, which makes model development computationally intensive. As a result, statistical models are widely used in developing heatwaves prediction models (Perkins-Kirkpatrick and Gibson, 2017;Gao et al., 2018). Machine learning has a huge advantage in learning the complex non-linear interactions of heatwaves by and with large-scale atmospheric variables. Today, many studies are using machine learning to predict heatwaves, and all have achieved good results. Asadollah et al. (2022) develops a physical empirical model using two classical machine learning algorithms, decision trees (DT) and random forests. It employs a novel hybrid technique of Ada-Boost regression and decision trees (ABR-DT) to predict the annual number of heatwaves days. The annual variability of HWD is effectively modeled, and experiments show that the model can be used to forecast heatwaves years. Iglesias et al. (2015) has developed a multi-task deep, fully connected neural network for predicting heatwaves trained on historical time series data. Experimental studies have shown that the neural network is a generative method that can be applied to heatwaves or various other climate problems. Chattopadhyay et al. (2020) proposes a data-driven extreme weather prediction framework based on simulated predictions of novel deep learning pattern recognition techniques Capsule Networks (CapsNet) that can achieve 80% accuracy for heatwaves. Khan et al. (2019b) presents a statistical model called quantile regression forest (QRF) for predicting heatwaves with different time-lags in Pakistan using weather climate variables. The study demonstrates the strength of the QRF model in predicting conditional quartiles, which helps explain some of the extreme temperature behavior. Khan et al. (2021) uses machine learning (ML) algorithms such as SVM, random forests, and artificial neural networks to develop a climate change resilient heatwaves prediction mode that has been shown to provide reliable predictions under climate change scenarios. Jung et al. (2020) used convolutional long-term shortterm memory (ConvLSTM) to predict SST in Korea's South China Sea region for up to 7 days. The study also examined anomalously high SST predictions based on three ocean heatwaves categories (i.e., warning, caution, and watch out). The study shows that ConvLSTM can successfully predict ocean heatwaves up to 5 days in advance.

ENSO
ENSO is currently the world's largest coupled sea-air model, occurring in the equatorial central and eastern Pacific Ocean, Frontiers in Earth Science 08 frontiersin.org influencing climate around the world and significantly impacting ecological and agricultural development. ENSO can not only change the state of the Pacific Ocean and atmosphere but also have a significant impact on global climate , precipitation (Ropelewski and Halpert, 1987), and ecosystems in remote areas (Adams et al., 1999). Therefore, the analysis and prediction of ENSO are critical. However, traditional analytical models face challenges due to insufficient data, spring predictability barriers (SPB), and model uncertainty. To address these issues, researchers have begun to apply AI techniques to ENSO studies to explore the impact of ENSO on extreme global climate change.With the rapid development of AI techniques, more and more researchers are trying to apply them to the analysis and prediction of ENSO. Nooteboom et al. (2018) used a hybrid model of classical linear statistical methods with autoregressive integrated moving average (ARIMA) (Box and Pierce, 1970) and artificial neural network (ANN) methods. The hybrid model gives slightly better prediction results than the traditional numerical model, using the potential of machine learning (ML) to overcome SPB (Guckenheimer et al., 2017). Mekanik and Imteaz (2012)  ConvLSTM outperforms almost all running models for the prediction of ENSO events when considering post-spring predictions. By combining LSTM time series modeling features and multidimensional data processing properties of CNN, Gupta et al. (2020) successfully overcame SPB using the ConvLSTM model to achieve prediction of Niño3.4 index monthly averages 12 months in advance. Ye et al. (2021) proposed a parallel deep convolutional neural network, i.e., MS-CNN, which could not accurately predict the Niño3.4 index of strong ENSO, and the prediction error increased with time. However, the model was more accurate than other models in predicting ENSO in spring.

Sea ice
Polar sea ice is a sensitive indicator of global climate change. Information on sea ice types is essential for ship navigation and predicting polar climate change. However, due to the large size and harsh environment of the polar regions, most polar regions are difficult to access, and the cost of fieldwork is very high. Ice extent and thickness are decreasing throughout the Arctic, and as the ice melts, pockets of ice become more mobile, allowing hazards such as ice floes to disperse. Sea ice retreat, especially in the Arctic, is one of the essential measures to address global climate change. Therefore, sea ice cover and concentration are essential for conducting climate change studies, polar navigation, and successful offshore operations.
SAR has proven to be an ideal remote sensing technique for generating detailed sea ice information because of its inherent ability to image surfaces at a high resolution independent of daylight and weather conditions. In addition, its polarization capability allows SAR to respond differently to sea ice types and open water. Since the launch of the first civilian SAR instrument, Seasat, in 1978, polar sea ice monitoring has been the primary mission of satellite-based SAR satellite operations. In order to process a large amount of available data in real-time, automated methods are needed to detect ice floes. Applying machine learning algorithms to sea ice classification has long been a focus of interest. Features are manually selected and extracted, and then, these features are fed into traditional machine learning algorithms, such as SVM (Leigh et al., 2013;Liu et al., 2014;Zakhvatkina et al., 2017), random forest (RF) algorithms (Tan et al., 2018), or artificial neural networks (Ressel et al., 2015). Automatic sea ice segmentation (ASIS) (Soh and Tsatsoulis, 1999), developed in 1999, combines image processing, data mining, and machine learning to segment unpolarized SAR images automatically. Another sea ice classification algorithm (Scheuchl et al., 2005) uses the Wishart algorithm to classify fully polarized single-and dualfrequency SAR data from sea ice and achieves good results. Kim et al. (2015) used decision tree and random forest machine learning methods to map land-fixed sea ice in Antarctica. The performance of traditional machine learning depends heavily on selecting features. In contrast, data-driven deep learning models with the ability to extract features automatically have come to the forefront in recent years.
CNN is a classical deep learning model, but its low-layer features perform poorly in sea ice detection tasks. Therefore, Gao et al. (2019) introduced multilayer feature fusion to exploit the complementary information between low, medium, and high-level feature representations and proposed a transferred multilevel fusion network (MLFN) model to achieve stronger feature extraction. Chen et al. (2017) proposed a combination of CNN and CRF to solve the problem of poor CNN localization. In addition, Wang and Li (2021) also stacked multiple U-Net models with different specializations for the sea ice segmentation task based on a multi-feature fusion approach, which has higher accuracy than any individual classifier. Cooke and Scott (2019) used a DenseNet model to estimate sea ice concentration by training on a dataset containing SAR images. Based on this, Kruk et al. (2020) proposed a new algorithm for predicting sea ice development stages based on deep learning concepts, using a combination of SAR images and CIS ice maps to create labeled datasets, completing a comparative analysis of DenseNet and U-Net performance.

Ocean element forecasting
This chapter is mainly divided into four parts to elaborate (see Figure 6). The papers covered in this section are listed in the Table 4.

Numerical models and statistical models
Numerical ocean models have played an essential role in ocean observation and ocean forecasting since their inception in the last century. More than 40 ocean data models are now available, including numerical ocean models for different seas and different ocean sciences.
Countries generally establish such numerical ocean models. Such as the HYCOM model applied globally by the National Centers for Environmental Prediction (NCEP) of the United States; the LIM2 model of the United Kingdom Met Office; the CONCEPTS model of Canada; and the MRI.COM model of the Japan Meteorological Agency, which has developed the Global Ocean forecasting system and regional nested high-resolution ocean forecasting system. For marine disasters and early warning forecasting, the United States, Japan, and other countries have developed the GFDL model, T213L3 model, etc. Operational forecasting systems based on marine ecodynamics have been opened internationally for marine ecological environment forecasting. The European Regional Seas Ecosystem Model (ERSEM) contains several refined cyclic processes that can characterize the vital biochemical processes in the shelf sea ecosystem. This model, coupled with the hydrodynamic model, has been operationalized at the Met Office in the United Kingdom. Furthermore, it can provide ecosystem health, water quality monitoring, harmful algal bloom prediction, and other product services in the northwest European shelf sea. The model, coupled with a hydrodynamic model, is operational at the Met Office in the United Kingdom and can provide ecosystem health, water quality monitoring, harmful algal bloom prediction, and other product services in the NW European shelf sea (Yu, 2021).
Numerical models are characterized by physical law constraints and require high computational and time costs to compute large models, resulting in a lack of ease of use and convenience.

Model-driven statistical models and traditional machine learning models 4.2.1 Statistical models
Numerical prediction models have always occupied the main position of marine environmental forecasting due to the single means of observation, the short age of observation data, and the small sample size of data in the early days. Numerical models are characterized by physical constraints and require high computational and time costs to compute large models, making them less easy to use. As an adjunct to numerical models, statistical models have also been helpful in ocean data observation. Physical  (2015) Callens et al.
Amorim et al.

FIGURE 6
Ocean element forecasting. Four components comprise the AI approach to ocean element forecasting: physics-guided numerical models, model-guided statistical and traditional machine learning models, data-guided deep learning models, and the combination of physical models and AI. Each section describes the most prevalent model algorithms and model optimization techniques in the area of ocean element forecasting.
laws do not constrain statistical models, are more concerned with correlations among ocean data, and play more freely. Statistical models used more often today include support vector regression (SVR), Markov models, etc. The autoregressive moving average (ARMA) model, the autoregressive (AR) model, and the autoregressive integrated moving average model (ARIMA) models are classical time series models. The effective wave height of the Portuguese coastline region is the prediction work using AR models (Soares et al., 1996;Soares and Cunha, 2000). Agrawal and Deo (2002) used ARIMA and ARMA models to predict waves at multiple intervals along the Indian coast. Although classical time series models are well adapted, they do not apply to complex ocean conditions due to their assumption of linearity and smoothness. Effective wave heights are generally non-linear and non-stationary, so classical time series cannot accurately predict non-linear and non-stationary waves.
The principle of the SVR model is to find the optimal hyperplane to minimize the distance from all data points in the sample to the hyperplane. In the prediction study of chlorophylla, Amorim et al. (2021) evaluated and compared SVR, MLP, RF, and ARIMA algorithms. Comparing the algorithms, SVR reached the best R2 (0.78) and RMSE (1.113 µg L1), however, these were only slightly better results (MLP = 0.76; 1.144 µg L1 and RF = 0.75; 1.189 µg L1) (Amorim et al., 2021). The advantage of the SVR model in the forecast study is that it can avoid the complexity of the high-level ocean data space and solve the corresponding high-dimensional space decision problem directly in the linearly separable case. When the kernel function is known, it can simplify the analysis difficulty of the high-dimensional space problem; the disadvantage is that it requires high parameter tuning of the model and consumes more space for storing training samples and kernel matrix.

Traditional machine learning models
The traditional machine learning (ML) is an effective empirical method. It is a collection of algorithms (e.g., neural networks, Frontiers in Earth Science 13 frontiersin.org support vector machines, decision trees, random forests, genetic programming, etc.) that can solve multivariate, non-linear, nonparametric regression or classification problems. The traditional ML-based modeling capability makes it highly capable of solving problems in the Earth's marine sciences. The application of traditional ML-based approaches is divided into three domains (Lary et al., 2016): 1. The system's deterministic models are computationally intensive, and machine learning can be used as a code accelerator tool. 2. No deterministic model exists, but an empirically based ML model can be derived using existing data. 3. Classification problems that wish to identify specific spatial processes or events.
The generalized additive model GAM is characterized by simplicity and ease of interpretation. However, at the same time, simplicity is also the most significant weakness because, in the real world, the relationship between features and results is non-linear. GAM models are particularly suitable for analyzing time series datasets in the Earth's marine sciences. Time series signals can often be explained by multiple additional components, such as trends, seasonality, and daily fluctuations, which can be easily incorporated into GAM models. Researchers have widely used GAM models and their derivatives in modeling analysis to predict changes in SST (Miftahuddin, 2016;Humaira et al., 2019).
From statistical models such as GAM to machine learning models, random forest RF performs well in complex prediction problems characterized by non-linear dynamics. RF is a classification regression algorithm based on the aggregation of many decision trees and is widely used for sea salt and wave prediction Callens et al., 2020).
RF is famous for its good performance and little hyperparameter tuning. As with all machine learning models, there are biases to consider, variance tradeoffs, the balance between models, models that do not generalize to new data, and models that have preferences or fail to learn training data features.
The extreme gradient boosting (XGBoost) belongs to the Gradient Boosting Decision Tree (GBDT) model. It shares many features and advantages with RF (interpretability, predictability, and simplicity), but the key difference is that the decision trees that facilitate performance improvement are constructed sequentially. Jin et al. (2020) has used XGBoost to estimate the intensity of tropical cyclones in the South China Sea. XGBoost and its variant AdaBoost have also been applied to wave height prediction in Malaysian waters (Anggraeni et al., 2021). However, the drawback of XGBoost is also apparent: the time-space overhead is enormous, which is very unfriendly to progressively larger training data.
The main reason for LightGBM is to solve the problems encountered by GBDT in massive data so that GBDT can be used better and faster in practice. While retaining the advantages of XGBoost, LightGBM optimizes the traditional GBDT algorithm, enabling it to speed up the training of GBDT models without compromising accuracy. Considering the effective performance of LightGBM models in regression, Gan et al. (2021) developed a LightGBM model for predicting estuarine water levels in the lower Columbia River region and outperformed the commonly used NS_TIDE model overall. Su et al. (2021) combined LightGBM with The Ocean and Land Color Instrument (OLCI) and in situ data to estimate near-coastal chlorophyll-a concentrations and map the spatial distribution of chlorophyll-a concentrations.
Some of the variables that may play a role in predicting elements of the ocean environment are atmospheric conditions (temperature, solar radiation, cloud, precipitation, wind speed, etc.), autoregressive features (past values of elements), temporal information (seasons), and variables in adjacent spaces. Feature engineering uses domain expertise and statistical analysis to extract the most appropriate set of features for a given problem from the entire dataset, improving prediction accuracy and accelerating model convergence by selecting the most relevant features for the response variable. Figure 7 shows the basic steps of feature engineering. For machine learning, raw data rarely provides the most information, so it must consider the combination and transformation of raw data.
Feature engineering is an essential part of machine learning. The good or bad feature engineering will primarily affect the final result of machine learning, which is also the disadvantage of machine learning compared with deep learning.

Data-driven deep learning models
As Big Data brings new perspectives to scientific research, more and more disciplines are moving toward data-driven analysis, including research in the field of oceanography. Data-driven deep learning techniques have recently been widely used in forecasting ocean elements. By mining the correlation between ocean elements and extracting spatial and temporal features from a large amount of ocean data, mathematical models for ocean element forecasting have been established, enabling the ability to input realistic data and then obtain accurate forecast data. Commonly used AI methods include ANN, CNN, LSTM, and ConvLSTM. These methods have been widely used in forecasting ocean elements, such as sea wind, sea surface height (SSH), SST, etc. Due to the poor interpretability and lack of physical constraints of AI, such deep learning methods cannot replace the current traditional numerical modeling methods. Combining AI with numerical ocean models is a significant development direction in the future.

ANN
ANN has been widely used to predict ocean parameters because of their higher predictive efficiency than traditional statistical techniques. Without the interpretability of neural networks in physical mechanisms, neural networks can sometimes be used to complement or replace traditional methods for optimization problems. Their excellent performance has encouraged researchers to use neural network models in many practical applications. ANN was first applied to sea level prediction in a 1997 study (Röske, 1997). Since then, ANN has been more widely promoted in marine research. For example, ANN has been applied to ocean salinity (Chen and Hu, 2017) and wave prediction (Ducournau and Fablet, 2016) in specific sea areas. Some extreme marine meteorological phenomena, such as storm surges and extreme sea level heights, can also be estimated using ANN (Sahoo and Bhaskaran, 2019). Moreover, relatively simple neural networks can also accurately predict surface currents in most global navigable oceans (Sinha and Abernathey, 2021). Chen et al. (2022b) Basic steps of feature engineering. From a large amount of initial feature data, the features that are not relevant to the problem are eliminated, the features that are more relevant to the problem are selected, and the subset of features that are most important to the problem is generated. a deep neural network (DNN) model to retrieve vertical profiles of chlorophyll-a from surface ocean data, an improvement that helps improve predictive power compared to shallow ANN (Das and Roy, 2019).

CNN
CNN plays an essential role in developing ocean environment elements and extreme climate predictions by extracting image features to reflect spatial information Huang et al. (2022). Jiang et al. (2018) combined shallow learning (SL) and CNN to forecast atmospheric typhoon events using SST. In a comparison experiment using meteorological parameters for forecasting SST in Japanese waters, CNN can identify extreme forecasting phenomena such as typhoons with higher accuracy than LSTM and deep MLP (Patil and Iiyama, 2022a). For marine ecology, the chlorophyll-a prediction can be performed using two different scales of CNN models for overall and local training (Jin et al., 2021), and the average RMSE of CNN Model II (7 × 7) was 0.191, which is significantly lower than that of CNN Model I (48 × 27), which was 0.463. Since CNN alone extracts only spatial feature information, it does not work excellently in ocean element forecasting with mainly spatial and temporal features.

RNN
RNN is a neural network with a "memory" function, specifically in that the output of a time series is also correlated with the previous output. It was designed to track the temporal dependence between the sequences of ocean elements, making RNN useful in marine environmental forecasting. The correlation coefficient of RNN is higher than that of a feedforward neural network when performing wave prediction (Mandal and Prabaharan, 2006). However, RNN suffers from the gradient disappearance problem. To overcome this problem, input gates, output gates, and forgetting gates are added outside the RNN structure. It produces the long short term memory network LSTM. The powerful memory capability of LSTM has been demonstrated in the time-series data of marine environment forecasting (Song et al., 2021;. predicts SSH using LSTM (Zhang et al., 2017). conducted the first study to use time series plus FC-LSTM to predict SST, followed by comparative experiments applying RNN and LSTM for SST prediction (Aydınlı et al., 2022). The integrated model of LSTM superimposed on MLP also obtained better results in predicting SST (Jahanbakht et al., 2021). SWAN-LSTM models integrating near-coastal wave models have also emerged to achieve near-coastal wave height predictions . As well as a series of experiments with LSTM on currents, sea level, and even ENSO phenomenon (Broni-Bedaiko et al., 2019;Ishida et al., 2020;Zulfa et al., 2021). LSTM models were also shown to be effective in predicting chlorophyll-a concentrations (Cho and Park, 2019;Rostam et al., 2021;Cen et al., 2022). These experiments have demonstrated the robust performance of the LSTM model. In addition, the LSTM has a variant of the bidirectional propagation model, bi-directional long short-term memory (BiLSTM), and a deep learning architecture based on these two models for wind speed prediction in the Indian Ocean region proved to be the most effective BiLSTM model (Biswas and Sinha, 2021). The LSTM model introduces many parameters that make training difficult, so the LSTM is simplified to produce a gated recurrent unit (GRU), synthesizing the forget gate and the input gate into a single update gate. It also mixes cell states and hidden states with some other modifications. It has the advantages of faster training and a more straightforward structure than the LSTM, which also performs well in marine environmental forecasting. Zhang et al. (2017) predicted medium and long term SST prediction models based on the GRU model mitigating the overfitting and underfitting phenomena. Sukanda and Adytia (2022) used GRU and bidirectional GRU for wave prediction in the Indonesia region. The GRU model obtained relatively good values of MSE, RMSE, and R2, 0.0443, 0.2106, and 0.9756, respectively, while the BiGRU model performs better than the GRU model with an MSE value of 0.0477, 0.2184 for RMSE, and 0.9869 for R2. Tropical cyclones often cause intense SSH changes at sea (Meng, 2022), Meng et al. (2021a); Song et al. (2022b) use bidirectional GRU to predict tropical cyclone tracks and SSH changes caused by tropical cyclones. The model evolution of RNN is shown in Figure 8. Although RNN and its derived networks are effective in time series ocean element forecasting, they do not extract spatial information from marine spatio-temporal data. It leaves significant room for improvement of RNN with single extraction of temporal features.

FIGURE 8
Evolution of the RNN family of models. RNN, LSTM, and GRU model structures are shown from left to right. The LSTM adds three gates (forget gate, input gate, and output gate) to the RNN, and the GRU is a variant of the LSTM that synthesizes forget and input gate into a single update gate.

CNN+RNN
In order to make RNN incorporate spatial dependency features in addition to extracted temporal features, researchers combined CNN with RNN. They successfully introduced spatial dependency extraction into time series information to further improve the performance of deep learning in ocean environment elements forecasting. Braakmann-Folgmann et al. (2017) studied combining CNN and RNN to analyze SSH evolution and predict SLA from temporal and spatial dimensions. Wei and Chang (2021) predicted typhoon-induced sea winds and waves around Taiwan Island, using a model of GRU+CNN to build a typhoon-induced wind and wave height prediction model. Mahesh et al. (2019) combined CNN and RNN to predict ENSO using SST with accuracy close to the stateof-the-art seasonal dynamic prediction model. Ren et al. (2022) Constructed a C-LSTM model based on CNN network combined with LSTM to predict typhoon paths with better results than a single LSTM model. The average surface distance error predicted by a single LSTM model is 11,503 m, while the prediction error of CNN+LSTM is only 832 m.

ConvLSTM
In introducing spatial features into time series, the ConvLSTM model is generated. The ConvLSTM replaces matrix multiplication with convolution operations for each gate in the LSTM cell (Shi et al., 2015). The difference between ConvLSTM and CNN+LSTM model is that the former convolves the state of the hidden layer at each step, making ConvLSTM more capable of spatial feature extraction, not only for one-dimensional temporal data but also for two-dimensional spatio-temporal data. Because of its robust feature extraction capability, ConvLSTM has shown outstanding performance in high-precision 2D wave prediction and SST for short-and medium-term prediction (Xiao et al., 2019;Zhou et al., 2021). ConvLSTM is also applied to invert SST and SSS (Song et al., 2022c). Moreover, ConvLSTM predicts chlorophyll-a compared to 3D-CNN models in marine ecological studies, and the results show that the accuracy of 2D-CNN, 3D-CNN and ConvLSTM are 0.8804, 0.9397 and 0.9799, respectively. (Lee et al., 2020;Wang et al., 2022b). If ConvLSTM exists, there must exist its twin model ConvGRU . extracts non-linear features of typhoons by ConvGRU combined with an attention mechanism to predict typhoon trajectories.

Optimization of AI methods
Based on the deep learning models mentioned above, we analyze the principles and processes of these marine element prediction models. We believe that two main factors affect the prediction performance of the models: dataset optimization and parameter optimization. The construction of datasets is even more critical in AI algorithms than the algorithms themselves. Taking the prediction of single-element SST as an example (Patil and Iiyama, 2022b), target SST data with high spatial and temporal resolution are crucial for developing deep learning models. ERA5 is one of the most commonly used ocean reanalysis datasets. JCOPE data have good temporal and spatial scales and are more regionally compatible with ERA5 reanalysis data (Miyazawa et al., 2017). Considering the daily variation of SST, the target SST data has a total of 153 days, adjusted considering the sensitivity of different models to the dataset. For example, CapsNet is less sensitive to the size of the dataset, while CNN has a stronger sensitivity because the former captures richer feature information from a single image (Chattopadhyay et al., 2020). Theoretically, multi-element data prediction has higher accuracy than single-element prediction. Using complex parameters in the atmospheric ocean to simulate more realistic environmental states helps to constitute richer features for deep learning models (Pathak et al., 2022). The parameter tuning of the model mainly refers to tuning the external parameters or hyperparameters of the model, which is optimized through iterative trials or with the help of complex optimization algorithms. In the current study, the stochastic search algorithm is used to optimize the hyperparameters because the performance of the DL model is susceptible to them. The various hyperparameters include the time duration, the number of hidden layers, the number of cells in each hidden layer, the size of the convolution kernel in the convolution layer, and the setting of the loss function. We need to evaluate these hyperparameters with the validation set to determine the most suitable for this experiment.

Combination of physical models and AI
The Traditional machine learning and deep learning have proven influential in ocean atmospheres, providing new alternatives to efficiently identify complex patterns and simulate non-linear dynamics. However, its predictions do not necessarily obey the control laws of physical systems and cannot be generalized to different systems. Therefore, physics-guided machine learning has emerged, and some results have been achieved. Physics-guided machine learning uses the laws of physics modeling and machine learning with deep learning models to solve scientific problems better.

Physics-guided loss function
There is no way to learn the underlying physical laws when fitting data using standard deep learning models, which leads to low physical consistency and poor generalization. The simplest and most widely used method to incorporate physical constraints is to design physically guided loss functions, which can help deep learning models to learn data patterns that conform to physical laws. Jia et al. (2019); Stewart and Ermon (2017); Daw et al. (2017) are used to guide network learning using a physics-based loss function that measures the violation of physical principles in the neural network's output. Beucler et al. (2021); Wu et al. (2020a) introduce statistical and physical constraints in the loss function to regulate the predictions of physical simulations.

Physics-guided design of architecture
Although the inclusion of physical bootstrapping through the loss function allows the model to learn physical patterns, deep learning models are still black-box models in most cases. Physicsguided neural network structures have tighter hard constraints than the soft constraints of the loss function. So physics-guided structural models are more interpretable and generalizable. For example, the LSTM architecture incorporates an intermediate physical variable as part of the structure that maintains monotonicity (Daw et al., 2020). The model will produce physically consistent predictions in addition to attaching an exit layer to quantify uncertainty. De Bézenac et al.
(2019) proposed a warping scheme for predicting sea surface temperature but only considered the linear advection-diffusion equation. The Turbulence Network (TF-Net) is the combination of turbulence modeling and deep learning (DL) to produce a novel deep learning model that enhances the ability to predict complex turbulence using deep neural networks (Wang et al., 2020).

Hybrid Physics-ML models
In addition to the two designs mentioned above that incorporate the physical system into a loss function or neural network structure, there exists a combination of physical and data-driven model models into a generalized hybrid model. A straightforward approach is to provide the output of the physical-based model as input to the data-driven model, which is visualized in Figure 9, for example, the physics-based machine learning model PBML for short-term wave forecasting (Wu et al., 2020b). The inputs and outputs of the PBML model are first determined by the main variables in the physicsbased wave model, and then machine learning algorithms are used to train and perform multi-step ahead forecasts. A framework based on a combination of GAN and physical numerical models for predicting SST is also used to train the neural network model using the physical-based numerical model and then calibrate the model parameters using observed data (Meng et al., 2021b).

Application of AI in numerical models
The ocean and the atmosphere have their physical laws, which a set of controlling equations can represent (Haidvogel et al., 2017;Schultz et al., 2021). Numerical ocean models are a commonly used method for forecasting ocean elements, essentially the numerical solution of a series of physical partial differential equations using different numerical analysis methods (Blumberg and Mellor, 1978). Another way of combining AI and physics is embodied in the combination of AI and numerical modes, thus further improving the efficiency and accuracy of numerical modes. The ocean numerical forecasting process is shown in Figure 10.
The combination of AI and numerical modeling is manifested in three main areas: pre-processing, the model itself, and postprocessing. Pre-processing refers to the optimization of data assimilation methods. Neural networks have great advantages in terms of accuracy and efficiency of data assimilation and are powerful in approximating non-linear systems and extracting meaningful features from high-dimensional data. These properties are very useful in the process of data assimilation (O'Donncha et al., 2018). For example, Choi et al. (2022) assimilated deep learning results for spatiotemporal prediction in the ocean to a numerical forecasting system, improving the model's accuracy. Boosting the model means replacing some physical parameterization processes with a machine-learning model. Parameterization means that the details of the process are not taken into account but that the physical process is represented by a simplified function represented by some other defined variables (Liang et al., 2022) proposes a vertical mixing effect using a deep neural network (DNN) model to parameterize OSBL turbulence and compares the DNN model with two traditional physical parameterization methods, KPP-CVMix and KPP-LF17, showing that the (DNN) model outperforms the two popular physical-based parameterization methods. There are two main ways in which machine learning can be applied to the post-processing of numerical models, one is to revise the predictions of the numerical models, and the other is to use the results of the numerical model predictions as input or training data for the deep learning models. Fei et al. (2022) proposed a sea surface temperature (SST) correction method for convolutional long and short-term memory (ConvLSTM) networks based on a multiattentive mechanism. Zhang et al. (2013) used higher resolution data on meteorological variables from the National Centre for Environmental Protection (NCEP) -Global Forecasting System (GFS) Final Analysis (FNL) dataset to train the neural network. The study demonstrates that combining AI and numerical models not only improves the efficiency and accuracy of numerical models but also saves computational resources.

Signal decomposition hybrid model
The classical time series models such as AR, ARMA, and ARIMA in statistical models are adaptive models based on linear and smooth theory predictions. Furthermore, the deep learning models LSTM developed for time series problems have strong This figure is a concrete example of a hybrid model for predicting sea surface temperature. A convolution-deconvolution network can generate a motion field of water. Then, the motion field and the initial input predict the state after a time step by a physical model. Adapted from Figure 1   non-linear processing capability. However, the variation of ocean elements, such as sea wind and waves, is non-linearity and non-stationary. Therefore although these models have proven to be robust in ocean forecasting, it is undeniable that they are still limited when facing non-stationary problems. So the signal decomposition method has been introduced to the study of ocean forecasting, as shown in Figure 11. This technique can decompose complex time series into simple components to overcome the non-linear and non-stationary state. The signal decomposition technique can be combined with machine learning and deep learning to enhance the models' ability to handle non-stationary signals. LI et al. (2017) enhanced SVM models to forecast SST using complementary ensemble empirical mode decomposition (CEEMD) algorithm. Duan et al. (2016) combined the empirical modal decomposition algorithm empirical mode decomposition (EMD) with statistical models model AR and deep learning model LSTM for wave prediction, respectively, and successfully enhanced the performance of both models (Hao et al., 2022). Ensemble empirical mode decomposition (EEMD) and CEEMD are two improved algorithms of EMD. Each of the two algorithms was combined with back propagation neural network (BPNN) to construct a hybrid model for predicting SST, and the results showed that CEEMD-BPNN was better for predicting SST (Wu et al., 2019). Moreover, Song et al. (2022a) combined three signal decomposition techniques (TVF-EMD, WT and CEEMD) with ENN models for a minute-scale sea level prediction study, respectively, and in the time series lengths of 1,440, 720, and 360 min, the mean R2 values of TVF-EMD-ENN were best among the four models. For the 1,440 min sequence length, the average R2 of TVF-EMD-ENN was 0.952, higher than those of WT-ENN (0.910) and CEEMD-ENN (0.929). A hybrid prediction model was developed for wind and wave power prediction, which was based on adaptive decomposition (Nelder-Mead variational mode decomposition) and a convolutional neural network with bi-directional long shortterm memory (Neshat et al., 2022). Meng et al. (2022) applied an adaptive time-frequency decomposition algorithm to predict SSH caused by tropical cyclones. Hu et al. (2021) proposed a hybrid time series prediction model based on integrated empirical modal decomposition EEMD, LSTM, and Bayesian optimization (BO) for wind and wave height prediction.

FIGURE 11
Signal decomposition schematic, the input data is decomposed into multiple sub-signals Z SD1 to ZSDn by the signal decomposition algorithm. The sub-signal input data-driven model results are combined to obtain the prediction results.
In summary, signal decomposition algorithms enhance the model's ability to handle non-stationary signals. However, not all signal decomposition algorithms apply to a particular problem. For example, Gan et al. (2021) discusses the application of NS_TIDE, EEMD, and VMD to river tides. EEMD and VMD are general signal decomposition algorithms, and VMD can eliminate modal mixing better than EEMD. NS_TIDE can resolve specific tidal components compared to VMD but cannot accurately reproduce sub-tidal water levels during high flow times. Specific problems should be analyzed to select the appropriate signal decomposition algorithm.

Direction of the application of AI in ocean observation
In remote sensing, AI models related to ocean observation need to incorporate physics. Moreover, models lacking physics may not be very accurate, and most current models used for ocean remote sensing image information mining are from computer vision. Some models explicitly serving the ocean should also be developed in the future. Increase the generalization capability of the models so that they can be adapted to different ocean observation sensors. In terms of intelligent sensor networking, its mobility is currently limited, and its location is relatively fixed. In the future, it needs to combine more unmanned aerial vehicles and autonomous underwater vehicles to expand its observation range and make it more flexible. For underwater robots, the current robotic intelligence delays the established procedures to meet the operating environment of basic behavior patterns. However, it does not yet have the human brain's advanced reasoning evolutionary behavior capabilities. The future must develop AI robots with higher-order intelligence and behavior capabilities.

Direction of identification of ocean phenomena
Two trends can be summarised by analyzing the evolution of methods for identifying some marine processes. Firstly, more AI methods have been used for marine process identification in recent years, and the models used by researchers are becoming more advanced. Thus more advanced deep learning models are a major trend for future development. Other marine elements related to marine process identification are gradually gaining attention from researchers, and multi-element studies are becoming mainstream.
It is clear from the current state of research that most work has focused on SAR imagery, and more needs to be done to identify it. The current state of research shows that most of the work has focused on SAR images. Very little work has been done to identify ocean phenomena in geostationary satellite images. In order to fill this gap, ocean phenomena in geostationary satellite images could be the main object of study in the future, and more advanced neural networks could be used to cope with the interference caused by complex imaging conditions. SAR images can be combined with geostationary satellite images, providing additional information for identifying ocean phenomena in geostationary satellite images. In addition, similar to the authors of (Drees et al., 2020), using a multimodal approach to fuse datasets other than SAR images with SAR images for further processing and recognition may become a major trend for future development.

Direction of ocean element forecasting
The following describes future research directions for ocean element forecasting from the perspectives of data, methods, and applications.
1. In terms of data, the performance of AI is inextricably linked to data features, and improving data validity is the basis for improving prediction accuracy. It can be combined with physical features, feature mining of data from the perspective of ocean physics, or data optimization through mathematical methods, statistics, or machine learning. 2. In terms of methods, different sea areas may be suitable for different methods. Combining AI with marine numerical models, statistical models, and big data to form a data collection forecasting system to complement each other's shortcomings will effectively improve the forecasting effect. 3. In terms of application, real-time observation of marine environmental data often faces characteristics such as missing data, significant errors, massive data, and complex calculations. Hence, it is necessary to consider various conditions of practical applications and improve the model's compatibility and robustness to more complex conditions.

Summary
With the emergence of AI and the age of "big data, " humans have the high-tech objective of using vast quantities of data. Big data on the ocean supports people in improving the ocean ecosystem and enhances the quality of life in society. We examine the evolution of ocean observation and its relevance in the development of ocean observation. Intelligent remote sensing technology, intelligent sensor networking technology, and intelligent underwater robots were created by combining the present observation technology with an AI algorithm. Among them, the data supplied by remote sensing technologies serve as the foundation for ocean forecasting and identification and are frequently used by scientists. In addition, reconstructed observation data are included in the ocean's big data set. Numerous objective causes will unavoidably cause a considerable quantity of ocean observational data to be incomplete or unavailable, necessitating the challenge of data reconstruction. The AI system is able to effectively recreate observation data and correct any inaccuracies or missing data issues.
Oceanic physical processes, such as internal waves, heatwaves, and eddies, have vital impacts on the Earth's climate, marine ecology, and human activities. Consequently, it is crucial that humans use ocean data for the identification of ocean phenomena. This research highlights the use of artificial intelligence algorithms for detecting and recognizing internal waves, heatwaves, the El Niño phenomena, and sea ice, as well as for forecasting ocean components. We categorize them as physics-driven numerical models, modeldriven statistical models, classical machine learning models, datadriven deep learning models, and physical models mixed with AI models. Each model type focuses differently on ocean forecasting. Currently, CNN, LSTM, and ConvLSTM are frequently used datadriven deep learning models in marine element forecasting, such as sea breeze, SSH, SST, etc. Due of the poor interpretability and absence of physical restrictions of AI in general, these deep learning approaches cannot yet replace classic numerical modeling techniques. Combining physical models with AI has thus yielded some outcomes. Using the rules of physics in conjunction with datadriven models improves the ability to address scientific challenges.
Finally, we address briefly how to increase the performance of AI systems, mostly through optimizing data sets and parameters.

Author contributions
TS: He is responsible for setting overall research goals and objectives, designing research ideas, supervising and planning the overall research plan, and reviewing and revising the original draft. CP: He is responsible for writing the original draft, collecting research papers, conducting research, and creating graphics. BH: He is responsible for writing the original draft, collecting research papers, conducting research, and creating graphics. GX: He is responsible for writing the original draft, collecting research papers, conducting research, and creating graphics. JX: He is responsible for compiling the findings and reviewing the results. HS: She is responsible for reviewing and revising the original draft. FM: He provides financial support and coordinates the entire research process.