Optical Classification of the Coastal Waters of the Northern Indian Ocean

Coastal waters are optically diverse; studying their optical characteristics is an important application of satellite oceanography. In coastal ecosystems of the northern Indian Ocean, optical diversity has been little studied, except for the global analysis by Mélin and Vantrepotte (2015). This paper is a contribution toward identiﬁcation and characterization of optical classes in the coastal regions of the northern Indian Ocean. The study identiﬁed eight optical classes using the monthly climatological datasets of remote sensing reﬂectance for the 1998–2013 period from the Ocean Color Climate Change Initiative (OC-CCI, www.oceancolour.org). The optical classiﬁcation we adopted uses the fuzzy logic method, based on Moore et al. (2009). The seasonal variations of the eight resultant optical classes of the coastal waters of the northern Indian Ocean were explored. From the mean reﬂectance spectral signals obtained, it appears that classes 1–6 belong to Case-1 waters and classes 7 and 8 correspond to Case-2 waters. Classes 1 to 2 appear in deeper oligotrophic waters; classes 3–6 are present in intermediate depths; classes 7 and 8 are mostly found within inshore eutrophic regions with high chlorophyll concentrations, sediments from river plumes and land runoffs. The optical variability between seasons (the summer and winter monsoon and the intermonsoon seasons) are inﬂuenced by variations in physical forcing, such as surface winds, ocean currents, precipitation, and sediment inﬂux from rivers and land runoff. Optical diversity index ranged from around 0.3 to 1.36. High diversity indices ranging between 1 and 1.36 were found in areas dominated by classes 1–4, whereas low diversity indices 0.3 occurred in areas where classes 7 and 8 dominated. The variations in the dominant optical classes are shown to be related to changes in chlorophyll concentration and suspended sediment load, as indicated by remote sensing reﬂectance at 670 nm. On the other hand, optical diversity appears to be high in zones of transition between dominant optical classes.


INTRODUCTION
In an ocean under rapid modification by climate change, the boundaries between marine ecological provinces will move, but in ways that are difficult to predict (Karl et al., 1995;Platt and Sathyendranath, 1999). However, there is a premium on knowing the large-scale structure of the ocean ecosystem as it changes through time, in other words on developing and maintaining a biogeography of the ocean basin. Conventional biogeography relies on collecting and identifying individual specimens through samplings and survey techniques. At large geographical scales, it is a costly and time-consuming procedure to make even a single survey at one time point; making serial surveys to detect possible changes may be prohibitive on the grounds of expense. An alternative approach would be to use data streams from sensors carried on satellites in Earth orbit. Such data have the advantages of high-resolution at the ocean surface, high frequency of coverage, cost-effectiveness and synoptic coverage Sathyendranath, 1999, 2008). Potentially, their use could yield a different kind of biogeography, based on data free from the limitations of coarse resolution in space and time. Visible spectral radiometry of the ocean provides a data stream that is particularly useful for ecosystem analysis: the visible spectrum carries information on the pigments and size of phytoplankton cells, as well as on the optical properties of the other constituents (such as suspended sediments and colored or chromophoric dissolved organic matter) in the surface waters of the ocean (Guzman et al., 1995;Babin et al., 2003;Dowell and Platt, 2009;Garaba and Zielinski, 2013). Mélin and Vantrepotte (2015) have pioneered the classification of coastal waters at global scale using annually-averaged fields of optical radiances from satellite data.
The Northern Indian Ocean is landlocked toward the north and bifurcates into two intra-continental seas: the Arabian Sea and the Bay of Bengal. Seasonally reversing monsoons and reversal of ocean currents are the major distinguishing features of the Indian Ocean basin (Shetye, 1998;Qasim, 1999). The monsoonal cycle, including southwest or summer monsoon and northeast or winter monsoon, determines the climate of the region. Southwest monsoon is the continuation of the southern hemisphere trade winds that bring monsoon rains and floods to the Asian landmass (Tomczak and Godfrey, 2001). Northeast monsoon is characterized by high pressure over the Asian land mass and northeasterly winds over the tropics and northern subtropics (Shetye and Shenoi, 1988). A strong coastal upwelling occurs along the western coast during the southwest monsoon season, whereas during the northeast monsoon season, cold continental winds cause convective mixing and winter cooling along the north Indian coast (Tomczak and Godfrey, 2001). Other oceanographic features of interest in this region include the Indian Ocean warm pool (Vinayachandran and Shetye, 1991) and monsoon depressions and cyclones (Schott and McCreary, 2001;Schott et al., 2009).
In the Northern Indian Ocean, biogeographical analysis has so far been restricted to what can be found using conventional methods (Krishnamurthy et al., 1978;Schills and Wilson, 2006;Obura, 2012Obura, , 2016Jeffries et al., 2015). Few notable studies on global ocean biogeographic partitions using satellite datasets include: Longhurst province classification (Longhurst, 1998), based on regional oceanography of major oceanic basins, and a global database of chlorophyll profiles; and the 56 biogeochemical provinces proposed by Reygondeau et al. (2013) using the datasets of Sea Surface Temperature (SST), Chlorophyll and Sea Surface Salinity (SSS). The current study region includes at least parts of four provinces proposed by Longhurst (1998): the Red Sea and Persian gulf province (REDS), Northwest Arabian Sea upwelling province (ARAB), Western India coastal province (INDW), and Eastern India coastal province (INDE). Studies on biogeographic partitioning of the Indian Ocean region using remotely-sensed datasets are relatively few. Here, we follow the lead of Mélin and Vantrepotte (2015) through a detailed implementation of their optical remote-sensing method to the Indian Ocean region. We extend the temporal resolution to reveal seasonal changes in the optical classification of the coastal waters of the region. We interpret the results in the context of the seasonally-reversing wind and ocean current system that is the unique oceanographic characteristic of the region.

Study Area
Northern Indian Ocean is subdivided by landmasses into the Arabian Sea in the west and the Bay of Bengal in the east and it opens into the equatorial Indian Ocean to the south. The Bay of Bengal coast is shared among India, Bangladesh, Myanmar, Sri Lanka, and the western part of Thailand. The Arabian Sea coast is shared among India, Yemen, Oman, Iran, Pakistan, Sri Lanka, Maldives, and Somalia. The area of interest is the coastal waters of the northern Indian Ocean within the 2,000 m isobath (Figure 1) (extending from 0 to 30 • N latitude and 50 to 100 • E longitude). Rather than using a more shallow depth (100-200 m) as the outer limit of the coastal zone, we have opted to use the 2,000 m isobath for the outer limit. This was to explore whether optical signatures of offshore waters appeared close to shore, and vice versa. In this choice, we were guided by Antony et al. (2002) who suggested that the offshore influence of coastal waters could extend as far out as 400 km from the shore. This region is well-known for the alternate upwelling and downwelling processes occurring during the contrasting seasons of southwest and northeast monsoons.
Here, we use satellite remote-sensing reflectances (R rs ) at six wavelengths (412, 443, 490, 510, 555, and 670 nm) to identify optically-distinct regions of the coast. Figure S1 provides a schematic diagram of the methods used in the current study.

Satellite Dataset
Remote sensing reflectance (R rs ) of six wavelengths and Chlorophyll datasets were obtained from Version 2 of the Ocean Color Climate Change Initiative (OC-CCI, see www.oceancolour.org) (Sathyendranath et al., 2016(Sathyendranath et al., , 2017 with spatial resolution of 4 km. Chlorophyll concentration was calculated from the remote-sensing reflectance, using the National Aeronautics and Space Administration (NASA) Ocean Color Chlorophyll Version 4 (OC4) algorithm (O'Reilly et al., 1998). This algorithm performed best in an algorithm comparison carried out as part of OC-CCI activities (Brewin et al., 2015). The OC-CCI satellite datasets affords superior coverage for the area of interest, compared with previously available data. These data products are band-shifted, biascorrected and merged data archives obtained from three sensors: Sea-WIFS (Sea-Viewing Wide Field-of-View Sensor), MODIS-Aqua (Moderate Resolution Imaging Spectro-radiometer of the Aqua earth Observing System), and MERIS (Medium Resolution Imaging Spectrometer). OC-CCI datasets were validated using the in-situ datasets from Teledyne/Webb APEX-Argo floats deployed in the Arabian Sea (Roxy et al., 2016). The OC-CCI dataset are limited to the six SeaWiFS wavebands in the visible. We recognize the limitations of the bandset that were identified by Mélin and Vantrepotte (2015) for coastal optical classification. Therefore, the analyses and interpretation are restricted to the optical differences that are amenable to identification by the available dataset. All grid points of the selected region (depth range of 0-2,000 m) were used in the classification: grid points outside the 2,000 m depth range were excluded. Isobaths of the region were taken from the General Bathymetric Chart of the Oceans (GEBCO) 1-min gridded data set (Figure 1).

Normalization of Dataset
The remote-sensing reflectances (R rs ) at six wavelengths (412, 443, 490, 510, 555, and 670 nm) for the years 1998-2013 were used for the study. The remote-sensing reflectance values were skewed in their distribution and to minimize skewness, each R rs spectrum was transformed to its log 10 values. They were then normalized by its integral from λ 1 (412 nm) to λ 2 (670 nm), where λ is the wavelength.
The normalization allows analysis of changes in the shape of the R rs spectra, rather than in their magnitudes. Typically, changes in the shape of the spectra would be more affected by the composition of the materials present in the water, whereas the magnitude of the spectra is likely to be more indicative of the concentration of the substances, especially of highly-scattering substances. In this work, the vector x j of six log-transformed and normalized reflectance values from a particular location and time (pixel, here indexed by subscript j) is referred to as an object. The total number of objects in a classification is N. Notation and Definitions used in this study are presented in Table 1.

Fuzzy C Mean Algorithm
Fuzzy classification evolved from classical set theory. The classical clustering approach determines whether an object is a member or non-member of a given set of any system. Only these two options are possible. In contrast, fuzzy logic allows that an object may have partial memberships in more than one set. The classification algorithms based on fuzzy logic are often used in classifying data from natural systems. The method allows for overlap between boundaries of particular classes or sets, and recognizes that more than one class may be represented at a particular location at any given time.
The membership F ij of a cluster i in the object j is given by (1 − Q( Z 2 ij )) where Z ij is the Mahalanobis distance given by ( x j − M i )/ S i where M i is the mean, S i is the standard deviation and Q is a cumulative χ 2 distribution (Zadeh, 1965).
In this study, the log-transformed, normalized reflectance spectra ( x) were analyzed using the Fuzzy C-mean (FCM) algorithm. Our implementation of fuzzy C mean classification follows Moore et al. (2001). It calculates the centres of each class or cluster and the percentage membership of each class in the data at each pixel. The FCM algorithm also uses several validity functions to assess the optimal number of clusters to be chosen for the classification (Bezdek, 1973;Rezaee, 2010).

Notations
Definitions Distance between center of clusters i and k (i = k) d( x j , c i ) Distance between object x j and center of cluster C i Membership of object j in cluster i, normalized by the total membership in all clusters x j An object, defined as the vector consisting of six remote sensing reflectances R rs after log transformation and normalization (equation 1) for a given pixel j Z ij Mahalanobis distance between the object x j and the center C i of cluster i

Optimal Cluster Validity Functions
Cluster validity function is a statistical measure used to select the optimal number of clusters in the classification (too many clusters would imply that individual clusters resemble each other; too few would imply that all possible cases are not covered).
We have used two methods: 1. Xie-beni index and 2. Partition co-efficient. These two methods are used only for selecting the optimal cluster number to run the fuzzy C-means classification. Cluster validity methods are statistical functions that determine the performance of a clustering procedure. Criteria of merit for a clustering method include the distance between clusters (separation) and the distribution of points around a cluster (compactness) (Deborah et al., 2010). We can rely on multiple validity functions to aid selection of the optimal cluster number. The principal strategy used is to cluster the data over a range of cluster numbers (n c ) and evaluate each clustering result with each validity function (Moore et al., 2009). The Partition Coefficient and the Xie-Beni index are cluster validity methods designed specifically for use with fuzzy algorithms. These two methods are preferred to aid selection of the optimal number of clusters in fuzzy classification (Halkidi et al., 2001).

Xie-Beni Index
The Xie-Beni index X is one of the measures used to determine the best cluster number for the fuzzy classification of a particular dataset. This index depends on the geometric properties of the dataset and the membership matrix. To calculate X, we need to calculate two quantities: the sum over all clusters of the mean squared distance of each data object from the centre c i of cluster C i ; and the square of the minimum distance between two cluster centers (Xie and Beni, 1991). The ratio of these two quantities is the Xie-Beni index: The smallest value of the index indicates the best cluster number (Halkidi et al., 2001;Zhao et al., 2009).

Partition Coefficient
The Partition Coefficient P is a validity function that uses the membership values (F ij ) to provide the optimal cluster number. It measures the amount of overlap between clusters. It is defined as the ratio of the sum of squares of the membership matrix elements of all the clusters to the total number of objects.
The index values lie in the range [1/n c , 1], where n c is the number of clusters. The closer the value is to one the better the data are classified. The cluster number with a maximum partition coefficient is said to be the best cluster number to choose for classification (Bezdek, 1973;Bezdek et al., 1984).

Optical Diversity
Optical diversity is an indicator of the overall variability in optical constituents at a given space and time. Optical diversity, (H j ) is defined here, following Mélin and Vantrepotte (2015), by analogy with the Shannon Diversity Index (Shannon, 2001), where F * i,j is the normalized membership of the optical classes and n c is the number of classes represented. The membership F i,j was normalized by the integral of F i,j over all optical classes to obtain F * i,j : 3. RESULTS AND DISCUSSION

Selection of the Optimal Class Number
The Xie-Beni Index and the Partition Coefficient were calculated for monthly climatologies of x for the study area, computed from the OC-CCI monthly R rs climatologies, which are based on years 1998-2013. Monthly values allowed study of seasonal variations in the distribution of optical classes in this region, which is known for its pronounced seasonality. Climatologies were selected to minimize the effect of outliers through averaging, and also to improve the coverage and reduce gaps in data. Climatological data also provide a baseline against which trends in anomalies can be studied at a future date. Both indices varied between months, and the Xie-Beni index often showed a broad minimum, whereas the Partition Coefficient often showed a broad maximum, such that selection of the optical class number was not straightforward. Nevertheless, eight emerged as the optimal number. To aid the selection of optimal class number further, we also studied the maps of cumulative membership (sum of the memberships of all the classes) calculated using class numbers n c from 5 to 15. The maps were studied for evidence of over-classification (large areas where the cumulative membership was >1) and under-classification (large areas where cumulative membership was <1). This study also showed that n c = 8 gave the best compromise, with low numbers of both under-classified and over-classified pixels. Therefore, finally, eight classes were selected as the optimal cluster number for all the analyses presented here.

Identification of the Optical Classes
The mean spectra M j of the eight selected optical classes are shown in Figure 2. Optical class 1 is characterized by a maximum in the blue, with the signal decreasing progressively toward the red, indicative of clear oceanic waters. With increasing class number, the signal decreased steadily at the shortest wavelength (412 nm), and the maximum shifted toward longer wavelengths: the maximum is at 490 nm for class 6 and 555 nm for class 8. Conversely, class 1 has the minimum value in the red at 670 nm, whereas classes 7 and 8 have the highest values in the red. The values of the mean spectra at the six SeaWiFs wavelengths for each of the classes and their corresponding covariance matrix are provided in Tables S1, S2. It is useful to assess how these optical classes relate to Case-1 and Case-2 waters as defined by Morel and Prieur (1977) and Prieur and Sathyendranath (1981). From the shapes of the spectra, it appears that classes 1-6 are representative of Case-1 waters and classes 6-8 of turbid Case-2 waters. The distributions of the dominant classes of representative months of the four seasons are shown in Figure 3. The mean (Figure 2) and covariance values of the optical classes were then used to classify the waters of the study area for all the months of the year, using the climatological satellite R rs data as inputs, after log-transformation and normalization to obtain x. Seasonal cycles used in description of the optical classes are: 1. southwest monsoon or summer monsoon (June-September), 2. northeast monsoon or winter monsoon (December-March), 3. spring intermonsoon (April-May), 4. autumn (fall) intermonsoon (October-November).

Classes 1 and 2
Optical classes 1 and 2 vary strongly with season. They occur along with class 3 over deeper waters (>200 m). During the southwest monsoon season (June-September), these classes represent very few pixels in deeper waters near the Andaman Sea. In the intermonsoon period (October-November) and northeast E (deeper waters) was also characterized by classes 1, 2, and 3 during the transition period (April-May). The Chlorophyll concentration corresponding to these classes ranged from 0 to 0.2 mg m −3 and the optical diversity index fell in the range from 1 to 1.3.

Classes 3 and 4
Optical classes 3 and 4 occurs in the isobaths of 100-2,000 m (shallow to deeper depths). These classes show irregular boundaries in the offshore waters during southwest monsoon season along west and east coasts of India, extending to the deeper waters of Andaman Sea. The classes are found near Gulf of Aden and Oman waters only in June, i.e., during the onset of southwest monsoon. In the autumn intermonsoon (October-November), the classes were distributed over the shallow depths (0-500 m) along the near-shore waters of Gulf of Yemen, Oman, Arabian Sea, and Bay of Bengal. In the northeast monsoon, these classes occurred around the islands off Somalia coast, Gulf of Yemen, west coast, and east coast of India. The Gulf of Oman waters flowing toward the Arabian Sea represents class 4 in May (spring intermonsoon). Chlorophyll concentration corresponding to these classes fell in the range of 0.5-0.75 mg m −3 and the diversity index ranged from 0.3 to 0.9.

Class 5
This optical class dominates in regions with isobaths of <1,000 m but >200 m. During the onset of southwest monsoon, class 5 is prominent in the inner Persian Gulf, Strait of Hormuz, Gulf of Oman, Somalia, west and east coasts of India. At the end of southwest monsoon season and onset of the fall intermonsoon period, this class is distributed throughout the coastline in the depth range 0-500 m. This trend persists until the month of January (northeast monsoon) along the entire coastline. In the spring intermonsoon this class is present toward the Persian Gulf, Gulf of Oman flowing into the Northwest coast and further extending toward the east coast of India including the Andaman Sea. This class has chlorophyll levels ranging from 0.75 to 1 mg m −3 and the diversity index falls between 0.2 and 0.8.

Classes 6-8
Classes 6-8 dominate in the regions with depths <200 m. Class 6 is dominant in the Persian Gulf characterized by high dense saline waters in all the seasons. Chlorophyll concentration of regions with class 6 varied from 1 to 1.5 mg m −3 . Classes 7 and 8 are present in the inner shelf regions with shallower depths influenced by boundary currents and river influx. These classes appear in the near-shore waters off the Somalia coast, Gulf of Oman, Inner Gulf of Kutch and Khambhat, Inner Ganges shelf, and Irrawady river basin near Andaman Sea. Local winddriven circulation brings in the waters of optical classes 7 and 8 from the major river deltas and minor rivers. The influxes from rivers are seasonally variable and rain-fed according to changing precipitation. In the northeast monsoon, waters belonging to classes 7 and 8 flow toward the Strait of Hormuz and into the Arabian Sea in the months of December to April under the influence of strong northwest wind during winter monsoon (Hunter, 1983), turning the Gulf of Oman waters into classes 7 and 8 in February. These classes do not show major variations in the transition periods. The chlorophyll concentration in the regions with classes 7 and 8 was high, ranging from 2 to 2.5 mg m −3 . The diversity index of the classes 6-8 were low, around 0.3.

Optical Diversity Index
The previous section describes the distribution of the dominant classes, but contains no information on contributions to the optical signal from non-dominant classes. The optical diversity index, which depends on membership of all classes represented in a pixel provides complementary information on the extent to which non-dominant classes are contributing to the signal. If a single class contributed to the optical signal of a pixel, then the optical diversity would be a minimum of 0.26 in our classification with 8 classes represented. On the other hand, if all classes contributed equally, then the optical diversity index would reach a maximum of 2.08.
The optical diversity index H (Equations 4 and 5) was calculated for all months to study the seasonal and regional variations in optical diversity. The optical diversity index H (Figure 4)   when the areas covered by high H values were also more extensive. High diversity indices were also found in the fall intermonsoon and spring intermonsoon periods. During the summer monsoon season (June-September), the diversity index lay mostly in the 0.5-1 range along the entire study region with some pixels having index >1 appearing in the deeper waters. High diversity indices (1-1.36) occur along the productive upwelling areas, the transition zones between coast and open ocean, oligotrophic waters and regions with the influence of boundary currents. Low diversity indices occur in the regions of most turbid waters, regions of high river water influx and inland waters.

Optical Classes, Optical Diversity, and Chlorophyll Concentration
In coastal waters, we know that the optical remote-sensing reflectance spectra are affected not only by chlorophyll concentration, but also by suspended sediment load. The seasonal variability of the chlorophyll concentration for the representative months is shown in Figure 5. The remotesensing reflectance value at 670 nm is often taken to be a measure of suspended sediment load. Therefore, in Figure 6, we have plotted the dominant optical class as a function of chlorophyll-a concentration and R rs (670), for the monthly climatology of February, as an example. Only well-classified pixels (cumulative class membership >0.5) are plotted. We see a gradual progression in the optical classes 1-8, with increasing chlorophyll concentration and increasing R rs values, clearly indicating that the optical classification is affected by both chlorophyll concentration and suspended sediment load. Since the chlorophyll concentration in Version 2 of OC-CCI was calculated with a single, global algorithm, we can discount the possibility that the relationship seen in Figure 6 is emerging from the use of different algorithms for different optical classes. On the other hand, it is worth discussing whether a single chlorophyll algorithm would work equally well in all optical classes in the coastal waters of the northern Indian Ocean. Tilstone et al. (2011) reported that there was a good agreement between OC4v6 and another algorithm (OC5) in open-ocean and coastal waters with chlorophyll concentration up to 2 mg m −3 for the Arabian Sea and the Bay of Bengal. In the current study, the classes 7 and 8 had chlorophyll concentrations ranging from 1.5 to 2.5 mg m −3 , quite close to conditions discussed by Tilstone et al. (2011), so that we can assume that OC4 algorithm was suitable for even these high-turbid classes. Nevertheless, it would be interesting, in a future study, to explore the advantages of using algorithms designed for coastal waters (e.g., Le et al., 2013;Loisel et al., 2017;Tilstone et al., 2017).
A similar plot (Figure 7) for the optical diversity index H reveals a more complex pattern, with very high and very low indices appearing in close juxtaposition to each other in clear waters (where both chlorophyll concentration and R rs values are low). Strands of high values of the index also appear for higher values of chlorophyll concentration and R rs . The differences in the distribution of H values, compared with Figure 6 for the optical classification, suggest that optical diversity H perhaps tends to be high during transition between optical classes.

Comparison of Regional Optical Classes With Results of a Global Classification
The question remains whether the regional classification presented here yields results similar to those found in the global classification of Mélin and Vantrepotte (2015). The first difference we note is that the regional classification yielded  only eight classes, whereas the global classification produced 16 distinct classes. The log-normalized mean reflectance spectra (Figure 2) of our optical classes 1-4 are similar to those of classes 10-16 of Mélin and Vantrepotte (2015). The spectral characteristics of their classes 8-16 of Mélin and Vantrepotte (2015) are typical of clear waters, and are similar to those of our classes 1 and 2. We also see similarities between the optical signatures of our classes 6, 7, and 8 and the classes 1-7 of Mélin and Vantrepotte (2015), and both these sets are typical of highly turbid waters, with mineral particles and dissolved organic matter (Vantrepotte et al., 2012). For the optical diversity, the values of H from this study lie in the range of 0-1.3 which was lower than the range (0-3) reported by Mélin and Vantrepotte (2015) globally. These differences in values of optical diversity are associated, by definition, with the differences in the number of classes, which has a direct impact on the values of optical diversity (Mélin and Vantrepotte, 2015). It is important to note therefore, that the values of optical diversity reported here are not directly comparable with those of Mélin and Vantrepotte (2015). Similarly, the differences in the class numbers have to be FIGURE 7 | Relationship between chlorophyll concentration and R rs (670). Climatological data for February are shown as an example. Only well-classified pixels (cumulative membership >0.5) are plotted here. The optical diversity indices are identified using different colors.
accounted for, when comparing our results with those of Mélin and Vantrepotte (2015).

CONCLUDING REMARKS
We have implemented an optical classification using a logtransformed, normalized, remote-sensing reflectance (R rs ) datasets, with spatial resolution of 4 km. In this study, eight optical classes were obtained in the coastal waters of the northern Indian Ocean. Seasonally-reversing monsoons are a defining oceanographic characteristic of the Indian Ocean. Here, we have discussed variations in optical classes with reference to the southwest and northeast monsoon seasons of the study region. The distribution pattern of optical classes in the study region showed major variations between seasons. An example is the presence of optical classes 1, 2, 3, and 4 in the latitudes (0-18 • N) during December-March, whereas they were not found in the months of June-September. The influence of class 5 in intermediate coastal waters is consistent in all the regions with fewer variations in each month. Class 6 is also a minimal contributor to the coastal waters of India, restricted the Persian Gulf in northeast monsoon season. These patterns show that in southwest monsoon season, the optical constituents of the coastline are affected mainly by precipitation and river water intrusion; this condition is not prevalent in northeast monsoon season. The regional distribution of dominant optical classes, and how they are related to physical and biological oceanographic features and processes, is presented in the Table S3.
We have also used the memberships of different optical classes in a given pixel, to study optical diversity within a pixel. Both the dominant optical class and the optical diversity index appear to be related to the chlorophyll concentration and the remote-sensing reflectance at 670 nm (used here as an index of suspended sediment load), but in quite different ways. Whereas, the dominant optical classes transition in a systematic matter from classes 1 to 8 with increasing concentration of chlorophyll and increasing R rs 670, the diversity index appears to be high in areas of transition between optical classes. We also see that the diversity index was high in clear waters around coral islands and in deeper waters away from the shore. Since it is wellknown that biological diversity tends to be high when chlorophyll concentration is high, these results suggest that optical diversity indices might run counter to biological diversity. This suggestion can only be verified when data on phytoplankton diversity in the study area become available on a systematic, and extensive basis. But once such relationships are established, optical diversity and optical classification would pave the way for mapping biological diversity at large scales, using remote sensing.
We opted for the Ocean Color Climate Change Initiative products for the study, because of the long time series of data available, which would facilitate extension of our work to study trends and inter-annual variability, and also because of the better spatial coverage, especially during the monsoon season. However, the dataset is limited to six SeaWiFS wavebands in the visible, which was dictated by the historical sensor capabilities. The number of wavebands available also determined the extent to which optical diversity could be explored. No doubt as better-resolution data become available over long time scales from missions such as Sentinel 3, which carries the Ocean and Land Color Instrument (OLCI) sensor with 10 wavebands in the visible domain, it would become possible to investigate optical classes and optical diversity with higher spectral resolution, which may reveal additional optical classes which were not captured in the present analysis.
The optical classification presented in this work enables us to study the seasonal dynamics in the bio-optical characteristics of the coastal waters of the Northern Indian Ocean, and how they are related to the physical and biological processes. Spatio-temporal variations of the eight optical classes under the influence of seasonally reversing monsoons were profound. This study will aid as a first step for investigations of the inter-annual variations in distribution of optical classes and their shifts in response to changing climatic conditions such as El Niño and La Niña events.

AUTHOR CONTRIBUTIONS
SM: analyzed the datasets, ran the algorithms, and worked on interpretation of the results; TP and SS: implemented and led the study; JJ: assisted with statistical measures used in the study; GG: contributed to the ideas of physical and biological oceanography of the study region; TJ: provided the computational codes for the work. All the authors contributed to the text.