CORRECTION article
Front. Remote Sens.
Sec. Terrestrial Water Cycle
This article is part of the Research TopicGlobal Change and Lake Monitoring: Harnessing Earth Observations for Environmental AnalysisView all 3 articles
Advanced Phycocyanin Detection in a South American Lake Using Landsat Imagery and Remote Sensing
Provisionally accepted- 1Universidad San Sebastian, Concepción, Chile
- 2Universidad de Concepcion Departamento de Oceanografia, Concepción, Chile
- 3Universidad Mayor, Santiago, Chile
- 4Universidad de Concepcion Facultad de Ingenieria, Concepción, Chile
- 5Laboratoire Geosciences Environnement Toulouse, Toulouse, France
- 6Institut National de Recherche pour l'Agriculture l'Alimentation et l'Environnement Direction pour la Science Ouverte, Versailles, France
- 7Universidad de Concepcion, Concepción, Chile
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
Algal blooms in oligotrophic lakes have become a growing concern for inland aquatic systems, where their frequency is increasing at an alarming rate (Feng et al., 2024). These lakes, once characterized by low nutrient levels and clear waters, are now vulnerable to the impacts of the triple planetary crisis of climate change, biodiversity loss and water pollution (Jenny et al., 2020;Mishra, 2023). These interconnected challenges have created conditions that exacerbate algal bloom, posing significant risks to aquatic ecosystems, water quality and human health (Igwaran et al., 2024).Among the most damaging blooms are those caused by cyanobacteria, also known as blue-green algae (Vadeboncoeur et al., 2021). Cyanobacterial blooms are especially dangerous because many species of this group produce toxins that can be harmful to aquatic life, terrestrial animals and humans (Chorus, I., & Welker, 2021;Moreira et al., 2022;Svirčev et al., 2022). These blooms are characterized by the presence of both general pigments, such as chlorophyll-a, and more specific pigments, such as phycocyanin, a marker for cyanobacteria (Konik et al., 2023). Phycocyanin is not only crucial for the photosynthetic processes of cyanobacteria but also serves as a valuable indicator to monitor their presence and growth (Almuhtaram et al., 2021). Detection of this pigment has become essential for tracking bloom dynamics and assessing water quality in affected lakes (Binding et al., 2021). Rising global surface temperatures, driven by climate change, have created increasingly favorable conditions for the growth and persistence of cyanobacterial blooms (Zepernick et al., 2022). Warmer waters favor stratification of lakes, reducing water circulation and creating nutrient-rich environments in the upper layers where cyanobacteria thrive (Chorus, I., & Welker, 2021). In addition, warmer temperatures can increase the metabolic rate of these algae, further accelerating the formation of blooms (Wu et al., 2024). The cumulative effects of climate change, combined with nutrient enrichment from agricultural runoff and urbanization, have intensified the frequency and severity of these toxic blooms, turning once pristine oligotrophic lakes into sites of ecological concern (Sivarajah, 2020;Devlin & Brodie, 2023).The need for advanced monitoring and mitigation strategies is more urgent than ever, as harmful algal blooms threaten not only aquatic biodiversity, but also the sustainability of freshwater resources for human use (Ahmed et al., 2022;Kazmi et al., 2022). Remote sensing technologies, especially those capable of detecting phycocyanin, are increasingly important for early detection and management of these blooms (Zahir et al., 2024). On the other hand, we acknowledge the complexity of inland water optics, including the influence of inherent optical properties that can introduce uncertainty into reflectance-based estimates. Additionally, the use of multispectral satellite data particularly from the Landsat series is limited by its relatively broad spectral bands, which can hinder the accurate detection of narrow absorption features, such as those associated with phycocyanin. A wide range of indices have been created and evaluated, from those adapted from agricultural and terrestrial vegetation applications to combinations of spectral bands specifically designed to detect primary productivity in aquatic environments (Luo et al., 2023). For example, spectral indices such as Surface Algal Bloom Index (SABI) (Alawadi et al., 2010) and Flotation Algal Index (FAI) (Ma et al., 2021) have been used in Lake Villarrica to determine the spatial distribution of Chl-a, while Green Normalized Difference Vegetation Index (GNDVI) and Normalized Difference Vegetation Index (NDVI) have been applied to measure chlorophyll concentration in the aquatic system of Lake Laja (Rodríguez-López et al., 2020). Despite these advances, significant challenges remain. Many indices do not perform optimally in aquatic ecosystems other than those for which they were initially developed and validated (Lukhabi et al., 2023). This is because conditions in different lakes are influenced by a variety of factors, including geographic, meteorological, and physicochemical characteristics (Melese & Debella, 2023).It is therefore essential to fill these gaps by developing methods that are more accurate site-specific based on the predominant optical characteristics of the aquatic system under study.Statistical techniques, such as linear regression, are commonly used to estimate water quality parameters (Gad et al., 2023;Adjovu et al., 2023). For example, Rodríguez-López (2020) developed models to estimate chlorophyll concentrations in six lakes in south-central Chile. In a subsequent study (Rodríguez-López et al., 2021), it was shown that combining regression techniques with remote sensing data could achieve the same objective, allowing the study of spatial and temporal variability in these aquatic systems. The integration of statistical methods with data from a variety of sources including in situ water quality monitoring and satellite observations can improve the ability to monitor and manage aquatic vegetation (Mukonza & Chiang, 2023;Batina & Krtalić, 2024). Compared with nonlinear models or analytical approaches, linear regression offers the advantages of simplicity, interpretability, and computational efficiency, which are valuable when establishing baseline predictive models for water quality. This, in turn, will contribute to the conservation and sustainable management of these critical natural resources. In Chile, species responsible for harmful algal blooms, mainly cyanobacteria, have been reported in lakes such as Villarrica, Laguna Grande de San Pedro, Vichuquén and Laja (Rodríguez-López et al., 2020;Rodríguez-López et al., 2023;Yépez et al., 2024). Therefore, the objective of this work is to develop statistical models to estimate algal pigments by combining water quality data with satellite observations. This lake system located at 39°18'S latitude and 72°05'W longitude is one of the most studied Chilean lakes due to its environmental, economic and social importance (Rodríguez-López et al., 2022). Its basin has multiple uses including tourism and agriculture. In addition, it has two important cities in the Araucanía Region on its banks: the city of Villarrica and the city of Pucón. A secondary water quality standard has been implemented for this lake that monitors the parameters of Chl-a, nitrogen and phosphorus nutrients and water transparency, however, every summer an algal bloom appears that extends over the largest surface of the lake and lasts for several months during the year (Rodríguez-López et al., 2023). Limnological parameters, including water temperature (°C), Chlorophyll-a (µg/L), Phycocyanin (µg/L), turbidity (NTU), and dissolved oxygen (ppm), were recorded during monitoring campaigns conducted by the Dirección General de Aguas (DGA) of Chile between 1989 and 2024. The data, collected at seven monitoring stations as detailed by Rodríguez-López et al. (2023), initially covered the four stations from 1989 to 2009. However, from 2009 onwards, probably due to resource constraints, measurements were only made in summer and spring. The monitored parameters included surface temperature (measured using the standard thermometry method 2250 B of the NCh 2313 compendium), surface chlorophyll-a (Chl-a, using the fluorometric method), total phosphorus (determined by the standard method 4500 P B, 22nd edition, EAM), total nitrogen (measured using the standard method 4500-N C, 22nd edition, EAM), and water transparency (assessed by Secchi disk depth, SDD). At each station in the lake, water samples were collected from five different depths using a 5 L Niskin bottle. These samples were stored in a thermal container. These samples were stored in thermally insulated boxes, kept cold at about 5°C on ice, before being transported to a collection center for analysis. Chemical analyses were performed at the DGA chemical laboratory, accredited according to Chilean standard NCh ISO 17025 of 2005 by the Instituto Nacional de Normalización. (Vanhellemont, 2020;Vanhellemont et al., 2016). In this study, two correction approaches were used: the default Dark Spectrum Fitting (DSF) algorithm (Vanhellemont, et al., 2018;Vanhellemont, 2019, Vanhellemont, 2020) and the earlier Exponential Extrapolation (EXP) method (Vanhellemont and Ruddick, 2014;Vanhellemont and Ruddick, 2015;Vanhellemont and Ruddick, 2016). Corrected Rrs bands for water and some spectral indices were derived from the ACOLITE output. These included individual bands, various band ratios, and additional spectral indices, which were subsequently used as input variables in the next steps for phycocyanin prediction. The bands used ranged from the blue to the shortwave infrared region (B, G, R, NIR, SWIR) and a total of 11 band ratios and 17 spectral indices were calculated. These variables were selected based on expert criteria and an extensive literature review focused on spectral indicators related to water quality, chlorophyll concentration, algal blooms, and nutrient levels in aquatic environments that could be related to phycocyanin (Viso-Vázquez et al., 2021;Wang et al., 2022;Rodríguez-López et al., 2023;Magrí et al., 2023;Lyu et al., 2023;Choi et al., 2023). The surface reflectance values (ρs) of each indicator were extracted in a 3 × 3 pixel matrix per sampling point (Rodríguez-López et al., 2023). The extraction sites were located entirely over open water, free from coastal influences, floating vegetation, and shadows. Although Figure 1 does not display the precise locations of the sampling points due to the map scale, they are situated well away from the coastline. Importantly, both the sampling points and the extraction pixels were free of clouds and fog. This extraction was carried out in QGIS 3.40 software (QGIS Development Team, 2024) and the area of interest (Roi) was downloaded from the Chilean Geospatial Data Infrastructure (IDE-Chile, https://www.ide.cl/) accessed on 04 September 2024.Table 1 show the spectral indices obtained and its specific algorithm. Building upon previous work (Rodríguez-López et al., 2024 PC), we developed linear regression models to estimate concentrations of the pigment phycocyanin (FCA). In this study, a broader set of explanatory variables was employed, derived from the processing of satellite imagery and incorporating both spectral bands and vegetation indices (see Section 2.3). Due to the limited volume of data available at individual stations (usually < 90 data points), we aggregated all observations across stations to construct generalized models rather than site-specific ones.To identify the variables most strongly associated with FCA, we first computed pairwise Pearson correlation coefficients between all candidate predictors and the dependent variable. Only those variables exhibiting statistically significant correlations (pv < 0.05) were retained for model development.Regression models were then constructed using three variable selection techniques: stepwise selection (Efroymson, 1960), backward elimination (Kutner et al., 2005), and forward selection (James et al., 2023), to determine the most parsimonious and explanatory model configuration. These approaches iteratively assess combinations of predictors based on established statistical criteria, such as the Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC), to optimize the balance between model complexity and explanatory performance.While linear regression was chosen for its high degree of interpretability a critical objective for understanding the environmental drivers of phycocyanin its suitability was formally evaluated. We conducted direct comparative analysis against a set of common non-linear machine learning models, including Random Forest, Gradient Boosting, Support Vector Regression, and a multi-layer perceptron Neural Network (Rodríguez-López et al., 2023). This analysis was designed to empirically determine whether the substantial loss in model interpretability associated with these more complex algorithms would be justified by a statistically significant improvement in predictive performance. To assess the generalization performance of the candidate phycocyanin (FCA) regression models and mitigate the risk of overfitting inherent (Cohen & Jensen, 1997) in using model-fit statistics alone (such as AIC or in-sample R 2 ), we employed k-fold cross-validation (with k=10) (Mahmood & Khan, 2009). This procedure provides a robust assessment of predictive capability on data unseen during model training, which is particularly critical given the aggregation of observations across sites (see Section 2.4).To more rigorously test model stability and generalizability, several advanced procedures were implemented. First, we conducted Leave-One-Out Cross-Validation (LOOCV), the most stringent form of cross-validation for a dataset of this size. Second, bootstrap validation with 1,000 iterations was performed to generate 95% confidence intervals for performance metrics, providing a robust measure of model stability. Third, temporal validation was executed by training the model on earlier chronological data and testing it on more recent observations to explicitly assess its predictive power on future, unknown conditions. Finally, the selected ordinary least squares (OLS) model was benchmarked against regularized regression techniques (Ridge, Lasso), which penalize complexity to directly diagnose and prevent overfitting. The final model was selected based on the optimal trade-off between predictive accuracy (maximizing R² and minimizing RMSE) (Preacher, 2006), model parsimony, and consistent, robust performance across this entire suite of validation protocols.Finally, to ensure that the assumptions underpinning linear regression were met, diagnostic checks were performed on the selected model. Predictor collinearity was assessed by calculating the Variance Inflation Factor (VIF) for each variable. Subsequently, residuals were examined through visual inspection of plots against fitted values to confirm linearity and homoscedasticity (Tsai et al., 1998), while quantile-quantile plots (Agustin et al., 2012) were used to assess the normality of the residual distribution. This comprehensive validation and diagnostic process ensures high confidence in the final model's predictive power and its applicability for estimating FCA. Figure 2 and Table 2 show seasonal variations in turbidity, temperature, dissolved oxygen (DO), chlorophyll-a (Chl-a), dissolved organic matter (DOM) and phycocyanin in Lake Villarrica during 1985-2024. The correlation analysis was performed to evaluate the linear associations between all potential predictor variables (derived as described in Section 2.4) and the target variable, phycocyanin (FCA).The complete matrix of Pearson correlation coefficients (r) and their corresponding significance levels (p-values) are presented (Fig. 3,S1). Based on this analysis, and adhering to the methodology described (Section 2.3 and 2.4), only predictor variables exhibiting a statistically significant correlation with FCA (p<0.05, confirmed in Fig. S1) were selected for inclusion in the subsequent regression modeling phase.The results highlighted several predictors with strong, significant linear relationships to FCA (Fig. 3). Specifically, strong positive correlations were identified between FCA and SABI (r=0.82), CCI (r=0.81), the blue spectral band (B; r=0.78), N/G (r=0.70), and N/R (r=0.66). In contrast, significant negative correlations were observed between FCA and GNDVI (r=-0.73), NDVI (r=-0.71), and EVI (r=-0.57). The variable selection techniques produced a range of high-performing models, as shown in Table 3. While several models yielded high initial R² values, our comprehensive validation framework (detailed in Section 2.5) identified a more parsimonious six-variable model as providing the optimal balance of predictive power, stability, and robustness against overfitting. This model, identified through the Stepwise selection method, was selected as the final model for phycocyanin estimation:FCA = N/G + NDVI + B + EVI + SABI + CCIThis model was chosen because it consistently performed well across stringent validation tests, including LOOCV and temporal validation, confirming its suitability for general application. To validate the structural integrity of the selected six-predictor model, a suite of advanced diagnostics was performed to move beyond standard performance metrics and explicitly assess its robustness, stability, and risk of overfitting. The results provide strong empirical support for the model's generalizability.First, to quantify the stability of the model against variations in the training data, a bootstrap validation with 1,000 resamples was conducted. This analysis yielded a tight 95% confidence interval for the coefficient of determination (R²) of [0.821-0.927], demonstrating that the model's high predictive power is not an artifact of a specific data partition but is consistently high across the resampling space.A critical diagnostic for overfitting involved comparing the standard OLS model with a complexitypenalized Ridge regression model. The performance of the Ridge model (R² = 0.835) was nearly identical to that of the unpenalized OLS model (R² = 0.834). This lack of improvement from a penalized approach provides compelling evidence that the OLS model is not over-parameterized and that its performance is not inflated by spurious correlations from its predictors.Furthermore, to ensure the reliability and interpretability of the individual predictor coefficients, multicollinearity was assessed using the Variance Inflation Factor (VIF). All six predictors in the final model exhibited VIF values below 6.5, a result well within the acceptable threshold, confirming the absence of debilitating collinearity.Collectively, these diagnostic tests, visualized in Fig. 4, confirm that the selected six-variable model is not only accurate but also statistically robust, stable, and well-specified, justifying its selection as a reliable estimator for phycocyanin concentrations in Lake Villarrica. To provide a comprehensive justification for the choice of a linear framework, the performance of the final selected model was benchmarked against a set of common non-linear machine learning algorithms. The results, visualized in Figure 5, confirm the suitability of the linear approach. While ensemble models such as Random Forest showed a marginally higher mean cross-validated R² (Fig. 5A), a Mann-Whitney U test confirmed that this performance difference was not statistically significant (p = 0.9937) (Fig. 5H).Further diagnostics revealed broad similarities in model behavior. Both linear and non-linear models exhibited comparable learning curves and low overfitting risk (Fig. 5B, 5C), produced similarly distributed residuals (Fig. 5E), and showed visually alike predictive accuracy when plotted against actual values (Fig. 5F). Critically, an analysis of feature importance demonstrated that both model types identified a similar set of influential spectral indices, with the Cyanobacteria Index (CCI) ranking as a key predictor across all approaches (Fig. 5D).Given these comparable performance and diagnostic characteristics, the decision was based on practical advantages. The linear model offers significantly greater computational efficiency, with training times orders of magnitude faster than the ensemble methods (Fig. 5G). This efficiency, combined with the superior interpretability of its coefficients, positions the linear model as the optimal choice on the performance-interpretability trade-off spectrum (Fig. 5I). Therefore, because no significant performance gain was observed to justify the substantial loss of interpretability and efficiency, the linear regression model was confirmed as the most appropriate choice for the scientific objectives of this study. The selected model (Section 3.3) was used to estimate FCA concentrations across the full time series. Due to data limitations at individual stations, model evaluation was conducted at the weekly scale for the entire lake (Fig. 6), rather than per site. The linear model successfully captured the seasonal dynamics, interannual variability, and several extreme values observed in the measured data. Overall model performance was high, with an R² of 0.91 and RMSE of 0.050.To further assess the model's utility for operational and year-round monitoring, its performance was disaggregated and evaluated by season. The analysis revealed consistently high performance across distinct seasonal conditions. The model achieved an R² of 0.93 (RMSE = 0.057 µg/L) in Autumn, an R² of 0.88 (RMSE = 0.089 µg/L) in Summer, and an R² of 0.80 (RMSE = 0.085 µg/L) in Spring. This remarkable consistency is quantified by the coefficient of variation (CV) for the R² metric across seasons, which was an exceptionally low 0.063 (6.3%). This result provides strong empirical evidence that the model's predictive power is stable and reliable throughout the year, making it a suitable tool for continuous monitoring applications.Given the use of linear regression and implementation of cross-validation during model selection, an additional train-test split was not required. The model's parsimony and interpretability further support its application for long-term phycocyanin estimation. In Figure 7 we show the mapping model in a Landsat 8 satellite image of 03-14-2020.Figure 7. Mapping of the best estimation model selected. In Figure 8 we can observe bar diagrams comparing the seasonal abundance of eight phytoplankton groups at seven stations (VR1 to VR7) during summer and spring. Bacillariophyceae (brown bars) dominate in both seasons, with higher total abundance observed in summer, especially in VR5, VR6 and VR7. In particular, Cyanophyceae (green bars) show a significant presence in VR7 during summer, indicating possible cyanobacterial blooms, while their contribution is minimal in spring. Other groups, such as Chlorophyceae (yellow) and Cryptophyceae (blue), are present but less abundant in both seasons.Seasonal differences reveal that total phytoplankton abundance is systematically higher in summer, suggesting favorable environmental conditions, such as warmer temperatures or greater nutrient availability. In spring, abundance is lower at all stations, with Bacillariophyceae still dominant but other groups contributing more visibly. Station VR7 stands out for its great variability, with Cyanophyceae thriving in summer but almost absent in spring. These patterns highlight changes in community composition influenced by seasonal and spatial factors. In Figure 9 the bar plots depict the abundance of various cyanobacteria species across seven stations (VR1 to VR7) during summer (left) and spring (right) in Villarrica lake. In summer, total cyanobacteria abundance is significantly higher, particularly at VR4, VR5, and VR7, with dominant species such as Microcystis spp., Dolichospermum lemmermannii, and Anabaena spp. VR5 shows the highest abundance, while VR6 and VR7 display relatively lower totals but with notable contributions from Anabaena spiroides and Aphanocapsa sp. In spring, cyanobacteria abundance is drastically reduced at all stations, with fewer dominant species, including Anabaena spp., Anabaena spiroides, and Microcystis elachista. The seasonal variation highlights a pronounced bloom in summer, likely driven by favorable conditions, while spring displays a more subdued cyanobacteria presence. Lake Villarrica, located in the south-central of Chile, is one of only two lakes in the country alongside Lake Llanquihue that benefit from the application of a secondary environmental quality standards for the protection of the surface continental waters as well as the Exempt Resolution SMA N•671/2016 which proposes and justifies the modification of the water quality monitoring performed by the DGA in Lake Villarrica due to multiple anthropogenic impacts, the lakes condition has changed from oligotrophic to mesotrophic, which has led to the implementation of Decree N•19/2013 of the Ministry of Environment (Rodríguez-López et al., 2024). These standards are crucial for protecting aquatic ecosystems and public health, particularly in regions under increasing anthropogenic pressure. The lake holds significant ecological, economic, and social value, positioning it as a major tourist destination in southern Chile. Its strategic location also includes two urban centers on its shores: the cities of Villarrica and Pucón, according to the last census conducted in 2017 the population is 55,478 habitants and 3,260 habitants respectively (INE, 2017), both of which contribute to, and are impacted by, the dynamics of the lake's water quality.Over the past decade, Lake Villarrica has experienced recurrent episodes of algal blooms, primarily during the austral summer months, which coincide with the peak tourist season. These events have raised concerns due to their potential ecological consequences, aesthetic impacts, and implications for recreational water use. More recently, there has been a noticeable increase in both the frequency and duration of these blooms, which now extend from late spring through to early autumn. This shift in phenology is likely linked to climate driven changes, particularly the warming of surface water temperatures, which create more favorable conditions for bloom development (Rodríguez-López et al., 2023).In our previous research (Rodríguez-López et al., 2023), we focused on the remote detection of algal blooms in Lake Villarrica using satellite-based observations and field validation techniques. These studies enabled us to characterize the spatial and temporal patterns of bloom occurrences, as well as to identify the dominant algal taxa responsible for bloom formation. Notably, we found that the blooms were frequently dominated by Dolichospermum circinale (formerly Anabaena circinalis), a filamentous cyanobacterium belonging to the class Cyanophyceae. This species is known for its ability to produce harmful cyanotoxins, raising additional concerns for human and ecosystem health.In Rodríguez-López (2024), we extended this line of research by assessing the potential of remotely sensed phycocyanin concentrations as an indicator of bloom toxicity. Phycocyanin is a water-soluble pigment specific to cyanobacteria and serves as a reliable proxy for their presence and abundance. By estimating phycocyanin levels using hyperspectral and multispectral satellite data calibrated with in situ measurements, we aimed to distinguish potentially toxic cyanobacterial blooms from non-toxic algal proliferations. This approach contributes to the development of early warning systems and management tools for monitoring harmful algal blooms (HABs) in freshwater ecosystems.Unlike earlier approaches that primarily emphasized presence-absence classification, our objective was to develop predictive models capable of estimating continuous phycocyanin values using satellitederived data.We employed surface reflectance data obtained from the Landsat satellite series, specifically leveraging information from the blue, green, red, and near-infrared (NIR) spectral bands. In addition to individual bands, we computed a suite of spectral indices known to be relevant for aquatic environments and vegetation dynamics, including the Normalized Difference Vegetation Index (NDVI), Green NDVI (GNDVI), Enhanced Vegetation Index (EVI), Surface Algal Bloom Index (SABI), Cyanobacteria Index (CCI), and various band ratio combinations such as NIR/Green (N/G).To explore the predictive capacity of these variables, we constructed 255 multiple linear regression models using different combinations of spectral bands and indices as independent variables, with phycocyanin concentration as the dependent variable. Model performance was assessed using a set of validation metrics, including the coefficient of determination (R²) and root mean square error (RMSE), and we retained the ten highest-performing models for further analysis.The most effective model, selected through a stepwise regression procedure, incorporated seven statistically significant predictors (p < 0.05) and took the following form: FCA = N/G + NDVI + B + EVI + SABI + CCI This model achieved a strong fit to the validation data, with an R² of 0.852 and an RMSE of 0.102 µg/L, indicating high explanatory power and relatively low error in phycocyanin estimation. When applied to the complete weekly time series of satellite observations, the model successfully captured both seasonal dynamics and interannual variability in phycocyanin concentrations (R² = 0.91; RMSE = 0.050 µg/L). These results demonstrate the robustness and practical utility for long-term monitoring of harmful algal blooms in Lake Villarrica.The inclusion of multiple vegetation and algal indices allowed for improved sensitivity to variations in water optical properties associated with cyanobacterial abundance. Notably, the combination of traditional vegetation indices (e.g., NDVI, EVI) with indices specifically tailored to cyanobacterial detection (e.g., SABI, CCI) enhanced the model's capacity to discriminate bloom conditions under a range of environmental scenarios.Our findings support the integration of satellite-based regression modeling as a cost-effective and scalable tool for monitoring toxic cyanobacterial blooms in freshwater ecosystems. The approach not only enables frequent and spatially extensive assessments but also provides a solid foundation for the development of early warning systems to protect public health and aquatic biodiversity. We expect future research to incorporate in situ spectral measurements and explore the integration of higher resolution sensors, such as Sentinel-2A/B, to improve model calibration and support more robust water quality assessments. This study highlights the potential of satellite remote sensing specifically the use of Landsat imagery for supporting the continuous monitoring of harmful algal blooms (HABs) in ecologically and socially valuable lakes such as Villarrica in southern Chile. A multivariate regression model, integrating both general (NDVI, EVI) and cyanobacteria-targeted indices (SABI, CCI), demonstrated promising accuracy in estimating phycocyanin concentrations (R² = 0.852; RMSE = 0.102 µg/L), a key proxy for toxic cyanobacterial presence. While these results are encouraging, it is important to acknowledge the limitations associated with the moderate spatial and spectral resolution of Landsat data, which may reduce the model's sensitivity to optically complex water conditions influenced by CDOM, suspended sediments, or bottom effects.From an applied perspective, these findings offer a useful foundation for developing early warning systems to support lake management strategies, especially in regions facing increasing pressure from climate change and eutrophication. Future efforts should incorporate in situ spectral measurements and leverage higher-resolution satellite sensors (e.g., Sentinel-2 or hyperspectral missions) to improve detection capability and enable more responsive, cost-effective monitoring programs tailored to the needs of local water authorities.
Keywords: remote sensing, Phycocyanin, algal blooms, lake, Chile
Received: 27 Sep 2025; Accepted: 17 Nov 2025.
Copyright: © 2025 Rodríguez López, Bustos Usta, Duran-Llacer, Bravo, BOURREL, Frappart and Urrutia. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: Lien Rodríguez López, lien.rodriguez@uss.cl
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.
