Combining Imaging Flow Cytometry and Molecular Biological Methods to Reveal Presence of Potentially Toxic Algae at the Ural River in Kazakhstan

Algal blooms occur in freshwater bodies throughout the world, often leading to fish kills. Cases of these kills along the Ural River were reported in 2018–2019, involving significant amount of sturgeon in fish farming areas. In this study, the analysis of algal samples from the delta of the Ural River up to 100 km inland was carried out from August to December 2019 using imaging flow cytometry (IFC), molecular biological, and microscopic techniques. We identified the filamentous cyanobacteria Cuspidothrix issatschenkoi, Dolichospermum cf. flos-aquae, Dolichospermum cf. macrosporum, Pseudanabaena limnetica, and Planktothrix spp. as the dominant potentially toxic phytoplankton species, and we also found minor quantities of Cylindrospermopsis raciborskii. For the first time, molecular phylogenetic investigations of field clones of cyanobacteria from Ural River were carried out to establish the taxa of the dominant species and to identify the presence of genes encoding toxins. The complementary analysis with nanopore-based next-generation sequencing overlapped with the results of IFC and was instrumental in revealing minor cyanobacteria taxa. Real-time PCR analysis and sequencing indicated the presence of Microcystis and ADA-clade spp. as well as genes associated with the production of microcystin (mcyE) and the algal neurotoxin saxitoxin (sxtA) originating from cyanobacteria. These findings suggest that toxin-producing cyanobacteria could become a threat in the Ural River near Atyrau, which can significantly affect aquaculture in the region.

Algal blooms occur in freshwater bodies throughout the world, often leading to fish kills. Cases of these kills along the Ural River were reported in 2018-2019, involving significant amount of sturgeon in fish farming areas. In this study, the analysis of algal samples from the delta of the Ural River up to 100 km inland was carried out from August to December 2019 using imaging flow cytometry (IFC), molecular biological, and microscopic techniques. We identified the filamentous cyanobacteria Cuspidothrix issatschenkoi, Dolichospermum cf. flos-aquae, Dolichospermum cf. macrosporum, Pseudanabaena limnetica, and Planktothrix spp. as the dominant potentially toxic phytoplankton species, and we also found minor quantities of Cylindrospermopsis raciborskii. For the first time, molecular phylogenetic investigations of field clones of cyanobacteria from Ural River were carried out to establish the taxa of the dominant species and to identify the presence of genes encoding toxins. The complementary analysis with nanopore-based next-generation sequencing overlapped with the results of IFC and was instrumental in revealing minor cyanobacteria taxa. Real-time PCR analysis and sequencing indicated the presence of Microcystis and ADA-clade spp. as well as genes associated with the production of microcystin (mcyE) and the algal neurotoxin saxitoxin (sxtA) originating from cyanobacteria. These findings suggest that toxin-producing cyanobacteria could become a threat in the Ural River near Atyrau, which can significantly affect aquaculture in the region.

INTRODUCTION
The incidence and intensity of harmful algal blooms (HABs) as well as the associated economic impact have increased worldwide with warming water temperatures in recent years (Paerl and Paul, 2012;Jeppesen et al., 2017;Griffith and Gobler, 2020;Wells et al., 2020;Fernandes-Salvador et al., 2021). HABs are complex events typically caused by multiple factors occurring simultaneously, may threaten ecosystems, and lead to the degradation of water quality for recreation, drinking, and aquaculture (Heisler et al., 2008;Oliver et al., 2012;Jeppesen et al., 2017) with further changes in temperature, nutrient load, and oxygenation of aquatic environments. The toxin production by certain cyanobacteria (e.g., Dolichospermum circinalis, Cylindrospermopsis raciborskii, and Microcystis aeruginosa) during algal blooms causes the accumulation of toxins in the tissues of fresh-and saltwater mussels, fish, and mammals (Rapala et al., 1997;Codd et al., 2005;Wilson et al., 2005;Carmichael, 2008;Mihali et al., 2009;Oliver et al., 2012;De Pace et al., 2014;Gibble et al., 2016;Li et al., 2016;Luerling et al., 2017;Meriluoto et al., 2017;Chernova et al., 2019). The harmful effects can be caused by either direct effects from toxins or changes in the water chemistry, such as oxygen depletion or hydrogen sulfide production (Eriksson et al., 1986;Tornazo et al., 1990). Together with nutrients, the temperature is considered an essential factor in the growth, metabolism of cyanobacteria (Reynolds, 2008), and the adaptation of invasive potentially toxic cyanobacterial species, particularly Cylindrospermopsis (=Raphidiopsis) raciborskii (Padisák, 1997;Thomas and Litchman, 2016). In the last decade, the increasing contamination of freshwater bodies by anthropogenic pollution, as well as the upward temperature gradient have had major effects on further expansion of cyanobacteria in freshwater bodies (Sukenik et al., 2015). Although most studies are focused on the seasonal algal blooms in summer and spring seasons, cold weather and winter cyanobacterial blooms have been documented throughout the literature (Naselli-Flores et al., 2007;Babanazarova et al., 2013;Wejnerowski et al., 2018). Water temperature directly affects rates of respiration, photosynthesis, and algal growth in aquatic ecosystems (Regier et al., 1990). The link between temperature and toxin production by cyanobacteria remains elusive and is governed by the complex interaction among all environmental factors.
Importantly, improving the ability to monitor and detect HABs and cyanobacterial toxins are required for better documentation and understanding of algal blooms and the prevention of their effects on fisheries and farmed aquaculture (Shumway, 1990;Pearl and Tucker, 1995;GEOHAB, 2001;Ramsdell et al., 2005;Anderson et al., 2012). Strong species-specific interactions between the environment and phytoplankton communities make HAB predictions difficult (Griffith and Gobler, 2020). Monitoring programs for HABs were developed initially for marine HABs and generally relied on the microscopic identification and manual time-consuming counting of phytoplankton (Anderson et al., 2012). The inability of conventional flow cytometry to determine species composition is one of the major reasons why flow cytometry could not be adopted as a standard tool for monitoring harmful algae (Rutten et al., 2005). In the HAB monitoring and research, there are several different approaches to identify causative species and relevant toxins, including the development of nucleic-based biosensors in microarray and real-time polymerase-chain reaction (PCR) formats (Dierks-Horn et al., 2011;rev. McPartlin et al., 2017), imaging flow cytometry (IFC) (Buskey and Hyatt, 2006;Campbell et al., 2010Campbell et al., , 2013, and satellite analysisbased algorithms. The FlowCAM (Yokogawa Fluid Imaging Technologies, United States) is an imaging flow cytometer designed to characterize microplankton size range particles (Sieracki et al., 1998) and used by different research groups for semi-automated monitoring of algal blooms (Buskey and Hyatt, 2006;Lehman et al., 2013;Dashkova et al., 2017). FlowCAM software have been allowed to discriminate and quantitate different potentially toxic algae within complex natural assemblages, including Microcystis colonies (Buskey and Hyatt, 2006;Lehman et al., 2013Lehman et al., , 2017Sukenik et al., 2015;Graham et al., 2018;Kurobe et al., 2018;Mirasbekov et al., 2021).
In common HAB monitoring strategies, the number of field samples required to be analyzed by light microscopy for taxonomical species identification dramatically reduces the possibilities of early detection of HAB initiation and dispersion phases (Antonella and Luca, 2013). In addition, the optical-based methods of identifying toxic cyanobacteria (light microscopy or IFC) do not differentiate toxin-producing from non-toxin-producing species. By taking in the account these limitations of traditional imaging approaches (microscopy and cytometry) in monitoring programs, molecular methods have been implemented for detection of pre-bloom algal abundances. That includes many real-time PCR protocols to monitor toxic cyanobacteria (rev. Martins and Vasconcelos, 2011). Identifying and quantifying of the potentially toxic species present may not suffice to estimate the potential toxic risk, because not all of the strains may produce toxins. Screening based on molecular targeting is more precise and sensitive with advantage of distinguishing non-toxic and toxic strains (Pearson and Neilan, 2008). Within most important cyanotoxins, PCR-based methods were used to identify toxic genotypes of cyanobacteria-producing microcystins, cylindrospermopsin, paralytic shellfish toxins (PSTs), anatoxins, and other toxins (Pomati et al., 2000;Foulds et al., 2002;Vaitomaa et al., 2003;Kurmayer and Christiansen, 2010;Pearson et al., 2010;Martins and Vasconcelos, 2011;Antonella and Luca, 2013). Microcystin production has been the most studied between cyanobacteria and is well known among several genera of cyanobacteria, including Microcystis, Anabaena, Pseudoanabaena, Planktothrix, and others (Kurmayer and Christiansen, 2010;Pearson et al., 2010). Cylindrospermopsin (CYN) is the second most widely occurring hepatotoxin produced by cyanobacteria species, including C. raciborskii and Aphanizomenon/Cuspidothrix spp. (Pearson et al., 2010). Several species of freshwater filamentous cyanobacteria have been reported to produce PSTs (saxitoxins) (Humpage et al., 1984;Mahmood and Carmichael, 1986). Although initially thought to be rare, cyanobacterial anatoxin-a (ANTX) has been detected in many freshwater habitats (Bruno et al., 2016;Ballot et al., 2018). Real-time PCR protocols can be helpful in defining the presence of toxic cyanobacteria genotypes during bloom-and non-bloom seasons (Martins and Vasconcelos, 2011;Antonella and Luca, 2013). However, molecular probes and PCR primers are designed for specific targets and may overlook unknown members of an algal community. In freshwater cyanobacteria research, next-generation sequencing (NGS) has been recently used to assess the general diversity mostly based on 16S rRNA gene amplicons analysis (Liao et al., 2016;Woodhouse et al., 2016;Lezcano et al., 2017;Scherer et al., 2017). Lately, the portable MinION device offered a unique solution for dinoflagellates HAB microbiome research (Hatfield et al., 2020). Recently released, the Oxford Nanopore MinION device (United Kingdom) is a third-generation portable sequencing platform valuable for applications with a large number of samples or where portability is more important than increased error rate (5-15%, Pfeiffer et al., 2018) (Castro-Wallace et al., 2017Johnson et al., 2017;Gowers et al., 2019;Burton et al., 2020). However, only a few studies focused on harmful algae in water bodies combining NGS assessment with toxic genotype analysis by real-time PCR (Fortin et al., 2015;Lee et al., 2015).
Although most studies are focused on the seasonal algal blooms in summer and spring seasons, cold-weather and winter cyanobacterial blooms have been documented throughout the literature (Naselli-Flores et al., 2007;Babanazarova et al., 2013;Wejnerowski et al., 2018). Water temperature directly affects rates of respiration, photosynthesis, and algal growth in aquatic ecosystems (Regier et al., 1990;Joehnk et al., 2008). The link between temperature and toxin production by cyanobacteria remains elusive and is governed by the complex interaction among all environmental factors. It was found that the highest toxin contents, such as microcystin (MC), cylindrospermopsin, and other toxic compounds, are not associated with temperature conditions most favorable for cyanobacteria growth (Saker and Griffiths, 2000 for cylindrospermopsin;Savadova et al., 2018 for MC).
In this work, a combination of high-resolution IFC, NGS, and PCR-based methods were used to evaluate the presence of potentially toxic cyanobacteria along the Ural River, including the places of recent fish kills. Initial detection and identification of potentially toxic algae was done with IFC, and then confirmed with molecular methods. This combination of methodological approaches will be a valuable way to determine the presence and frequency of potentially toxic detectable cyanobacteria. Additionally, metagenomic analysis with the Nanopore sequencing platform provided an alternative, complementary way to identify and characterize potential harmful algae strains.

Water Sampling
Ural River is the third river in terms of its length among European rivers, and it flows from the Ural mountains to the delta of the Caspian Sea. The temperature in this area can be as low as −38 • C and as high as 46 • C (Kazhydromet Climate Monitoring Bulletin, 2016). This project had covered the region of the river within the borders of Atyrau city, upstream from the city border (100 km away), and downstream to the delta of the Caspian Sea. The sampling sites are illustrated in Figure 1, and they were assigned as follows: S1 Delta > S2 Channel > S3 Ship > S4 low Peretaska > S5 Balykshy > S6 Peretaska > S7 Oil refinery water intake-ANPZ > S8 Atyrau Su Arnasy > S9 Geolog > S10 Almaly > S11 Saraishyk > S12 low Makhambet > S13 upper Makhambet > S14 Ural River 100 km (distances are as follows: S1-S2 14.8 km; S2-S3 21.2 km; S3-S4 9.2 km; S4-S5 8.6 km; S5-S6 2.1 km; S6-S7 1.6 km; S7-S8 6.7 km; S8-S9 5.1 km; S9-S10 16 km; S10-S11 45 km; S11-S12 23.2 km; S12-S13 17.5 km; S13-S14 24.6 km) The samples were collected across the length of the river in August-December 2019, noting that sampling in October and December was limited by two sampling points at Peretaska duct, near a local cogeneration plant and oil refinery.
Shortly, the surface water sampling was carried out, and samples for IFC analysis of the phytoplankton communities and hydrochemistry were collected at each location in plastic 5-L bottles from surface water horizon (0.5 m depth) in plastic 5-L bottles. Water turbidity was checked by the Secchi disk. Phytoplankton samples were fixed in triplicates with different fixators: paraformaldehyde (0.5%), glutaraldehyde (0.5%), and Lugol's solution. Fresh, live samples were preserved in the cold box until delivered to the laboratory and then kept in the refrigerator at 4 • C. Water parameters such as dissolved oxygen, salinity, conductivity, temperature, and total dissolved solids at the sampling site were measured on-site. The phosphate concentration was calculated by the standard molybdate method (Motomizu et al., 1983).

Satellite Data Analysis
Land surface temperature is one of the physical observation remote sensing parameters which are of great importance for different disciplines, such as environmental studies, hydrology, etc., which contributes to a better understanding of energy and water exchange. NASA's Landsat mission has collected a huge amount of relatively high spatial resolution thermal data as it has a thermal infrared (TIR) sensor onboard since 1984 (Barsi et al., 2003). In this study, Landsat 8 data were used on Atyrau city to retrieve land surface temperature for recent dates as it is the source of new images. Landsat 8 has two thermal bands with a spatial resolution of 100 m and a 16-day revisit time (Rozenstein et al., 2014). The data included bands 4, 5, and 10 of scenes LC08_L1TP_166027_20181201_20181211_01_T1 (December 01, 2018), LC08_L1TP_167027_20181208_20181211_01_T1 (December 08, 2018), LC08_L1TP_167027_20190314_2019032 5_01_T1 (March 14, 2019), and LC08_L1TP_167027_2 0191211_20191217_01_T1 (December 01, 2019).

The Land Surface Temperature Calculation
The main workflow is based on the algorithms proposed by Chander and Markham (2003), Chander et al. (2009), andWalawender et al. (2012). Band ten from Landsat 8 images was used to calculate the top of the atmospheric radiance (TOR) first, and then it was converted into the at-sensor temperature.
FIGURE 1 | Representation of sampling points on the Ural River. Sampling points from S1 to S3 were taken near the Ural delta, sampling points S4-S9 were taken within the city boundaries, where most of the fish die-off happened at the end of 2018-beginning 2019, sampling points S10-S14 were taken above the city boundaries upstream of the Ural River. Sampling points P1-P4 are taken during October 2019 expedition.
In parallel, the red and the near-infrared bands were used to calculate NDVI. The NDVI was used to calculate the ground emissivity. Both derivatives were integrated into the final land surface temperature products for each date. The time series covering the period between December 2018 and December 2019 were processed using Python 3 codes.

Ion Chromatography
To determine the ionic composition of water, ion chromatography technique (IC) was used. Before the IC analysis, samples were filtered with 0.22 µm cellulose acetate filters using a vacuum pump. The analysis of fluoride, chloride, nitrite, bromide, nitrate, phosphate, and sulfate ions in water samples was performed using a Dionex ICS-6000 HPIC system (Thermo Fisher Scientific, Waltham, MA, United States) equipped with a conductivity detector, two chromatographic columns: for the separation of cations and anions, and Chameleon 6.0 software. Anion separation was carried out using a Dionex IonPac AS11-HC-4 µm capillary, analytical, and guard column (2 × 250 mm) and 30 mmol potassium hydroxide solution as eluent. To determine the cations, we used a Dionex IonPac CS12A-5 mSm IC column (3 × 150 mm). A calibration chart was built from 1 to 30 ppb; the correlation coefficient was 0.998. Quality control was acquired at 15 ppb with 40 times diluted samples.

Ultra High-Performance Liquid Chromatography Coupled With Diode Array Detection
An UHPLC-diode array detection (DAD) method was chosen because it can quickly confirm the presence of microcystin (MC-LR) and nodularin in aqueous samples (Rapala et al., 2002;Thuret-Benoist et al., 2019). The UHPLC system used was an Ultimate 3000 instrument (Thermo Fisher Scientific, United States) equipped with DAD detector. UV-spectra were drawn from 200 to 300 nm. Separation of the toxins was achieved on reversed-phase C18 analytical column thermostated at 35 • C. Both the aqueous, water (A), and organic, acetonitrile (B), mobile phases contained 0.025% trifluoroacetic acid (TFA). The following linear gradient program was used: 0 min 70% A, 1.2 min 70% A, 6.2 min 30% A, 7 min 0% A, 7.4 min 70% A; stop time 10 min. The sample injection volume was 5 µl. The flow rate was 0.350 ml/min. For identification of cyanotoxins in water samples, we used standards of MC-LR and nodularin (Thermo Fisher Scientific, United States) with a concentration of 500 µg/L. Stock solution of the toxins was prepared in methanol and stored at −20 • C. For solid-phase extraction (SPE), C18 cartridges HyperSep Phenyl, 500 mg (Thermo Fisher Scientific, United States) were used. All solvents (distilled water, methanol, and acetonitrile) were UHPLC gradient grade. TFA was at least of analytical reagent grade. The filtered samples were passed through a SPE cartridge. Analytes were eluted from the solid phase with a small amount of 90:10 methanol:reagent water (v/v). The extract were concentrated to dryness by evaporation with SP Genevac Rocket Synergy 2 evaporator (SP Inc., Anchorage, AK, United States), and then adjusted to a 1-ml volume with 90:10 methanol:reagent water (v/v). The HPLC method was optimized by using ultrapure grade water with diluted MC-LR and nodularin standards (Thermo Fisher Scientific, United States).

Light Microscopy
For the morphological identification of organisms, samples were analyzed with a Leica DM2500 microscope (Leica, Wetzlar, Germany) with ×100, ×200, and ×400 magnifications. The plankton organisms were classified at the genus level and species level, where possible. Dolichospermum and Aphanizomenon/Cuspidothrix could not always be determined to species level because of the lack of akinetes that are a key feature in the morphological identification of nostocalean cyanobacteria. In such cases, the Dolichospermum spp. were classified according to the filament type as "curved" or "straight."

Imaging Flow Cytometry
Cell abundance was counted using FlowCAM VS-4 (Yokogawa Fluid Imaging Technologies, United States). A mixture of 5, 10, and 25 µm size beads (Yokogawa Fluid Imaging Technologies, United States) was used for the calibration of the instrument. The FlowCAM was run in autoimage and/or laser-triggered mode using a ×10 objective and a 100-µl flow cell. The results from the FlowCAM were analyzed using VisualSpreadsheet software version 4.0 (Yokogawa Fluid Imaging Technologies, United States). Classification of the dominant genera of phytoplankton was conducted using a semi-manual mode of the software and manual selection of the representative training set of images. To create the subsets within the classification process, statistical filtering, which takes the highest score value as a threshold, was used. Each run of statistical filtering chooses particles with the value of likeness higher than the threshold value. The results of the automated classification were manually rechecked in case of erroneous or missed particles. Classification included subsets of the following major cyanobacterial genera: Dolichospermum, Microcystis, Pseudanabaena, Aphanizomenon/Cuspidothrix, Cylindrospermopsis, and Sphaerospermopsis (representative images in Figure 2).

MinION Sequencing
Based on FlowCAM IFC analysis, water samples from five different Ural River locations were selected for NGS. More specifically, samples were taken from the delta of river Ural, inside Atyrau town line (Peretaska artificial stream), and near Makhambet village in August 2019 field expedition. DNA extraction was performed using PowerWater DNA Isolation Kit (Qiagen, United States). The resulting DNA concentrations were in a range of 2-10 ng/µl as defined by measurement with a Nanodrop spectrophotometer (Thermo Fisher Scientific, United States) set at 260 nm. The methodology was adapted from manufacturer's instructions in the protocol library of Oxford Nanopore Technologies website. The 16S Barcoding Kit (SQK-RAB204) was used for library preparation, which included PCR reactions and attachment of sequencing adapters. The former procedure was performed by preparation of reaction mixture that included nuclease-free water (14 µl), DNA sample (10 ng, diluted with nuclease-free water to reach 10 µl), DreamTaq Hot Start PCR Master Mix (25 µl), and 16S barcode primer at 10 µM (1 µl). There were 12 different 16S barcodes, and they were assigned for defined samples from Ural River. For instance, Barcode 01 was used for the sample site S1. The thermocycler was set for the following parameters: 1 min at 95 • C, followed by 25 cycles with 20 s at 95 • C, 30 s at 55 • C, 2 min at 65 • C for and ended by a final extension of 5 min at 65 • C. PCR products were then cleaned up by AMPure XP beads (Beckman Coulter, United States), and the mixture of PCR products (with different barcodes) was prepared. The sequencing adapters were added to this mixture and incubated for 5 min in room temperature. After the library preparation, the DNA library was mixed with loading beads and sequencing buffer (provided by the manufacturer). The MinION device was used for sequencing with flow cell R9.4. DNA library was transferred to the flow cell according to the manufacturer's recommendations, and sequencing run continued for about 14 h. The basecalling was turned on during the run, and the produced data (Fastq files) were analyzed by EPI2ME cloud-based data analysis platform. The "Fastq 16S r2020.04.06" workflow was used to build 16S classification taxonomy.

PCR Analysis of Toxic Cyanobacteria in Ural River Environmental Samples
Based on FlowCAM IFC analysis, water samples from five different locations in the Ural River were selected for molecular analysis (labeled as samples S1, S4, S6, S12, and S13). Primers were designed for the identification of microcystin synthetase gene E (mcyE), the aspartate aminotransferase domain for saxitoxin synthesis (sxtA), amidinotransferase gene for cylindrospermopsin synthesis (cyrA), and genes for the biosynthetic production of anatoxin-a (anaC), according to Vaitomaa et al. (2003) reaction volume of 50 µl. All reaction tubes were transferred to a thermocycler (Bio-Rad, Hercules, CA, United States) for a 3-min initial denaturation step at 95 • C. That was followed by 40 reaction cycles, with one cycle consisting of 30 s at 95 • C, 30 s at temperature depending on T m of primers, and 60 s at 72 • C. The PCR steps were followed by a final extension step at 72 • C for 5 min. Length of PCR products were analyzed by DNA electrophoresis using 1% agarose gel with SYBR Safe DNA Gel Stain (Life Sciences, United States). Samples with PCR products were purified and sequenced. The resulted DNA sequences were used for the identification of Microcystis-specific toxin synthase genes. Sanger sequencing was performed by commercial entity (Evrogen, RF).

Real-Time PCR
The real-time SYBR Green I qPCR was performed using the Bio-Rad CFX96 Touch Real-Time PCR instrument (Bio-Rad, United States). All reagents were provided by Syntol (Moscow, Russian Federation) and used according to the manufacturer's protocol. The reaction volume of 20 µl included 5 µl of extracted DNA [concentration range of 2-10 ng/µl as defined by measurement with a Nanodrop spectrophotometer (Thermo Fisher Scientific, United States)], 5 µl of nuclease-free water, 2.5 µl of vortexed MgCl 2 solution, 2.5 µl of dNTP, 2.5 µl of SYBR Green buffer, 1 µl of each primer (10 µM), and 0.5 µl of SynTaq-polymerase. The reaction protocol included an initial denaturation step at 95 • C for 3 min, and 40 cycles of denaturation (30 s, 95 • C), annealing (30 s, T m of primers), and extension (60 s, 72 • C) were followed by melting curve assessment. Each reaction was performed in triplicates. Microcystis genome DNA extracted from toxic strain served as a positive control, and a no-DNA reaction well served as a negative control.

Statistical Analysis
Multivariate analysis techniques provide appropriate statistical tools for analysis of environmental parameters and their effects on ecological communities (Greenacre and Primicerio, 2013). The principal component analysis (PCA) was performed to identify the significant environmental conditions influencing distribution detected by IFC cyanobacterial genera in Ural River sampling sites. The analysis was performed using STATISTICA vs. 8 (StatSoft, Tulsa, OK, United States). Environmental parameters included temperature, pH, conductivity, solids (referred as TDS), dissolved oxygen (referred as DO), and NO 3 − and NH 4 + concentrations as explanatory data (Supplementary File 1). Correlations between the abundances of different cyanobacteria and environmental factors were analyzed by Spearman correlation analysis (STATISTICA vs. 8, StatSoft, United States).

Sample Sites Description and Environmental Parameters
Sampling sites S1-S3 were located downstream the Ural River, closer to the Caspian Sea basin. Sampling sites S4-S9 were located within the city area, and sampling sites S10-S14 were located upstream 20-100 km away from the city. The main hotbed of the fish kills was in the urban region of the city, starting from sampling point S5 Balykshy until sampling point S9 Geolog. The area corresponds to the residential districts of the city, as well as the industrial districts. For instance, there is an artificially created water channel called Peretaska (S6), the main purpose of which is water intake by the local thermal power station and oil refinery. Water then exits at a higher temperature (from our measurements at 30 • C on October 10 and 26 • C on December 24, 2019) and is returned to the upper part of the Peretaska, which then heads in the direction of the Caspian Sea. The temperature difference between the upper and lower Peretaska is significant. Water in the upper part of the Peretaska channel differs from that in the Ural River in terms of its greenish color and high turbidity. The river flow is minimal and unobservable. Sampling points S4-S8 are located near the industrial areas, while S9 Geolog is a residential area near the border of the city.
By applying the satellite analysis approach described in the section "Materials and Methods, " we derived the land surface temperature of the study, referring to the four dates indicated above. Results demonstrate a strong thermal anomaly near Atyrau's power plant and oil refinery on December 1 and 8, 2018 (Figures 3A,B) and December 2019 (Figure 3D), while images acquired in the spring show a significant decrease in the surface temperature for the same area ( Figure 3C). Furthermore, the locations of a fish kill in December 2018 and 2019 were mapped and combined with the land surface temperature map, derived from the images acquired by Landsat 8 on December 1, 2018 (closer to the date of initial fish kills).
The detailed information on ion concentration is provided in Supplementary File 1. The highest concentration of nitrates in water was detected in the artificial water duct S6 Peretaska at 4.69 mg/L of water leading to the cogeneration plant. Accordingly, the oil refining plant's water intake system from the Ural River directly corresponds to the sampling site S7 Oil Refinery Water Intake, where a nitrate concentration of 3.92 mg/L was detected. Phosphate concentrations were slightly increased in the Peretaska locations than the Ural River (Atyrau Su) location. Oxygen concentrations were significantly low in the Peretaska duct (6.5 mg/L). No differences in chloride concentrations between Peretaska sites and the Ural River were detected in water samples in August-December 2019.

Identification and Abundances of Potential Toxic Algae in the River Ural Region
We have identified the cyanobacteria Cylindrospermopsis (=Raphidiopsis) raciborskii, Sphaerospermopsis aphanizomenoides, Dolichospermum spp., Aphanizomenon/Cuspidothrix spp., and Pseudanabaena limnetica as dominant potentially toxic phytoplankton species in the Ural River using light microscopy and IFC (Figure 2). The genera of these potentially toxic cyanobacteria were earlier described as forming algal blooms with the capability of releasing life-threatening toxic compounds (e.g., Codd et al., 2005;Paerl and Otten, 2013;Rastogi et al., 2015). According to the classification results, the most abundant species of potentially toxic cyanobacteria in the Ural River were Dolichospermum and Aphanizomenon/Cuspidothrix spp. Sampling site S1 is the delta of the Caspian Sea, which correspondingly encountered the lowest abundance of cells and a higher water salinity of 0.42%. The S2 channel sampling site is a shallow artificial channel connecting the Caspian Sea to the Ural River with brackish water (0.15%) and a higher abundance of the target species.
The abundance of cells of potentially toxic algae was high in the mouth of the river duct Peretaska, an area with low circulation rates and increased water temperature. The parameters of water in the S6 Peretaska site included a heightened NO 3 concentration (4.69 mg) and a slight increase compared with the sampling point temperature (23 • C) at the Ural River. The low Peretaska was low in biodiversity, and Dolichospermum spp. was most abundant among potential harmful algae with a cell count equal to 10 filaments/ml. The temperature of this sampling site was 30 • C in August and October and decreased to 26 • C by the end of December.
The maximum number of cells of Aphanizomenon/Cuspidothrix spp. reached almost 900 particles/ml at the S11 Saraishyk sampling site (Figure 4;  Supplementary File 2). The site was distinguished by the high NH 4 content in the water (1.19 mg/L). In addition, both the S11 Saraishyk and S10 Almaly sites showed low numbers of C. raciborskii. The S12 low Mahambet site had a high  concentration of NH 4 (3.25 mg/L) and is the only sampling point where Microcystis sp. was detected by IFC.

MinION-Based Sequencing
After the sequencing run of 14 h, it resulted in 1,088,000 passed reads with a total yield of 1.6 gigabase pairs of sequencing data. The average sequence length was 1,485 base pairs with an average quality score of 9.29. The 16S amplicon analysis in EPI2ME platform resulted in 896,029 classified reads with an average accuracy of 84%, 864,719 of which were used in the further taxonomic analysis in sample sites S1, S4, and S12. The raw sequences from nanopore-based NGS analysis had a significantly higher error rate compared with short-read sequencers, similar to findings by other researchers (Pfeiffer et al., 2018). Large variations in the raw number of reads between different sampling points were observed. The classification was represented by matches in NCBI database (Altschul et al., 1997), and the number of sequencing reads was interpreted to the estimated relative abundance of species in sample sites of the Ural River. Figure 6 illustrates the relative phylogenetic distribution at the phyla level. The data shows that S1 (Delta) was abundant with cyanobacterial organisms, while the sampling site of S4 (low Peretaska) was mainly inhabited by Proteobacteria and Firmicutes, respectively. A more detailed exploration of cyanobacterial phylum also indicated differences in the distribution within these sites (Figure 7). The species-level identification with cyanobacterial origin yielded from one (S4) up to 63 different matches (S1) in NCBI database. According to relative abundance data of these species, Planktothrix spp. were predominant, and low quantities of Microcystis spp. were also identified at the S12 site. Potentially toxic algae Nodularia spumigena was identified by NGS but not by IFC as the most abundant among cyanobacterial species at S1 (delta) site.

Identification of Toxic Genes in Cyanobacteria Using PCR Methods
Polymerase-chain reaction analysis was used to identify Microcystis-specific mcyE gene according to the primer design provided by Vaitomaa et al. (2003). The genome assembly of M. aeruginosa NIES-843 (Kaneko et al., 2007) was used to find an expected sequence of PCR products. Its size was 247 base pairs, and only S12 sample had a corresponding significant band. The semi-quantitative analysis by SYBR Green I realtime PCR revealed the presence of this gene in sample sites S1 and S12, assessed by the similar melt peak as in positive control. The relative gene abundance data was standardized (ratio-based) by concentration of input DNA extract. Sample S1 contained 2.05-fold more mcyE gene per nanograms of total DNA than the S12 sample.
The PCR amplification of the sxtA gene had an expected size of 148 base pairs (Rantala-Ylinen et al., 2011) and only sample S13 showed the positive band in the following size range. To validate the presence of this toxin production gene, the real-time PCR analysis was performed with positive control of standards from multiplex qPCR kit (Phytoxigene CyanoDTec TM Toxin Genes Test; Diagnostic Technology, United States). The results showed specific amplification of the sxtA gene only in sample S13. The relative sxtA gene abundance in site S13 was considerably small with a 1.23 × 10 −4 -fold difference, when compared with FIGURE 6 | Bacterial community composition at the phylum level based on 16S rRNA next-generation sequencing at the Ural River S1, S4, and S12 sites.
FIGURE 7 | Cyanobacterial community composition based on 16S rRNA next-generation sequencing at the Ural River S1, S4, and S12 sites.
positive control. However, as the positive control from the kit had a known concentration of 100,000 copies/µl, it allowed us to estimate the concentration in S13 sample to 36.8 copies per µl of the extracted total DNA sample.
The presence of cylindrospermopsin (cyrA) and anatoxin (anaC) genes was not identified by the abovementioned PCR methods. To sum up, the mcyE was detected at the environmental samples from S1 and S12 sample sites, below Makhambet village, and the delta of Ural River location, respectively, and sxtA gene was detected at S13, upper Makhambet site.

Sanger Sequencing of PCR Products
Polymerase-chain reaction products were purified and sequenced using the forward primer of Microcystis-specific mcyE gene (Evrogen, Moscow, Russian Federation). Resulted sequence data for S1 and S12 are presented in Table 2. Sequencing of samples from S4, S6, and S13 sites did not produce any DNA data of cyanobacterial origin.
Sequence data was analyzed via BLAST algorithm, and the best matches for the two sites are shown in Table 3.
The sequence homology analysis in NCBI nucleotide database revealed that PCR products from Ural delta (S1) and low Makhambet (S12) matched with Microcystis-specific mcyE gene with a percentage identity between 88 and 96%, and S13 sample (upper Makhambet) matched with ADA-clade spp. sxtA gene (99%).

DISCUSSION
CyanoHABs are very complex phenomena and early detection and tracking cyanotoxin producers and predicting of the bloom dynamics remain as unsolved challenges for water managers and aquaculture farmers. From December 2018 through March 2019, the significant mortality of farmed sturgeon and other fish species were reported in the Ural River near the town of Atyrau, Kazakhstan, with initial findings of dead fish in the artificial duct of the Peretaska channel, which supplies water to the local power plant and oil refinery. The Peretaska duct was an initial point of a huge fish die-off on the Ural

Sample
Sanger sequencing data S1-mcyE GCCCGAGCAACGAATACTCATAGACTTAAGATCGAAACTTCTATCCTAGCTTTCTTGATGACAAAAAAGACAATGGAACCGGCGATTTAGG CAAGCAAACTGCTCCCGTGTATCATTGAGTTTATGGGACGAAAAGATAATCAAGTTAAGGTCAATGGTTATCGAATTGACCCCGGAGAAATT CAAACACGACCGTTGCTCCTTGA S12-mcyE CCTCAATCACTAGCCAACCCTGAAATGACTCAGGAAATGTTAAACCTAGCTTTCTTGAGTGAGACAAAAACTCTCTTTAGAACCGGCGATTT AGGCAAGCAAACTGCTCCGGGTATCATTGAGTTTATGGGACGAAAAGATAATCAAGTTAAGGTCAATGGTTATCGAATTGACCCCGGAGAA ATTGAATATCAATTGACTCGTTATGCTCCCATTGA S13-sxtA TCGATTAGAGAGTATCCTCTCAGTATTCAATCAGTTACCATTGGAAGAACAGCCCAAATACATTGCTGATATGGGGTGTGGCGATGGAACTC TCCTGAAACGAGTCTGGGAAACA River and aquaculture fish farms in December 2018 (>100 tons of artificially grown sturgeon), and a small fish die-off was observed in December 2019. The temperature anomaly registered by Landsat 8 in this area (Figures 3A,B,D) is supported by in situ measurements of water temperature in the Peretaska duct below the oil refinery and cogeneration plant. The decrease in temperature in March 2019 coincides with a temporary interruption to the biological treatment plant at the oil refinery due to preventive maintenance ( Figure 3C). The thermal discharge from power plants leads to an elevation in temperature of the receiving waters (Li et al., 2011) and promotes phytoplankton growth and algal blooms, particularly in cold seasons (Hickman and Klarer, 1975;Jiang et al., 2012Jiang et al., , 2019a. Thermal discharge plumes near power plants are thermally dynamic environments with strong influences on aquatic organisms. Many scientists report decreasing diversity of aquatic communities, functional and species richness, and even large mortality events of benthos invertebrates near discharge sites or in discharge channels (Suresh et al., 1993;Chuang et al., 2009;Teixeira et al., 2009;de Szechy et al., 2017;Lee et al., 2018;Khosravi et al., 2019). It seems that a shift in temperature is more important than exact temperature numbers, emphasizing a possible role of toxins in quorum sensing. The number of different toxins may increase with changes in growth conditions and temperature shifts (Kleinteich et al., 2012). In our field expedition, we found a number of potentially toxic cyanobacteria species originally using FlowCAM-based IFC (supported by light microscopy), including invasive C. raciborskii, along all of the 100-km stretch of the Ural River-from the delta S1 sampling point up to S14. C. raciborskii is a cyanobacterium species with a well-known invasive origin (Padisák, 1997;Moreira et al., 2011), and causes increasing concern because of its potential toxicity (Briand et al., 2004). The presence of C. raciborskii was occasionally reported in the region close to the Northern Caspian Sea (Kazakhstan) starting in the twentieth century (Padisák, 1997). However, in the current study, we did not identify the presence of genes encoding cylindrospermopsin in the field samples.
The cyanobacterial blooms are often composed from nontoxic and toxic strains within morphologically identical species, and the reports of multicyanotoxin co-occurrence are becoming increasingly common (Graham et al., 2010;Pitois et al., 2018). The initial choice of detailed analysis of sampling points for the presence of toxic algae was made with IFC, followed by NGS and PCR-based molecular methods. We choose to use a combination of optical methods (IFC) and NGS because the classical optical-based morphological methods may not provide sufficient identification resolution (Eiler et al., 2013;Xiao et al., 2014;Zimmerman et al., 2015;Dzhembekova et al., 2017;Parulekar et al., 2017). Moreover, the traditional cyanobacterial taxonomy has been based on morphological features, but this classification is revised with an accumulation of molecular sequence data (Komárek et al., 2014;Hugenholtz et al., 2021). The taxonomic classification was done with a 16S NCBI database, however, the choice of database may have an effect on the results of taxonomic identification (Park and Won, 2018;Rizal et al., 2020;Winand et al., 2020). Metabarcoding based on 16S rRNA gene revealed a moderately diverse cyanobacterial community at S12 and S1 sampling points, and only a few cyanobacterial species were found at the S4 point (Figure 6) where the heated water stream of the artificial Peretaska channel (30 • C, October 2019) was represented mostly by Proteobacteria and Firmicutes. Proteobacteria are always present in high-temperature ecosystems and often the dominating phylum in hot springs (Chan et al., 2017;Panda et al., 2017;Adjeroud et al., 2020), as the diversity and richness of microbial communities in thermal springs are negatively affected by temperature (Chan et al., 2017). In the winter, due to the thermal discharge, the temperature in the Peretaska channel remains high (26 • C; December 2019); however, it becomes more permissible for cyanobacteria and other microalgae, which may lead to algal blooming. Temperature increase caused by thermal discharge from power plants may promote phytoplankton growth and blooms, particularly in cold seasons (Jiang et al., 2019a,b).
The structure of the cyanobacterial distribution in the field samples defined by NGS analysis suggested the predominance of filamentous Planktothrix sp. and N. spumigena in the brackish S1 sampling point. NGS results together with IFC confirmed a predominant presence of Planktothrix sp. as well as the presence of low abundant Microcystis sp. at the S12 sampling point (low Makhambet). Although there are still common concerns about the nanopore-technology accuracy, mobile sequencing with MinION has already been transformative for bacterial and viral pathogen outbreaks (Quick et al., 2015(Quick et al., , 2016Boykin et al., 2019) and in the context of freshwater analysis (Reddington et al., 2020). We used MinION-based sequencing to complement IFC information on the presence of potentially toxic cyanobacteria at sampling points and found that nanopore amplicon sequencing allowed us to get extended information about the presence of even minor quantities of cyanobacterial genera. Thus, at the S1 sampling point, the predominance of Planktothrix sp. and a presence of minor quantities of N. spumigena were confirmed by 16S rRNA gene metagenomics, but not by IFC. However, a number of experimental intricacies should be addressed using this approach related to the DNA extraction yields and sample filtrate volumes and database incompleteness that can be curated by adding local species (Abad et al., 2016).
As 16S rRNA-based NGS only provides identification of potentially toxic algae, PCR was required to confirm the presence of toxigenic cyanobacteria species utilizing toxin synthesis genes. Field samples from S12 and S13 (near Makhambet village) demonstrated the presence of strains expressing mcyE or sxtA genes identified by traditional and real-time PCR. Data from real-time PCR as well as data from Sanger sequencing suggested that low abundant Microcystis sp. was the only microcystin producer at the S12 sampling point (low Makhambet), and toxigenic ADA-clade spp. were detected at S13 (upper Makhambet) sampling point (Figure 4 and Table 3). This is in line with other studies, which found that the ADAclade is a highly significant component of cyanobacterial HABs (Dreher et al., 2021), and dominant microcystin producer is not necessarily the predominant cyanobacterial species in the water stream (Dadheech et al., 2014;Lee et al., 2015;Scherer et al., 2017). In summary, our study demonstrates that potentially toxic cyanobacteria including toxigenic strains widely distributed in Ural River waters. We hypothesize that the presence of potentially toxigenic cyanobacteria in side channels of the Ural River can lead to yearly seasonal fish kills depending on the floods dynamics and winter temperature shifts.
After our manuscript submission to the journal, another fish kill happened in this region (May 02-09, 2021; Baksay) (in the water bodies temporarily connected to Ural River main stream during spring river flooding). The results of UHPLC/DAD analysis of water samples taken a few days later confirmed the presence of MC-LR and nodularin in the water probes (Supplementary File 2).

CONCLUSION
During our field study of the Ural River in August-December 2019, we documented the occurrence of potentially toxigenic cyanobacteria in Ural River from the brackish-water delta up to 100 km. Aphanizomenon/Cuspidothrix spp., Dolichospermum spp. (most likely D. cf. flos-aquae and D. cf. macrosporum), and, in some locations, Planktothrix sp. and Microcystis sp. were dominant phytoplankton species, and we also found minor quantities of C. raciborskii. The presence in the field samples of toxigenic genotypes (mcyE+, sxtA+) was confirmed by realtime PCR and Sanger sequencing. The complementary analysis with nanopore sequencing showed potential for using portable and fast NGS for real-time monitoring and research of harmful algae, but currently, a taxonomic identification is still limited by the availability of suitable databases and cost. A combination of IFC with molecular methods demonstrates a potential for monitoring algal blooms and can lay a foundation for addressing these risks in aquaculture and river management of the Ural River region. We, therefore, suggest that the monitoring of toxic algae levels in the Ural River should continue based on the potential of combined IFC and NGS approach among others and on the construction of Ural River-specific database of toxigenic cyanobacteria taxa.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: NCBI BioProject [accession: PRJNA714785].

AUTHOR CONTRIBUTIONS
AAbd, KuS, YM, and NB conducted the field expeditions, collected and analyzed data, and contributed to the funding acquisition. YM, AAbd, KuS, KaS, DP, IV, DM, and NB performed the methodology. YM, KuS, AAbd, AAbi, KaS, DP, GA, ZI, IV, DM, and NB contributed to the data curation and data analysis. KaS contributed to the satellite data analysis. NB performed the supervision of the project. NB and YM wrote a first draft of the manuscript. All of the authors wrote, read, edited, and reviewed a manuscript.

FUNDING
The study was funded by Nazarbayev University grant FDCRGP #110119FD4513 to NB and a research grant to AAbd and KuS from FRIP Program of Young Researchers Alliance. DM was partly working under FPGEC Interdisciplinary Scientific and Educational School of MSU, and IV was supported by MES #AP05134153/GF4.