Bioassessment of multiple stressors in Afrotropical rivers: Evaluating the performance of a macroinvertebrate-based index of biotic integrity, diversity, and regional biotic indices

Many streams and rivers outside conservation areas across the Afrotropics face multiple stressors from land use change, urbanization, and excessive water withdrawals. Thus, there is a need to develop cost-effective tools for assessing and monitoring ecological changes to inform management decisions. Studies utilizing macroinvertebrate communities as indicators of the ecological condition of streams and rivers in the Afrotropics use diverse methods, including diversity, richness, biotic and multimetric indices. However, some of these indices are region- or country-specific, which limits their general use across multiple regions or countries. In this study, we address this challenge by testing and comparing the performance of diversity and richness indices (e.g., Shannon-Wiener and Simpson), regional biotic indices (the African Scoring System Version 5 [SASS5], Tanzanian River Scoring System [TARISS] and a biotic index developed for the Ethiopian highlands [ETHbios]), and a macroinvertebrate-based index of biotic integrity (M-IBI) in assessing the ecological condition of Afrotropical rivers with the transboundary Mara River, Kenya and Tanzania, as a case study. In this study, we analyzed water and habitat quality degradation caused by multiple stressors such as land use change, organic pollution and flow alteration and the corresponding responses in macroinvertebrate communities. We utilized macroinvertebrates data collected from 143 sites covering the entire gradient of the river and its major tributaries in Kenya and Tanzania. To develop the M-IBI, we used 12 metrics that describe macroinvertebrate community richness, composition, tolerance to disturbances (indicator taxa), and the composition of functional feeding groups. Although all the biotic indices were sensitive to poor water quality and human disturbance of the river, the M-IBI performed better than biotic indices (SASS5, Tanzanian River Scoring System, and Ethiopian highlands), diversity and richness indices by having a higher discriminatory ability of site categories according to different levels and types of disturbance. Diversity and richness indices performed poorly and failed to discriminate between stressor gradients in the river. This study demonstrates a need for testing and evaluating indices or protocols before adoption and use in biomonitoring streams and rivers in other countries and regions. There is an even greater need to assess the tolerance of macroinvertebrate taxa before inclusion in biotic indices for improved performance as discriminators of multiple stressors.

Many streams and rivers outside conservation areas across the Afrotropics face multiple stressors from land use change, urbanization, and excessive water withdrawals. Thus, there is a need to develop cost-effective tools for assessing and monitoring ecological changes to inform management decisions. Studies utilizing macroinvertebrate communities as indicators of the ecological condition of streams and rivers in the Afrotropics use diverse methods, including diversity, richness, biotic and multimetric indices. However, some of these indices are region-or country-specific, which limits their general use across multiple regions or countries. In this study, we address this challenge by testing and comparing the performance of diversity and richness indices (e.g., Shannon-Wiener and Simpson), regional biotic indices (the African Scoring System Version 5 [SASS5], Tanzanian River Scoring System [TARISS] and a biotic index developed for the Ethiopian highlands [ETHbios]), and a macroinvertebrate-based index of biotic integrity (M-IBI) in assessing the ecological condition of Afrotropical rivers with the transboundary Mara River, Kenya and Tanzania, as a case study. In this study, we analyzed water and habitat quality degradation caused by multiple stressors such as land use change, organic pollution and flow alteration and the corresponding responses in macroinvertebrate communities. We utilized macroinvertebrates data collected from 143 sites covering the entire gradient of the river and its major tributaries in Kenya and Tanzania. To develop the M-IBI, we used 12 metrics that describe macroinvertebrate community richness, composition, tolerance to disturbances (indicator taxa), and the composition of functional feeding groups. Although all the biotic indices were sensitive to poor water quality and human disturbance of the river, the M-IBI performed better than biotic indices (SASS5, Tanzanian River Scoring System, and Ethiopian highlands), diversity and richness indices by having a higher discriminatory ability of site categories according to different levels and types of disturbance. Diversity and richness indices performed poorly and failed to discriminate between stressor gradients in the river. This study demonstrates a need for testing and evaluating indices or protocols before
Identifying the major anthropogenic threats in aquatic ecosystems and understanding how biological communities respond to the changes is essential in developing bioassessment protocols for streams and rivers. Monitoring programs prefer biological indicators because they integrate both spatial (smalland large-scale) and temporal (short-and long-term) human disturbances (Karr and Dudley, 1981;Davies and Jackson, 2006;Hughes, 2019). Development of bioassessment or biomonitoring protocols involves determining the effects of pollution and other disturbances, whether natural or human-induced, on the presence or absence of species from a site or changes in species composition following disturbance (Barbour et al., 1999). The presence or absence data are used to assign tolerance scores or sensitivity weightings to the various species or taxa, with highly sensitive taxa assigned a high weighting. In contrast, highly tolerant taxa are assigned a low weighting. Biomonitoring assists in identifying regional biotic attributes and patterns, as well as potential sources and causes of degradation. It can also be used to detect and assess cumulative impacts and evaluate the effectiveness of pollution control and remediation activities (Barbour et al., 1999;Hering et al., 2006). Biomonitoring protocols and programs form part of the regular monitoring and managing of water quality and river health in riverine ecosystems worldwide (Buss et al., 2015). Compared to biological indicators, assessments of water chemistry, physical habitat structure, and landscape or riverscape condition usually explain less than half the variability in biological conditions and are extremely sensitive to sampling efforts and natural variability (USEPA, 2016;Hughes, 2019).
Biological communities have been used to assess the ecological status of many ecosystem types globally Vadas et al., 2022;Ofogh et al., 2023). Although limited in comparison to other continents (e.g., Herman and Nejadhashemi, 2015;Ruaro et al., 2020;Feio et al., 2021;Vadas et al., 2022), several studies have used biological communities to study the condition of streams and rivers in Africa. These studies use heterogeneity (richness) and diversity indices (Odume and Muller, 2011;Olawusi-Peters and Ajibare, 2014;Soko and Gyedu-Ababio, 2015;Arimoro and Keke, 2017;Masese et al., 2020), regional or countryspecific biotic indices (Dickens and Graham, 2002;Aschalew and Moog, 2015;Kaaya et al., 2015;Dallas et al., 2018;Musonge et al., 2020) and multimetric indices (Odume et al., 2012;Mereta et al., 2013;Lakew and Moog, 2015;Edegbene 2021;Edegbene et al., 2019;Tampo et al., 2020;Edegbene et al., 2022;Kaboré et al., 2022). Diversity indices provide a numerical measure of species diversity in a community based on composition and structure other than the number of species, while richness or heterogeneity indices are an indicator of the relative diversity of species in a community (Mouchet et al., 2010;Magurran, 2013). The use of diversity and richness indices as a measure of ecological condition assumes that their values decrease with environmental degradation and can reveal community compositional differences among sites or over time in response to disturbances (Lake, 2000;Ives and Carpenter, 2007;Sundstrom et al., 2017). In contrast, biotic indices employ the sensitivity values of different taxa in a community to develop an overall score of the ecological condition depending on the level of disturbance (Washington, 1984;Chessman and McEvoy, 1997;Dallas, 2021). The basis of biotic indices is the presence or absence of taxa (species, genus, family) in response to the severity of a particular stressor (Junior et al., 2015). Examples of macroinvertebrate-based biotic indices in Africa include the South African Scoring System Version 5 (SASS5; Dickens and Graham, 2002), the Tanzania River Scoring System (TARISS; Kaaya et al., 2015) and the biotic score (ETHbios) developed for the Ethiopian highlands (Aschalew and Moog, 2015).
Compared to species diversity, richness, and biotic indices, multimeric indices (MMIs) integrate a community's biological attributes or metrics into a single index that is responsive to different forms of disturbance (Karr, 1981;Hughes et al., 1998;USEPA, 2016). In this regard, a community attribute or metric is defined as a calculated term or enumeration representing some aspect of biological assemblage structure, function or other measurable characteristics that changes predictably with human influence (Barbour et al., 1995). MMIs are effective for biomonitoring because they are better at identifying and discriminating the effects of different stressors on ecological conditions (Hering et al., 2006;Lunde and Resh, 2012). They are considered one of the best cost-effective methods for biomonitoring aquatic ecosystems (Bonada et al., 2006;Ruaro and Gubiani, 2013). Because of their popularity, published literature reviews have shown that MMIs have been used to evaluate water quality and the overall ecological integrity of aquatic ecosystems globally (Resh and Jackson, 1993;Buss et al., 2015;Herman and Nejadhashemi, 2015;Ruaro et al., 2020;Eriksen et al., 2021). In Africa, macroinvertebrateand fish-based MMIs have been developed and used in different regions, including eastern Africa (e.g., Masese et al., 2009a;Aura et al., 2010;Raburu and Masese, 2012;Achieng et al., 2021), southern Africa (e.g., Hocutt et al., 1994;Kleynhans, 1999;Odume et al., 2012) and western Africa (e.g., Hugueny et al., 1996;Thorne and Williams, 1997;Toham and Teugels, 1999;Tampo et al., 2020;Kaboré et al., 2022).
Despite the broad use of diversity, richness, biotic and multimetric indices in the African continent, studies that compare the efficacy and applicability of these different indices in assessing water quality and the overall ecological condition of streams and rivers are limited. Specifically, most biotic and multimeric indices have not been standardized or tested for evaluating aquatic ecosystems beyond regions or countries where they were first developed (Masese et al., 2013;Mangadze et al., 2019;Dallas, 2021). Additionally, most countries in Africa lack a legal provision for the use of biological criteria as a basis for assessing the condition of surface waters at the national level, which would require defining and standardizing biomonitoring protocols or tools used for this purpose (Dallas, 2021;Masese et al., 2022). Nevertheless, biotic indices and MMIs based on benthic macroinvertebrates are growing across the continent, with some being used without testing and validation. For instance, while SASS was developed for assessing water quality and the ecological condition of South African rivers (Dallas, 1997;Chutter, 1998;Dickens and Graham, 2002), version five of SASS (SASS5) has been applied without modification in other countries in southern and eastern Africa, including Zimbabwe (Bere and Nyamupingidza, 2014;Mwedzi et al., 2016), Swaziland (Mthimkhulu et al., 2004) and Kenya (Oigara and Masese, 2017;Mbaka et al., 2014;M'Erimba et al., 2014). Similarly, TARISS, which is a modified version of SASS5 developed for assessing streams and rivers in Tanzania, has recently been applied in Rwanda  and Uganda (Tumusiime et al., 2019;Turibamwe and Wangalwa, 2020;Ochieng et al., 2021) without modifications. Similarly, while some MMIs have been tested and validated for monitoring of some of the African aquatic ecosystems (e.g., Raburu and Masese, 2012;Moges et al., 2016;Tampo et al., 2020;Achieng et al., 2021;Kaboré et al., 2022), some have been used without validation, including Toham and Teugels (1999), Masese et al. (2009a), Aura et al. (2010), Alemu et al. (2018) and Aura et al. (2021). This means that these biotic indices and MMIs must be tested for performance and validated before being used widely to assess the ecological conditions of streams and rivers across the African continent (e.g., Bere and Nyamupingidza, 2014;Musonge et al., 2020).
By offering various responses to multiple stressors, macroinvertebrates are among the most widely used bioindicator organisms in streams and rivers (Barbour et al., 1999;Raburu, 2003;Birk et al., 2012;Carter et al., 2017;Ruaro et al., 2020). In addition, macroinvertebrates have many practical advantages for bioassessments, including a wide distribution in most streams and rivers, ease of sampling and identification at the family level, and the sampling equipment is relatively inexpensive (Rosenberg and Resh, 1993;Barbour et al., 1999;Bonada et al., 2006). However, in the Afrotropics, high-elevation, low-order streams have a very low diversity of fishes (Kleynhans, 1999;Mangadze et al., 2016;Raburu et al., 2022), which is the alternative bioindicator group for use in river ecosystems (Barbour et al., 1999). This low diversity of fishes makes macroinvertebrates the only best alternative available biotic group for biomonitoring headwater or high-elevation streams.
In this study, we tested the applicability and performance of diversity and richness indices, regional biotic indices, and a macroinvertebrates-based index of biotic integrity (M-IBI) to assess the influence of multiple stressors on the ecological condition of the Afromontane-savanna Mara River in Kenya/ Tanzania. We focused on the Mara River as a case study because it is transboundary between Kenya and Tanzania. The river also experiences multiple stressors arising from land use change, soil erosion and nutrient runoff from farmlands and grazing areas, excessive water withdrawals, organic matter and nutrient input by livestock and large wildlife, human waste from urban settlements and tourism establishments, and mercury from artisanal gold mining in the lower basin. The specific objectives were to; 1) determine structural responses in macroinvertebrates to changes in water quality across different land uses and stressors in the river, 2) use regional indices (ETHbios, SASS5 and TARISS) and macroinvertebrate index of biotic integrity (M-IBI) to assess the ecological condition of different sites in the river, and 3) compare the sensitivity of the different groups of indices (diversity and richness indices, biotic indices and the M-IBI or MMI) as indicators of water quality and ecological condition of the river.

Study area
This study was conducted in the Mara River basin (Figure 1). The river originates from the Mau Escarpment and Loita Hills in Kenya and drains into Lake Victoria in Tanzania. In the headwater, the Mara River drain remnants of tropical broadleaf forests interspersed with plantations of tea and mixed agriculture (crop farming and livestock rearing). Until the middle of the past century, the river basin was covered by extensive Afromontane forests in its headwaters and a mixture of shrublands and grasslands throughout its middle and lower reaches (Mati et al., 2008). However, extensive loss of the native forests, grasslands and shrublands to human settlements, agricultural and grazing lands has occurred over the years resulting in reduced coverage of natural vegetation in the basin (Serneels et al., 2001;Mati et al., 2008;Mango et al., 2011). The land use changes have resulted in 27% reduction in grasslands and shrublands, 32% reduction in forest cover, and 203% increase in agriculture (Mati et al., 2008).
The Mara River basin transcends varying climatic conditions that range from the humid uplands, and the sub-humid upper middle reaches on the slopes and base of the Mau Escarpment, and the semi-arid middle and lower reaches in Kenya and Tanzania. Two perennial tributaries, the Nyangores and Amala, drain the humid and sub-humid uplands and join to form the Mara River mainstem (Figure 1). In the middle and lower reaches, all tributaries draining the grasslands and shrublands are ephemeral, including the Talek and Sand Rivers, despite their extensive drainage basins. The Talek and Sand Rivers drain the Kenyan part before the Mara River crosses into Tanzania. In Tanzania, most tributaries draining the Frontiers in Environmental Science frontiersin.org 03 Serengeti grasslands and adjoining areas are seasonal, including the Gurubi, Nyarwera, Nyarusondobiro, Tobora, Somoche and Tigithe. The Tigithe River is the last of the major tributaries in the lower reaches that join the Mara River before entering the extensive rivermouth wetland and is one of the most impacted by gold mining activities.
On the Mau Escarpment, the climate is cooler and seasonal due to the high altitude, characterized by two dry and wet seasons. The amount of rainfall varies across the basin depending on altitude, with the highlands receiving around 2,000 mm of rain per annum (pa) while the lowlands receive around 1,000 mm pa (Camberlin et al., 2009). Dry conditions are experienced during January-March and wet conditions during March-July and October-November, which are periods for the "long rains" and "short rains," respectively. However, variations sometimes occur at the onset and end of the rainy and dry periods. Potential evapotranspiration varies between 1,400 mm in the highlands to 1,800 mm in the lowlands (Jackson and McCarter, 1994).
The Mara River is one of the most important rivers in east Africa as the only permanent water source for large wildlife in the Serengeti -Mara Ecosystem in northern Tanzania and southern Kenya. It acts as a signpost for wildebeests that cross the river multiple times during their annual migrations (Gereta et al., 2009;Holdo et al., 2009). The Mara River basin also supports over one million people, with over 60% directly reliant on the river for watering their livestock and domestic water supply (Hoffman et al., 2011). The basin is also ecologically diverse, with the wetter upper reaches supporting Afromontane forests and mixed but intensive agriculture of both livestock and subsistence crops. In contrast, the drier middle reaches outside conservation areas support both small-and largescale cultivation of subsistence and cash crops and livestock grazing. However, pressures on the conservation areas both in the headwater and middle reaches are high, and there is evidence that the changing land use has had significant effects on water quality and the natural flow regime of the river and its tributaries (Melesse et al., 2008;Kilonzo, 2014;McClain et al., 2014).

Study design
A total of 143 sites were sampled in this study. The sites cover the Mara River and its tributaries in Kenya and Tanzania. Most of the sites have been sampled before for other ecological studies, and these provided some of the historical data used in this study (Minaya et al., 2013;Masese et al., 2014a;Masese et al., 2014b;Kilonzo, 2014;McClain et al., 2014). Additional sites were selected along the Talek and Sand Rivers in Kenya, and other tributaries in Tanzania for sampling to represent the basin's highly hydrologically variable and modified streams and river reaches.
Sites were grouped into four site categories depending on the percentages of the forest, and agricultural land uses and levels of stressors they were experiencing. The percentages of forest, grasslands, and agricultural land use in the entire catchment upstream of the sampling sites were calculated based on the Digital Elevation Model of Kenya (90 m by 90 m), obtained from the Shuttle Radar Topography Mission Wanderi  Frontiers in Environmental Science frontiersin.org 04 et al., 2022). Sub-catchments were delineated, and the area of each major land use category upstream of each sampling site was calculated. Forest (FOR) and agriculture (AGR) sites drained catchments with the proportion of catchment land use under forestry and agriculture >60%, respectively. Mixed (MIX) sites did not meet the catchment land use criteria for GRAS and AGR sites and had an intermediate mixture of two or three major land use types. The sites were grouped into four site categories that were; 1) Forested sites (FOR), which represented minimally disturbed sites in forested areas that acted as a reference for the rest of the sites, 2) Agricultural sites (AGR) that were located in agricultural areas, 3) Mixed sites (MIX) were located in catchments draining forested and agricultural land uses, and 4) Mara sites (MARA) were located in the Mara River mainstem and tributaries where the stressors were flow variation, loading of organic matter and nutrients and trampling by livestock and large wildlife (mainly hippopotami). FOR sites in the Mau Forest Complex in the headwaters of the Amala and Nyangores rivers were used as reference sites. Along FOR streams, the riparian corridor was largely intact, with a mixture of indigenous vegetation throughout the catchment. Compared to AGR sites, water quality in FOR streams was less turbid even immediately after storm events.

Sampling methods
We sampled sites during the dry season in January-March and July-August and the wet season in April-June and October-December from 2012 to 2018. During the dry season, water flow levels were significantly low, with low turbidity, and there was more excellent bed stability. Sampling habitats were also more distinguishable and accessible, and macroinvertebrates were most abundant but not necessarily in terms of taxa richness (Melo and Froehlich, 2001;Hughes and Peck, 2008). As opposed to the dry season, the wet season flows were fast and deep, and the water was more turbid. The deep water and fast flows made it challenging to access some of the permanent habitats in the middle of the large rivers for macroinvertebrates sampling. This meant that in some cases, the biotopes sampled were recently inundated, resulting in low abundances of macroinvertebrates.

Environmental variables
The water chemistry and physical variables measured included physical measures, nutrients, and major ions. The data on physical measures had river width, depth, discharge, pH, dissolved oxygen (DO), temperature, electrical conductivity, turbidity, and total suspended solids (TSS). In situ, water quality physico-chemical parameters were measured using portable probes (YSI multiprobe water quality meter (556 MPS, Yellow Springs Instruments, Ohio, United States). They include pH, dissolved oxygen (DO) concentration, temperature, and electrical conductivity. Turbidity was measured using a portable Hach turbidity meter (Hach Company, 2100P ISO Turbidimeter, United States). Water samples were collected from the thalweg using acid-washed HDP bottles to analyse nutrients, dissolved organic carbon (DOC), and major ions. For total suspended solids (TSS), water samples were filtered immediately through pre-weighed and pre-combusted glass-fibre filters (Whatman GF/ F, pre-combusted at 450°C, 4 h). GF/F filters holding suspended matter were carefully folded and wrapped in aluminium foils before transport in a cooler box at 4°C to the laboratory. The filtered and unfiltered water samples were stored and transported in a cooler box and kept frozen until analysis.

Sampling of macroinvertebrates
Within each sampling site, representative river reaches were selected for macroinvertebrates sampling. Each sampling site provided a range of biotopes (riffles, pools, marginal vegetation) with varying velocities, depths and substrate types that ranged from gravel, sand, and mud (GSM) to boulders and bedrock. Sampled biotopes ranged from GSM, stones-in-current and stones-out-ofcurrent (STONES) and aquatic and marginal vegetation (VEG) (Dickens and Graham, 2002;Dallas, 2007).
We used sampling methods in the South African Scoring System Version 5 (SASS5; Dickens and Graham, 2002) and Tanzanian River Scoring System (TARISS; Kaaya et al., 2015). The sampling methods in these two biotic indices are similar and use the SASS net, a modified kick net measuring 300 mm by 300 mm with a mesh size of 1,000 μm. Several biotopes were sampled within a prescribed time limit and/or areal coverage. The stones-inside-current (SIC) and bedrock were searched ("kicked") for 2-5 min. Similarly, stonesout-of-current (SOOC) and bedrock were searched for 1 min. The SIC and SOOC samples were combined into a "Stones" (STONES) sample. Suitable stretches covering 2 m 2 of marginal vegetation (SIC and OOC) were swept as well as aquatic vegetation covering 1 m 2 . This represented the VEG sample. Gravel, sand, and mud (GSM sample) were stirred and swept for 1 min and checked for macroinvertebrates. Hand-picking and visual observation was also employed for 1 min, and biotopes with macroinvertebrates were recorded. Scoring was done for 15 min per biotope but stopped when no new taxa were seen after 5 min. As a modification to the SASS procedure, all samples from the three habitats, Stones, GSM and Vegetation, were preserved in formalin in separate containers and taken to the laboratory for further processing and enumeration of individuals of the various taxa.

Frontiers in Environmental Science
frontiersin.org 05 These data were needed for statistical analyses of the macroinvertebrate samples.
In the laboratory, macroinvertebrate samples were sorted, enumerated, and identified to the lowest taxon level possible, mainly family level. Many taxonomic keys were used, such as Gerber and Gabriel (2002)

Analysis of macroinvertebrates data 2.4.1 Taxa richness and diversity indices
Data on macroinvertebrates samples collected were pooled for each sampling site per season to analyze various indices. Community composition was summarized for each sampling site using the number of species (S), the total number of individuals (N), richness and diversity indices, and the relative abundance of each taxon. Several site-level diversity and richness indices were calculated by separately summing species-specific abundances across biotopes for each season. The Shannon index (H′) was used as a measure of diversity (Shannon, 1948), and an associated H′/H′ max index (Pielou, 1975) was used for evenness. The reciprocal form of the Simpson Index (1/Ds) (Simpson, 1949) was used as a measure of community richness. We used Hill's number (i.e., gamma diversity; Hill, 1973) and Fisher's alpha (Fisher et al., 1943) as extra measures of community diversity. Hill's number was calculated as the ratio between H′ and 1/Ds. Other diversity indices calculated included Brillouin's (Brillouin, 1956), Menhinick's (Menhinick, 1964), Margalef's (Margalef, 1958), Equitability (Lloyd and Ghelardi, 1964), Berger-Parker (Berger and Parker, 1970), and Chao-1 (Chao, 1984).
These indices were used to measure different aspects of macroinvertebrate diversity in the river. The first group of indices consists of dominance and evenness indices, which are sensitive to the abundance of the most common species rather than species richness (Simpson, Berger-Parker, Chao1, evenness and equitability). The second group consists of diversity and richness indices, focusing on the number of species in relation to the number of individuals (Fisher, Hill, Margalef and Menhinick); and lastly, indices based on information theory, which takes into consideration evenness and species richness components of diversity (Shannon and Brillouin).

Biotic indices
We used SASS5 (Dickens and Graham, 2002), TARISS (Kaaya et al., 2015) and the ETHbios biotic index, which was developed to assess ecological conditions in the highlands of Ethiopia (Aschalew and Moog, 2015), to evaluate the ecological conditions of the sampled study sites. For SASS5 and TARISS, we identified macroinvertebrate specimens in the field to the family or higher levels per the protocols. Taxa that were identified in the field and recorded in the SASS5 Score sheet, and the total abundances estimated for each taxa (Dickens and Graham, 2002;Kaaya et al., 2015). The SASS5 and TARISS Score and their associated average score per taxon (ASPT) were used to characterize macroinvertebrates at each site (Dickens and Graham, 2002). For ETHbios, the list of taxa encountered at each site was subjected to the sensitivity scores developed for Ethiopian highlands to obtain the total ETHbios score and its related ASPT.

Macroinvertebrate index of biotic integrity (M-IBI)
Community composition data at the site level were also used to develop a macroinvertebrate index of biotic integrity (M-IBI) for the Mara River. Metrics for the index were selected following the performance on range, responses to changes in water quality, redundancy with other metrics, signal-to-noise ratio (repeatability), seasonal stability and ecological significance (Kurtz et al., 2001;Klemm et al., 2003). We eliminated candidate metrics that failed a test from additional consideration and testing. In the range test, we required that the difference between the value of a metric was >10 for percentage or proportion metrics and >5 for taxa richness or diversity metrics. Regarding the response to water quality changes, metrics were required to correlate with at least one of the water quality physico-chemical variables. Redundancy in the remaining metrics was evaluated using Spearman correlation coefficients and visual inspection of scatter plots. Metrics with a correlation coefficient r ≥ 0.85 were considered redundant, and only one was selected for inclusion in the final M-IBI (Hughes et al., 1998).
We used the signal-to-noise ratio (S/N) to assess the reproducibility of metrics, i.e., to ensure that between-site differences in metrics are caused by differences in site conditions rather than by sampling variation within a site (Stoddard et al., 2008). The S/N is the ratio of variance among sites (signal) to the variance of repeated visits at the same site (noise; repeated seasonal sampling of the same site in our case) (Kaufmann et al., 1999;Stoddard et al., 2008;Stevenson et al., 2013). To represent the S/N ratio, we used the F-ratios in an analysis of variance to evaluate a metric's ability to discriminate differences among sampling sites over the "noise" of measurement variation (Kaufmann et al., 1999). We used one-way ANOVA to calculate F-ratio and to calculate S/N as follows: Where F is the ratio between the F in one-way ANOVA among sites and F in one-way ANOVA within sites, and C 1 is a constant varying between one and r, the number of times the repeat-sample sites are visited (Neter and Wasserman, 1974). Metrics with higher S/N have greater reproducibility and serve as reliable indicators of differences in condition among sites (Esselman et al., 2013). Metrics with S/N higher than two (2) were retained for further consideration (Stoddard et al., 2008). A low S/N ratio indicated that a metric didn't distinguish well among sites.
Seasonal stability in metrics was evaluated by box-and-whisker plots, whereby those metrics that failed to discriminate the reference and disturbed site categories during both the dry and wet seasons in different years were eliminated from further consideration. Metrics were also included in the final M-IBI following their proven utility in previous multimetric indices developed in the region (e.g., Masese et al., 2009a;Raburu et al., 2009a;Aura et al., 2010). The metrics fall into five categories that describe taxa richness, community composition, tolerance of the various taxa to disturbances (indicator individuals or taxa), and functional feeding groups.
Frontiers in Environmental Science frontiersin.org Metrics sensitive to changes in substrate type or quality, channel stability and flow variation were also selected for inclusion in the M-IBI for the Mara River.

Data analysis
Differences in water quality physico-chemical variables among the four condition categories were explored separately for each season using One-way Kruskal-Wallis ANOVA. The ability of the different macroinvertebrate indices (diversity, richness, biotic and multimetric) to discriminate among the four site or condition categories was explored using box-andwhisker (box) plots. We defined separation power as the degree of overlap between boxes (i.e., 25th and 75th quartiles) in box plots of the values of the indices for all condition categories (Barbour et al., 1996;Barbour et al., 1999). The derivatives of the biotic indices tested included the total number of taxa, total site scores (SASS5, TARISS and ETHbios) and their associated average score per taxon (SASS5-ASPT, TARISS-ASPT, and ETHbios-ASPT).
We used Spearman correlations to test for relationships among diversity and biotic indices and the M-IBI. Before analyses, macroinvertebrates abundance (count) data and water quality physico-chemistry data were log (x + 1) and square-root transformed, respectively. Metrics calculated as proportions were normalized by Arcsine-square-root transformations before analysis. Analyses were performed using SigmaPlot Version 12. Statistical tests were considered significant at p < 0.05.
We used the continuous method to score individual metrics. Positive scoring metrics (which declined with increasing water quality degradation or disturbance) were scored 0-10 points: 0 points for values less than the 5th percentile of impacted sites and 10 points for values greater than the 50th percentile of highquality reference sites. Negative scoring metrics (which increased with increasing water degradation or disturbance) were scored 0-10 points: 0 points for values greater than the 90th percentile of impacted sites and 10 points for values less than the 50th percentile of reference sites. These percentiles were chosen to maximize the discrimination among sites for each metric. Metric scores were linearly interpolated between 0 and 10. We felt continuous scoring over the individual metric ranges of 0-10 led to a more precise index than traditional methods (Hughes et al., 1998). To scale the index from 0 to 100 points, we summed the M-IBI score of the individual metric scores for each of the twelve metrics times 0.83.
Four condition categories of excellent, good, fair, and poor, using percentiles of frequency distributions of M-IBI scores at reference and impaired sites, were established to distinguish the different environmental conditions of the sampling sites (Table 8). The M-IBI scores exceeding the 75th percentile for reference sites (M-IBI >81) were classified as having ''excellent'' biotic integrity, and scores between the 75th and 25th percentiles (66 < M-IBI ≤81) were identified as having ''good'' biotic integrity. M-IBI scores between the 5th and 25th percentile for reference sites (M-IBI = 52-66) were identified as being in a "'fair" condition, and scores below the 5th percentile for reference sites (M-IBI <2) were defined as in a "poor" condition.

Environmental variables
There were seasonal and spatial variations in water quality variables and solutes in the study area (Table 1). For most site categories, levels of in situ physico-chemical variables (DO, electrical conductivity and temperature) and major solutes (Al, Ca, Mg, K, Na, F, Cl, SO 4 , and DSi) were higher during the dry season compared with the wet season (Table 1). In contrast, variables related to water clarity (turbidity and TSS) had elevated levels during the wet season compared to the dry season. Non-significant differences were found in nutrient concentration temporally.
There were no significant differences in stream/river width and depth between the site categories (Table 1). Discharge only differed between the sites during the wet season, with sites in the Mara River mainstem recording significantly higher discharge. Apparent differences in water physico-chemistry were observed between the site categories (Table 1). While forest (FOR) and Mara sites (MARA) showed differences in most physico-chemical variables, for some variables, differences with agricultural (AGR) and mixed (MIX) sites were not clearcut, and levels lay between the FOR and MARA sites. For nutrients, the lowest levels were recorded in FOR sites, while AGR and MARA sites had the highest concentrations of ammonium, SRP, TN, and TDN. Similarly, AGR and MARA sites recorded the highest concentration of dissolved organic carbon (DOC), and FOR sited recorded the lowest concentration.

Diversity indices
There were significant correlations between water quality variables and the total number of taxa (S), total number of individuals (N) and diversity indices ( Table 2). The total number of species and abundance displayed the largest number of correlations, 9 and 11, respectively, out of the 22 water quality variables tested in this study. Among the diversity indices, Menhinick's index had the highest discriminant ability by having strong correlations with nine (9) water quality variables. Evenness and equitability indices had the second-highest discriminant ability with eight (7) correlations. In contrast, Simpson's, Shannon's, Hill's, Brillouin's, Margalef's, and Berger-Parker's indices had six (6) correlations with water quality variables each. Finally, Fisher's diversity index had the lowest number of correlations (4) with water quality variables (Table 2).

Performance of diversity indices
The ability of the diversity indices and the total number of taxa and families to discriminate the different levels of disturbance among the four site categories were explored using box-andwhisker plots (Figures 2, 3). During the dry season, the number of taxa and the number of families were highest in MARA sites, but this was not different from the FOR sites ( Figure 2). However, the FOR sites weren't separated from the MIX and AGR sites. Although most diversity indices showed a decline from FOR sites which had the highest values, to MIX and AGR sites, which recorded the lowest values, they weren't able to discriminate between FOR and MARA Frontiers in Environmental Science frontiersin.org sites. Only Menhinick's index separated FOR, MIX, and AGR sites, but MIX and MARA sites were inseparable (Figure 2). During the wet season, the performance and separation power of all diversity indices was much reduced as none separated the different site categories (Figure 3). Overall, FOR sites recorded the highest number of taxa and families and the highest diversity compared to other site categories.

Biotic indices
In total, 96 macroinvertebrate taxa were collected in the Mara River, belonging to 19 major taxonomic groups identified to order level or higher (Table 3). Although some taxa were identified at a lower than the family level, we used sensitivity weightings or scores at the family level, which is the level used in most of the biotic indices currently used in the region (Table 3). Of the 96 taxa, the SASS5 score had the highest taxa represented in the Mara River (78), closely followed by TARISS (77). ETHbios had the lowest number of taxa (40) represented in the Mara River.
A total of 16 taxa identified mainly to the family level did not occur or have sensitivity weightings in any of the three biotic indices (Table 3,  Table A1). SASS5 and TARISS had a similar number of taxa (16) missing from the Mara River list, while ETHbios had 40. The taxa in the Mara River included the most intolerant to poor water quality or pollution, having a sensitive weighting of 15 in the SASS5 and TARISS scores (Propistomatidae and Oligoneuridae). Other sensitive taxa collected in the river included Heptageniidae (13), Perlidae, Crambidae (Pylaridae), Baetidae (>2 spp.), Hydropsychidae (>2 spp.), Polycentropodidae and Scirtidae/Helodidae (12). The most tolerant taxa to pollution encountered in the river included Coelenterata (Cnidaria), Oligochaeta, Culicidae, Psychodidae and Muscidae (1), and Chironomidae (2).

Performance of biotic indices
The sensitivity of the biotic indices to discriminate between levels of disturbance of the four site categories was evaluated using whisker-and-box plots (Figure 4). The total score derivatives of the three biotic indices (SASS5, TARISS and ETHbios) had poor discriminatory ability among disturbance categories during the dry and wet seasons. However, the ASPT derivatives of the indices had a better discriminatory power among site categories during the dry season. During the wet season, the ASPT derivatives, like the total scores, did poorly in discriminating the levels of disturbance among the sites.
There was discordance between the number of taxa and sensitivity scores. The MARA sites recorded the highest number of taxa but the lowest ASPT scores for both SASS5 and TARISS during the dry and wet seasons (Figure 4). Similarly, there was close agreement in the performance of SASS5 and TARISS by showing similar trends in both total and ASPT scores. ETHbios showed little variability among the site categories, except for a drop in ETHbios-ASPT score in the MARA sites during the dry season.

Metrics selection, testing and scoring
In total, 12 metrics falling into five categories that describe richness, composition, tolerance to disturbances (indicator individuals or taxa), functional feeding groups and effects of channel stability and flow variation were selected for inclusion in the M-IBI for the Mara River (Table 4). The final metrics in the Mara M-IBI had sufficient variability in data values among sites (data range), were reproducible (temporal stability), and were independent of other metrics by having low correlation relationships (r < 0.70). Some metrics that are sensitive to the cessation of flows and river sedimentation were included in the M-IBI to capture the effects of land use change and excessive water withdrawals that have modified the flow regime of the river. These metrics include the percentage of rheophilic individuals, the percentage of non-insect individuals and the percentage of burrowing individuals (Table 4). In addition, the percentage of vegetation-associated taxa was included as a metric to capture the loss of riparian and instream vegetation caused by flashflooding, disturbance by livestock and wildlife and human over-use of streams and rivers. Metrics were also assessed for responsiveness to changes in water quality among the different sites. All macroinvertebrate metrics displayed significant Spearman correlations with various physico-chemical parameters (Table 5). SRP, F, K, and NO3-N returned only one significant relationship with M-IBI metrics. Significant correlations were among most water quality physicochemical variables and M-IBI metrics ( Table 5). The number of EPT taxa, number of tolerant taxa and percent Diptera individuals were the most responsive metrics, showing correlations with more than half of the 19 variables. The number of taxa and the number of predator individuals metrics were the least responsive metrics by having significant correlations with only three of the 19 physicochemical variables.
Metric scoring for the M-IBI was based on distributions of reference FOR sites and the most impacted (AGR and MARA) sites (Table 6). The scoring criteria were based on dry season samples because metric values are more stable and representative of environmental conditions

FIGURE 2
Performance of taxa richness and diversity indices across site categories in the Mara River Basin during the dry season. Similar letters on boxes indicate no significant difference in metric values among the four condition categories after One-Way ANOVA.

Frontiers in Environmental Science
frontiersin.org as compared with wet season conditions. Consequently, metrics were scored separately for the dry and wet seasons.

Performance of the M-IBI
The ability of the M-IBI to discriminate between the four levels of disturbance or site categories was evaluated using whisker-and-box plots ( Figure 5). During the dry ( Figure 5A) season, the performance of the M-IBI was able to discriminate between FOR and MIX sites, which were least disturbed, and AGR and MARA, which were the most disturbed human activities. However, during the wet season, the discriminatory ability of the M-IBI was much reduced as the four categories of sites were essentially inseparable ( Figure 5B). The FOR and MIX sites scored higher values than the AGR and MARA sites, where water quality was much reduced. Overall, the M-IBI effectively assessed perturbations, including water quality and habitat disturbance, throughout the Mara River basin.

Relationships between biotic indices and M-IBI
Significant correlations existed among the total number of taxa, number of families, regional biotic indices, and the Mara River M-IBI (Table 7). During the dry season, the number of taxa had a significant positive correlation with the number of families, SASS5 score, TARISS score, and ETHbios score. The number of    Dickens, and Graham, 2002), Tanzania River Scoring System (TARISS, Kaaya et al., 2015) and ETHbios (Aschalew and Moog, 2015 EPT taxa displayed a similar response as the total number of taxa. The SASS5 score had a highly significant positive relationship with the TARISS score. Both the SASS5 score and TARISS score had highly significant relationships with the number of taxa, the number of families and ETHbios score. The M-IBI index showed significant positive relationships with the SASS5 score and ASPT, TARISS score and ASPT, and ETHbios Score. However, the M-IBI didn't display any significant relationship with the number of taxa and families, nor ETHbios ASPT. Notably, all Spearman correlations among the total number of taxa, number of families, regional biotic indices, and the Mara River M-IBI were positive and significant (Table 7). This implies that the discriminatory ability of all the indices was reduced and could not display variability among sites.

Assessment of ecological condition
Four condition categories of excellent, good, fair, and poor were established to distinguish the different environmental conditions of the sampling sites (Table 8). The ecological condition and integrity classes were interpreted relative to the levels of human disturbance at the different groups of sampling sites. Narrative descriptions were established for different categories or classes of sites using physicochemical data, individual metrics, final M-IBI scores, habitat conditions and flow conditions for each site (Table 8). A notable response was the association of stable substrate and clear water quality with high diversity and abundances of the EPT and rheophilic taxa in the river. However, habitat modifications caused by trampling and input of organic matter and nutrients by large mammalian herbivores (livestock and hippos), and increased erosion causing turbidity from agricultural areas, were associated with low numbers of intolerant EPT taxa and high numbers and dominance of tolerant taxa such some species among Diptera Hemiptera, Coleoptera, and Oligochaeta.

Discussion
We sought to compare the performance of macroinvertebratebased diversity (including richness) indices, regional biotic indices, and an index of biotic integrity (M-IBI) to assess the ecological  Dickens, and Graham, 2002), Tanzania River Scoring System (TARISS, Kaaya et al., 2015) and ETHbios (Aschalew and Moog, 2015). (−) indicate that sensitivity weights don't exist in the three RBPs for the taxon.   Dickens, and Graham, 2002), Tanzania River Scoring System (TARISS, Kaaya et al., 2015) and ETHbios (Aschalew and Moog, 2015). (−) indicate that sensitivity weights don't exist in the three RBPs for the taxon. Frontiers in Environmental Science frontiersin.org condition of the Mara River, Kenya/Tanzania. The river is influenced by multiple stressors arising from agriculture and animal grazing, organic matter and nutrient loading, and excessive water withdrawals. Significant differences in water quality variables among the site categories influenced the structure of macroinvertebrates communities. Overall, the M-IBI performed better than biotic and diversity indices by having a higher discriminatory ability of site categories according to different levels and types of disturbance. Diversity indices performed poorly and failed to discriminate between stressor gradients in the river. All the indices were very sensitive to seasonality. During the dry season, the indices were more stable and could distinguish between different levels of disturbances in the river. However, during the wet season, the indices performed poorly and largely failed to differentiate between the different disturbance levels among the site categories.

Water quality physico-chemical variables
Most physico-chemical variables differed among the site categories (Table 1), with turbidity, conductivity, TSS, nutrients and some major ions recording the lowest values in FOR and MIX

FIGURE 4
Box-and-whisker plots comparing the ability of SASS5, TARISS, and ETHbios total index scores and average score per taxon (ASPT) to discriminate levels of disturbance among the four site categories (FOR, MIX, AGR, and MARA) in the Mara River Basin during the dry and wet seasons. Boxes show interquartile ranges (25th and 75th percentiles, and middle lines are medians. FOR, forested; MIX, mixed; AGR, Agricultural, and MARA, Mara River mainstem and seasonal tributaries. Similar letters on boxes indicate no significant difference in metric values among the four condition categories after One-Way ANOVA.

Frontiers in Environmental Science
frontiersin.org sites and highest values in AGR and MARA sites. Turbidity, TSS and nutrients are good indicators of disturbances at the local scale, such as human activity and the loading of organic matter and nutrients by livestock and large wildlife through defecation and excretion Dutton et al., 2018;Iteba et al., 2021). In contrast, electrical conductivity and major ions are surrogates for land use impacts or changes in geology on water physico-chemistry at the catchment scale (Gaillardet et al., 1999;Ibarra et al., 2016). The negative impact of agriculture on water quality and ecological condition of Afrotropical streams and rivers through sediments and nutrient inputs from farmlands have been well-documented (Minaya et al., 2013;Masese et al., 2014a;Masese et al., 2014b;Fugère et al., 2018;Dalu et al., 2019;Kroese et al., 2020). Other stressors that negatively affect water quality and ecological integrity include the loading of organic matter from human settlements and livestock grazing areas (Dutton et al., 2018;Iteba et al., 2021). Excessive water withdrawal and changes in the natural flow regimes of streams and rivers are also emerging as significant stressors of river ecosystems. Extreme cases of flow alteration that lead to cessation of flows or supra-reduced flow levels during the dry season have been reported in the Mara River and other rivers in eastern Africa (Dutton et al., 2018;Stears et al., 2018). Reduced flows lead to the accumulation of organic matter and ammonia, increased concentration of solutes and an overall decline in DO concentration (Dutton and Subalusky, 2021;Wanderi et al., 2022).
Seasonality played a significant role in influencing water quality variables in the river through its influence on runoff, erosion and leaching of solutes, as also shown in a previous study (Wanderi et al., 2022). Solute concentrations and electrical conductivity were much higher during the dry season, probably because of increased evaporation and water temperatures. Studies show that inter-site differences can increase during the dry season as reach-scale influences (e.g., daily animal and human disturbance) increase during this period (Mathooko, 2001;Yillia et al., 2008;Minaya et al., 2013). In contrast, runoff and leaching during the wet season can increase the delivery of sediments, nutrients, and DOC into streams and rivers (Elsenbeer, 2001;Saunders et al., 2006). However, during TABLE 4 Final metrics selected for inclusion in the Mara River M-IBI for assessment of the present ecological condition of the river. Details on definition and rationale or significance for inclusion, their historical use in Lake Victoria basin and predicted responses to perturbations and reduced flows are provided.

Metric
Metric definition and significance Predicted response to increased perturbation (including flow reduction) Ratio of individuals that feed on algae and periphyton and those that filter food from water currents to those that feed on organic matter on the benthos

Frontiers in Environmental Science
frontiersin.org the wet season, dilution and increased longitudinal connectivity can homogenize conditions in rivers, leading to reduced intersite differences in water quality and ecological conditions (Leung et al., 2012;Masese et al., 2014b).

Performance of diversity indices
Since there are many diversity indices, it is difficult to decide a priori which one is best for a given region or study objective. By comparing the performance of eleven indices that are rarely or never used, we found that most of the indices performed poorly as discriminators of human disturbance in the river. This poor performance is probably due to taxa replacements with increasing levels of disturbance and not necessarily a loss or decline in taxa richness or diversity. For instance, the Mara River mainstem recorded the highest taxa richness, but most often, these were either tolerant or moderately tolerant to poor water quality. This is a weakness of most diversity indices as they don't consider the tolerance of individual taxa or species to different levels of disturbance but the composition of the community in terms of the number of species and their abundances.
Although the diversity indices performed poorly, most were essentially in agreement regarding differences in macroinvertebrate diversity among the four site categories (Figures 2, 3). Low values of the Shannon index (<2.5) in MIX, AGR and MARA sites indicate widespread degradation affecting macroinvertebrate communities in the river. The same trend of declining richness and diversity of macroinvertebrates in MIX and AGR sites was captured by Simpson's, Hill's, Margalef's, Fisher's, and Menhinick's indices ( Figure 2). Agricultural streams had the lowest diversity indices, while MARA sites had slightly higher values, although there was a lot of variability within this site category. The lack of variation in most diversity indices across the site categories indicates the similarity of macroinvertebrate communities (Jost, 2007), especially regarding the abundance of common taxa. Menhinick's diversity index had better differentiation among the site categories, suggesting that it was less sensitive to the numerical dominance of macroinvertebrate communities by a few common taxa. Hence, the index (Menhinick) is better suited for assessing anthropogenic influences on the diversity of macroinvertebrates in Afromontane-savanna rivers where the relative abundance of a few taxa (3-5) can be very high. In the Mara River, the relative abundance of the three most common taxa ranged from 30% to 93%. Shannon's, Simpson's (the reciprocal form), Hill's, and Fisher's indices could not capture declining water quality conditions when the diversity of macroinvertebrates was high. Hence, they are better suited to assessing disturbance that causes taxa richness and composition variation.
Like diversity indices, the number of taxa and families did not perform any better in discriminating different river disturbance forms. For instance, MARA sites recorded the highest number of taxa and families, but water quality was the most degraded. The MARA sites also experienced extreme flow variation, with seasonal tributaries ceasing flowing during the dry season. Given that this is the nature of savanna rivers-seasonal flows, high levels of organic matter, turbidity, water temperatures, and electrical conductivity (dissolved solutes) and low dissolved oxygen concentration (Wanderi et al., 2022), these taxa may be adapted to live in these conditions while maintaining high diversity and abundances. Indeed, some of the most sensitive taxa to poor water quality, such as Ephemeroptera and Trichoptera, recorded high taxa richness and abundance in organically polluted sites downstream of hippo pools and livestock watering points . Similarly, Coleoptera, Hemiptera and Odonata had high taxa richness and abundance in MARA sites, suggesting that these orders are essential for monitoring extreme cases of flow reduction (including flow cessation), organic matter and nutrient loading in streams and rivers (Masese et al., 2018;Masese et al., 2021).

Performance of biotic indices
The three biotic indices (SASS5, TARISS and ETHbios) performed similarly in assessing the ecological status of different site categories in the Mara River ( Figure 4). However, the indices' total score and ASPT derivatives provided contrasting results. While the total scores indicated improved ecological conditions from the FOR to MARA sites, the ASPT derivative showed that the MARA sites had the poorest ecological condition. Thus, the ASPT derivatives captured the true conditions of the sites according to water quality variables. Higher ASPT scores indicate a high number of taxa that are sensitive to pollution or disturbance (Chutter, 1972;Chutter, 1998;Dickens and Graham, 2002). Overall, the results of the biotic indices indicate a general deterioration in water quality and habitat diversity in the Mara River mainstem and the seasonal tributaries. On the contrary, sites in the upper reaches of the Amala and Nyangores tributaries had natural water quality and high habitat diversity.
Despite having the lowest number of taxa represented in the Mara River-only 40 out of 96-ETHbios performed better than expected in comparison to both SASS5 and TARISS, which had 78 and 77 taxa, respectively, represented in the river (Table 3). The good performance of ETHbios shows that even with a subset of the taxa, sites can be assessed to determine their ecological status. However, this can be difficult when the sensitivities of native taxa are unknown, and values from elsewhere are assigned to native taxa. TARISS, which borrows heavily from SASS5, was expected to perform better than the other two indices (SASS5 and ETHbios) because it is the only biotic index empirically developed in East Africa, and macroinvertebrates in Tanzania and Kenya share similar characteristics. For instance, Dicercomyzidae in TARISS, which is missing in both SASS5 and ETHbios, is represented in the Mara River.
Similar to diversity indices, the biotic indices also showed dependence on seasonality and flow conditions for performance. The three biotic indices performed better during the dry season by distinguishing between the extremes of ecological conditions (FOR and MARA) in the river. In contrast, there were declines in the discriminatory ability of the indices from the dry to the wet season. Poor performance of the biotic indices is likely due to improvements in environmental conditions or water quality caused by the dilution and flushing of the streams and rivers during the rainy season. Improvement of water quality in the MARA sites during the wet season was captured by the presence of sensitive taxa such as Perlidae, Heptageniidae and Philopotamidae, which did not occur Frontiers in Environmental Science frontiersin.org during the dry season. Similar findings have been reported in other rivers, whereby replenishing of water and removal of fine sediments in the benthos by river flows leads to the recolonization of streams and rivers with sensitive taxa that cannot withstand sedimentation and poor water quality during the dry season (Shivoga, 2001;Masese et al., 2009b). The performance of the SASS5 and TARISS biotic indices was very similar. This can be explained by the fact that TARISS was derived from SASS5, with all the sensitivity weightings in the two indices being similar except for three taxa (Dicercomyzidae, Ephemerythidae and Neridae) missing in SASS5 but included in TARISS. Dicercomydidae occur in the Mara River, implying that the distribution of taxa in regions sampled for the development of TARISS, including northern Tanzania, is like that in the Mara River basin. This isn't surprising because the Mara River is transboundary between Kenya and Tanzania. In contrast, ETHbio's poor performance was largely due to many taxa (40) in the Mara River that are missing in the index, as opposed to only 16 taxa in the Mara that aren't in either SASS5 or TARISS (Table A1).

Performance of M-IBI
The M-IBI developed for the Mara River performed better than diversity and biotic indices in discriminating the different levels and types of stressors among the four site categories. IBIs are more rigorous in their assessment of ecological conditions because they incorporate metrics whose utility or sensitivity to different forms of disturbance is already tested and assured. Although most of the 12 metrics included in the M-IBI have previously been used to develop macroinvertebrate IBIs in the Lake Victoria basin Raburu et al., 2009a;Raburu et al., 2009b;Kobingi et al., 2009;Aura et al., 2010;Masese et al., 2014a), and other river systems in Africa (Thorne and Williams, 1997;Lakew and Moog, 2015;Alemneh et al., 2019;Tampo et al., 2020;Kaboré et al., 2022), three new metrics were incorporated to capture specific stressors that are unique to Afrotropical savanna rivers as indicated above. The first new metric included in the M-IBI was the percentage of individuals of rheophilic taxa. Rheophilic taxa require a stable substrate and moderate to fast flows to meet their ecological requirements, mainly attachment for feeding, dissolved oxygen uptake and osmoregulation. Rheophilic taxa in the Mara River include Hydropsychidae, Tricorythidae, Simuliidae, Elmidae, and some Baetidae (Centroptiloides sp. and Pseudocloeon sp. and Afroptilum sp.), among others . Simuliidae, Hydropsychidae, Tricorythidae, and some Baetidae have been identified to be sensitive to flow alteration in African rivers Dallas, 2007;Rivers-Moore et al., 2007;Thirion, 2016;Masese et al., 2021). When water levels and flow velocities drop, they cannot feed and obtain enough dissolved oxygen, and consequently, they are eliminated. Incorporating rheophilic species in multimetric indices is crucial for monitoring changes in natural flow regimes of rivers caused by land use change and excessive water withdrawals. These taxa are also important for setting environmental flow requirements for streams and rivers.
The second new metric included in the M-IBI was the percent of individuals in vegetation-associated taxa, which is the number of individuals in taxa occurring in or preferring vegetated zones. Vegetation-associated taxa included Naucoridae, Belostomatidae, Lestidae, Hydrometridae, Nepidae, Notonectidae, Naucoridae, Lepidostomatidae, Leptoceridae (plant cased), Pisuliidae, among others. Most of these taxa belong mainly to Coleoptera, Hemiptera and Odonata, and some Trichoptera, especially detritivores. The sensitivity of Trichoptera to different forms of disturbance in streams and rivers is undisputed (Barbour et al., 1999). Taxa among Odonata, Coleoptera, and Hemiptera are fast colonizers tolerant to water flow reduction or cessation and poor water quality (Velasco and Millan, 1998;Boulton and Lake, 2008; Raburu et al., 2017) but very sensitive to the loss of instream or riparian vegetation that they use for attachment. Odonata was particularly abundant in seasonal streams and river sections with emergent and submerged macrophytes. In other savanna rivers, marginal vegetation supports abundant odonate larvae and adults (Samways and Steytler, 1996), and, as a result, Odonata and other taxa that prefer marginal vegetations are good indicators of riparian deforestation or clearance of instream vegetation (macrophytes) by floods or other disturbances (da Silva Monteiro et al., 2013). The third new metric included in the M-IBI was the percentage of individuals in burrowing taxa. Taxa that burrow in a soft substrate (gravel, sand, or mud) prefer slow-flowing sections of rivers (backwaters and pools) and are soft-bodied. Burrowing taxa include some Chironomidae, Oligochaeta, Polymitarcyiidae, Gomphidae, Libellulidae, Leptophlebiidae, Tubificidae, Gyrinidae, and Corixidae, among others. The abundance of burrowing taxa is negatively related to discharge and flow velocities in streams and rivers . In the Njoro River, drying of the river resulted in a tremendous increase in oligochaetes while other sensitive taxa among the EPT were eliminated (Shivoga, 2001;Mathooko et al., 2005). In the Moiben River, the peak of the dry season coincided with a significant decline in sensitive taxa and a proliferation of burrowing and pool-associated taxa, such as Belostomatidae, Corixidae, Gomphidae, and oligochaetes Masese et al., 2009b). Similar findings have been reported in other river systems suggesting that burrowing or pool-associated taxa are good indicators of flow reduction, sedimentation or organic pollution in streams and rivers.

Comparison of diversity, biotic and multimetric (M-IBI) indices
The discriminatory ability of the different indices was in the order M-IBI > biotic indices > diversity indices. Incorporating various metrics that responded differently to various forms and levels of disturbance gave the M-IBI an edge over diversity and biotic indices in the Mara River. Despite the notable better performance of the M-IBI, the three classes of indices were insensitive to low and moderate levels of disturbance in the river, specifically among FOR and MIX sites, MIX and AGR sites and AGR and MARA sites. Only the extreme ecological conditions in FOR and MARA sites were distinguishable, and even so, most diversity and biotic indices couldn't differentiate FOR and MARA site categories. Failure of diversity and biotic indices to identify MARA sites as disturbed shows that the indices are biased in assessing disturbed sites with high taxa richness. It has been noted that diversity measures describe community structure but not the pollution level of water bodies (Washington, 1984). Thus, when community structure remains unchanged, for instance, by replacing sensitive taxa with tolerant taxa when water quality degrades or the level of disturbance increases, most diversity indices cannot capture this change. Similarly, biotic indices are insensitive to the number of taxa but to the sensitivity ratings or scores of individual taxa. However, most derivatives of the biotic indices were positively correlated with the number of taxa, not the number of families.

Considerations for using diversity, biotic and multimetric indices
Although the diversity and biotic indices have a long history of use as discriminators of different levels of pollution and other forms of disturbance in streams and rivers, the findings of this study show that they can be limited when distinguishing low to moderate levels of disturbance. Diversity indices were particularly ineffective in identifying changes in macroinvertebrate community composition caused by the replacement of taxa, and not loss of taxa. On the other hand, biotic indices were insensitive to the number of taxa and instead relied more on individual taxa's sensitivity ratings or scores. This made it challenging to detect moderate forms of disturbance that maintain or increase taxa richness. These shortcomings of diversity and biotic indices are addressed in multimetric indices

FIGURE 5
Box-and-whisker plots comparing the ability of the M-IBI to discriminate between the four levels of disturbance or site categories (FOR, MIX, AGR and MARA) in the Mara River Basin during the dry (A) and wet (B) seasons. Boxes show interquartile ranges (25th and 75th percentiles), and the middle lines are medians. FOR, forested; MIX, mixed; AGR, Agricultural, and MARA, Mara River mainstem and seasonal tributaries. Similar letters on boxes indicate no significant difference in metric values among the four condition categories after One-Way ANOVA.

Frontiers in Environmental Science
frontiersin.org 20 that incorporate various metrics of macroinvertebrates whose utility is individually evaluated before inclusion in the final index.
There is growing interest in developing biotic indices for the assessment of surface waters in Africa (Dallas, 2021;Masese et al., 2022). Thus, the performance of existing indices should be evaluated to inform the development of new ones. Most importantly, it is vital to determine the situations and conditions under which their use is most appropriate. There is also a need to improve the performance of biotic indices by empirically establishing the sensitivity ratings or scores of native taxa instead of relying on values in the literature. Although TARISS and SASS5 could assess the ecological condition of the different sites in the Mara River, their performance in terms of discriminating different levels of disturbance in the river was inadequate. Thus, if any one of the indices is to be used for regular biomonitoring of the river, several improvements must be made. First, all the taxa in the Mara River which could not be scored because of missing sensitivity weights or ratings in the two indices must be assigned sensitivity scores. This can be done using water quality and distribution data or by comparing the distribution with taxa already scored in other indices. Secondly, the sampling methods need to be tested for representativeness, and data on abundance incorporated as part of the bioassessment protocols. Although abundance data is estimated in SASS5 (Dickens and Graham, 2002), it isn't clear how these data are incorporated into the categorization of sites into different ecological classes. Thirdly, the sensitivity scores of the existing taxa should be re-assessed to ensure that they are congruent with the displayed responses in water quality and the overall condition of the sampled sites.
Concerns have been raised over the possibility of regional differences in the sensitivity of similar or related taxa to the same forms of disturbance (Kaaya et al., 2015;Dallas, 2021). Lastly, the issue of seasonality should be addressed to determine the optimum conditions for sampling. This is necessary because water resources are dwindling, and increasing demand has led to excessive withdrawals that convert once permeant streams and rivers into seasonal ones. In semi-arid and sub-humid savanna grasslands, streams and rivers experience hydrological extremes characterized by flooding during spates and cessation of flows during the dry season or droughts. This makes it difficult to develop an index suitable for all flow conditions. For instance, the SASS5 protocol is not recommended for high-flow (wet) conditions (Chutter, 1972;Dickens and Graham, 2002) because it is too variable and unreliable. However, it is very good for assessing the effects of low flow on ecological conditions and establishing minimum flow requirements (environmental flows) for sustaining the functioning of rivers (Dickens and Graham, 2002;O'Keeffe and Dickens, 2008).
Even with the development of regional indices such as TARISS and ETHbios in Eastern Africa, there is still a need to develop a biotic index for river systems in Kenya and other African countries. However, this doesn't need to start from scratch as existing regional indices such as SASS or TARISS can be used as foundations. Most importantly, these indices must be validated and tested using native taxa and environmental conditions. Differences in climate, geology, altitude, longitude, and latitude among regions and African countries may contribute to differences in physico-chemical characteristics of rivers, resulting in differences in macroinvertebrate assemblages and sensitivities to disturbance gradients and general impairment of ecosystem functioning.

Conclusion
The findings of this study show that while diversity and richness indices that measure the structure of macroinvertebrate communities are widely used in determining the level of anthropogenic disturbance in streams and rivers, their performance in the Afromontane-savanna rivers isn't adequate. Most diversity and richness indices tested performed poorly and failed to discriminate among different levels of human disturbance. Overall, the M-IBI performed better than biotic and diversity indices by having a higher discriminatory ability among site categories. The poor performance of regional biotic indices in assessing the river's ecological condition provides more evidence for the need to test and validate indices developed elsewhere before their use in bioassessment programs and decision-making.

Data availability statement
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Author contributions
FM, EW, KF, and MM: Conceptualization, experimental design; FM, EW, KF, and MM: Data collection; FM, EW, AA, and KN: Data analysis; FM and EW: Drafting of the manuscript; All authors commented on the manuscript and approved it for publication.