Social Network Analysis for Water, Sanitation, and Hygiene (WASH): Application in Governance of Decentralized Wastewater Treatment in India Using a Novel Validation Methodology

Social network analysis (SNA) is a versatile and increasingly popular methodological tool to understand structures of relationships between actors involved in governance situations. Given the complexity of the set of stakeholders involved in the governance of Water, Sanitation and Hygiene (WASH) and the diversity of their interests, this article proposes SNA to the WASH sector. The use of SNA as an appropriate diagnostic tool for planning Citywide Inclusive Sanitation is explored. Missing data is a major problem for SNA in the studies of governance situations, especially in low- and middle-income countries. Therefore, a novel validation methodology for incomplete SNA data, relying on information from internal and external experts is proposed. SNA and the validation method is then applied to study the governance of decentralized wastewater treatment in four cities of India. The results corroborate key differences between mega and secondary cities in terms of institutions, community engagement and overall sanitation situation including aspects of decentralized wastewater treatment plants, based on the city types.


INTRODUCTION
Social Network Analysis (SNA) is a method of detecting and interpreting structures and patterns of connections between actors who may be individuals, collectives or institutions (Scott, 2017). SNA is a versatile tool for different applications due to its graphical representation, structural intuition and systematic data interpretation (Freeman, 2004;Borgatti and Ofem, 2010). It has been increasingly used in a variety of fields from political science (Fischer and Sciarini, 2016;Victor et al., 2016), business marketing (Iacobucci, 1996), social psychology (Pearson and Michell, 2000) to public health (Valente et al., 2008), and environmental governance (Bodin and Crona, 2009;Bodin, 2017). More substantively, SNA is designed to deal with data on relations among entities, and thus data that describes interconnected phenomena, and consists of non-interdependent observations.

SNA concept Relevant interpretation in sanitation governance
Density Indicates how closely actors within a network are connected to each other. Calculated as the number of observed network connections over the maximum number of network connections that could exist (if all actors are connected to all other actors). Useful mostly for comparing networks.
Centrality Centrality indicates the degree to which an actor is embedded in the network. For example, high centrality refers to actors able to collect and transmit information and coordinate with other actors (Scott, 2017). Several centrality measures exist (Freeman, 1979); the most prominent ones are degree centrality (number of connections an actor has), closeness centrality (average path length to all other actors in the networks), and betweenness centrality (actor lying on shortest path between two other actors in the networks). Useful mostly to identify important or powerful actors in the network.
Core and periphery Indicates the degree to which a network has a core-periphery structure, and whether actors belong to one or the other. The core is defined as a set of densely interlinked actors, which is positioned in the center of the whole network, whereas actors in the periphery are more loosely connected to the center, and not among each other (Borgatti and Everett, 1999). Useful to identify a power structure in the network, and identify marginalized actors.

Centralization
The degree to which centralities in the network are distributed equally or unequally among actors in the network (Freeman, 1979). High centralization exists if there is one very central actor with all other actors being much less central. Useful to identify power structure and hierarchies.

Cliques
Subgroup of actors within the network that is densely connected. Useful to identify fragmentation of the network, or coalitions of actors, etc. (Bron and Kerbosch, 1973).
Whenever a researcher believes that relations among entities are crucial for understanding a given phenomenon, SNA can provide important insights (see Table 1) 1 . Governance in water, sanitation and hygiene (WASH) for development, especially in urban sanitation, is complex and commonly involves a number of stakeholders interacting across administrative levels, sectors and demographics (Strande et al., 2014). For instance, political economy studies of WASH and related urban services in Asian low-and middle-income countries, have revealed that the complexity of governance combined with weak institutions are a detriment to urban service delivery (Boex et al., 2020). In such a context, SNA can be used to describe and analyze the polycentricity of governance and institutions relevant for economic development. Furthermore, SNA has been related to (e.g., Ostrom, 2009) crucial concepts of polycentric governance (by assessing the complex patterns of different actors participating in a diversity of parallel decisionmaking bodies, e.g., Lubell, 2013), and social-ecological systems (by assessing how governance networks of actors are related to underlying ecological networks, e.g., Bodin, 2017). The use of SNA for such contexts can thus take the potentially important structure of relations 2 among different actors into account, and could offer a different and possibly more appropriate perspective as compared to more conventional stakeholder analysis methods, which are often employed in WASH research and practice. The importance of SNA in understanding the complex adaptive systems existent in WASH for development has been indicated by Neely (2013) to answer the questions of why and how to ensure sustainability of community WASH interventions.
More specifically, SNA has several key advantages for the analysis of complex governance situations. First, SNA can help in identifying and interpreting specific roles of given actors in the governance network including gatekeeper or broker roles (Bodin and Crona, 2009;Ingold and Varone, 2012;Ingold, 2014). These actors can be crucial for the diffusion of information and best practices, or the elaboration of compromise solutions in governance networks. Second, a graphical representation of the SNA, a network graph (or sociogram) provides intuitive visual insights of the interactions between actors and allows for identification of key and marginalized players, and therefore could facilitate more equitable stakeholder involvement. Such information could pave the way for effective stakeholder engagement, taking into account formal, and informal networks, and reveal possibilities to build on existing social structures and points of interventions that improve success in WASH governance. For example, using SNA for identifying collaborative social networks for better water resource governance in the Mkindo catchment, Tanzania (Stein et al., 2011). A deeper understanding of stakeholder relations can increase the likelihood of collective action resulting in higher success of interventions (Prell et al., 2009). The use of SNA for identifying key characteristics of stakeholder networks that support institutional development has been shown in the service delivery of rural water supply in several low-and middle-income countries (McNicholl et al., 2017). Third, the very process of SNA data gathering has positive effects on the participation of stakeholders and the building of relationships with them (Jami and Walsh, 2014), while also increasing their awareness of other actors in the network. This is particularly useful in planning for the paradigm shift in urban sanitation that is Citywide Inclusive Sanitation (CWIS), which is based on equity in sanitation service delivery, combined use of diverse sanitation systems, and safe management of fecal waste along the entire sanitation value chain (Lüthi and Narayan, 2018).
Despite the potential benefits of SNA for research in the WASH sector, there has been a preference for stakeholder analysis over SNA, especially in urban sanitation studies Lüthi et al., 2011;Reymond, 2014;Myers, 2016). Stakeholder analysis has been criticized for lack of consistency, halved perspectives, and for being in want of accounting informal relations (Hermans, 2005;Reed et al., 2009). Stakeholder analysis is purely qualitative and relies solely on interviews, focus group discussions, and snowball sampling to identify stakeholder interest and influence . SNA, on the other hand, can be both quantitative or qualitative, and allows for a more mixed methods approach (Edwards, 2010). Studies advocate combining SNA and stakeholder analysis to produce fine-grained insights in water infrastructure planning, because this would improve rigor and offer complimentary perspectives that would help to create a more complete situational diagnosis of stakeholder interest and interactions (Lienert et al., 2013). Other studies have promoted this view in natural resource governance and participatory planning (Paletto et al., 2015;Yamaki, 2017).
One important disadvantage of conventional SNA methodology and related data gathering through surveys or interviews (Wasserman and Faust, 1994) are problems in data collection similar to most other key informant methodologies. SNA requires reliable data to draw strong inferences from the analysis of the networks. This presents the need for a systematic validation procedure, which could mitigate the issues that arise with unreliable data, especially from research in low-and middle-income countries 3 , where data quality and availability is a consistent issue (Becker et al., 2012). Since most WASH research is carried out in similar settings, an appropriate validation procedure is even more relevant.
Decentralized wastewater treatment systems in India have witnessed an exponential increase in their uptake across the country in the last decade. This was prompted by an 2006 amendment to the environmental clearance laws that mandated that large buildings (built up area above 20,000 m 2 ) treat sewage in situ. An estimated 20,000 small-scale Sewage Treatment Plants (STP), serving between 10 and 1,000 households, are currently in operation using various technologies . A majority of them are found in cities, both mega and secondary. However, due to the lack of a clear policy framework and jurisdictional overlap between governing agencies at various levels, the performance and sustainability of such small-scale sanitation systems (SSS) 4 are affected . Sustainable long-term operation of such SSS require effective governance (Ross et al., 2014). Understanding the governance of SSS can also help inform future policies for their planning, implementation and long-term monitoring. Such a study can also help the understanding of the nuanced differences between mega and secondary cities in India, which have inherent differences in institutional set up, urbanization, citizen engagement, decentralized wastewater treatment, and sanitation at large. Therefore, the combined aim of this paper is to: (i) propose SNA as a useful tool for WASH research and practice, (ii) introduce a novel validation methodology for SNA, and (iii) explore the differences in sanitation governance between mega and secondary cities in India, using SNA as a tool. In doing so, this paper presents the first research carrying out social network analysis research for urban sanitation settings.

Social Network Analysis and Low Response Rates
The goal in the first stage was to gather SNA data on the governance networks in four Indian cities based on interviews and surveys. This type of data gathering in the field is well established for SNA and has been previously used as a systematic method to describe and analyze the governance network between multiple stakeholders in areas such as the water sector (Lienert et al., 2013;Angst, 2018), natural resources governance (Bodin and Crona, 2009), climate governance (Ingold and Fischer, 2014), energy governance (Fischer, 2015), policies for reducing emissions (Brockhaus et al., 2014), and planning (Dempwolf and Lyles, 2012;Gerber et al., 2013). In this initial attempt, the relevant actors responsible for the SSS present in the four Indian cities (Chennai, Bangalore, Mysore and Coimbatore) were identified through informal expert contacts and document analysis (a set of about 15-20 actors per case, e.g., national, state and city level public administrations, international organizations, relevant boards, and associations, etc. An overview of actors appears in Table 2). Individual representatives of the relevant organizations were then contacted by email and phone in order to interview them or have them fill out a written survey with the same content. For example, in order to assess the relevant network relations among actors, the survey/interview protocol asked actor A to "check, on a pre-defined list of all relevant actors -all those actors with which actor A regularly exchanged technical information on sanitation issues within the last 10 years." A common problem with gathering network data directly from the stakeholders themselves is low response rates, as with any other interview and survey data gathering. In the present case, the interview and survey response rates on average were <40% (with a maximum of 50% in Bangalore and a minimum of 27% in Coimbatore). Common reasons for non-response are that individuals do not feel competent to answer the questions, are not interested in filling surveys, do not have time, do not want information about their organization to appear in studies, etc. These reasons were mentioned by actors in this specific case, but they correspond to common reasons for non-response in survey and interview-based research. Overall, while low response rates is a common problem specific to social science research in lowand middle-income countries such as India, it is also an issue in many studies of this nature elsewhere, including SNA research in the United States, for example (Lubell et al., 2017).
Low response rates lead to incomplete data. Data can be incomplete with respect to actors that are missing, or, more frequently, with respect to relations between the actors that are missing. Concerning the latter, survey and interview data gathering in the context of SNA always has two potential sources of information for the relations between two actors, that is, from one or the other actor. While this can mitigate issues of low response rates (if actor A indicates a relation to actor B, but information from actor B is missing, the researcher still has partial information on that relation), missing data in SNA can still be problematic for several reasons. Most importantly, incomplete network data can lead to unreliable estimates of network-level statistics, given that network-level statistics are based on the structure of the entire network (Burt, 1987). For example, centrality is a popular network measure used to identify the most important actors in a governance network ( Table 1). Centrality measures can be incorrect due to missing data, or if parts of the networks are missing or disconnected from each other (Costenbader and Valente, 2003). More substantively, the analysis of incomplete network data might lead to the erroneous identification of important actors through wrong or unstable centrality indices. It can further lead to inaccurate density measures (see Table 1), if the percentage of missing data differs between the networks to be compared.

Validation Methodology
In order to increase the validity of the data gathered on the four cities in India, a validation methodology was developed. The objective of the process was to validate an existing, incomplete network, using available expertise from informants who have high knowledge of the case and the relationships the actors share within the network. This process of eliciting expert judgements has been previously used for WASH studies in low-and middleincome countries where data is often not readily available and knowledge from experts was found to be invaluable (Montangero and Belevi, 2007). Similar practices have been employed, albeit scarcely, to elicit network data for social network analysis. Carley and Krackhardt (1996) involved a third person within the network to comment on connections between dyadic relations, the equivalent of an "insider." Here, the cognitive inconsistency between non-symmetric and non-reciprocated relations between actors were studied, using such insiders. Orenstein and Phillips (1978) used press reporters to give information about political actors' relations, a case which used members completely outside of the network, an "outsider." As mentioned by Dorelan et al. (1989), it is important for these outsiders to be in the margins of the study group and yet remain knowledgeable. Insiders bring in detailed information about relations between actors based on their direct experience and a perspective only available to them. Similarly, outsiders are beneficial due to their ability to view the entire network without direct involvement and, therefore, without egocentric biases (Dorelan et al., 1989). Using these two established types of informants, insiders and outsiders, simultaneously, allows for an additional level of confirmation to be obtained regarding network data between actors, while also reducing any possible perception biases.
In order to improve data reliability, a seven-step validation procedure has been proposed below. This procedure is based on network graphs that are visualizations of the social network. Most importantly, these visualizations include nodes (also called vertices) to represent the actors in the governance networks and ties (also called links or edges) to represent relations between the actors. Colors and sizes of nodes and ties can be used to represent attributes of these elements. For example, different colors can be used to represent different types of actors, and tie size can be used to represent the intensity of a relation. The steps of the validation procedure are grouped as desk based steps (1-3), field based steps (4-6) and reconciliation steps (7).

Usage of existing incomplete or desk based network graph
The initial network graph stems from an incomplete social network analysis, with either missing actors or missing information on relations between actors. The incompleteness can be either due to low response rates in interviews or surveys, or to the fact that it was a purely desk based study, which needs validation from the field to bring it closer to the reality of the different types of relations among actors.

Expert identification
This could either be carried out from a Power-Interest matrix, choosing actors with high interest (Quadrant-1 & 4 in Figure 1) 5 or who could be chosen from case knowledge. 10-20% percent of the number of actors in the entire network graph, depending on its size, could feature as experts. It is preferable to keep this percentage low, otherwise there is a risk of carrying out an elaborate conventional SNA procedure of interviewing most actors, again with problems of missing responses. It also helps target the most valuable experts and ease the reconciliation (Step 7).

Insider-Outsider selection
An equal number of insiders and outsiders (defined as above) have to be selected from the experts. Those actors positioned in the core of the network graph with high centrality are classified as insiders and those actors who are either in the periphery of the previous network graph or who do not feature as an actor at all, and yet have high interest and/or knowledge about the context of the social network, will be classified as expert outsiders.

Discussion based on a simplified unconnected version
A simple version of the network graph, where actors are arranged randomly with equal sizes and without color codes or connections between them, is presented to each expert (insider and outsider). This ensures that there is only basic inference on the part of the actors, possible from the representation, and does not create any biases. In order to deal with the first basic issue, concerning missing data in the SNA (missing actors), it is verified that all important actors are featured, and that no non-important actor is included. If not, the suggested actors are added or deleted (for example: Divisional PCB is removed as mentioned in Figure 2).

Simplified version to make connections
Post the actor verification on step 4, the perceived relations between them are requested from the expert in order to deal with the second missing data issue in the SNA, that is, missing relations among actors. Types of connections vary by case; in governance, typical connections include information exchange (technical and administrative), collaboration, line reporting, etc. (Victor et al., 2016). These connections could be formal only, or informal only, or both-as required by the network graph. Initially, the obvious connections are marked, and then the less visible connections, such as informal or inter-sector connections are made (for example: International Organizations and Private Companies in Figure 2). This exercise might take some time, and often requires prompt questions. 5 As part of the study, a stakeholder analysis with a power interest matrix, was carried out for the above cases (Figure 1) . The power interest matrix classifies the stakeholders identified according to the power they hold and their interest in the decision making process on all aspects of decentralized wastewater treatment plants in each of these cities (Reymond, 2014). "Power" (vertical dimension) refers to the ability of an actor to make decisions and to influence the system, independently of its formal role. "Interest" (horizontal dimension) refers to their involvement in the sector, based on their responsibility (Ackermann and Eden, 2011). 6. Existing network graph for representation questions Post the simplified unconnected version, the original nonvalidated network graph is presented to the expert, and representative questions are discussed. The expert is then invited to verify which actors are central or peripheral actors, which connections are present or not, and whether the size and positions of all actors are right, according to his view (note that the position of the actor usually represents its centrality, and the size can represent different types of information, in this case Eigenvector centrality). Additionally, any weak, nonexistent or irrelevant connections are marked to be removed (for example: a weak connection between the Central Pollution Control Board and International Organizations was marked for removal in Figure 3. Similarly connections between urban development authority and divisional pollution control board, and state funding corporation and pollution control board were also suggested to be removed) 6 .

Data reconciliation
Based on all the data collected from the above steps 1-6, the corresponding binary adjacency matrix is filled as 1 or 0-the pair of actors being connected or not connected, respectively. When there are conflicting responses for the same connection from various sources, the reconciliation for the relation is carried out based on the following (see example in text further below): (i) Data from the previous network graph; (ii) Weightage of expertise of insiders and outsiders; (iii) Documental evidence found; (iv) Justification provided during the interview; (v) Substantial case knowledge.

Validation of the Network Graph
For the validation procedure proposed in this paper, four key stakeholders were chosen for each of the four cities and, a total of 16 validation interviews were carried out ( Table 3). For reasons of potential research fatigue (Clark, 2008), all the stakeholders chosen were new and had not been interviewed for the previous social network analysis. This was possible, since these actors were not part of the earlier SNA interviews (due to poor selection, unavailability or inaccessibility at that point of time), which resulted in analysis being incomplete in the first place. In addition, certain experts, who were retired or switched careers, yet still had significant knowledge were included in the validation study.

Discussion on Validation Methodology
While such a validation method allows for the gathering of additional data to complement incomplete networks and thus provides an improvement over incomplete survey-or FIGURE 1 | Power-interest matrix of potential stakeholders involved in small-scale sanitation governance at the local level. Refer to Table 2 for abbreviations. Color coding is followed in all other network graphs presented below. desk-based studies, there are obviously some challenging issues as well. Below, four such challenges and their mitigation are discussed.
Firstly, knowledge biases, exercise preferences and effective priming are concerns for the format of the validation methodology. The order of steps 5 and 6 were found to be critical in drawing out major connections in the expert's opinions without biasing. This sequence also ensured that the actors are primed for a more visually complex, information dense and influential network graph. Through the combined usage of time consuming step 5 and visually intimidating step 6, experts who had a preference for one step over the other were also catered to. Experts are often senior and time pressed; therefore, the process had to be time effective and flexible. Therefore, this two-pronged approach reduces the amount of information lost due to temporal and methodological leaks.
Secondly, clarity in criteria for connections is important to establish at the beginning. Interpretation of the requirements of an existent connection varies depending on experts, and has to explicitly clarified. These assumptions could result in inaccurate connections (for example: are solely funding agencies of decentralized STP projects involved in governance, even if they have no responsibility apart from their financial contributions?). There is the possibility that large biases could emerge from the experts as well (for example: private sector experts tend to focus on their importance, while government players tend to downplay the former's importance (see Fischer and Sciarini, 2015). Both aforementioned concerns, could be mitigated by objectively administering the interview with clarity on the relational requirements and minimizing information spill to prevent biases.
Thirdly, prompting is frequently employed in order to maximize the information elicited from the experts, especially in circumstances where inherent knowledge or previous connections are to be challenged. This could potentially lead to interview frustration or bias (Bowling, 2005). At a certain point when all major connections are explored, to bring out inconspicuous connections, prompting is found to be necessary. The researchers must have a considerable amount of prior case in order to carefully prompt when required. For example in step 5, the connection between private company and the pollution control board, in several cases required prompting to be considered for either connecting or not.
Finally, conflicting information leads to difficulties in reconciliation. Since the validation methodology relies on fewer respondents, albeit experts, it requires care to bring in diverse perspectives. Otherwise, the SNA could risk becoming skewed through purposeful sampling (Patton, 1990). The validation procedure finally rests on the systematic reconciliation of conflicting data points. This is carried out qualitatively and involves the judgement of the researcher, which, yet again, places the requisite of prior substantive case knowledge on the researcher. Since the method itself is a mix of qualitative data collection and quantitative data analysis, these limitations are inherent and require careful consideration while selecting experts and being systematic during the reconciliation. However, such limitations are prevalent in most qualitative methods (Taylor et al., 2015), including conventional social network analysis (Scott, 2017). The reconciliation procedure becomes crucial when the experts give varying and frequently conflicting network data. Therefore, systematic assessment of the data needs to be carried out, based on expertise weightage, documental evidence, substantive case knowledge, and justification provided during the interviews. For example, when C3 and C4 (Table 3) had conflicting views on one specific connection between the city corporation and state pollution control board, C4's view was withstanding since C4 earlier held the positions at both city and state levels. Additionally, C4's justification proved to be more convincing with references to policy documents.
In the results section, we present and compare the governance of decentralized wastewater treatment in four cities based on the data received from the different steps of the data collection, including the validation procedure. Since the goal is to describe governance networks and compare different cases, SNA as a standalone method lacks context to interpret the network graphs and needs to be used in conjunction with other research methods, especially qualitative methods to gain deeper understanding of the situation and prevent simplistic conclusions on the stakeholder interactions (Prell et al., 2009;Edwards, 2010). Therefore, this validated network data was used in compliment with two workshops and 76 in-depth qualitative key informant interviews, which provided the background and context on urban wastewater management in India, for the selected mega and secondary cities, and the differences between them were explored (see results section). In addition, the institutional and performance analysis of the specific small-scale sanitation systems in the four cities was available to provide additional perspectives relevant to this analysis . The validated data was processed using the user friendly SNA specific open source software Gephi (Bastian et al., 2009), and represented using Force Atlas configuration without any manual manipulation.

RESULTS
In this section, four main results regarding the use of SNA for our case study are presented. Firstly, the comparison of the pre-validated SNA with the validated SNA, and the major modifications made from the validation exercise are given. Secondly, a detailed illustration of using SNA to understand governance of decentralized wastewater treatment in one particular city-Chennai, is made. Thirdly, the differences between mega and secondary cities in terms of sanitation are presented, and then SNA results are discussed in relation to few of these key differences.

Comparing Pre-validated SNA With Validated SNA
The initial procedure yielded an incomplete network, based on which pre-validated network graphs were created for the four cities of Chennai, Bangalore, Mysore and Coimbatore (Figures 4A-D). Similarly, network graphs were created using the validated network data for the same cities (Figures 5A-D). The five major differences that are clearly visible are discussed below-actor influence, removal of irrelevant actors, addition of important actors, centralities of actors and densities of overall network.
In the interviews, it was unanimously stated that certain actors had a much bigger role in implementation than others who only had soft powers to influence policies. Actors were then broadly classified as implementing actors and influencing actors. For example, comparing Figures 4B, 5B, the Central Pollution Control Board (CPCB) and the Central Public Health and Environmental Engineering Organization (CPHEEO) are influencing actors, while Bangalore's Water Utility (BWSSB) and Resident Welfare Associations (RWAs) are implementing actors. It is important to note that the aforementioned influencing actors are at the national level, while implementing actors are at local level. CPCB sets effluent standards while CPHEEO develops engineering manuals, and both are strong influencers in designing SSS for all contexts. Whereas, BWSSB and RWAs are actors that are directly involved in the building, operation and maintenance of SSS. Although these influencing and implementing actors could have been visually marked differently in their node 7 characteristics, the validated network graph clearly makes the distinction through their position in the core or periphery (Table 1), and their node sizes that represent their centrality measures. 7 Nodes are representation of actors within the network graph. Their color, size and position are important visual characteristics that define them. Other statistics, such as various centralities for each of the nodes, can also be calculated (Scott, 2017). Through step 4, the most relevant actors were identified, and unimportant actors were removed. This resulted in changes in the actors present in the network. The main actors removed were the State Environmental Impact Assessment Agency (SEIAA), the Divisional PCB (DPCB), and the Department of Environment (DoE), due to their relative insignificance in the governance of SSS. SEIAA was removed due to the fact that the Impact Assessment Certifications for construction and operation of STPs are within the purview of the respective state pollution control boards (CPCB, 2016). DPCB is a department within the state PCB and, therefore, does not require explicit mention. DoE as a department does not directly play any role apart from being the state level agency that the PCB reports to.
Additions were made to the social network, as certain actors were found to play a directly influencing or implementing role in SSS for these cities. In Figure 5A, Chennai River Restoration Trust (CRRT), a special purpose vehicle (an independent legal entity with a specific goal, which in this case has the mandate of the rejuvenation of urban water bodies in Chennai) was found to be engaged in the setting up of SSS and also in coordinating with other actors for SSS's wider establishment, and was therefore, added. Similarly, the node Private Players (Figures 4A-D), was meant to represent RWAs, NGOs, private STP companies, and consultants. Since the adjacency matrix of their relationship with other actors varied highly, they were split into two groups (Figures 5A-D). Further, the main agency that directed all municipal governance including water and sanitation was the Municipal Administration and Water Supply (MAWS) in the state of Tamil Nadu, and the Directorate of Municipal Administration (DMA) in the state of Karnataka. These agencies were found to play a bigger role in the smaller cities with respect to SSS.
Overall, the centralities of actors changed with modification in the network data. The most central agency is no longer the PCB, but the utility (CMWSSB/BWSSB) in the mega cities of Chennai and Bangalore while the municipal corporation (CMC/MCC) became the most central actor in the secondary cities of Coimbatore and Mysore, with the parastatal water supply and drainage board (TWAD/KUWSDB) playing a bigger role in the latter two.  The densities of the networks of the four cities have also changed to reflect a more uniform network density across the four cases (Table 4). This is a result of the changes in the overall number of actors and the changes in the individual relations of each actor. The higher values are due to the elimination of irrelevant actors who earlier had minimum connections, thereby increasing the overall network density.

Using SNA to Understand Governance of Decentralized Wastewater Treatment
In order to illustrate the usage of SNA for insights into the governance of decentralized wastewater treatment, the case of Chennai is taken as an example ( Figure 5A). There are a total of 13 key actors involved in the city's SSS. The network overview characteristics, such as network density and average path length provide basic insight into the network graph. A density of 0.50 indicates quite strong connections, as half of the actors are directly connected with each other. The network diameter of 2 shows that the longest distance between two nodes positioned afar is 2, and for them to have contact there is one actor in between. The average path length of 1.5 corroborates this by suggesting that on an average, any two actors are connected through one and a half other actors. These network characteristics are particularly useful when comparing networks, but are more difficult to interpret by themselves. For example, we can state that a network in one city is denser than in another city, but it is hard to judge whether the network is dense, per se, as this depends very much on the type of network (type of context, types of nodes, types of ties, etc.).
All actors either perform the roles of implementing or influencing agencies and, as mentioned before, this is not explicitly labeled, but the size of the nodes and their positions form a core and periphery structure (Table 1) which indicates whether the actors are implementing or influencing. In the case of Chennai, the Utility (CMWSSB), the municipal corporation (GCC), State PCB (TNPCB), Consultants & Private Companies, and RWAs & NGOs are directly involved in the process of commissioning, licensing, building, operating, and maintaining SSS. Therefore, they are clearly seen to be implementing agencies, while all the others remain only as influencing agencies since they only have indirect involvement in the process, such as financing, setting standards for discharge and performance, providing expertise, advocating or simply approving SSS projects.
The centralities of these actors offer more detail in terms of how much power they have within the network. This also translates to how much influence they have in governance within this context. Among the many different centralities (Table 1), degree centrality and betweenness centrality are the most relevant in the present case, as they offer simple measures of an actor's influence within the network. Together, they offer a complimentary set of perspectives i.e., degree centrality represents the simple number of connections an actor has-and thus the actor's potential to serve as a hub. Whereas, betweeness centrality represents the extent to which an actor is placed on a path between other actors. The latter shows the power an actor has in controlling information exchange between other actors, and how the network will get disrupted if that actor is removed. Table 5 provides the values of centralities for all actors involved in SSS governance in Chennai. For example, CMWSSB as the most central actor has connections to all other 12 actors, whereas four actors are connected to only a third of the network (degree centralities of 4). The betweenness centralities are more complicated to interpret directly from the measure, but suggest a clear hierarchy in terms of the actors able to connect other actors within the network. While both centrality measures offer theoretically informed complementary perspectives, they are also highly correlated, suggesting that actors cumulate different aspects of centralities and related potential for influence, etc. Based on the centralities, actors and their most suitable functions can be identified. For information diffusion, the actor with the highest centrality measures (both degree and betweeness) is CMWSSB. They are best placed to inform all actors of policy changes, standard settings, and best practices. For, the role of monitoring, a governmental agency requires a high centrality and to be within the core of the network, yet independent enough that it is not easily influenced by virtue of its connections to other actors. In this case, CMWSSB, GCC and TNPCB are relevant agencies for monitoring the performance of SSS in Chennai. TNPCB has already been constitutionally mandated to monitor all sewage treatment discharges, according to the Water Act of 1974. A recent notification from the National Ministry of Forests and Environment has delegated the power of ensuring compliance with environmental standards, to the urban local bodies such as GCC . In reality, there is little clarity on these institutional mandates for the long-term monitoring of SSS and each of these agencies have their own limitations in terms of jurisdictional reach and capacity . Therefore, purely looking at the SNA, CMWSSB is the most central actor with the highest betweeness centrality by far; it has access to most of the other actors involved in SSS. In addition, CMWSSB is an independent agency and works toward overall sanitation provision for the city; it is best suited to perform the role of monitoring individual SSS. Further, since CMWSSB themselves are required to report to TNPCB about their own treatment performance, TNPCB could be the ultimate custodian of the monitoring database and capable of performing the final verification audits of SSS performances. This function is suitable to their limited organizational capacity.
In the planning process of CWIS projects, it is important to involve all stakeholders present (Narayan and Luthi, 2019). In this particular case of governance of SSS, actors, such as CRRT, who advocate for SSS and for the restoration of urban water bodies in the city, are often not included in the planning. Similarly, CMDA who is responsible for zoning and approval of all construction plans including those of SSS, does not even feature in conventional stakeholder analysis for the same reason. This is also evident from the lack of connections between international organizations involved in SSS projects and CRRT/CMDA. Such agencies can be powerful allies when forming coalitions to create policy shifts or simply to help support the planning of SSS in CWIS projects.
SNA can also inform about many other aspects of WASH research and practice, such as the important role of consultants and private companies in setting up SSS as seen by their betweeness centrality, or the limited connections international organizations have with state and national level actors in SSS governance (visible in the network graphs in Figures 5A-D). These all have a direct effect on the governance of this sector. These are all deeper insights which other methods such as stakeholder analysis, often fall short in bringing to light.

Comparing Small Scale Sanitation in Mega and Secondary Cities
Although there is no standardized definition for the boundary of a city, the administrative jurisdiction, built up area and degree of economic and social interconnectedness together provide a delineation of what is a city. Mega cities are, however, clearly defined as urban agglomerations with a population more than ten million (UN DESA, 2016). Secondary cities are more complicated to describe, as they are contextually defined in terms of population, functionality, connectivity and hierarchy. However, at large, these are cities with a population that is between 10 and 50% of the largest city in the country, and contribute significantly to the regional and subnational economies (Roberts, 2014).
In India, cities are classifiede under several systems by the revenue departments, census agencies, central ministry of urban development and individual state governments (Nandi and Gamkhar, 2013). At the national level, the Class system and Tier system are popular and they classify cities by population and economic contribution. They are however, inconsistent with international terminology and vary even between each other. Therefore, in our analysis henceforth, international definitions are followed. Mega cities are 10 million above in population and secondary cities are ones with a population of at least one million, and feature among the top five in the economic hierarchy of the state.
Therefore, Chennai and Bangalore with populations of 10-11 million each feature as mega cities, whereas Coimbatore and Mysore with populations of 1-3 million each (UN DESA, 2016) and by virtue of their positions in the respective state hierarchy, feature as secondary cities. The reason for choosing to study these four cities is multi-fold. Among the five mega cities in India, Chennai and Bangalore were most comparable by size and demography. The states of Tamil Nadu and Karnataka to which they belong, respectively, have dedicated and progressive sanitation policies. Hence, within the two states, the respective secondary cities of Coimbatore and Mysore were chosen due to high data availability from past projects. Therefore, by reducing inherent variability, the key differences with respect to sanitation could be better focused.
In the sanitation sector, especially within India, the differences between rural and urban contexts (O'Reilly and Louiss, 2014;Chaudhuri and Roy, 2017) and the characteristics of small towns have been previously explored (Sundaravadivel and Vigneswaran, 2001;Singh et al., 2015). However, there has been no study to date of the differences between mega and secondary cities in the WASH context. There are considerable differences in their institutional set up, funding availability, community engagement, urbanization and presence of SSS ( Table 6) that are worth exploring 8 . These differences are important in planning for CWIS, which aims to contextually determine sustainable sanitation interventions (Lüthi and Narayan, 2018). Since the governance landscape, business ecosystem, stakeholder involvement and local knowledge vary significantly between these two types of cities, accounting for these differences in the planning and design stage of sanitation systems, especially in SSS, augers well for their success and sustainability.

Relating SNA Measures to the Differences Identified
The network graphs (Figures 5A-D) and their related measures ( Table 1) that result from the SNA can be usefully related to some of the differences between mega and secondary cities with respect to sanitation, particularly SSS ( Table 6). Other differences, however, are beyond the scope of SNA. The discussion below focuses on three key differences that relate to SNA.
Firstly, the differences in the institutional set up are visibly seen, as the number of actors involved, and their respective positions in the network graph vary. Sanitation in mega cities is governed by a dedicated utility, while sanitation in secondary cities is often governed within the municipal corporation itself. This is clearly seen through the central actors in the network graphs (Figures 5A-D), where the utilities of Chennai and Bangalore (CMWSSB/BWSSB) assume the central positions, whereas in Mysore and Coimbatore, they are replaced by the municipal corporations (MCC/CMC), along with a larger role for the parastatal agencies (TWAD/KUWSDB). Similarly, due to the limited capacity available for SSS planning in secondary cities , consultants and private companies end up playing a larger role (see Figures 5C,D).
Secondly, community engagement is another key difference between mega and secondary cities. In the former, there are a higher number of non-governmental organizations (NGOs) and resident welfare associations (RWAs) reported; yet, the quality of engagement with the citizens is relatively lower when compared to the secondary cities. One plausible explanation from experts for this, is the higher amount of migrants venturing into mega cities for job opportunities, who have a significantly lesser connection with the governance of the cities, when compared to the residents who have spent a majority of their lives in secondary cities, and the latter have a greater motivation for better governance and infrastructure. Studies have suggested that the sense of belonging among migrants toward a new city, their past experiences, and the broader narrative in place, affect their involvement in urban governance (McDuie-Ra, 2012;Scholten et al., 2017;Wessendorf, 2017). This aspect is not clearly deductible from the present network graphs, since the quality of the relations were not accounted for in this analysis. Nevertheless, SNA as a tool has the scope to do such an analysis and can represent the quality of relations though the thickness or shades of color in the connections.
Thirdly, the overall sanitation situation in the two secondary cities have been found to be considerably better than that of the two mega cities, as seen in the results of the "Fecal Waste Flow Diagram" (also called "SFD") assessments (Eawag, 2019). The national level survey on cleanliness, which includes fecal waste and solid waste management, have placed Mysore and Coimbatore in the top 50, whereas, Chennai and Bangalore are 61 and 194 (MoHUA, 2019). However, Chennai, along with Bangalore, consistently ranked above 100 in the past editions. The SNA for these four cities can contribute to the explanation of this diagnostic. Mega cities have issues regarding coordination and overlapping jurisdictions, which the network graphs have visually revealed with multiple actors (Utility, Municipal Corporation, Pollution Control Board and City Development Authority) involved in SSS governance and implementation, yet having limited connections between them. This causes issues in sanitation governance and leads to slower funding cycles even though the proximity to power centers is closer in mega cities. The overall graph density further gives an insight into relatively poorly connected actors in mega cites compared to marginally better secondary cites ( Table 4).

DISCUSSION
The above results indicate that SNA could bring out useful information and new perspectives for WASH governance that other methods miss out. SNA can also corroborate key qualitative evidence, while allowing for a systematic comparison of the governance networks in different cities.
The validation method itself goes beyond the WASH sector and can be applied in any situation where the reliability of network data is low. The validation methodology proposed in this paper is particularly useful when data reliability is low due to poor response rates; it helps validate incomplete and desk based SNAs, which was found to be the case in the initial attempt of carrying out a conventional SNA.
The results also reveal that a simple SNA, such as the present case, has limitations in terms of the differentiating factors that could be analyzed between mega and secondary cities. Yet, this limitation can be significantly overcome. There is scope for SNA as a tool to get more complex, and to account for the quality, strength and formality of connections by weighing the relationship and representing them using thickness, patterns and color shades of edges connecting nodes (e.g., Brandes and Wagner, 2004).
The reconciliation procedure in the validation methodology relies on the researcher having inherent case knowledge and places emphasis on their judgement. Albeit systematic, the replicability of results is uncertain, as in any other qualitative method. Since the reconciled data is a binary matrix of relations, there is high risk of low replicability. This can be mitigated if the reconciliation is based on statistical measures of centrality or simply Bayesian, which then could be represented as weighted edges. The size of nodes, which currently represents centrality, could also be altered to represent other factors, such as perceived importance, size of organization, power, interest, or any other factors the research would benefit in representing.
It is important to use SNA in tandem with other methods to derive relevant conclusions that are complimentary. SNA as a standalone method risks being simplistic with little context sensitivity. Depending on the research question, SNA in compliment with stakeholder analysis, qualitative interviews, focus group discussions, stakeholder workshops, discourse analysis, etc., could deliver deeper insights. This has been shown throughout the results, which uses contextual information from qualitative interviews and document analysis to strengthen various arguments, such as the larger role of the private sector in driving SSS in secondary cities. Furthermore, additional useful questions could be asked based on the network data, and involving more advanced statistical tools. For example, Exponential Random Graph Models (ERGMs) (Cranmer et al., 2016;Fischer and Sciarini, 2016) and similar models allow for inferences on the factors associated with network ties between two actors. Relying on such methods could for example reveal whether actors exchange information mainly due to their ideological similarity, or due to being part of the same institutional arena. Based on such results, concrete measure could be taken to strengthen network relations among a given set of actors in the entire network.
Therefore, SNA has the potential to be a powerful tool in the WASH sector, especially when planning for Citywide Inclusive Sanitation (CWIS), which involves participation of all stakeholders, in order to provide equitable and context appropriate solutions. Therefore, the results of an SNA, along with a stakeholder analysis, adds value to the initial step of planning-a diagnostic study of sanitation governance in the select city. SNA as a process is just as valuable as the results, since it allows for the identification of marginalized stakeholders who are part of the sanitation governance, by not just the researcher, but also the survey participants themselves (Valente et al., 2015;Hauck et al., 2016). SNA as a process, proposed in this paper, is enriching for the participants as well, since it uses techniques of knowledge co-production which engages the local actors in social learning (see Schröter et al., 2018). Such a tool is important in the urban WASH sector, especially in low and middle-income countries, such as India, where the complexity of stakeholders involved is immense. This could help the planning for CWIS become inclusive even at the local level closest to implementation. It could identify actors who could potentially act as policy entrepreneurs or form advocacy coalitions to bring about policy shifts (Ingold, 2011).
The differences in mega and secondary cities that are presented also significantly help in planning for SSS in particular. Lack of monitoring leads to poor operation and maintenance, which then leads to poor performance of systems, and ultimately results in failure of SSS, as proved in India (Davis et al., 2019;Ulrich et al., 2019). The present SNA has been shown to identify the actors who are best suited to carry out the long-term monitoring of SSS. Although WASH governance is not rigid and can be adaptable (Rosenqvist, 2018;Chandragiri et al., 2019), based on an actor's position and connections, their functional potential could be explored to identify which actors are best placed to perform certain functions-central actors for information diffusion and overall influence, and peripheral actors for support functions, presence of cliques for collaboration etc. Such nuanced and visual information will be a useful addition, when seeking to strengthen governance, by using stakeholder participation tools in local scale systems such as The Governance Spectrum and Role play Scenarios (Mitchell and Ross, 2016) or form the basis for action research using participatory design games as used in the study of governance of communitymanaged sanitation services in Indonesia (Rosenqvist, 2018).
Further research is necessary to understand the limits of using SNA for the WASH sector, and of the validation methodology presented. The proof concept tested in this article has <15 actors in each of the four cities. The feasibility of the usage and validation could be tested for larger networks, where the nodes are not institutional actors but individual actors, in cases directly involving implementation of CWIS interventions.

CONCLUSION
The paper proposes SNA as a useful tool for the WASH sector, especially in planning for CWIS. It provides deeper insight into the stakeholders involved in governance situations, such as decentralized wastewater treatment. Apart from visually representing the actors and the exchange of information between the connections, SNA has been shown to be used for comparing contextual differences between different cases, such as SSS governance in mega and secondary cities.
The validation procedure helps to overcome the problem of low response rates in the gathering of network data, which results in incomplete SNA and leads to unreliable network graphs and centralities. The problem of incomplete or desk based SNA, which is frequently present in research in the WASH sector of low-and middle-income countries can be overcome through the use of the proposed validation methodology. The novel use of the combination of insiders and outsiders with expert knowledge, balances the biases and widens the perspective of the SNA.
The proof of this concept is tested in four mega and secondary cities in India-Chennai, Bangalore, Coimbatore and Mysore, for the context of the governance of decentralized wastewater treatment. Using Chennai as an example, the use of SNA to show fine grained insights, such as overall network densities, actor centralities, and functional suitability of actors to perform monitoring has been illustrated. This, combined with the inferences from qualitative analyses, shows that the SNA can corroborate few key differences between mega and secondary cities with respect to SSS governance, their institutions, community engagement, funding availability and the overall sanitation situation. These differences are important considerations to be discussed when planning and designing CWIS projects for such cities.

DATA AVAILABILITY STATEMENT
The datasets generated for this study are available on request to the corresponding author.

ETHICS STATEMENT
According to Eawag Ethical Review of Projects involving human subjects, this was deemed minimal risk. All participatory data collected was through verbal consent and fully anonymised.

AUTHOR CONTRIBUTIONS
All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

FUNDING
Initial network graphs were made in the 4S project, funded by Bill and Melinda Gates Foundation. All subsequent research costs were funded internally by Eawag. Open Access publication costs were covered by Lib4RI.