Representative Farm-Based Sustainability Assessment of the Organic Sector in Switzerland Using the SMART-Farm Tool

The agricultural sector faces serious environmental, social and economic challenges. In response, there has been a proliferation of labels and certifications aiming to ensure minimum farm sustainability performance. Organic agriculture (OA) a prominent example, having received substantial research attention relating to agronomic and environmental performance. While international OA movements are evolving to include broader sustainability aspirations, limited research exists on the social and economic performance of OA. To address this, we conducted a representative farm-based assessment of the Swiss organic sector to evaluate its contribution to sustainability across a wide range of themes based on the FAO Sustainability of Agriculture and Food Assessment (SAFA) Guidelines. We assessed 185 farms using the Sustainability Assessment and Monitoring RouTine (SMART) Farm Tool, chosen through stratified random sampling by farm type and agricultural zone. The results indicate that the Swiss organic sector makes a substantially positive contribution to sustainability, with average scores for theme goal achievement of 62% (Good Governance), 77% (Environmental Integrity), 70% (Economic Resilience), and 87% (Social Well-being). A set of 45 influential indicators (28 for plant production/mix farms and 30 for livestock farms) were selected based on the ability to explain variance (using Principal Component Analysis) and importance for goal achievement. The indicator sets explained a large amount of variation (ca. 70% for both farm types) and revealed a snapshot of management topics relevant to sustainability performance across the sector. These covered socio-political engagement, emissions to air and water, biodiversity, animal welfare, profitability, vulnerability, product quality, local economy, capacity building, and workplace risks. The spread of results across the sample, and comparisons to secondary data (literature and official statistics), revealed the importance of both well-studied issues (e.g., wide spread of energy consumption, variable yield levels/stability, local value chain dynamics) and more novel insights (e.g., strong political engagement, variable price premiums, lacking social security of farming families, insecure land tenure). We propose these topics as a basis for deeper analysis, designing improvement measures and conducting comparative research. This would bring much-needed breadth into the typically narrow debate surrounding the relative merits of OA.

The agricultural sector faces serious environmental, social and economic challenges. In response, there has been a proliferation of labels and certifications aiming to ensure minimum farm sustainability performance. Organic agriculture (OA) a prominent example, having received substantial research attention relating to agronomic and environmental performance. While international OA movements are evolving to include broader sustainability aspirations, limited research exists on the social and economic performance of OA. To address this, we conducted a representative farm-based assessment of the Swiss organic sector to evaluate its contribution to sustainability across a wide range of themes based on the FAO Sustainability of Agriculture and Food Assessment (SAFA) Guidelines. We assessed 185 farms using the Sustainability Assessment and Monitoring RouTine (SMART) Farm Tool, chosen through stratified random sampling by farm type and agricultural zone. The results indicate that the Swiss organic sector makes a substantially positive contribution to sustainability, with average scores for theme goal achievement of 62% (Good Governance), 77% (Environmental Integrity), 70% (Economic Resilience), and 87% (Social Well-being). A set of 45 influential indicators (28 for plant production/mix farms and 30 for livestock farms) were selected based on the ability to explain variance (using Principal Component Analysis) and importance for goal achievement. The indicator sets explained a large amount of variation (ca. 70% for both farm types) and revealed a snapshot of management topics relevant to sustainability performance across the sector. These covered socio-political engagement, emissions to air and water, biodiversity, animal welfare, profitability, vulnerability, product quality, local economy, capacity building, and workplace risks. The spread of results across the sample, and comparisons to secondary data (literature and official statistics), revealed the importance of both well-studied issues (e.g., wide spread of energy consumption, variable yield levels/stability, local value chain dynamics) and more novel insights

INTRODUCTION
The food and agriculture sector is facing multiple global sustainability challenges. These include the environmental threats of climate change, land degradation, nutrient mismanagement, biodiversity loss, and resource depletion (Green et al., 2005;Tilman et al., 2011;Newbold et al., 2016;Poore and Nemecek, 2018). Socially, economically and culturally, the agricultural sector also faces severe challenges, such as poor labor conditions, work overload, environmental conflict, waning profitability, and threatened traditional practices (Binder et al., 2010;Rulli et al., 2013;Janker and Mann, 2018). The globalization of modern food systems means landscapes and actors are connected over vast distances via flows of goods, services, and information along global value chains (Lenzen et al., 2012a). Decisions and policies made by one set of actors along a value chain (e.g., consumers, producers, and policy makers) can have far-reaching and unpredictable local and global consequences for sustainability (Rice, 2007;Lenzen et al., 2012b;Schaffartzik et al., 2014). The multi-dimensional, interconnected nature of the challenges facing global food systems leads to overarching governance problems of designing policies and incentives to bring food systems within planetary boundaries (Rockström et al., 2009;Steffen et al., 2015).
Concerns over the sustainability of modern agriculture have given rise to a multitude of sustainability assessment approaches across social, environmental, and economic dimensions. These assessments cover specific themes (biodiversity, climate change, labor conditions, well-being) through sets of suitable indicators (Bockstaller et al., 2009;Singh et al., 2009;Binder et al., 2010;Schader et al., 2014). Approaches generally adopt an explicit normative framework (theory-driven set of well-defined criteria) and assessment structure (system boundaries, indicators, and aggregation method). The term "Sustainability Assessment Tool" (SAT) can be used to refer to a framework, method and indicators combined in some form of standard protocol or software implementation. Existing SATs are commonly used for farmlevel assessments, and have received considerable attention in the literature (Marchand et al., 2014;Schader et al., 2014;de Olde et al., 2016de Olde et al., , 2018Arulnathan et al., 2020;Coteur et al., 2020). A frequently employed method in SATs for combining information from different dimension (social, economic, environmental) is indicator-based Multi-Criteria Assessment (MCA). This allows a wide scope (e.g., all three sustainability dimensions) but restricted level of detail due to a trade-off between scope and precision (Schader et al., 2014). Examples of research and commercial SATs using MCA methods include "Response-Induced Sustainability Evaluation" (RISE; Grenz et al., 2009), "Sustainability Monitoring and Assessment RouTine" (SMART) Farm tool (Schader et al., 2016) the "Public Goods" tool (Gerrard et al., 2011) and "Indicateurs de Durabilité des Exploitations Agricoles" (IDEA; Zahm et al., 2008).
Product certifications can be considered one basic form of sustainability assessment (Schader et al., 2014). They aim to guarantee a minimum sustainability performance based on defined criteria, and monitored through checklists of diverse compliance indicators (covering various sustainability dimensions). While certifications can inform consumers and other decision-makers about the sustainability performance of consumption choices, they also lead to confusion given the massive number of private and public standards and labels currently lining supermarket shelves (the label rating website "Ecolabel Index" lists 458 active sustainability labels and certifications globally; http://www.ecolabelindex.com/ ecolabels/). To address this, SATs can explore sustainability claims and evaluate impacts of certifications and standards. This can provide valuable information for all parties concerned (e.g., fact checking for consumers, independent verification for decision-makers, and hotspotting risk areas for the standards/certification bodies themselves).
The organic agriculture (OA) label, and its associated principals, standards, and country-specific laws, represents one of the most widespread voluntary environmental sustainability standard in the agricultural sector. Whilst primarily restricting the use of chemical inputs and fertilizers at a practice level, the standards are rooted in a precautionary ideological perspective on agriculture not simply as biomass production but rather the management of a multifunctional agroecological system (van der Werf et al., 2020). As interest has grown in broader sustainability, organic standards have also evolved to include wider ambitions across dimensions. National standard-setting organizations have played an instrumental role. For example, in Switzerland, the federation of Swiss organic farmers (the "Bio Suisse" label) have adopted the FAO SAFA guidelines in their rules and regulation with the goal of enhancing the sustainability of certified producers (Bio Suisse, 2020, Article 1.6, p. 49). Internationally, the umbrella organization International Federation of Organic Agriculture Movements (IFOAM-Organics International) has adopted the ambition of "Organic 3.0, " which aims to achieve improvements across all pillars of sustainability (Arbenz et al., 2016).
Within agricultural research, the benefits and disadvantages of OA in comparison to conventional production has received extensive attention. When compared to conventional production, the performance of OA hinges on a set of assumptions, such as the functional unit chosen for comparison, yield differences, the theme and dimension assessed, local contextual factors and the development of consumption patterns and demand (Meier et al., 2015;Seufert and Ramankutty, 2017;van der Werf et al., 2020). Less common are studies that focus purely on the OA sector (i.e., outside of a comparative focus with conventional farming). This is helpful for guiding the further evolution of the standard and the design of measures to address hotpots. This study aims to fill this gap with novel and comprehensive perspective on the entire Swiss OA sector. It employs a Multi-Criteria Assessment (MCA) using the SMART-Farm Tool to investigate the sustainability performance of organic farms across 21 sustainability themes. The research aims were twofold: (i) to assess the overall performance of the OA sector in Switzerland across sustainability dimensions; and (ii) identify key indicators and related farm-level management topics that determine performance (i.e., to form the basis of further research, monitoring and improvement efforts).

Farm Sustainability Assessment
In recognition of the multifunctional nature of agriculture, the Food and Agriculture Organization (FAO) developed the SAFA guidelines (Sustainability Assessment of Food and Agriculture systems) as a normative framework to guide sustainability assessment of agricultural firms, covering 21 themes and 58 subthemes (FAO, 2014). The SAFA guidelines attempt to provide a holistic and universally relevant sustainability framework made-up of nested dimensions, themes and sub-themes. For each sub-theme, there is a goal definition for an agricultural business, taking into account its limited area of influence via procurement, management and sales decisions. For each sub-theme, suggested indicators (qualitative and quantitative) are provided with good and bad performance examples along a 5-point scale ranging zero to 100% (FAO, 2014).

SMART-Farm Method and Software Tool
Farms were assessed with the Sustainability Monitoring and Assessment RouTine (SMART) Farm Tool (Schader et al., 2016) version 4.0.1. This method operationalizes the SAFA framework (FAO 2014) through indicators that impact on each of the 58 SAFA subthemes. SMART uses a "degree of goal achievement" (DGA) MCA approach for each indicator and sub-theme, also referred to as "degree of goal fulfillment, " "distance to target, " or "distance minimization" (Diaz-Balteiro et al., 2017). The DGA of each SAFA subtheme is expressed as a percent (0% = no achievement and 100% = full goal achievement) and is based on a set of indicators that are aggregated using a simple weighted arithmetic mean. The indicator weights used in the aggregation reflect the relative importance ("impact") that a change in the indicator rating will have in achieving the sub-theme goal. Indicators may interact with multiple sub-themes simultaneously. For example, water withdrawal for irrigation may improve crop yields and stability of supply, but also could increase water scarcity and damage to aquatic ecosystems. Thus, weights are specific to the interaction of indicator and sub-theme. Indicator weights were developed in an international Delphi process involving over 60 experts from different scientific backgrounds . SMART-Farm has user-friendly software and data base storage on a central server. The SMART-Farm Tool is registered under the Resource Identification Initiative under the RRID:SCR_018197 (Bandrowski et al., 2016).

Representative Farm Sampling
We sampled farms under a project titled "Representative sustainability assessment of Bio Suisse organic farms under the SAFA guidelines of the FAO, " in partnership with Bio Suisse, the federation of Swiss organic farmers (www.bio-suisse.ch). The Bio Suisse standard is stricter than basic legal organic requirements of the Swiss government (Swiss Organic Farming Ordinance 910.181; see www.blw.admin.ch for more details) or the European Union (Council Regulation 834/2007). It contains more ambitious criteria, including a mandatory whole-farm approach, lower limits for concentrate feed in the diets of ruminants (10% vs. 40%) and promotion of on-farm biodiversity (Bio Suisse, 2020). In fact, the vast majority of organic farms in Switzerland (6 ′ 144 or 96% of the total 6 ′ 348 in 2016; Bio Suisse, 2016) is certified under the Bio Suisse label called "Bio Knospe" ("Organic Bud" in English). A summary of the main differences between the Bio Suisse and EU organic regulations is presented in the supplementary information (Supplementary Table 1). Using population data from Bio Suisse, we took a representative sample of 185 Organic Bud farms (3% of the national total in 2016) by a combination of stratified random sampling and additional targeted sampling of high variation groups. We aimed for overall representation of the Swiss OA sector in order to make general statements on sustainability performance. The stratification criteria were farm type (reflecting the type of production systems employed) and Swiss agricultural zone. The farm typology was taken from Schnyder et al. (2003), and is presented in the Supplementary Information (Supplementary Table 2). This detailed classification was aggregated to four basic farm types (cattle livestock, other livestock mixed production and plant production), which was further aggregated to a simplified classes of livestock (LS) and plant production/mixed (PM) for certain analyses. Agricultural zones are based on climate, altitude, and land use considerations, ranging from lowland valleys to Alpine transhumance zones (summer grazing). They are part of Swiss land use planning (Directive SR 912.1; see www.blw.admin.ch for more details). The detailed zones were grouped into three broad classes for the analysis (lowland, hilly and mountain zone I, mountain zones II-IV).
Farm sampling was carried out from March 2015 to October 2017. An initial stratified random sample of 165 farms aimed to mirror patterns in production systems across zones in the total population. After initial contact by phone, farms unwilling, or unable to conduct an assessment were randomly replaced from the same farm type in the same zone. Potential positive selection bias was monitored by asking the reasons for declining an assessment. After carrying out SMART-Farm assessments on these 165 farms, an additional 20 farms were sampled from farm strata (production system/zone combinations) that showed notable variability in results of the assessment (via a visual comparison of score distributions along SAFA subthemes). The results of the sampling (farms per farm type and zone) are presented in the (Supplementary Table 3).
Of the 185 assessed farms, an average of 17 farms per production system were sampled, with greater sampling of suckler cattle (25 farms), dairy cattle (23 farms), and mixed finishing (22 farms) systems. The least sampled farm types were arable (eight farms), pig and poultry (13 farms), horse/sheep/goat (14 farms), and mixed dairy/arable (14 farms). In terms of geography, the farms were located predominantly in the lowest agricultural zone (83 farms), with declining numbers at higher altitudes. The final sample broadly represented the larger population of organic farms in Switzerland (Supplementary Table 4), for which data were obtained with permission (data sharing contract Nr. 140450) from the Swiss Federal Office for Statistics (Bundesamt für Statistik, 2013). However, some systems were over/underrepresented by an average of 6% deviation from their actual frequency (range 0-18% deviation). A summary of sampled farm characteristics according to simplified farm type is presented in Table 1. A map of sampled farms by type and size is presented in Figure 1.

Dimension Reduction and Indicator Selection
SMART applies a large number of indicators (339 in the version used in the study) depending on the farm context (Schader et al., 2016). For this study, 298 indicators were applied as relevant to the sampled farms. Indicators can be split into generic and specific indicators. Generic indicators are applied to all farm types regardless of context, whereas specific indicators are context-dependent (e.g., production system, farm type, geographic location). For plant production and mixed farms, 186 indicators were applied to 95% or more of farms. For livestock farms, the figure was 213 indicators. In order to identify key indicators and related management practices, we used an indicator selection process based on (i) high inherent importance for the SAFA theme ("importance weights") and (ii) a high ability to explain variability in the data ("variance weights"). The importance weights are fixed in SMART and based on a Delphi process with over 60 experts in the field of agricultural sustainability . The variance weights were derived through Principal Component Analysis (PCA), which is a commonly applied method of dimensional reduction. PCA constructs Principal Components (PCs) that represent linear combinations of the input data to maximize the explanation of variance. It is frequently used to find patterns in high-dimension datasets where variables are correlated with each other (Venables and Ripley, 2002, p. 301-330;Abdi and Williams, 2010).
PCA was applied to the unweighted matrix of indicator ratings (i.e., standardized values from 0 to 100%). Prior to the analysis, all indicators were removed with zero variance (i.e., where the ratings are identical across farms). These indicators are deemed as important contributors to overall sustainability performance in the SMART method, but they contribute nothing to differentiating farms from each other, and are thus not statistically relevant. A second filtering step removed indicators with more than 5% missing data. For the remaining indicators, missing data were imputed using a regularized iterative PCA algorithm proposed by Josse and Husson (2012). This algorithm first imputes missing values as fitted values generated through a PCA of the complete data. The imputed values are sequentially adjusted until the estimated parameters of the complete PCA converge with the fitted data (i.e., the original estimations of parameter mean, variance, and distribution is preserved). The regularization component adjusts the imputation based on the amount of noise in the data, thus reducing the influence of overfitting (which is a problem of predictions corresponding too closely to the fine structure, i.e., noise, of data). The imputation was implemented in the R package "missMDA" (Josse and Husson, 2016).
PCA is very sensitive to anomalous observations, which can bias estimation of the principal components and reduce the explanatory power of (in particular the initial) principal components. To account for this, we applied a method of "robust" PCA, which seeks to identify principal components that are less influenced by outliers. To do this, we used the ROBPCA method developed by Hubert et al. (2005), which combines two related approaches: (i) the use of a robust covariance estimator in place of the classic covariance matrix, which is suitable only for relatively small sample sizes, and (ii) "project pursuit" techniques, which are less effective but suitable for large datasets. The ROBPCA approach first uses the latter to reduce the size of the dataset and then the former to derive the final robust PCs. Part of this involves identifying three different kinds of outliers: (i) those that lie on the same plane as the PCs but have extreme values (having a high "robust score distance;" SD), (ii) those that are orthogonal to the PCs (having a high "orthogonal distance;" OD), and (iii) those with a mix of both (high SD and OD values). Further details on the approach can be found in Hubert et al. (2005). Following the robust PCA, the variance explained by consecutive dimensions (PCs) and indicators was visualized using scree plots and contribution plots for PCs and indicators, respectively.
The initial PCA results using the entire dataset showed clear clustering of farms along the second PC by simplified farm type. Livestock farms were predominantly on one side of the axis, whereas plant production and mixed farms were on the other side (Figure 2). To address this, we repeated the robust PCA and all further steps of dimension reduction and indicator selection separately for these two farm groups: livestock (LS) and plant production/mixed (PM) farms. This distinction provided a much clearer insight into the key indicators influencing each broad group of farm types.

Selection and Weighting of Influential Indicators
For both farm groups (PM and LS farms), an identical process of indicator selection and weighting was used. The quality of representation of indicators by a given PC was quantified Frontiers in Sustainable Food Systems | www.frontiersin.org    using the squared cosine (cos 2 ) measure and its translation into a contribution score for each PC in percentage terms (%). The contribution of all indicators for a given PC adds up to 100%, and an indicator contribution larger than the average generally implies importance (Abdi and Williams, 2010). To select influential indicators, we multiplied the contribution score of an indicator by the variance explained by the particular PC to derive a measure of total variance explained for each indicator. We selected all indicators that explained at least 0.5% of the total variance through any of the first 10 PCs. Variance explanation for each indicator was then summed together across all PCs and then Z-normalized (i.e., the value minus the average, divided by the standard deviation) across indicators. These values were used as "variance weight" for the indicators. A second set of "importance weights" were taken from the expert judgements in the SMART-Farm Tool. These importance weights, and the anonymized primary data underpinning them, are described in Schader et al. (2019). These SAFA subtheme weights were translated into theme weights by summing them up and Z-normalizing them. Therefore, indicators that influenced multiple nested subthemes in a given theme were given a larger weight. The selected indicators were give final "combined weights" as the average of variance and importance weights, calculated for each SAFA theme and then normalized across all indicators. These were visualized using heat maps and used to rank a final set of indicators for each farm type. The farm management topics reflected by the selected indicators were then classified as a basis for further interpretation of good and bad practice. All statistical analyses and visualizations were performed in R (vers. 3.6.3, R Project for Statistical Computing, RRID:SCR_001905) using RStudio (vers. 1.2.5033, RStudio, Q19 RRID:SCR_000432). The analysis was implemented in RStudio's RMarkdown script format, which integrates analysis, reporting and export functions for highly reproducible research reports (Baumer and Udwin, 2015). The source datasets and RMarkdown script (including an automated report) required for all main analyses presented in the paper are publically available (see Data Availability Statement Section). The section of the study that involved human participants was performed in accordance with all relevant institutional and national ethical FIGURE 2 | Robust PCA biplot of farms (symbols) and indicators (arrows). The plot shows farms largely grouping by farm type (left and right side of Dim2). Several indicators also align with the direction and magnitude of the first two dimensions (longer arrows aligned with the axes that break from the group at the center of the plot).
guidelines. Approval by an ethics committee was not required in accordance with Swiss law. Informed consent was obtained from respondents in accordance with section 32 of Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data (General Data Protection Regulation).

Overall Sustainability Performance
Across the sample, overall performance per SAFA theme ranged 23-100% of goal achievement (Figure 3). Average median theme values per dimension were 62% for Good Governance (s.d. = 17%), 77% for Environmental Integrity (s.d. = 9.6%), 70% for Economic Resilience (s.d. = 7%), and 87% for Social Well-being (s.d. = 6%). The average performance of the sampled farms ranked in the "Best" category (81-100%) for Social Well-being and the "Good" category (61-80%) for the other three dimensions. Farms achieved median scores in the "Best" category (81-100%) for eight themes (Participation, Water, Animal Welfare, Fair Trading Practices, Labor Rights, Equity, Human Safety and Health, and Cultural Diversity), in the "Good" category (61-80%) for 10 themes (Rule of Law, Atmosphere, Land, Biodiversity, Materials and Energy, Investment, Vulnerability, Product Quality and Information, Local Economy, and Decent Livelihoods), and in the "Moderate" category (41-60%) for three themes (Corporate Ethics, Accountability, and Holistic Management). In general, lower scores were observed in the dimension of Good Governance and Economic Resilience, where the lower quartile of performance scores fell into the "Limited" category (21-40%) for the themes of Accountability, Holistic Management, and Local Economy. Performance scores were compared across farm type, showing an overall similar spread of performance for the majority of themes (Figure 3). Interquartile ranges of farm performance were non-overlapping in only two themes: Land (mixed farms performed lower than cattle livestock) and Animal Welfare (plant production performed better than cattle livestock). Otherwise, interquartile ranges always overlapped across farm types, despite some substantial differences in median scores (in particular, in the themes of Equity, Water, and Biodiversity) or a large spread of scores (for cattle livestock and plant production farms in Local Economy and all farm types Equity and Decent Livelihoods).
In the top scoring theme of Human Safety and Health (93% median score), the most important indicators as reflected by the expert-based weights related to the use of pesticide active substances of known toxicity (acute and chronic toxicity, particularly via inhalation) based on the Pesticide Action Network (PAN) database. Unsurprisingly, such substances are near absent on organic farms, with 148 farms in the sample not using any active substances, and the rest using up to eight substances. We observed limited use of organic pesticides with broad (non-specific) or known toxic effects (e.g., pyrethrins and potassium bicarbonate). Other important indicators in this theme include those related to the storage and disposal of hazardous substances, use of protective gear, fertilizer contamination risks, and overall knowledge of farm risks. In the theme of Fair Trading Practices (92% median score), important indicators included the perceived strength of supplier relationships, stated social criteria in the procurement of farm inputs and their sourcing from countries without problematic social conditions according to the Business Social Compliance Initiative (Amfori BSCI, 2019). In theme of Equity (median score of 90%), farm performance was slightly more variable. The most important indicators related to equal pay for similar work (rated uniformly positive across farms), support for disadvantaged workers (applied to 68 farms with workers, of which 34% were rated positively), incidences of harassment in the work place (five minor incidences and no major incidences reported) and social security for the partner in case of divorce, invalidity or death (75% reported positively).
The worst-performing themes were from the governance dimension. For Accountability, important indicators included whether the farm accounts for external environmental costs (through economic valuation or tracking environmental impacts per unit of turnover; no farms rated positively), the presence of a written sustainability management plan (only three farms rated positively) and the public availability of a sustainability report (only one farm rated positively). In the theme of Holistic Management, important indicators were the presence of professional agricultural accounts (all farms rated positively), conducting a sustainability assessment in the past 5 years (only 6 farms rated positively) and making a verbal commitment to sustainability (137 farms rated positively). For the theme of Corporate Ethics, important indicators were having a written commitment to sustainability (13 farms rated positively), a risk assessment of safety hazards (163 farms rated positively), and the use of soil samples to determine fertilizer requirements (19 farms rated positively).

Dimension Reduction With PCA
Removing indicators with zero variance (59 and 63 indicators for PM and LS farms, respectively), and those with more than 5% missing data (121 and 83 indicators for PM and LS farms, respectively) led to an input dataset for the PCA of 121 indicators for PM farms and 148 indicators for LS farms. Dimensional reduction via PCA was high, with the first five dimensions explaining ca. 66% of PM farm variance and 70% of LS farm variance. The first dimension explained 18% (PM farms) and 20% (LS farms). Indicator contributions to PCs were relatively modest, ranging up to 5% across all PCs for the best indicators (Supplementary Figures 1, 2). Indicator selection based on contribution to the total variance explained (at least 0.5% contribution through any of the PCs) led to a final selection of 28 indicators for PM farms and 30 indicators for LS farms (45 indicators in total, with some shared by both farm types). This final set of indicators explained a cumulative 68% (PM farms) and 70% (LS farms) of the total variance of the respective datasets.

Selection and Analysis of Influential Indicators
The interaction between influential indicators and SAFA themes was investigated using combined indicator weights (i.e., normalized average of variance and important weights). These were visualized using heat maps (Figure 4). The heat map illustrates the main themes that are impacted by the selected set of indicators. These key indicators are useful as a basis for improvement measures because they combine high variability across the dataset (i.e., both good and bad performance is represented) with high importance for goal achievement (i.e., a high impact on sustainability performance). The indictors were grouped into broad farm management topics. While each indicator had impacts across multiple themes, to aid interpretation they were then grouped according to the SAFA theme where they had the highest combined weight ( Table 2). Finally, the unrated, raw responses of the four key categorical and numeric indicators that feature prominently in the Discussion (Section) were visualized using bar plots and histograms, respectively (Figures 5, 6). The full set of influential indicators reflected diverse farm management topics spanning all sustainability dimensions. Across both farm types (PM and LS farms), only one indicator related primarily to the governance dimension, while 17 related to the environmental dimension, 21 to the economic dimension and six to the social dimension ( Table 2). A description of all 45 selected indicators and their rating systems is provided in the (Supplementary Table 5).

Topic
Dim.

Sustainability Performance of the Swiss Organic Sector
Our findings showed an overall high performance of the sampled farms across SAFA themes (Figure 3). This implies the Swiss OA sector as a whole makes a substantially positive contribution to sustainability, as measured by the SMART-Farm Tool. The overall good performance is largely due to strong regulatory standards within the Bio Suisse label and Swiss legal requirements. For example, in the top scoring theme of Human Safety and Health, key indicators were related to agrochemical use and general safety practices (protective gear, farm risk assessment etc.), with a high level of compliance across Swiss (organic) farms, leading to good scores overall. Likewise, for Fair Trading Practices (median score of 92%), the strong sourcing requirements of organic regulations (particularly for imported inputs and product) meant key indicators of social risk, forced labor etc. were well-represented. In the theme of Equity (median score of 90%), there was more room for farm agency, as indicators of equal pay, discrimination, harassment etc. are not necessarily regulated by law/standards, but we observed generally high performance in these indicators as well.
The worst performing themes (Accountability, Holistic Management, Corporate Ethics) were from the governance dimension and shared important indicators drawn from the corporate sustainability reporting literature. Their effective absence from small to medium family-run farms is not surprising given the costs and risks involved. Clarkson et al. (2008) highlight that voluntary sustainability reporting by businesses is limited by resource availability (information production costs) and vulnerability caused by potentially negative disclosures (proprietary costs). In other sectors, reporting on (environmental) sustainability issues remains very rare, estimated at about 5% of all companies (Bjørn et al., 2017). It is also disputed if sustainability disclosures actually translates into better actual performance or simply reframe existing operations or minor internal adjustments (Clarkson et al., 2008). In the US agri-food sector, reporting is neither coordinated nor standardized across firms, and existing initiatives relate primarily to internal activities of limited wider benefit (Ross et al., 2015). Across the packaged food industry, sustainability reporting disproportionately favors (primarily social) initiatives that require the least structural change in a firm's operations (Shnayder et al., 2015). During workshops with farmers to present assessment results, participants questioned the relevance of several corporate governance indicators for family farms. Similar feedback was observed in applications of SMART-Farm in developing countries, where farming is dominated by smallholder non-commercial farms (Ssebunya et al., 2017,  2019; Winter et al., 2020). One option to address this would be to either omit such indicators or adapt them to the size and organizational context of a farm. Farmers sampled in Uganda also suggested such indicators could be suitable for application at the "farmer group" level, relating to associations and cooperatives rather than individual farms (Ssebunya et al., 2017). A similar approach could apply to label and certification bodies. The Bio Suisse association has formally committed to sustainability improvements via an evolution of its rules and regulations (Bio Suisse, 2020, Article 1.6, p. 49). In particular, extended checks of multiple sustainability themes are applied to imported produce from high-risk areas (Bio Suisse, 2020, Ch. 5) and a wider strategy for sustainability improvements is being integrated into future standards. The need for action at the level of the individual farmer is thus questionable in this context.

Selected Indicators and Farm Management Topics
The PCA indicator selection process resulted in a set of 28 and 30 indicators for PM and LS farms, respectively. The process achieved successful dimension reduction whilst explaining a substantial amount of variance (10-fold reduction in indicators, ca. 70% of the variance explained by the final set). The indicators thus offer a good starting point for identifying important topics influencing sustainability performance in the Swiss organic sector (within the scope of SAFA and the SMART-Farm Tool). We grouped the indicators into 11 broad farm management topics, which cut across sustainability dimensions ( Table 2), ranging from socio-political engagement of the farm manager to agrobiodiversity management and socio-economic vulnerability. They provides a snapshot of key conditions on farm across the Swiss organic sector, which we briefly summarize per dimension below, highlighting examples of meaningful patterns in selected indicators that we relate to findings from the literature (indicator codes provided in brackets and full description of indicator question and rating system is provided in Supplementary Table 5).
In the governance dimension, our results show that almost one in five farm managers (17%) engaged in political activity to influence (agricultural) laws and regulations over the past year (ID 00057; Figure 5A). This compares well with monitoring data on voluntary engagement in the Swiss population at large. Based on data from 2019, active engagement in political parties or holding office extends to roughly six percent of the population (Lamprecht et al., 2020, p.42), and up to 28% are active indirectly, such as through petitioning, contacting elected officials or supporting political organizations (Lamprecht et al., 2020, p. 119). This implies organic farms engage politically at a comparable rate to the population as a whole. In terms of social engagement outside the farm (ID 00075), we recorded a mean of 5.7 days per year and person (range 0-60, median = 1; Figure 6A). This is a lower estimate than the general population, where official figures from 2016 indicate an average of ca. 19.5 days per person and year of institutional volunteer social engagement (BFS, 2016). This is perhaps due to time availability in a sector where a weekly working schedule is typically 55 h, 10 more than the maximum of 45 h in other sectors). Although we lack data to compare OA with conventional farming, a higher level of socio-political engagement might be expected due to the history of OA as both a production system and a farmer-led movement (Luttikholt, 2007). OA rose from near insignificance a few decades ago to broad acceptance across Europe. This required an alliance between official politics of the state, market dynamics and, crucially, active civil society engagement by communities of farmers (Michelsen, 2001). Future development and application of such indicators (e.g., standardize to official data) should better assess their value in comparative analysis across farms and production systems. This would improve coverage of farmers' wider interactions with society, an area of suggested improvement for existing sustainability assessment tools (Janker and Mann, 2018).
In the environmental dimension, a range of indicators were identified that covered topics of energy and water use, pollution, agrobiodiversity and animal welfare ( Table 2). Most of these topics are well-covered in the literature comparing OA to conventional production (Schramski et al., 2013;Seufert and Ramankutty, 2017;Smith O. M et al., 2019). However, animal welfare has received less attention, despite promise as a key topic to represent OA in comparison to other systems (van der Werf et al., 2020). Our results demonstrate the usefulness of simple indicators that are quick to assess on-farm. Regarding energy indicators, renewable energy production (ID 00186) was distributed similarly across farm types, whereas electricity consumption (ID 00332) was useful for differentiating farm types, showing higher average values on PM farms ( Figure 6B). Similar results emerge from total direct energy consumption (i.e., including solid and liquid fuels; see Table 1). This was largely due to a handful of crop farms with particularly high consumption. The selected agrobiodiversity indicators of rare breeds and grassland use intensity have previously been proposed as good proxies of farm-level genetic and species diversity across Europe (e.g., Herzog et al., 2012). Interesting, the share of extensively managed grassland was higher in PM farms (with 19 farms reporting almost full extensive management), highlighting the relevance of this indicator to all farm types, notwithstanding difference in grassland area (LS farms had 3-4 times more permanent grassland than PM farms; Table 1). The indicator of alpine grazing (ID 00227) is particularly useful for capturing the off-farm benefits to species diversity in alpine pastures across multiple taxa, which is threatened by farm agglomeration and intensification (Kampmann et al., 2008;Marini et al., 2011).
Within the economic dimension, indicators reflecting traditional realms of productivity and profitability emerged, including farm viability (ID 00125), yield level (ID 00128_1), and price premiums (ID 00161) for LS farms. Reported price premiums relative to conventional market prices, spanned a very large range, from slightly negative (two farms; LS farms only) to double the value (mainly PM farms; Figure 6C). This illustrates a significant role for farmer agency and/or local constraints in determining producer prices (e.g., through differentiated marketing strategies). Adequate organic price premiums are instrumental in covering production costs, with estimated breakeven premiums spanning 5-7% based on global crop data (Crowder and Reganold, 2015). Our findings indicate a substantial proportion of farmers are operating under low premiums that could indicate economic vulnerability, which deserves further attention. Investmentrelated indicators included land ownership/tenure status (ID 00767) and the use of high-input hybrid cultivars in PM farms (ID 00247). For land ownership, a majority of LS farms, particularly cattle, reported insecure (< 10 y; generally leased land) compared to secure tenure (>10 y; owned land). This is in contrast to roughly equal shares for PM farms (Figure 5B). This distribution is sub-optimal, given that land ownership increases productivity of Swiss dairy farms through increased investment and technical change (Bokusheva et al., 2012).
Another focus of our selected indicators is on socio-economic vulnerability, where PM farms exhibited varying experience of major yield losses (ID 00095), reliance on external fertilizers (ID 00712), availability of alternative markets (ID 00623), availability of a replacement farm manager in case of inability to work (ID 00623), and planning of farm succession (ID 00124). Yield stability in particular is known to be lower in OA than in conventional production (Smith O. M et al., 2019). Our results confirm the majority of PM farms experienced major loss (>20%) within the past 5 years (Figure 5C). In terms of farm succession, our results reflect larger concerns in the agricultural sector, where lacking planning of succession represents a significant threat to farm productivity and investment (Mann et al., 2013).
Another key indicator of vulnerability is social and economic security of spouses in farming families (ID 00456_5). This is a major issue across the agricultural sector in Switzerland, with an estimated 15,000 farms lacking social security for (predominantly female) spouses (BauernZeitung, 2020). This is roughly 30% of the 50,000 active farms in Switzerland (BLW, 2019). Under current proposals for subsidy reform, farms will only be eligible to receive direct payments if both marriage partners are socially ensured in case of invalidity, death or separation (BLW, 2020). Our results point to a frequency of ca. 25% of organic farms (or 1,500 farms in total), showing slightly better performance than the national average. A further topic of product quality related to potential risks of contamination through veterinary products (antibiotics and hormonal treatments) in manure fertilizers (ID 00295 and ID 00613) and failure to meet product safety standards (ID 00170). Contamination through veterinary products is a major issue in agriculture (Grenni et al., 2018;Urra et al., 2019), but organic regulations already targets common mitigation measures, such as preventative care, eliminating prophylactic use and promoting alternative treatments. As a result antibiotic use and risks of resistance (alongside other toxic compounds) are much lower in OA compared to conventional (Gomiero, 2018). Despite this, our results indicate that targeted measures aimed at the poorer performers could reduce the burden even further. Finally, local economy indicators stood out as influential in our analysis for LS farms, including the share of main inputs produced within 150 km of the farm (ID 00793) and the further on-farm processing of products (ID 00145). For procurement (ID 00793), a large share of mixed farms in particular could maximize local sourcing (Figure 6D), likely due to withinfarm transfers between animal and plant production systems. Further processing of all farm products occurred only at higher altitudes (data not shown), which may be due to typical alpine products (cheese, processed meats, herbs etc.) and differentiated marketing channels.
Finally, in the social dimension, topics emerging surrounded capacity development, workplace mechanization, and risks. In terms of capacity building, increased educational status has been linked to higher productivity of crop farms in Switzerland (Bokusheva et al., 2012). Our indicator results indicate PM farms invest actively in staff training (ID 00072), where farms primarily reported high levels of per-person training (>2 days/y), particularly for mixed farms (Figure 5D). Under the topic of workplace risk, two indicators of pesticide toxicity, acute toxicity (ID 00377_7) and toxicity via inhalation (ID 00377_75), were selected as influential. While overall risk levels were low (only 16 PM farms showing moderate to high risk levels, i.e., scores of 2 or 3), the issue does deserve attention. While OA forbids the use of synthetic pesticides, organic, plantbased alternatives also have harmful active substances, and a process of simple input substitution will not necessarily improve environmental outcomes (Bahlai et al., 2010;Smith and Perfetti, 2020;Turchen et al., 2020). Finally, the number of days of absence due to occupational injury was influential for all farm types, with an average value of 3.22 and a maximum value of 70 days per year and FTE (data not shown). Incidence rates of occupational injury requiring absence was very high, with almost half of all farms (48.6%) reporting some absence taken and 33.5% requiring more than 1 day absence. According to insurance statistics from the primary industries (agriculture, forestry and fishery), 7,562 insured and non-insured cases of injury were recorded in 2017 from an estimated full-time workforce of 32,066 (Suva, 2014, p. 26-27). A further 3,700 cases can be added from part time farming (yearly average between 2012 and 2016; Suva, 2014, p. 48), given a maximum incidence rate of about 35% across the primary sectors. Provided only more serious cases are recorded in these statistics, our estimate of about a third of farm workers requiring significant (>1 day) absence due to occupational injury is in line with these statistics.

Conclusions
Organic agriculture is frequently typified as a precautionary system of production that prioritizes environmental sustainability, focusing on strengthening agroecosystem services (e.g., soil health, functional biodiversity) and the local circulation of resources, nutrients, and energy (Halberg, 2012;Niggli, 2015;van der Werf et al., 2020). While international OA standards recognize the social and economic dimensions of sustainability as additional foundations of the movement (Luttikholt, 2007), there is less known about the performance of OA in these areas, and the viewpoint is not necessarily shared by all practitioners (Shreck et al., 2006). Existing research is biased toward agronomic and environmental aspects (Seufert and Ramankutty, 2017), with attention also paid to economic benefits for farmers, rural job creation, nutritional/health issues, and traceability for consumers (e.g., El-Hage Scialabba, 2013). There is further research required for issues such as labor conditions, worker wages, farm resilience and autonomy (Seufert and Ramankutty, 2017). Studies from the US and Spain on social conditions for workers have demonstrated only moderate performance (Shreck et al., 2006;Medland, 2016;Torres et al., 2016). Our research has attempted to contribute to these efforts to broaden the scope of OA research, through presenting and analyzing a new dataset from the Swiss OA sector. We hope this will both help evolve OA standards and identify suitable areas for comparative research with other production systems. Using a selection of simple indicators, we have highlighted relevant sustainability topics that are both important for goal achievement of sustainability targets (i.e., SAFA themes) and variable across farms. At the same time, we emphasize that the results are neither broadly generalizable to other contexts, nor necessarily comparable to results generated with other tools (de Olde et al., 2016(de Olde et al., , 2017(de Olde et al., , 2018. SMART-Farm is a "rapid sustainability assessment" tool (Marchand et al., 2014) aiming for broad coverage, communication and awareness raising among agricultural stakeholders. Additional research with more detail and better coverage of the identified topics is required to expand upon, validate, and refine our findings.

DATA AVAILABILITY STATEMENT
The anonymized farm data and analysis scripts used to generate the main results are available in the following repository: https://bitbucket.org/FiBL-Socioeconomics/curran_ 2020_biosuisse_frontsustainfoodsyst/src/master/.

ETHICS STATEMENT
Ethical review and approval was not required for the study on human participants in accordance with the local legislation and institutional requirements. The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
MS, CS, and LB designed the study. LB, RE, JB, VG, GL, and MC contributed to data collection, preparation and exploration. MC and GL analyzed the data. MC prepared the manuscript and all authors revised/edited and approved it.

FUNDING
Financial support for this study was provided by Bio Suisse (the federation of Swiss organic farmers) and the Gerling Foundation (Fondazione Gerling, CH-6652 Tegna). The work was based on the following project [unofficial translation from German]: Representative sustainability assessment of Bio Suisse organic farms under the SAFA guidelines of the FAO (Reference numbers: Bio Suisse 9615/FiBL 35112).

ACKNOWLEDGMENTS
The authors are grateful to Bio Suisse (the federation of Swiss organic farmers) and the Gerling Foundation for financing this research work. Furthermore, we thank the farmers who participated in the assessments for sharing information and data on their farms for the purpose of research. We thank Dr. Silvia Marton for valuable contributions during the project execution and initial data analysis. We also acknowledge the contribution of the Swiss Federal Office for Statistics for providing anonymized structural data, which greatly facilitating planning of the research. Finally, we thank two reviewers for constructive comments that improved the quality of the final manuscript.