The Setting Questionnaire for the Ayahuasca Experience: Questionnaire Development and Internal Structure

The growing interest in research on psychedelic consumption in naturalistic contexts and their possible medical and therapeutic benefits requires assessment of the relationships between the substance and the individual who consumes it (set) and its context of use (setting). This study provides a novel measurement scale for the setting of Ayahuasca consumption, the Setting Questionnaire for the Ayahuasca Experience (SQAE), and examines its psychometric properties. Construction of the scale began with a literature review, followed by interviews on 19 Ayahuasca users from different backgrounds and different consumption experience, and an online survey for quantitative data collection (n = 2,994). Exploratory Graph Analysis (EGA) was used to investigate the questionnaire's dimensional structure with (n = 1,497, half of the sample), and multidimensional item response theory (MIRT) was used to compare the fit of the theoretical dimensions with the EGA proposed dimensions (n = 1,497, independent other half). EGA identified six dimensions, which corresponded partially to the theorized model (Leadership, Decoration, Infrastructure, Comfort, Instruction, and Social). The MIRT comparison found that the proposed theoretical model fit significantly better than the EGA model, providing support for the former (χ2/df = 1,967; CFI = 0,972; TLI = 0,969; RMSEA = 0,059; WRMR = 1,087). Our findings present evidence of validity of this instrument, justifying its use for future research on the influence of the setting during the ayahuasca experience. Its findings may provide a basis for expanding the settings investigated in the use of psychedelics in general.


INTRODUCTION
Studies with psychedelics have been steadily growing in the last two decades, with research centers in different countries investigating their effects and possible use as therapeutic tools in clinical environments (Johnson et al., 2019;Lawrence et al., 2021). There is also a growing interest in understanding how healing, self-empowerment, self-knowledge, and related processes occur with the consumption of psychedelics in naturalistic contexts (Labate, 2004;Winkelman, 2005;Luna, 2011;Gomes, 2013;Maia et al., 2020).
Beyond the pharmacological effects of the drug itself, other variables must be taken into account in addressing the total effect of this group of substances, the so-called set and setting (Hartogsohn, 2016;Haijen et al., 2018). Briefly defined, set involves variables related to the subject who is ingesting the substance-such as their personal characteristics and traits like personality and life history-and setting refers to the culture, place and situation where the consumption occurs, including decoration and objects displayed, together with what other people are present, the interpersonal relationships established among the participants, what activities are being performed and the metaphysical beliefs shared among the group (Leary and Alpert, 1962;Zinberg, 1984;MacRae, 2001;Hartogsohn, 2016). The broader beliefs regarding psychedelics are particularly relevant with the beverage ayahuasca because of its characteristic setting features that arise from its ritual origins in indigenous ritual practices from the Amazon basin and South American religious syncretism (Labate, 2004;Luna, 2011).
Ayahuasca consumption settings have different forms that stem from different traditions, cultures and their adaptations. Within each tradition, there may be different arrangements adapted for different goals. A curing ritual, for example, may be organized differently from a celebration ritual by the same group (Gomes, 2013). Nevertheless, a typical ayahuasca general setting is always composed of a spiritual leader (vegetalista, mestre, curandero, shaman, "master, " or "godfather") who, together with their helpers, supervises the consumption of the beverage by the participants and conducts the spiritual ritual in an appropriately decorated environment (Labate, 2004). These participants, in turn, also compose the setting, typically as group rituals. Accommodations for people to sit, lay or stay on, such as chairs, cushions, hammocks or grass may also vary, but are generally present and reported as influential (Pontual et al., in revision). Singing, chanting, dancing, smoke blowing and communicating with spirits are commonly found during ayahuasca rituals, and its presentation is part of an Amerindian cosmology and its beliefs about the presence and roles undertaken by spirits, plant spirits, and ayahuasca animals (Labate, 2004).
In spite of the importance of the quality of acute psychedelic experience in determining the long-term outcomes from psychedelic experiences (Johnson et al., 2017;Roseman et al., 2018), there seems to be a lack of measurement tools available to evaluate and measure the setting and its respective impact on the experiences of ayahuasca consumers and their outcomes (Pontual et al., in revision). This statement may be broadened to the study of the setting in the field of psychedelics in general, which, although highly reliant on psychometric instruments as a way of conducting its studies (Bouso et al., 2016), seems to be lacking appropriate and modern psychometric tools to evaluate the impact of settings. Perkins et al. (2021) addressed the relationship between ayahuasca traditional settings and possible therapeutic outcomes. For this, a statistical correlation was calculated between responses on standard psychological and health questionnaires among participants of different ayahuasca denominations. Kettner et al. (2021) developed and investigated the psychometric properties of a short instrument-eight itemsto assess what the authors have called Intersubjective Experience During Psychedelic Group Sessions, evaluating how participants of ayahuasca rituals related to each other during sessions and established social bonds among themselves. However, as the ayahuasca consumption setting is complex and involves other domains in addition to these, the development of more instruments is necessary.
The objective of the present study was to develop and validate a new multidimensional questionnaire with strong psychometric properties, appropriate for different ayahuasca intake contexts, that could assess the perception of the setting and be easily validated in other languages.

Design
An interview was first conducted with 19 ayahuasca drinkers and group leaders from different backgrounds-Santo Daime (five participants), União do Vegetal (UDV) (four participants), Shipibo tradition (two participants), Neo-Shamanic (three participants), and mixed traditions (five participants). It was opted to invite participants from different backgrounds in order to expand the perception possibilities of the ayahuasca settings, with a variety of rituals and rules. These respondents also varied in the number of experiences-from less than five experiences (three participants) to more than 500 experiences (four participants). These participants were interviewed about the setting of their consumption and how they thought these setting features related to the nature of their personal experience. Based on these interviews and on the literature on setting (Zinberg, 1984;MacRae, 2001;Labate, 2004;Hartogsohn, 2016), thematic analysis (Braun and Clarke, 2013), and scale development guidelines (Pasquali, 1998;DeVellis, 2016) were used to elaborate 33 short items (see Supplementary Material 1) in six general dimensions: Leadership, with six items about the people conducting the ceremony; Infrastructure, with seven items about the facility; Instruction, with five items about information and guidance; Social, with seven items about the other participants; Comfort, with four items about the body position; and Decoration, with four items about the place's ornamentation. To these, 15 additional descriptive questions were added about music played, activities performed, and presence of natural elements. All items were elaborated as statements that avoided idiosyncratic and culturally-specific expressions and idiomatic language that can cause problems with cross-cultural translation, helping to assure that the items would allow easy translation and cross-cultural adaptation (Hambleton et al., 2004;Vijver and Matsumoto, 2011). Items were submitted to a committee formed by five ayahuasca researchers in the field, native Portuguese speakers, -one PhD in anthropology, one PhD in biology and three PhD candidates in psychology-who judged their content and pertinence to the six general dimensions using an agreement table where items were positioned as rows and their proposed subscale as column. Items were kept in the questionnaire if they achieved a kappa score superior to 0.8. An individual video conference was then held with a sample of four ayahuasca drinkers from the lower educational level of the target demographic. These respondents were presented with the questionnaire, asked to read each item once, report if its content was simple to understand, and explain their interpretation of it to the researcher. Items that didn't achieve a perfect score on all participants were flagged to have their wording re-formulated.
Items included in the questionnaire were set up online, in Portuguese, using LimeSurvey version 1.01, on the University of Campinas server, in Brazil. The content order was randomized for each respondent, who were asked to score them on a Likert scale based on their last ayahuasca consumption: 1 -Strongly Disagree; 2 -Partially Disagree; 3 -Neither Agree nor Disagree; 4 -Partially Agree; and 5 -Strongly Agree. Invitations to participate were sent to members of ayahuasca churches, healing groups, online discussion communities and posted on social media.
Together with the setting questionnaire, there were descriptive questions and questions on demographics and on ayahuasca consumption habits and affiliations. Also added was a question about previous participation in the study to avoid repeated data. Exclusion criteria were not completing all fields/having missing data, completion of the questionnaires in <5 minwhich was considered to be insufficient time-or having provided the same answer for all items-which was interpreted as invalid data. A total of 2,994 responses were considered valid and used for analysis, a sufficient number for accurate MIRT parameter estimates (Jiang et al., 2016).

Procedures of Exploratory Graph Analysis
The responses were randomly split in two halves, for Exploratory Graph Analysis-EGA-and multidimensional item response theory-MIRT. EGA is a recently developed method from network psychometrics, that has produced comparable or better accuracy in identifying dimensions than other more common methods (e.g., principal component analysis, factor analysis, and parallel analysis; Golino and Epskamp, 2017;Golino et al., 2020). EGA consists of identification of communities of items, interpreted as possible dimensions, when they are represented in a "regularized partial correlation network" using a walktrap algorithm that computes distances via random walks (Pons and Latapy, 2005). The items are presented in a network made of nodes representing variables (questionnaire items) and edges representing how they are connected. The representations of these connections use penalized inverse covariances between variables to remove spuriousness. Doing so, only relevant inverse covariances remain, visually organizing items and clustering them according to their affinity to each other in a more precise way.
Correlations between items were first estimated by calculating a correlation matrix of all the variables and its inverse variance-covariance. The inverse covariance, together with a model that utilizes penalized maximum likelihood estimation to regularize it, is used to avoid overfitting. Least absolute shrinkage and selection operator (LASSO) was used as a method for regularization of the partial correlation network edges, a procedure in which a penalty [the lambda parameter (λ)] is imposed on the coefficients, an effect sufficient for some of the values to be zeroed and thus absent from the model. This absence indicated conditional independence and facilitated interpretability of the model as the communities of items tend to cluster in such a way that resemble a dimension, but without the need of loading into a latent variable as it would happen in an exploratory factor analysis. Because of the reduced number of correlations, as the regularized network becomes sparser than the non-regularized network, the clustering of items on the network becomes more self-evident (Golino and Epskamp, 2017).
The degree of regularization is determined by the variation of the Extended Bayesian Information Criterion (EBIC) (Epskamp et al., 2018). The parameter adjusted through EBIC, the gamma hyperparameter (γ ), determines the final number of edges that are retained in the network. This is useful to avoid overfitting of the model. The final design of the partial item correlation network is determined by a pairwise Markov random field model, more specifically the Gaussian Graphical Model (GGM). This model generates an undirected network based on the assumption that edges indicate a full conditional association between the two given nodes after conditioning on all other nodes in the network. The Fruchterman-Reingold algorithm is used to iteratively compute the optimal placement of nodes, resulting with the most central nodes placed centralized, least central nodes in the periphery. For the EGA calculations, the R software version 4.0.2 was used (R Core Team, 2020) together with the packages lavaan, semPlot, psych, ega, igraph, qgraph were applied (Csardi and Nepusz, 2006;Epskamp et al., 2012;Rosseel, 2012;Epskamp, 2015;Golino and Epskamp, 2017;Revelle, 2017).

Procedures of Multidimensional Item Response Theory Analysis
The second half of the sample was used for a confirmatory crossvalidation of the original proposed theoretical factors and the solution using the communities of items found in the EGA. A multidimensional item response theory approach was used for this analysis, namely Samejima graded response model, with parameter estimation performed by the weighted least squares means and variances adjusted (WLSMV). This approach is similar to a confirmatory factor analysis, but instead of assuming one linear regression for each item loaded on a latent variable, it estimates one function for each Likert-scale category of each item. This approach was chosen instead of a classic confirmatory factor analysis because of the multivariate non-normality of the data, suggesting that the observed variables should be treated as categorical-ordinal, something enabled by item response theory, instead of the linear approach used in classical confirmatory factor analysis (Samejima, 1997;Osteen, 2010).
The degrees to which the observed data followed the theoretical model, as well as the model suggested by the item communities found in the EGA, were evaluated by several goodness-of-fit indices: chi-square divided by degrees of freedom (χ 2 /df), Comparative Fit Index (CFI), Tucker-Lewis Index (TLI), Weighted Root Mean Square Residual (WRMR), and Root Mean Square Error of Approximation (RMSEA). The model was considered to have a good fit with values of χ 2 /df < 5 (Ullman and Bentler, 2003), CFI higher than 0.95, TLI higher than 0.95   (Hu and Bentler, 1999), WRMR lower than 1.5 (Hu and Bentler, 1999), and RMSEA lower than 0.05 (Browne and Cudeck, 1992). The quality of each item was evaluated in terms of discrimination, R-squared and residual variances. These indicators were not used as threshold to determine item exclusions, but any items presenting a low discrimination, a low R-squared coefficient and a high residual variance were scrutinized for its appropriateness and content validity.

Reliability
Cronbach's alpha coefficients were calculated for the full questionnaire and its six subscales. Coefficients above 0.7 were considered acceptable (Streiner et al., 2015). In addition to the alpha, Gutmann and McDonalds coefficients, we have also calculated individual reliability estimates based on the individual estimates for the standard error of measurement from the Samejima graded response model.

Ethics
The study was approved by the Research Ethics Committee of the University of São Paulo and an Informed Consent Form was presented to all participants (Authorization number 64130517.8.0000.5407).

RESULTS
During the development phase, two items from the 33 elaborated-items number 02 and 28-, didn't achieve a satisfactory agreement between all members of the committee formed by five researchers on the field, achieving a kappa score of <0.8 and were flagged to be removed from the questionnaire. Among the semantic judges formed from ayahuasca drinkers of the lower educational demographic, only one item was not fully comprehensible to a respondent after first reading and was reworded-item number 12. All items are listed in Table 1, including removed items.
For the investigation of evidence of validity of the questionnaire, data from 3,472 participants was collected. After application of the exclusion criteria, a total of 2,994 responses were considered valid and used for analysis. Table 2 reports their demographics. Gender and age groups were well-distributed among the sample, with the largest group being 31-40 years old (n = 934). The majority of the participants had completed a major or a professional education degree (81.4%). And it is observed that regarding the number of experiences, the most frequent responses (51.2%) were grouped in the highest classification, more than 100 experiences with ayahuasca. Participants who had their last experience in a União do Vegetal ritual were the largest group of respondents (53.6%).

SQAE Dimensions
EGA was conducted to explore the factor structure underlying the SQAE with half of the valid responses (n = 1,497). The result of the EGA analyses revealed six communities of items, which were, in overall terms, compatible with the six-factor theoretical model (Figure 1). The constructs according to the proposed theoretical model are: Social (S1 to S5), Leadership (L1 to L6), Decoration (D1 to D3), Comfort (C1 to C4), Infrastructure (I1 to I6), and Instruction (G1 to G5) (Figure 2). Figure 1 depicts a regularized partial correlation network between items-nodesand their regularized partial correlations-edges. The thickness of the edge is the degree of correlation, with positive correlations depicted as green, and negative as red. A strong correlation also brings their respective items closer. It is possible to observe the proximity of the items from the same proposed theoretical construct. The Leadership dimension took on a centralized position, having more interconnection with other dimensions, especially with Instruction; in contrast, Comfort assumed a marginal position, with less correlation with other dimensions. Infrastructure item I2 wasn't positioned well and was removed after it also presented a poor adjustment on the MIRT. Figure 2 shows the original proposed theoretical model and the model proposed by the walktrap algorithm. Nodes of the same color indicate common dimensionality, with the color of the edges representing positive (green) or negative (red) correlation, and their thickness representing their strength. The number of dimensions proposed to be kept by it is befitting with the proposed theoretical model, but it varies on the sixth dimension, sectioning the Infrastructure dimension in two and grouping together Leadership with Instruction. It also disagrees on some items, specially centralized items (Figure 2).
Taking the walktrap at face value for a model, both models would present fit indices in multidimensional item response theory analysis used for confirmatory purposes (n = 1,497) that suggest good fit, but with the theoretical model presenting superior results in all indices (Table 3) The internal consistency of the SQAE and its subscales are described in Table 4. The SQAE presented acceptable reliability coefficients, with the Social, Comfort and Leader individual subscales also presenting acceptable internal consistency indices, and Infrastructure, Decoration and Instruction presenting scores below the 0.7 threshold.   Table 5 presents a correlation matrix between the subscales of the SQAE. With exception of Instruction and Leadership, that correlates highly−0.941-, statistically suggesting a possible common dimensionality, almost all correlations presented a good coefficient, between 0.623 and 0.840. Figure 3 depicts individual reliability estimates for each individual participant in the study according to the proposed theoretical dimensions. Individual responses that tended to weight frequently on multiple items' extreme (1 -Strongly Disagree on reversed items or 5 -Strongly Agree on direct items) showed decreased reliability in comparison to more moderated responses.

DISCUSSION
This study has demonstrated the processes used for the development of the Setting Questionnaire for the Ayahuasca Experience (SQAE) as a tool to help researchers to investigate the setting component of ritualistic ayahuasca consumption, and the properties of the data collected on a large Brazilian sample. The study proceeded further to obtain evidence of validity for the SQAE based on its internal structure using exploratory graph analysis and multidimensional item response theory analysis. The semantic, expert and statistical analyses revealed that most items developed for the SQAE performed as expected, in alignment to the theoretical framework used for the SQAE construction. The study also demonstrated an acceptable of reliability for the overall instrument and for two of its subscales-Social and Comfort-using internal consistency coefficients calculated for the sample. However, conditional reliability curves calculated using the individual standard errors of measurement from the multidimensional item response theory analysis demonstrated that diversity tends to decrease in those candidates with a higher level of overall endorsement in the items of all subscales, homogenizing the responses and decreasing the reliability of the subscales with this data.
Significantly, both gender and age differences were wellrepresented, since the perceptions of setting may be different between genders or age groups. Participants from the highest education groups were over-represented, likely because the data collection was online. The total sample included people with many experiences as well as those little and medium experiences, guaranteeing a view of the setting from a range of experiences.
In comparison with other protocols used for factor extraction (e.g., Principal Component Analyses, Kaiser Rule, and Varimax rotation), EGA and MIRT can be considered modern and rigorous statistical analysis, and to achieve these results with a large database can be considered a positive evidence of validity. Some items and subscales indeed presented better psychometric indicators than others, but all items included were justified by theoretical thresholds or theoretical arguments. It is also possible to infer from the items' means ( Table 1) that the sentences are postulated in a format that facilitates responses too close to extremes (1 -Strongly Disagree or 5 -Strongly Agree), leading to a reduced individual reliability (Image 3) and consequently lowering subscales reliability ( Table 4). This frequency of extreme responses is not uncommon in studies that report responses' means with instruments in the field (Bouso et al., 2016) and could be scrutinized and improved in future versions of the questionnaire, wording items on low reliability subscales in a way that promote more heterogeneous levels of endorsement.
To start a psychometric investigation of a broad theme such as the setting is a complex venture, and many decisions had to be made. During the interviews, some influential aspects of the setting were brought to light by different participants with distinct points of views. For some participants, for example, leaders have responsibility to assure participants are fully informed a priori about what to expect of the ritual and how to behave in different situations, but for others good leadership has more to do with a spiritual and energetic issue rather than formalities, which should be regarded as different topic. These two subscales, Leadership and Instruction, were the closest in correlation among all six scales, something that can clearly be seen with the EGA and in the correlation matrix between the subscales. Another hard decision that had to be made was the exclusion of the items D0-"the place had characteristics in common with other environments that I frequent in everyday life" and S0-"the other participants are similar to my friends"from the subscales Decoration and Social, respectively. Although these items were introduced based on interview reports, they were removed because they strongly correlated with each other, creating an undesirable new dimension "Familiarity, " that was   opted out for not being contemplated by the initial reviewed literature and the theory adopted. Although dealing with a new endeavor and not having other similar instruments to compare, we believe that the SQAE can be well-paired with other instruments that have been recently developed to be used in psychedelic studies, such as the Emotional Breakthrough Inventory (EBI) (Roseman et al., 2019), Challenging Experience Questionnaire (CEQ) (Barrett et al., 2016), recent versions of the Mystical Experience Questionnaire (MEQ) (Barrett et al., 2015;Schenberg et al., 2017), and 5d-ASC (Studerus et al., 2010). If the broadly accepted assumptions of the effects of set and setting are accurate, a positive measurement of the setting should positively correlate with a less challenging experience, and promote more mystical experience and higher levels on Emotional Breakthroughs. If the scores on these instruments are intercorrelated, it shouldn't be interpreted as redundant information: on the contrary, it should be considered as valuable predictive information for the psychedelic experience and a guide on where to manipulate to improve the chances to achieve the desirable outcome with the psychedelic use. Right now, the combined EBI, MEQ, and CEQ model was able to predict close to 20% of the variance in well-being changes after a psychedelic experience (Roseman et al., 2019), and we believe that the investigation of the setting with the SQAE can be the next step to improve this number.
We also believe that bringing modern and avant-garde techniques-such as EGA, network analysis and IRT-to psychedelic psychometrics can have a positive impact, making these assessments more comparable with other areas of the field that are also using modern techniques and high standard techniques. In principle, one interesting direction for future research would be the use of cognitive diagnostic modeling to further investigate the internal structure of the SQAE, given the apparent within-item multidimensionality of some of the items.
The elaboration and investigation of the validity of the SQAE is the beginning of a long journey that will need use, improvements and future adaptations to extend its capacity to measure the various dimensions of setting and predicting their influence on the psychedelic experience. For this, more data and broad-use are needed. This was the premise of this work, so both in the elaboration of the items and in the statistical analysis, many precautions were taken to assure each step, testing their validity and realiability, to provide foundations that can accommodate changes and future improvements.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Research Ethics Committee of the University of São Paulo. The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
AP, CC-W, and LT devised the project and designed the study. AP and CC developed the theory and performed the computations. JR verified the analytical methods. AP took the lead in writing the manuscript. All authors provided critical feedback and helped shape the research, analysis, and manuscript.