The effect of liquid consistency on penetration-aspiration: a Bayesian analysis of two large datasets

Introduction Thickened liquids are commonly recommended to reduce the risk of penetration-aspiration. However, questions persist regarding the impact of bolus consistency on swallowing safety. The common practice of summarizing Penetration-Aspiration Scale (PAS) scores based on worst scores is a bias in prior analyses. The aim of this study was to examine the impact of liquid consistency on PAS scores using a Bayesian multilevel ordinal regression model approach, considering all scores across repeated bolus trials. A second aim was to determine whether PAS scores differed across thickener type within consistency. Methods We analyzed two prior datasets (D1; D2). D1 involved 678 adults with suspected dysphagia (289 female; mean age 69 years, range 20-100). D2 involved 177 adults (94 female; mean age 54 years, range 21-85), of whom 106 were nominally healthy and 71 had suspected dysphagia. All participants underwent videofluoroscopy involving ≥3 boluses of 20% w/v thin liquid barium and of xanthan-gum thickened barium in mildly, moderately and extremely thick consistencies. D2 participants also swallowed trials of slightly thick liquid barium, and starch-thickened stimuli for each thickened consistency. Duplicate blinded rating yielded PAS scores per bolus, with discrepancies resolved by consensus. PAS ratings for a total of 8,185 and 3,407 boluses were available from D1 and D2, respectively. Bayesian models examined PAS patterns across consistencies. We defined meaningful differences as non-overlapping 95% credible intervals (CIs). Results Across D1 and D2, penetration occurred on 10.87% of trials compared to sensate (0.68%) and silent aspiration (1.54%), with higher rates of penetration (13.47%) and aspiration (3.07%) on thin liquids. For D1, the probability of a PAS score > 2 was higher for thin liquids with weighted PAS scores of 1.57 (CI: 1.48, 1.66) versus mildly (1.26; CI: 1.2, 1.33), moderately (1.1; CI: 1.07, 1.13), and extremely thick liquids (1.04; CI: 1.02, 1.08). D2 results were similar. Weighted PAS scores did not meaningfully differ between thin and slightly thick liquids, or between starch and xanthan gum thickened liquids. Discussion These results confirm that the probability of penetration-aspiration is greatest on thin liquids compared to thick liquids, with significant reductions in PAS severity emerging with mildly thick liquids.


Introduction
Thickened liquids are commonly recommended as an intervention for people with dysphagia in whom airway invasion ("penetration-aspiration") of thin liquids is suspected or confirmed based on swallowing assessment.In the very first year of the Dysphagia journal, Coster & Schwarz introduced the concept of the "swallow safe bolus", described as a bolus with properties that enable its "unimpeded passage through the food pathway without aspiration, choking, or retention in the pharynx" (1).Subsequently, Curran & Groher (2) described the inclusion and exclusion of foods and liquids with specific characteristics in the "aspiration risk reduction diet" in a Veterans' Affairs hospital in New York, including the thickening of liquids to three different degrees, which were labelled nectar consistency, honey consistency and pudding consistency.Despite recommendations in textbooks that thickened liquids should be considered as an intervention of last resort, and that other compensatory interventions should be considered first as techniques for reducing penetration-aspiration (3), the literature suggests that thickened liquids quickly became one of the most recommended interventions for dysphagia (4).
The most-commonly cited research study exploring the efficacy of thickened liquids for reducing penetration-aspiration was led by Logemann & Robbins, often referred to as "Protocol 201".In the first phase of that study, a sample of 711 adults with diagnoses of Parkinson Disease and/or dementia were enrolled based on videofluoroscopic confirmation of at least one occurrence of aspiration across a series of three 3 ml boluses and three sips of thin liquid barium.These individuals were then asked to swallow up to 6 boluses each, including three 3 ml boluses and three sips of: (a) thin liquid using a chin-down posture; (b) nectar-like liquid barium; and (c) honey-like barium.The order of these intervention conditions was randomized across participants.The study results showed improved swallowing safety in 51% of participants, with the remaining 49% of participants continuing to aspirate in all three intervention conditions.Twenty-five percent of the sample showed improved swallowing safety with all three interventions, while 26% showed a best response, with the honey-like barium showing significantly lower aspiration rates compared to both the nectar-like barium and the chindown posture conditions.
Several systematic reviews in the past decade have explored the impact of bolus consistency on swallowing and the effectiveness of thickened liquids as an intervention for impaired swallowing safety (5)(6)(7)(8)(9).Although these reviews used different article inclusion criteria and explored slightly different questions, they concur in concluding that thicker consistencies of liquid are less likely to be aspirated than thin liquids (5)(6)(7).Whether long-term use of thickened liquids leads to reductions in downstream negative sequelae such as pneumonia remains unclear (7)(8)(9).
One recent study with results that diverge from the systematic review evidence was reported by Miles and colleagues (10).Unlike the majority of studies considered in the systematic reviews, Miles and colleagues used Flexible Endoscopic Evaluations of Swallowing (FEES) to evaluate swallowing safety in a prospective sample of 180 in-patients referred for FEES assessments at two urban hospitals.Worst airway protection status was classified for each participant per exam, based on the depth of airway invasion and whether or not a cough response was seen with material visualized on or below the true vocal folds for 5 ml and 50 ml challenges of thin and mildly thick liquid.The authors reported that although the mildly thick liquid trials showed an overall reduction in the frequency of aspiration compared to thin liquids, some patients who showed a cough in response to airway invasion on thin liquids failed to show a cough response to airway invasion with a thicker consistency of the same volume.
The question of whether a thicker consistency is effective in reducing penetration or aspiration may sound deceptively straight forward, but a number of methodological issues can influence the answer to this question, both in research and in clinical practice.These include the number of boluses that are tested for each consistency, the order of bolus presentation, control of potentially confounding factors such as bolus volume, and the handling of varying scores across repeated boluses of the same consistency.In research, additional considerations that vary across studies with respect to methodological rigor include rater reliability and blinding of raters to patient, consistency, order of the bolus within the protocol, and audio information in the recording.The most common reporting practice, both in research and in clinical practice, is to summarize performance within a given patient or research participant based on the worst swallowing safety status seen across boluses of a particular consistency (11,12).This practice ignores both the frequency and range in severity of penetration-aspiration events, inflating the contribution of single occurrences of more-serious airway invasion events (13,14).Furthermore, although most studies collect data for each bolus using the 8-point Penetration-Aspiration Scale (PAS) (15), it is common for the full scale scoring details to be reduced to 2-, 3-, or 4-levels prior to analysis and reporting.Unfortunately, there is no consensus regarding scale reduction practices and differences in choices hinder meaningful comparisons being drawn across studies (11,12).
One legitimate rationale for summarizing PAS scores for research analyses lies in prerequisite assumptions of common statistical tests.Traditional count-based approaches (e.g., chisquare, simple logistic regression) require that data for each participant are statistically independent.Therefore, these analyses are unable to include individual data points across repeated bolus trials of a given condition for a single participant, necessitating summarization (i.e., taking the maximum or modal score).In the presence of elevated within-participant variability [e.g.(16,17),] the summarization of PAS scores may not adequately represent swallowing function.While some may ignore this statistical assumption, its violation is known to produce highly inaccurate effect size estimates (18).Moreover, these types of analyses are unable to account for missing data, which can be common in PAS datasets, particularly when safety-motivated procedural stopping criteria result in termination of testing for a given consistency prior to collection of all planned repeated trials.Overall, approaches that summarize repeated trials within a participant result in a loss of information and detail.This issue is particularly concerning in studies exploring the physiological mechanisms behind penetration and aspiration, where the assumption of similar physiology across both safe and unsafe swallows from an individual classified as either an aspirator or a non-aspirator is unlikely to be valid.
In the case of repeated boluses in a swallowing protocol, multilevel models (i.e., hierarchical or mixed effects models) are a potential alternative to resolve the issue of non-independence.These models rely on a partial pooling approach that accounts for the nested, non-independent structure of the data via random effects.Their ability to incorporate multiple trials for each participant not only increases statistical power and improves the accuracy of effect sizes, but also reduces the frequency of false positive findings (19,20).One limiting factor of multilevel models is inability to achieve model convergence in a frequentist framework, which relies on maximum likelihood or restricted maximum likelihood methods for model fitting, due to small random effect variances or sample sizes.However, this issue can be easily resolved with a Bayesian framework by incorporating uncertainty (i.e., prior knowledge) before the data are analyzed.Given these benefits, Bayesian multilevel models have quickly become the standard analysis approach in many fields and offer richer analysis possibilities than traditional frequentist models (e.g., t-test, chi-square test, simple regression) across a wide range of outcomes.
In this paper, we use Bayesian multilevel ordinal regression models to examine patterns of airway invasion across different liquid consistencies.This is a secondary analysis of available bolus-level PAS data collected in three prior studies for which the second author served as the principal investigator.In contrast to previous published analyses of these datasets (either in full or for diagnostic subgroups within each project) (17,(21)(22)(23)(24)(25)(26), the primary analysis approach taken in this paper allows for consideration of repeated bolus presentations in the swallowing protocol and does not collapse PAS score data into binary categories of safe and unsafe.Our goal was to answer the following research questions: Q1: How do different liquid consistencies affect PAS scores?Q2: Among participants with airway invasion (PAS > Certain elements of data collection were common to all three of the source projects, namely collection of videofluoroscopy data at 30 frames per second, with multiple boluses per consistency, beginning with blocks of thin liquid and progressing to thicker consistencies.All barium stimuli were prepared in 20% w/v concentration according to standard recipes.For Projects 1a and 1b, the barium sulfate product used was Varibar ® Thin (Bracco Diagnostics).For the NIH dataset, the barium sulfate product used was E-Z-Paque ® powdered barium (Bracco Diagnostics).Thicker consistencies were prepared by adding commercial thickening agents to the thin barium recipe; for the thicker liquids, consistency was confirmed as falling into Levels 1 to 4 of the International Dysphagia Diet Standardisation Initiative (IDDSI) Framework, using IDDSI's testing methods (27, https://www.iddsi.org).Henceforth, these consistencies will be referred to using IDDSI terminology and/or abbreviations: thin (TN0), slightly thick (ST1), mildly thick (MT2), moderately thick (MO3) and extremely thick (EX4).It should be noted that according to the IDDSI Framework, moderately thick (MO3) and extremely thick (EX4) liquids are identical in flow characteristics to liquidized (LQ3) and pureed foods (PU4); therefore, we will use the abbreviations MO3/LQ3 and EX4/PU4 for these consistencies.For all 3 projects, cups containing 40-60 ml of each test stimulus were arranged in blocks in a muffin tray in the order of presentation.For each trial, participants were instructed to pick up a new cup, to take a comfortable sip and to swallow naturally without waiting for a cue from the clinician.For the moderately and extremely thick stimuli, boluses were taken by teaspoon rather than sip.Where possible, participants self-administered the test boluses.Finally, pre-and post-sip cup weights were collected on a digital balance to enable derivation of sip weight and sip volume.For Project 2, thickened stimuli were prepared using two different thickeners: (a) Resource ® ThickenUp ® Clear TM (Nestlé Health Science), a xanthan-gum based thickening agent; and (b) Resource ® ThickenUp ® (Nestlé Health Science), a starch-based thickening agent.Figure 1 illustrates the different study protocols with respect to the stimulus consistencies tested, the thickeners used, and the number of boluses presented per consistency.

Data processing and PAS rating
All three projects shared the same core lab for videofluoroscopy rating.Data processing involved splicing the original videofluoroscopy recordings into shorter clips, each containing the swallowing events for a single bolus.At this point, the audio track was also removed and black screen scrubbers were superimposed on the recordings to blind raters to any potentially biasing information such as participant identifiers, bolus consistency labels, bolus number in the protocol and the video time code.Each bolus clip was then relabeled with a random file number, and assigned as part of a batch of ∼150 bolus clips to two trained raters who documented the number of swallows seen for each bolus, and PAS ratings for each swallow.Inter-rater agreement for PAS rating was inspected for each batch; discrepant ratings were flagged and sent to a consensus meeting for review and resolution.Each consensus meeting was attended by 2-3 members of the rating team, but these were not necessarily the same individuals who provided the original ratings for any given clip.Additional details regarding the rating procedures and rater reliability can be found in the supplemental materials published with the primary articles for Projects 1a and 2 (17, 21).

Statistical analysis
Due to the previously mentioned differences in study protocols, we divided this retrospective analysis into two parts.In part 1, we analyzed the combined data from datasets 1a and 1b.In part 2, we analyzed the data from dataset 2. We used Bayesian multilevel ordinal regression models with a logit-link to estimate differences in patterns of airway invasion across liquid consistencies.A Bayesian framework afforded several benefits, including flexibility to use maximal random effects structures, quantification of uncertainty, and ease of model interpretation.All models included random intercepts of participant, which permitted multiple trials for each participant (i.e., resolved violations of independence) and incomplete data (e.g., participants without trials of extremely thick consistency), as well as a random slope of consistency, which allowed each participant to have their own effect of consistency.Model predictions provided the probability of PAS scores for each fixed effect.To estimate a weighted PAS score across trials, these probabilities were multiplied by their respective PAS scores for each fixed effect and then summed (28).
Before proceeding with the analysis, we first inspected the data to answer two preliminary questions, which influenced choices regarding factors to include in the models.First, recognizing that multiple swallows could occur for a single bolus, we examined whether PAS scores meaningfully differed by swallow number.The results of this preliminary exploration can be found in Appendix A, and show that the probability of higher PAS scores did not meaningfully increase across repeated swallows for a single bolus.Therefore, the maximum PAS score across swallows was calculated for each bolus and used in subsequent analyses.Second, we wanted to determine whether PAS scores varied across diagnoses.Statistical models with and without a fixed effect of diagnosis were compared via leave-one out cross validation (28) for Q1 and Q2.The results of this exploration showed that diagnosis did not meaningfully contribute to the models (Table B1).
On this basis, the models for Q1 examining group-level differences in PAS scores across bolus consistencies included a fixed effect of consistency.For thin liquids, within-subject variability and saturation across repeated bolus trials were also described for each dataset.For Q2, we used the same models but limited the analysis to the subset of data for participants with at least one score of concern (PAS > 2) on thin liquids.Q3 examined how different thickening agents affected PAS scores and was limited to dataset 2. Here, we first compared models  with and without a fixed effect of consistency via leave-one out cross validation.Results indicated that consistency did not meaningfully contribute to the model (Table B2).Thus, the final model included a fixed effect of thickening agent only and excluded thin liquid boluses.
Analyses were performed in R version 4.2.1 (R Core Team, 2018) with the brms package (29).Models used "weakly informative" prior distributions for fixed and random effects, which assumed no a priori effects were present and restricted implausible parameter values.All models demonstrated adequate convergence from sampling diagnostics.Post-hoc alternate prior specifications were performed, which revealed that model inferences remained stable regardless of the prior distribution (Figures C1-C3).
For Question 4, we identified the worst (i.e., highest) PAS score for each participant for thin liquids and across all thickened consistencies combined.These worst scores were cross-tabulated, with frequencies (and percentages) calculated at the participant level.

Q1: How do different liquid consistencies affect PAS scores?
Table 1 provides data regarding the frequency (number and percent) of boluses displaying different PAS scores by bolus consistency for each dataset.Figure 2 provides an illustration of the Q1 results for both datasets.

Q2: Among participants with airway invasion (PAS > 2) on thin liquids, how do thicker liquid consistencies impact PAS scores?
Table 2 provides data regarding the frequency (number and percent) of boluses displaying different PAS scores by bolus consistency for the subset of participants in each dataset who displayed at least one PAS score > 2 on thin liquids.Figure 3 provides an illustration of the results for Q2 across both datasets.

Dataset 1
Two-hundred and ninety-nine participants (44.1%) demonstrated at least one score of concern (PAS > 2) on thin liquids.More than 74% of participants completed at least five bolus trials of thin liquid, while 35% of participants completed six bolus trials.Across repeated bolus trials of thin liquid, the first score of concern (i.e., first trial of PAS > 2) was most likely to be appreciated on the first thin liquid bolus trial (n = 155, 51.84%).The second (n = 69, 23.07%), third (n = 32, 10.70%), and fourth (n = 27, 9.03%) presentations of thin liquid showed slightly lower occurrences of the first PAS score of concern, whereas this was less commonly seen on the fifth (n = 11, 3.68%) and sixth (n = 5, 1.67%) trials.
The magnitude of these weighted PAS scores differed across consistencies, such that PAS scores decreased as bolus consistency increased; however, PAS scores for moderately and extremely thick consistencies were not meaningfully different.
Table 3 provides frequencies for single vs. multiple PAS scores of concern by consistency for each dataset.PAS scores of concern are further stratified into two degrees of severity (PAS > 2; PAS > 5).Among participants with at least one trial showing a PAS > 2 across all consistencies, multiple scores of concern across repeated trials were more commonly seen for thin liquids (62.22%) compared to mildly (49.47%), moderately (35.20%), and extremely thick (9%) consistencies.A similar pattern was appreciated among participants with at least one trial of aspiration (PAS > 5).

Dataset 2
Thirty-three participants (18.64%) demonstrated at least one score of concern (PAS > 2) on thin liquids.More than 94% of participants completed at least three bolus trials of thin liquid.Across repeated bolus trials of thin liquid, the first score of concern (i.e., first trial of PAS > 2) was most likely to be appreciated on the first thin liquid bolus trial (n = 21, 11.86%), whereas the second (n = 9, 5.14%) and third (n = 3, 1.80%) presentations showed lower occurrences of the first score of concern.
The frequencies of single vs. multiple PAS scores of concern, stratified by degree of severity (PAS > 2; PAS > 5) are shown on the right hand side of Table 3 by consistency.Among participants with at least one trial showing a PAS > 2 across all consistencies, multiple scores of concern across repeated trials were more commonly seen for moderately thick liquids (71.43%), although there was a low number of airway invasion events in this dataset due to the large proportion of healthy participants.prepared with both thickeners were tested for all consistencies in 78 of the healthy participants, while selected thicknesses were included in the protocols for specific clinical cohorts, resulting in available data for a total of 125 participants.Table 4 shows the frequencies (percent) of different PAS scores by thickener, across all bolus consistencies.As shown in Figure 4 panel A, participants demonstrated similar probabilities for scores of concern (PAS > 2) between stimuli prepared with Thicken-Up (0.23%; 95% CI: 0.08, 0.47) and Thicken-Up Clear (0.31%; 95% CI: 0.11, 0.62).As shown in Figure 4 panel B, weighted PAS scores were also nearly identical across the two thickeners: Thicken-Up (1.016; 95% CI: 1.007, 1.029) and Thicken-Up Clear (1.022; 95% CI: 1.01, 1.039).This question was explored to enable comparison with the recent paper by Miles et al. (10), which reported worsening of aspiration from sensate on thin liquids to silent on thickened liquids in some patients.Figure 5 shows the crosstabulation of participant-worst scores for thin liquids and across all of the thickened consistencies combined.Panel A shows the results for dataset 1 and panel B for dataset 2. PAS scores are summarized categorically as suggested by Steele & Grace-Martin (11), with scores of 1, 2 and 4 grouped together (Category A) as scores either showing no bolus entry into the laryngeal vestibule or scores showing transient penetration with complete ejection of material back out of the airway.Category B includes PAS scores of 3, 5 and 6 in which there is penetration but material remains in the laryngeal vestibule at the end of the swallow.PAS scores of 7 and 8 are captured in categories C and D, respectively, with category D (PAS of 8) reflecting cases of silent aspiration.The cells in the Figure 5 tables represent the percent of cases with a worst PAS category of A, B, C or D on thin liquids (y-axis) for which a worst PAS category of A, B, C or D was seen across all of the thicker consistencies combined (x-axis).Green shading is

Discussion
This manuscript involves secondary analysis of two large datasets of videofluoroscopy data, collected across three prior projects.In total, the two datasets comprised bolus-level PAS ratings for 8,185 boluses (from 678 participants) and 3,407 boluses (from 177 participants), respectively.Although slightly different bolus protocols were used across the three studies, as illustrated in Figure 1, the rating procedures for both datasets were identical and performed in the same central analysis lab.All boluses were rated by two trained raters who were blinded to each other's ratings and to potentially biasing information including participant identity, diagnosis, sex and age, bolus consistency, and bolus trial order within the testing protocol.All discrepancies in PAS scores were flagged and resolved by consensus.As such, a high degree of rigor was used in the rating procedures for all three prior studies.The analyses performed for this manuscript involved pooling of data from projects 1a and 1b, which shared the use of Varibar ® barium sulfate and identical recipes for thickening to mildly thick or thicker consistencies using the same commercially available xanthan gum thickener.Analysis of dataset 2 was kept separate due to use of a different barium product, the inclusion of slightly thick liquids, and the availability of data for both starch and xanthan gum thickened liquids.Additionally, dataset 2 involved a substantial proportion of healthy participants, in whom penetrationaspiration events of concern would be unexpected.
Unlike previous analyses of these datasets, and other datasets in the literature, the statistical analysis used to address questions 1-3 in this manuscript allowed for consideration of all boluses across repeated trials of each consistency, and avoided reduction of the Penetration-Aspiration Scale.The results are generally concordant with previous findings synthesized across studies in systematic reviews, namely showing significant reductions in the frequency and severity of penetration-aspiration with liquids thickened to mildly thick consistency or thicker, compared to thin liquids.However, in contrast to prior work, these results are more trustworthy due to the inclusion of multiple bolus trials that account for within-participant variability with a Bayesian multilevel modeling approach.Given substantial sample sizes in both datasets, results were highly stable and robust regardless of the choice of prior distribution-an important and reassuring validation of these findings (see Appendix C).
In question 4, we adopted the traditional practice of summarizing worst PAS scores per participant for thin vs. thicker liquid consistencies to enable comparison to the results reported by Miles et al. (10).It should be noted that there are a number of differences in data collection methods between our studies and the Miles et al. study, including the use of videofluoroscopy rather than FEES, different bolus volumes, different stimuli (non-barium liquids), differences in the number of trials per condition, and different categorization of PAS score severity.Like those reported by Miles and colleagues, our data do show that a small proportion of participants showed worse airway invasion (i.e., higher PAS scores) on thicker liquids.Of particular note are the 7 participants whose worst scores moved Effect of different thickening agents on (A) the probability of PAS scores and (B) the weighted PAS score obtained from the model posterior distribution.Panel A displays the probability and corresponding 95% credible interval for each PAS score, stratified by thickening agent.Panel B displays the estimated posterior median average (i.e., weighted) PAS scores by thickening agent with 95% credible intervals.This is estimated by multiplying each PAS score's numerical value (1-8) by its probability (A) and summing these weighted values (28).Note that all estimates were obtained from ordinal regression models and that these estimates are averaged across consistencies.from PAS = 7 (i.e., aspiration followed by an unsuccessful sensate clearing attempt) to PAS = 8 (i.e., silent aspiration).This finding emphasizes the fact that cough responses to aspiration may not be effective for ejecting material out of the airway, and that PAS scores of 7 indicate severe impairments of airway protection similar to scores of 8.However, the predominant trend in these datasets was one of improved airway protection with thickened liquids.It should be noted that this trend was seen even with summarization of worst scores across a larger number of thickened liquid trials (i.e., a minimum of 9 in dataset 1 and a minimum of 12 in the dataset 2) compared to the number of thin liquid trials in the data collection protocols (i.e., 6 and 3 in datasets 1 and 2, respectively).
As with any research study, limitations must be acknowledged for the analyses reported in this manuscript.These include the lack of balanced sample sizes for different clinical diagnostic subgroups in the datasets (although the exploration of diagnosis in Appendix B found no meaningful differences).Additionally, we were unable to explore the impact of bolus volume on penetration-aspiration due to the use of comfortable, patientselected sip volumes in the studies from which the data were sourced.Prior analyses of those studies have reported descriptive statistics for sip-volume based on pre-and postswallow cup weights (13,17,21).Importantly, all three source projects involve the acknowledged confound that the moderately and extremely thick liquids were taken using a Cross-tabulation of the proportion of participant-worst PAS score categories for across polled thicker consistencies by participant-worst thin liquid PAS score category.Each cell represents the percent of participants with a worst PAS category of (A-D) on thin liquids (y-axis) for which a worst PAS category of (A-D) was seen across all of the thicker consistencies combined (x-axis).Green shading is used to indicate score improvement with thicker consistencies while red sharing indicates score worsening, Annotations on the right side of the table for each row summarize the frequencies of improvement or worsening seen by worst PAS score category on thin liquids.Panel A shows results for dataset 1 while Panel B shows equivalent results for dataset 2. teaspoon, which limited bolus volume to the 4-6 ml range rather than the ∼ ≥ 10 ml volumes seen for the thin, slightly and mildly thick consistencies that were sipped from a cup.Thus, part of the observed effect of thickening to moderately and extremely thick consistencies on swallowing safety may be attributable to smaller bolus volumes.Another methodological constraint of the studies explored in this manuscript is the fact that thicker liquids were always tested later in the data collection protocol.Thus, order effects, either with respect to improved airway protection with practice or with respect to possible worsening of airway protection with fatigue cannot be ruled out.Additionally, it is acknowledged that all of the data collected in the source studies required the use of a low concentration of barium sulfate contrast agent, which differs from regular liquids in taste.

Conclusion
The secondary analyses described in this manuscript provide strong confirmatory evidence that penetration-aspiration is much less likely to occur on thicker liquid consistencies than with thin liquids.The evidence in this paper corroborates previous findings that swallowing safety may vary within an individual across repeated trials of a given consistency, such that multiple trials are needed to rule out penetration-aspiration both with thin liquids (13,30), and with thicker consistencies.Additionally, the findings show that a small number of patients may show worse airway protection with thicker consistencies, supporting best practice guidance that thicker consistencies should not be recommended without both prior assessment and subsequent monitoring for adverse outcomes.

FIGURE 1
FIGURE 1Details of the different bolus presentation protocols used across the three studies.XG = Xanthan Gum. a included in some but not all of the patient cohorts; b not included in the patient cohorts.

FIGURE 2
FIGURE 2Effect of bolus consistencies on (A) the probability of PAS scores and (B) the weighted PAS score obtained from the model posterior distribution.Panel A displays the probability and corresponding 95% credible interval for each PAS score, stratified by bolus consistency across two datasets.Panel B displays the estimated posterior median average (i.e., weighted) PAS scores by bolus consistency with 95% credible intervals.This is estimated by multiplying each PAS score's numerical value (1-8) by its probability (A) and summing these weighted values (28).Note that all estimates were obtained from ordinal regression models.

3. 4
Q3: Do PAS scores on thickened liquids differ between liquids thickened with a commercially available starch-based thickener vs. a commercially available xanthan gum based thickener?3.4.1 Dataset 2 This question was only explored in dataset 2, for which both starch and xanthan gum-based thickeners were used.Stimuli

FIGURE 3 4
FIGURE 3Effect of bolus consistencies on (A) the probability of PAS scores and (B) the weighted PAS score obtained from the model posterior distribution among participants with airway invasion on at least one thin liquid bolus.Panel A displays the probability and corresponding 95% credible interval for each PAS score, stratified by bolus consistency across two datasets.Panel B displays the estimated posterior median average (i.e., weighted) PAS scores by bolus consistency with 95% credible intervals.This is estimated by multiplying each PAS score's numerical value (1-8) by its probability (A) and summing these weighted values (28).Note that all estimates were obtained from ordinal regression models.

TABLE 1
Overall frequencies (percent) of penetration-aspiration scale scores by bolus consistency for Q1.

TABLE 2
Overall frequencies (percent) of penetration-aspiration scale scores by bolus consistency for Q2 (limited to participants who showed at least one PAS score > 2 on thin liquids).

TABLE 3
Frequencies of single and repeated scores of concern stratified by dataset.

TABLE 4
Distribution of penetration-aspiration scale scores for Q3 (dataset 2 only).