Machine learning data sources in pediatric sleep research: assessing racial/ethnic differences in electronic health record–based clinical notes prior to model training

Introduction: Pediatric sleep problems can be detected across racial/ethnic subpopulations in primary care settings. However, the electronic health record (EHR) data documentation that describes patients’ sleep problems may be inherently biased due to both historical biases and informed presence. This study assessed racial/ethnic differences in natural language processing (NLP) training data (e.g., pediatric sleep-related keywords in primary care clinical notes) prior to model training. Methods: We used a predefined keyword features set containing 178 Peds B-SATED keywords. We then queried all the clinical notes from patients seen in pediatric primary care between the ages of 5 and 18 from January 2018 to December 2021. A least absolute shrinkage and selection operator (LASSO) regression model was used to investigate whether there were racial/ethnic differences in the documentation of Peds B-SATED keywords. Then, mixed-effects logistic regression was used to determine whether the odds of the presence of global Peds B-SATED dimensions also differed across racial/ethnic subpopulations. Results: Using both LASSO and multilevel modeling approaches, the current study found that there were racial/ethnic differences in providers’ documentation of Peds B-SATED keywords and global dimensions. In addition, the most frequently documented Peds B-SATED keyword rankings qualitatively differed across racial/ethnic subpopulations. Conclusion: This study revealed providers’ differential patterns of documenting Peds B-SATED keywords and global dimensions that may account for the under-detection of pediatric sleep problems among racial/ethnic subpopulations. In research, these findings have important implications for the equitable clinical documentation of sleep problems in pediatric primary care settings and extend prior retrospective work in pediatric sleep specialty settings.


Introduction
Racial/ethnic disparities are well documented and persistent in pediatric sleep at the population level (Billings et al., 2021;Meltzer et al., 2021;Reynolds et al., 2023).However, these disparities are often preventable and linked to social determinants of health at the individual, family, healthcare, and broader community/societal levels (Billings et al., 2021;Fanta et al., 2021;Yip et al., 2022;Clarkson-Townsend et al., 2023;Gueye-Ndiaye et al., 2023).Pediatric primary care is ideal for preventing pediatric sleep disparities at the population level, yet providers in this setting typically lack the time and resources necessary to identify sleep problems (Honaker and Saunders, 2018;Mosher and Piccinini-Vallis, 2022;Williamson et al., 2022;Golden et al., 2023).Efficient machine learning and clinical decision support tools embedded in the pediatric primary care electronic health record (EHR) are needed to support universal screening of pediatric sleep problems at the population level (Anan et al., 2023).In addition, EHR-embedded machine learning tools for data collection are essential to include patient-self report and aid providers with limited personnel and time constraints in pediatric primary care (Honaker et al., 2019;Huffstetler et al., 2022;Willis et al., 2022).Therefore, EHR-embedded machine learning tools have the potential to innovatively address modifiable pediatric sleep care gaps (Kang et al., 2021;Ramgopal et al., 2023).
Despite their promise, EHR-embedded machine learning tools also have the capability of worsening racial/ethnic disparities due to inherently biased healthcare data sources used for training machine learning models (Chen et al., 2023).In the context of the EHR, training data commonly used for machine learning in pediatric sleep may be inherently biased for two reasons: historical biases and informed presence (Hamilton et al., 2021).Historical biases include an overrepresentation of non-Hispanic white patients in epidemiologic pediatric sleep cohorts, which have commonly leveraged patients with confirmed sleep diagnoses (Meltzer et al., 2010;Honaker and Meltzer, 2016).Informed presence occurs when patients navigate the healthcare system but experience barriers that yield variable interactions across racial/ethnic subpopulations (Phelan et al., 2017).Informed presence is critical to account for when developing EHR-embedded machine learning tools because it can have downstream effects on how racial/ethnic subpopulations are classified, measured, and/or represented in healthcare data sources (Phelan et al., 2017).Therefore, researchers should assess for such biases (e.g., historical or induced by patients' healthcare navigation) that may be inherent in healthcare data sources prior to training and developing models for automated solutions (Huang et al., 2022).
As posited in the Peds B-SATED framework by Meltzer et al. (2021), pediatric sleep problems can be multidimensional and include unhealthy sleep behaviors (B), poor sleep satisfaction (S), difficulty with alertness during waking hours (A), inappropriate sleep timing (T), low sleep efficiency (E), and inadequate sleep durations for age (D).A reliance on sleep diagnoses and polysomnography data limits epidemiologic and population pediatric sleep research by failing to capture all the subclinical characteristics described in the Peds B-SATED framework (Yang et al., 2023).Leveraging clinical note data from the EHR is a way to improve our identification of Peds B-SATED in primary care settings using EHR-embedded machine learning tools.Natural language processing (NLP), a machine learning model for understanding language and contextualized nuances in EHR free-text clinical notes, is an innovative and available approach for capturing Peds B-SATED framework subclinical characteristics (Gianfrancesco and Goldstein, 2021;Rahman et al., 2022).However, this data source and approach are not exempt from being inherently biased due to the reliance on clinical notes, telephone notes, patient-provider messages, and other text-based fields that are shaped by informed presence in healthcare systems (Rozier et al., 2022;Walk et al., 2022).Recent NLP work used to audit clinical notes has found that the language used to describe providers' recognition of patients' reported characteristics (e.g., sociodemographic and clinical) varies by race/ethnicity (Thompson et al., 2021;Sun et al., 2022).Therefore, it is important to assess how Peds B-SATED keywords and global dimensions are documented and captured across racial/ethnic subpopulations in NLP training data sources prior to model training and using these methods to develop EHR-embedded machine learning tools.
To address this knowledge gap, utilizing clinical notes from patients seen in pediatric primary care, the current study included two objectives: (1) A least absolute shrinkage and selection operator (LASSO)-normalized logistic regression model was used to investigate whether there were racial/ethnic differences in documentation of Peds B-SATED keywords.
(2) A mixed-effects logistic regression was used to determine whether the odds for the presence of global Peds B-SATED dimensions also differed across racial/ethnic subpopulations.We hypothesized that racial/ethnic differences in the documention of keywords and global dimensions would be observed by LASSO regression and multilevel modeling approaches.

Participants
We conducted a cross-sectional cohort study of 44,244 patients, 5 to 18 years old, seen in a pediatric primary care network at a large academic medical center.We excluded infants and early childhood youth due to developmental reliance on caregivers to support sleep, which would require an extensive and separate set of pediatric behavioral sleep medicine keywords and phrases.The protocol was approved by the institutional review board at Nationwide Children's Hospital.

Data source and procedures
2.2.1 DeepSuggest clinical note search engine-DeepSuggest is an internally developed and validated clinical note search engine at Nationwide Children's Hospital (Moosavinasab et al., 2021).DeepSuggest queries clinical notes by a set of keywords and filters by note type, provider type, department specialty, date range, age range, and patient information such as name, date of birth, and medical record number.In addition to facilitating a search of EHR-based clinical notes, DeepSuggest expands query terms by recommending related or similar search keywords based on the similarity of keyword Word2Vec embeddings calculated across all notes in the repository on the backend (Mikolov et al., 2013).During this process, duplicate notes are not included.
We utilized DeepSuggest to expand our initial keywords and retrieve clinical notes that contained at least one Peds B-SATED keyword or phrase.For vocabulary expansion, we entered our initial set of Peds B-SATED keywords into DeepSuggest, and it determined recommended keywords based on their relevance.This vocabulary expansion increased our keywords by including those with common misspellings (e.g., "insomia"), inconsistent punctuation (e.g., "sleepwalking" vs. "sleep-walking"), abbreviations, and synonyms (e.g., "difficulty staying awake during the day" vs. "sleepy during the day").

2.2.2
Predefined keyword features set approach-We desired to cluster clinical notes into groupings based on their presence of global Peds B-SATED dimensions, so we applied a predefined keyword features set approach.This is an NLP approach that uses the presence of each of the keywords as a representation of the clinical note, rather than a predictive NLP model.To convert these Peds B-SATED keyword occurrences into a numerical representation, we searched each clinical note for an occurrence of each of the 178 keywords, using case-invariant matching, and phrases, using regular expressions.If a keyword or phrase was found, we would mark the keyword's corresponding index in a 178-dimensional vector with a 1; if no occurrence of that keyword was found, its value would be 0. At the end of this process, each note had a corresponding 178-dimensional binary keyword vector.In the end, using a predefined keyword features set containing 178 Peds B-SATED keywords, we queried all the clinical notes for patients between the ages of 5 and 18 from January 2018 to December 2021.

LASSO regression model-To
investigate whether there were racial/ethnic differences in the documentation of Peds B-SATED keywords, we fit a LASSO regression model predicting patients' race/ethnicity using the occurrence of Peds B-SATED keywords as the input feature.This model assessed the Peds B-SATED keywords used in a patient's clinical note(s) to detect if they differed across race/ethnicity subpopulations in our cohort.
For this LASSO regression model, we performed two analytic steps.The first step was focused on the overall predictiveness of the Peds B-SATED keywords.We trained a LASSO regression model using 10-fold cross-validation.This included using the combined holdout subsets of data from each fold to evaluate the model's overall performance parameters: area under the receiver operating characteristic curve (AUC ROC), precision, recall, accuracy, and F1-scores.To ensure that our results were rigorous and meaningful, we repeated our 10-fold cross-validation training 1,000 times using bootstrapping to randomly resample the data in the training folds at each step, stratified by label (ensuring the racial/ethnic proportion of the training data remains constant).This analysis investigated the predictive capability of the Peds B-SATED keywords.
The second step was focused on identifying which Peds B-SATED keywords were most influential for predicting each racial/ethnic patient subpopulation.In this step, we trained another LASSO regression model using all of the data simultaneously.By examining the coefficients of each Peds B-SATED keyword, we reported the 10 most common keywords used for each racial/ethnic subpopulation.This step also included predicting patients' race/ethnicity from the Peds B-SATED keywords in their clinical notes.To do this, we collapsed patient race/ethnicity into six categories: non-Hispanic Black, Hispanic/ Latino, non-Hispanic white, non-Hispanic Asian, non-Hispanic Multiracial (more than two races), and Other Race/Ethnicity (including non-Hispanic Native Hawaiian or Other Pacific Islander, non-Hispanic American Indian or Alaska Native, and no information given/ unknown/refuses to answer).Race is collected by patient report at the time of registration and entered into Epic by registration staff.Registration staff are trained to directly ask the caregiver/patient to select the race/ethnicity category they most identify with and have the option to not report this information.For predicting patients' race/ethnicity, we aggregated each patient's Peds B-SATED keywords vector for all of their clinical notes and then used that in the LASSO regression model as the input features to predict patients' race/ethnicity category.

Mixed-effects logistic regression-Descriptive
analyses were used to count the incidence of patients that had global Peds B-SATED dimensions present.The percentages of patients with global Peds B-SATED dimensions are reported.Mixed-effects logistic regression models (e.g., clinical notes nested within patients, with a random effect for patients) were used to predict whether the presence of global Peds B-SATED dimensions differed by patients' race/ethnicity.Statistical models were adjusted for covariates at both the note and patient levels.Note-level covariates included clinical note author type (nurse practitioner, physician, integrated pediatric psychologists, social workers, and others/ trainees) and note date pandemic status was dichotomized (notes prior to 1 March 2020; notes after 1 March 2020).Patient-level covariates included the following sociodemographic characteristics: biological sex (female vs. male) and age.Logistic mixed-effects models were fitted using the glmer function from the lme4 package, the performance of the fitted models was compared, and the best model was selected using Akaike's information criterion and Bayesian information criterion.A p-value of <0.05 was considered statistically significant.
Statistical analyses were performed using R version 4.2.2 and R Studio software (Bates et al., 2015).

Patient cohort characteristics
Our sample consisted of 44,244 patients with 111,078 clinical notes.Of the patients, 51.4% were classified as non-Hispanic Black, 24.6% as non-Hispanic white, 8.1% as non-Hispanic Multiracial, 4.6% as non-Hispanic Asian American, 11.4% as non-Hispanic Other Race/ Ethnicity, and 15.6% as Hispanic or Latino.The mean age was 11.12 (SD = 3.78), and 49.1% were classified as female.

Racial/ethnic differences in documentation of Peds B-SATED keywords
Figure 1 shows a comparison-confusion matrix table.A confusion matrix is a table that is used to display the predictive ability of the LASSO regression model used in this study.Ideally, in this particular case, the predictive ability should be at 0 and not show the model's capability of predicting patients' race/ethnicity from Peds B-SATED keywords.However, this reveals that race/ethnicity across patient subpopulations in our cohort could be predicted by primary care providers' use of keywords in clinical notes and that this difference was most observable among non-Hispanic Black and non-Hispanic white patient subpopulations.
A model with random classification ability will have an AUC ROC of 0.5, and a perfect model (which can separate everything without error 50% of the time) will have a value of 1.0.Our average AUC ROC score is 0.72. Figure 2 shows in no case was the AUC ROC score at or below 0.5, indicating that this LASSO regression model with Peds B-SATED keywords was always able to learn to discriminate patients by race/ethnicity to some degree, which is an indication of racial/ethnic differences in primary care providers' documentation of Peds B-SATED keywords.Figure 3 shows that non-Hispanic Black patients had the highest overall model F1-score (mean = 0.56) and model recall (mean = 0.68) scores.In addition, non-Hispanic Black patients were similar to non-Hispanic white patients in terms of model precision.
To more specifically demonstrate how providers' documentation of specific Peds B-SATED keywords differed for racial/ethnic subpopulations, Table 1 shows the top 10 Peds B-SATED keywords for each racial/ethnic subpopulation that were documented by pediatric primary care providers, including the coefficient ranking value.The most common Peds B-SATED keyword rankings qualitatively differed across racial/ethnic subpopulations.Most interestingly, non-Hispanic white patients had distinguished Peds B-SATED keywords that may lead to follow-up support and/or a referral to specialty care to address pediatric sleep problems.However, minoritized racial/ethnic subpopulations commonly had Peds B-SATED keywords that encompassed broader descriptions and relevant daytime sequelae of pediatric sleep problems.

Racial/ethnic differences in global Peds B-SATED dimensions
Once we unearthed the differences in primary care providers' documentation of Peds B-SATED keywords across racial/ethnic subpopulations using LASSO regression, we sought to understand the odds of patients having the presence of global Peds B-SATED dimensions in their clinical notes.Of patients, 16.2% had one or more sleep behavior keywords present in clinical notes, 37.0% had one or more sleep satisfaction/quality keywords present, 25.2% had one or more alertness/daytime sleepiness keywords present, 13.5% had one or more sleep timing keywords present, 34.4% had one or more sleep efficiency keywords present, and only 2.1% had one or more sleep duration keywords present.Table 2 shows the racial/ ethnic differences in global Peds B-SATED dimensions.
In adjusted models, non-Hispanic Black patients had a lower adjusted odds ratio (aOR = 0.59; 95% CI [0.55, 0.63]; p = 0.00) of having the presence of one or more sleep satisfaction keywords, a higher adjusted odds ratio (aOR = 1.16; 95% CI [1.09, 1.24]; p = 0.00) of having the presence of one or more alertness/daytime sleepiness keywords, and a lower adjusted odds ratio (aOR = 0.81; 95% CI [0.77, 0.86]; p = 0.00) of having the presence of sleep efficiency keywords, compared to non-Hispanic white patients.
In adjusted models, non-Hispanic Multiracial patients had a lower adjusted odds ratio (aOR = 0.62; 95% CI [0.54, 0.71]; p = 0.00) of having the presence of sleep efficiency keywords, compared to non-Hispanic white patients.
In adjusted models, non-Hispanic Asian patients had a lower adjusted odds ratio (aOR = 0.68; 95% CI [0.56, 0.83]; p = 0.00) of having the presence of one or more sleep behavior keywords, a lower adjusted odds ratio (aOR = 0.49; 95% CI [0.42, 0.58]; p = 0.00) of having the presence of one or more sleep satisfaction keywords, and a lower adjusted odds ratio (aOR = 0.62; 95% CI [0.54, 0.71]; p = 0.00) of having the presence of sleep efficiency keywords, compared to non-Hispanic white patients.
In adjusted models, non-Hispanic Other Race/Ethnicity patients had a lower adjusted odds ratio (aOR = 0.75; 95% CI [0.60, 0.93]; p = 0.01) of having the presence of one or more sleep satisfaction keywords and a higher adjusted odds ratio (aOR = 3.24; 95% CI [1.13, 9.27]; p = 0.03) of having the presence of sleep timing keywords, compared to non-Hispanic white patients.

Summary of findings
To our knowledge, this is the first study to assess racial/ethnic differences in NLP training data prior to model training.This study describes LASSO and multilevel modeling approaches that were used to identify these differences in pediatric primary care providers' documentation of Peds B-SATED keywords and global dimensions across racial/ethnic subpopulations.In addition, the most frequently documented Peds B-SATED keyword rankings qualitatively differed across racial/ethnic subpopulations.These racial/ethnic differences of documented Peds B-SATED keywords and global dimensions both extend and align with previous studies of providers' differential documentation of pediatric sleep problems in pediatric primary care settings (Honaker et al., 2018;Carson et al., 2023).Our findings have important implications for future practices in equitably documenting pediatric sleep problems across racial/ethnic subpopulations, which shape how patients experience the clinical workflow from identification in primary care to referral sleep specialty settings in pediatric healthcare institutions.
In addition to our main findings, we found that by using the Peds B-SATED keywords documented in primary care providers' clinical notes, patients' race/ethnicity could be predicted.In addition, the most frequently documented Peds B-SATED keyword rankings qualitatively differed across racial/ethnic subpopulations.These findings indicated that non-Hispanic white patients had keywords that typically noted pediatric sleep problems with more specificity, particularly those that require specialized intervention (Honaker and Saunders, 2018).Yet, among racial/ethnic minoritized patients, keywords commonly included broad descriptions and relevant daytime sequelae of a pediatric sleep problem (e.g., daytime behavior problems, napping, inattentiveness, and irritability).Although these daytime sequelae are necessary for understanding phenotypes of pediatric sleep problems, primary care providers may not have the training to interpret these subclinical characteristics as clinically meaningful proxies for an underlying sleep problem (Golden et al., 2023;Yang et al., 2023).Therefore, it may be a source of concern that these broad descriptions and daytime sequelae are ranked in the top 10 documented keywords for primarily racially/ ethnically minoritized patient subpopulations.Future clinical research is needed to further investigate patient-provider factors that may influence these differences in documentation observed across racial/ethnic subpopulations (e.g., patient perception of sleep problems and provider screening practices).
We found that the top three global Peds B-SATED dimensions included satisfaction, efficiency, and alertness/daytime sleepiness (e.g., present in ~25%−37% of patient clinical notes).Although the most commonly monitored in past epidemiologic and population sleep health research, the current study found that only ~2% of patients had the presence of keywords/phrases falling in the sleep duration dimension.We also found that the odds of certain global Peds B-SATED dimensions were both lower and higher in racial/ethnic minoritized subpopulations of patients, compared with non-Hispanic white patients.Thus, efforts to prevent racial/ethnic inequities and manage pediatric sleep problems in primary care require routine and multidimensional screening protocols (Meltzer et al., 2021).In addition, the results highlight that it is also important for future work to consider the heterogeneity in patients' susceptibility (e.g., social and environmental determinants; cooccurring health problems) that can influence their perception of, providers' identification of, pediatric sleep problems (Rubens et al., 2016;Billings et al., 2021;Reynolds et al., 2023).

Ethical implications
The goal of our research was to apply novel and efficient approaches to support the assessment of racial/ethnic differences in NLP training data (e.g., keywords).We approached this study from a "discovery" perspective in our methods, but this study aligns with the future directions for ethical machine learning in pediatric healthcare settings previously outlined in the literature (Huang et al., 2022;Chen et al., 2023).NLP allows pediatric sleep researchers to expand their reach beyond diagnoses and polysomnography data commonly utilized in past healthcare research (Ramgopal et al., 2023).However, these more recent advances that leverage NLP-extracted data are not exempt from racial/ethnic bias that is inherently shaped by historical bias and informed presence (Phelan et al., 2017;Boch et al., 2022).This highlights the important prioritization of innovation and machine learning ethics in pediatric sleep research at the population level (Mhasawade et al., 2021).Leading scholars have recently outlined key ethical and equity-centered processes to consider when using machine learning in healthcare settings (Boch et al., 2022;Walk et al., 2022;Chen et al., 2023).They deemed the process of identifying and addressing biased patterns in data collection, imbalanced or skewed datasets, to be an important step called preprocessing (Huang et al., 2022).This preprocessing process typically occurs prior to model training and deployment to prevent biased machine learning models (Huang et al., 2022).Ethical machine learning approaches such as preprocessing are a necessity, not an optional step, to confirm and address any underlying bias in training data sources.This step is particularly important when processing NLP-derived data from clinical documentation and starts at the data collection phase (e.g., clinical protocols for assessment and documentation).

Clinical research implications
Our analyses identified potential differences in clinical documentation of pediatric sleep problems that necessitate future research that examines how primary care shapes the clinical sleep outcomes of racial/ethnic subpopulations.Clinically, these observed differences in our cohort may highlight the importance of clinicians routinely asking about sleep problems in a multidimensional way, which may be a potential pathway for improving the equitable identification of racially/ethnically minoritized patients in pediatric primary care settings (Meltzer et al., 2021).In this pediatric primary care cohort study, we identified differential documentation of Peds B-SATED keywords and global dimensions across racial/ethnic subpopulations, but also a very low presence of various global dimensions in clinical notes overall.For example, the highest prevalence was the presence of the satisfaction dimension at ~37%, while the lowest was the duration dimension at ~2%.These findings allude to the importance of clinical machine learning to rapidly increase the efficiency of enhancing patients' sleep health literacy and education, patient-driven data collection, and the development of clinical decision support tools to aid pediatric primary care providers (Harada et al., 2021;Kang et al., 2021;Ramgopal et al., 2023).Theoretically, efficient clinical workflows for identifying such patients in pediatric primary care can aid research efforts for universal screening of pediatric sleep problems at the population level (Goldstein et al., 2020).Improving the quality and increasing the vastness of data collected in pediatric primary settings can also determine influential social determinants of pediatric sleep problems (Huffstetler et al., 2022).With this level of population-level surveillance in reach using both patient self-report and actigraphy, researchers will be able to develop EHRembedded machine learning tools for primary care providers to recognize predictors and profiles for distinct pediatric sleep phenotypes (Willis et al., 2022).In the age of precision health, these enhanced clinical workflows and tools are key for improving our ability to equitably reach racial/ethnic subpopulations that may benefit the most from targeted and tailored interventions (Seixas et al., 2020;Honaker et al., 2022).

Limitations
Pediatric sleep problems are multidimensional, but the variation of sleep health definitions and phenotypes among race/ethnicity subpopulations remains limited in pediatrics.Therefore, future work is needed to determine whether the differential patterns observed in providers' documentation of sleep problems are related to unique differences in symptom presentation or literacy across racial/ethnic subpopulations.More specifically, future research should examine whether variation is due to patient self-report or caregiver report, differing community beliefs and literacy about sleep as a health experience, and/or the social and environmental determinants that are potential drivers of existing pediatric sleep disparities (Reynolds et al., 2023;Yang et al., 2023).Using a cross-sectional retrospective cohort design and EHR data, this study is not capable of determining whether providerpatient interactions, implicit cognitive bias of providers or patients, or informed presence influences the ways Peds B-SATED keywords or global dimensions are captured or missed in primary care providers' documentation (Phelan et al., 2017).However, these findings do raise some concerns about differences in Peds B-SATED keyword rankings across racial/ethnic subpopulations.The way a provider documents sleep problems influences how patients' future providers monitor and treat their pediatric sleep problems as clinically meaningful (Honaker and Saunders, 2018).Consequently, varied provider documentation may yield differential care outcomes across racial/ethnic subpopulations of patients.Future studies and replication (e.g., other time periods, clinic settings, and multiple institutions) are needed to understand the factors that cause this differential documentation pattern that we observed across racial/ethnic subpopulations in our pediatric primary care cohort.In addition, the current study included a primarily non-Hispanic Black patient population.

Conclusion
Overall, the purpose of the study was to assess racial/ethnic differences in providers' documentation of Peds B-SATED keywords and global dimensions.Our findings unearthed racial/ethnic differences in our training data, using both LASSO and multilevel modeling approaches.The three primary results related to racial/ethnic bias in our NLP training data are both informative and addressable.First, we found that primary care providers' documentation of keywords in clinical notes was able to predict patients' race/ethnicity and that this difference was most observable among non-Hispanic Black and non-Hispanic white patient subpopulations.Second, the Peds B-SATED keyword rankings qualitatively differed across racial/ethnic subpopulations.Finally, the results of the mixed-effects models revealed that the presence of global dimensions in clinical notes varied between racially/ ethnically minoritized patients compared to non-Hispanic white patients.In the end, the findings confirmed that developing standardized guidelines for documenting pediatric sleep problems in pediatric primary care, in collaboration with specialty sleep providers, may be warranted.This also highlights implications for routine and multidimensional screening in pediatric primary care settings, due to providers' differential patterns of documenting Peds B-SATED keyword and global dimensions that may contribute to differences in clinical outcomes across racial/ethnic subpopulations.In pediatric sleep research, these findings have important implications for identifying a potential sleep care gap that is preventable in pediatric primary care.The presented confusion matrix in this figure from the least absolute shrinkage and selection operator regression model revealed that providers' use of Peds B-SATED keywords in clinical notes predicted patients' race/ethnicity.The color spectrum on the right, ranging from 0 to 1,600, displays a near-zero predictive ability in purple and a high predictive ability in yellow.Ideally, the matrix would display that all predicted labels, race/ethnicity, were purple and near 0. NH, non-Hispanic; Multi, Multiracial.Predict patient race: bootstrap (1,000 reps) 10-fold cross-validation.The Figure displays the distribution of the area under the receiver operating characteristic curve (AUC ROC) for 1,000 bootstrapped iterations of 10-fold cross-validation predicting patient race.AUC ROC is a metric that reflects how well a model can discriminate between categories.A random model will have an AUC ROC score of 0.5, and a perfect model (can separate everything without error) will have a value of 1.0.Our average AUC ROC score is 0.6194, with a standard distribution of 0.0151.In no case was the AUC ROC score at or below 0.5, indicating that the model is always able to discriminate patients' race/ethnicity using the Peds B-SATED keywords from primary care providers' clinical notes.

FIGURE 3 .
FIGURE 3. Distribution of scores for 10-fold cross-validation on 1,000 rep bootstrap.The figure displays the specific performance of the least absolute shrinkage and selection operator regression model prediction of racial/ethnic subpopulations across precision, recall, F1-score (the harmonic mean of precision and recall), and the model's overall accuracy.NH, non-Hispanic; Multi, Multiracial.

TABLE 1
Top 10 Peds B-SATED keywords based on highest ranking coefficients.
Front Sleep.Author manuscript; available in PMC 2024 May 30.

TABLE 2
Mixed-effects logistic regression models predicting racial/Ethnic differences in global Peds B-SATED dimensions.NH white; aOR, adjusted odds ratio; NH, non-Hispanic.Note-level covariates include provider author type and pandemic status.Patient-level covariates include biological sex and age.
Front Sleep.Author manuscript; available in PMC 2024 May 30.