Are U.S. graduate curricula ready to prepare social data scientists for the AI era?

Dong, Yixiao; Baral, Deodatta; Baral, Kushmakar

doi:10.3389/feduc.2025.1657651

BRIEF RESEARCH REPORT article

Front. Educ., 12 January 2026

Sec. Higher Education

Volume 10 - 2025 | https://doi.org/10.3389/feduc.2025.1657651

Are U.S. graduate curricula ready to prepare social data scientists for the AI era?

Yixiao Dong^1,2^*

Deodatta Baral²

Kushmakar Baral²

¹Department of Education, University of California, Santa Barbara, Santa Barbara, CA, United States
²Department of Research Methods and Information Science, University of Denver, Denver, CO, United States

The evolving skill demands of the data science workforce present unique challenges for individuals trained in the social science disciplines. This study examines the readiness of U.S. graduate programs in preparing social data scientists for the AI era. We collected and analyzed publicly available coursework plans (n = 97) from graduate programs at research universities in the U.S. that focus on training social data scientists. Required skills for data scientists were identified through a random sample of current job postings (n = 30) on LinkedIn and cross-validated with findings from the relevant literature. Using Python-based web scraping and text content analysis, we identified the 10 most in-demand skills within the data science industry and conducted a binary coding of whether each program offers coursework relevant to these skills. These 10 binary indicators were subsequently analyzed using Rasch modeling. The results indicate notable gaps between graduate curricula and industry expectations, and also highlight the need to reform graduate education to better prepare social data scientists for the new demands of the AI era.

Introduction

Consider this question: What skills are needed for a successful career in data science? Twenty years ago, responses would have emphasized data management, research methodology, and statistical modeling expertise. Today, while these skills remain fundamental, the list of skills in people’s responses may expand considerably to include machine learning, natural language processing, Python programming, and artificial intelligence (AI) competencies. Data science has flourished in the 21st century (Donoho, 2017; Schwab-McCoy et al., 2021). The integration of AI techniques is propelling this field to new heights, which has triggered substantial changes in various ways (Dong, 2025; Ho, 2024; Liu et al., 2024; Luan et al., 2020; Min et al., 2024). Concurrently, the skill sets required for data scientists are shifting, particularly for those trained in traditional social science disciplines (e.g., Educational Statistics, Quantitative Psychology, and Data Analytics for Social Sciences).

Data scientists represent a broad and somewhat heterogeneous population, given their diverse training in disciplines such as social science, medical science, and computer science (Donoho, 2017). Programs and curricula in computer science (or engineering schools) are more likely to keep up with rapid advancements, as they are the native residents who initiate the development of AI techniques and make early contributions to introducing AI to the data science field. In contrast, those from the social science family rarely have the first-mover advantage and might lag behind new shifts (Luan et al., 2020).

In effect, training in social science disciplines is still striving to catch up with the advanced but conventional skill sets demanded prior to the AI boom in the early 2020s. For example, Everson (2022) identifies substantial statistical skills gaps for professors within schools of education, and these gaps are evident in both advanced methods (e.g., propensity score matching, structural equation modeling, and item response theory) and software packages (e.g., R, SAS, and Stata) needed for training educational data scientists. The challenges derived from these gaps could be even more significant for those in minority-serving colleges and universities, where they tend to have less federal and financial support (Brown, 2013). The constantly evolving and dynamic nature of data science has been a major hurdle for faculty teaching up-to-date content (Schwab-McCoy et al., 2021). Now, AI is reshaping the landscape in data science, as well as the needs of the associated industrial labor market (Hijazi and Alfaki, 2020; Liu et al., 2024), which may create new challenges or magnify existing ones for training data scientists in the social science fields.

Current study

Program curricula are undoubtedly a fundamental component of data science education (Gundlach and Ward, 2021; Hardin et al., 2021; Nolan and Temple Lang, 2010; Schwab-McCoy et al., 2021). The present study aims to examine the curriculum readiness of graduate programs in preparing social data scientists for the AI era, that is, to investigate whether current graduate curricula in social science disciplines adequately cover the skills required for data scientists in today’s evolving landscape. The objective of this study is not to develop a new measure of curriculum readiness, which would typically require comprehensive psychometric validation (e.g., Boateng et al., 2018). Rather, we aimed to generate complementary and additional evidence from a measurement-analog model to address our research goal. Furthermore, identifying potential gaps in existing curricula is a crucial step for advancing data science education (Everson, 2022; Hardin et al., 2021), as well as for promoting effective curriculum reforms. Thus, this research also contributes to broader discussions on how to better align graduate training with the evolving demands placed on data scientists in the AI era.

Methods

The Methods section outlines the sources of curriculum data from graduate programs, data collection procedures (e.g., web scraping), and a description of the main analyses (i.e., text analysis and Rasch modeling) employed in the current study.

Data sources and collection procedures

To address the research purpose, we collected data from two sources: (1) web-based curricula data and (2) job posting suggested skill requirement data for data scientists in the AI era.

For the curricula data, we gathered all publicly available Coursework Plans (CWPs, n = 97) from U.S. universities’ graduate programs aimed at training social data scientists. These programs included Quantitative Psychology, Measurement and Quantitative Methods, Educational Statistics, and so forth. These social science programs were chosen for a shared goal of equipping graduates with the quantitative and methodological skills for a career in data science. Additionally, we targeted graduate-level programs at R1 and R2 universities (i.e., those with high or very high research activity). This is because research universities place greater emphasis on graduate-level education, whereas teaching universities typically focus more on undergraduate education. Although graduate training is not strictly required for being a data scientist, graduate programs often offer deeper training in areas such as programming, advanced statistics, or machine learning, which are highly valued in data science (Jiang and Chen, 2022).

Based on a recent version of the Carnegie Classification of Institutions of Higher Education (2024), we compiled a list of website URLs for the program CWPs through a manual search. Specifically, two researchers scrutinized each research university’s websites to locate qualifying programs and CWPs. After that, web scraping was performed using Python (Mitchell, 2018) to extract the content of coursework plans from each URL. The web scraping gathered coursework titles and descriptions, skill emphases, and training goals of programs from each website.

For the skill requirement data, we collected the essential or required skills for current data scientists by examining a sample of new job postings in 2024 on LinkedIn. The search terms for locating job postings on LinkedIn included “Data Scientist,” “AI or Artificial Intelligence” Research Scientist,” and “Social Science.” In filtering search outcomes, we selected multiple work experience options: “Entry,” “Senior,” and “Manager” to ensure positions required different levels of data science skill proficiency are represented. The locations of search jobs were restricted to the U.S, given the study population. We randomly sampled 30 data scientist job postings, which included industry leaders such as Lockheed Martin, Udemy, Gusto, Deloitte, Google, DoorDash, and UC Health. This sample encompassed organizations from various sectors that may hire data scientists with a social science background, including technology, healthcare, education, and consulting. For each job posting, skill sets (e.g., SQL, Python, and machine learning) were collected and coded based on the specific requirements (e.g., required skills or qualifications) outlined in the job descriptions.

Overview of analysis

The present study primarily utilized content analysis of text through Python (Sarkar, 2016), followed by a Rasch analysis of the extracted information and recoded indicators representing curriculum readiness for training social data scientists. First, we examined the frequency of keywords about skills from 30 online data science job postings on LinkedIn. These frequencies highlighted the skills currently demanded or preferred by employers in the U.S. data science industry. Subsequently, using the compiled list of 10 key skills, another text analysis was implemented to analyze the CWP content of the 97 graduate programs in social data science to identify gaps between industry requirements and graduate training. Each CWP was coded via Python following a dichotomous coding scheme to indicate whether each identified skill was reflected from the program’s coursework (1 = at least one course containing keywords matching the skill; 0 = the skill is not reflected in any coursework).

Next, we entered the indicators into a dichotomous Rasch model to quantify the curriculum readiness of each program and to examine the alignment between current social data science graduate curricula and industry demands. The current study does not aim to comprehensively develop or validate a measure via Rasch modeling; however, certain Rasch analysis results (e.g., Wright maps; Boone et al., 2014) can effectively reveal and visualize potential skill gaps between graduate training and industry needs. In this study, three Rasch analyses, including unidimensionality, item fit, and construct coverage (i.e., Wright map), were conducted using Winsteps 5.3.1 (Linacre, 2022). It is important to note that all CWPs were collected and analyzed by the summer of 2024, by which time all graduate programs were expected to have released their coursework for the most recent academic year (i.e., 2024–2025).

Results

Here, we first present descriptive findings that identify the key skills required for social data scientists in the AI age and assess how well these skills are covered in the analyzed social science graduate curricula. Rasch modeling results are then provided to further demonstrate the gaps between current industry needs and the skills taught in social science graduate programs for training data scientists.

Demanded skills of data scientists in the AI age

A total of ten key skills were identified from job postings: machine learning (including deep learning), Bayesian analysis, cloud computing, artificial intelligence, statistics, algorithms, programming, Python, SQL, and research. This list was also cross-validated with skills suggested in the literature for data science (e.g., Ismail and Abidin, 2016; Li et al., 2021) to ensure its coverage and representativeness. These skills predominantly include specialized technical and programming skills such as Python, deep learning, machine learning, and artificial intelligence, as well as general research and statistics skills valued in traditional social data scientist training. The results highlight the high demands of technical skills for social data scientists (Costa and Santos, 2017), as well as reflect the evolving nature of the data science field.

Gaps between industry needs and social science graduate curricula for data scientists

Table 1 summarizes the number and percentage of programs offering courses that cover each data science skill identified from the analyzed job posts, ranked from the lowest to the highest percentages. The majority of graduate programs offer courses covering content related to research (97.94%) and statistics (86.60%). However, beyond these traditional skills, less than 10% of the programs provide training in more advanced technical skills or tools (e.g., machine learning, algorithms, and cloud computing).

Table 1

Table 1. Number and percentage of programs offering courses that cover each data science skill.

A total skill-coverage score was calculated for each program based on the 10 dichotomously coded skill variables (i.e., whether or not each of the 10 skills was covered by each program). The total score had a possible range from 0 to 10, with lower course-skill coverage scores indicating a more severe misalignment between program training and industrial demands. Among the 97 programs, the mean score was 2.13 out of 10 (SD = 0.78), with a median of 2, indicating that most programs’ current coursework covers only a limited number of skills (typically traditional research and statistics skills) required for data scientists in the AI era.

Psychometric evidence and implications from Rasch modeling

In addition to the descriptive statistics presented above, Rasch modeling was applied to quantify and investigate the curriculum readiness of each program for training social data scientists, using nine of the t10 dichotomous skillset indicators. The “cloud computing” indicator was excluded from the analysis because it showed no variance—no program involved in this study offered coursework covering this particular skill.

We first examined the dimensionality of the curriculum readiness items through a principal components analysis of residuals (PCAR). The Rasch dimension accounted for 86.1% of the variance in the observations, while the first contrast in the residuals (i.e., the largest secondary dimension) had a relatively small eigenvalue of 2.78 and explained only 4.3% of the variance. This indicates that the Rasch dimension explained the overwhelming majority of variance in the data, and including a secondary dimension would contribute minimally to explaining additional variance. Collectively, these results provide strong evidence of the unidimensionality of the measure (Linacre, 2025).

Table 2 summarizes the mean square (MNSQ) and standardized (ZSTD) item fit statistics. The “Research” indicator exhibited an outfit MNSQ of 9.9 and a ZSTD of 9.91, which substantially exceeds the typical acceptable fit range (e.g., MNSQ between 0.6 and 1.4; Wright and Linacre, 1994; ZSTD between −2 and 2; Bond and Fox, 2015). The misfit of the “Research” indicator suggests that it is not an appropriate item for representing curriculum readiness in the context of AI data science training. This is likely because nearly all collected graduate programs offer research-related coursework, making it ineffective at differentiating program-level readiness.

Table 2

Table 2. Summary of item fit statistics.

We further examined the alignment between graduate curricula and industry skill demands using a Wright map (see Figure 1). The curriculum readiness level of programs was placed on the left side of the continuum, while the difficulty levels of the nine skill indicators were plotted on the right. Within the current research context, a more difficult skill indicator generally indicates that programs are less likely to offer coursework covering that skill. From Figure 1, most skill indicators (all except “Statistics” and “Research”) cluster near the upper range of the continuum, between +1.5 and +4 logits. In contrast, programs are concentrated toward the lower end, around the −3.5 logit position. These results echo the preliminary descriptive findings and further confirm the existence of a substantial gap between the skills taught in U.S. social science graduate curricula and those demanded by data science industry jobs. Although the general misalignment was evident, several programs demonstrated better alignment. For example, the “Statistics/Machine Learning Joint PhD” program at Carnegie Mellon University (CMU) was positioned near +3 logits at the high end of the continuum. The program’s CWP covers six of the ten outlined skills and has shown the closest alignment between its curriculum and industry demands among the 97 programs analyzed. This finding suggests that a joint training model that integrates social science and computer science could be an innovative and effective strategy for preparing social data scientists in the AI era.

Figure 1

A vertical item-person map that plots the distribution of difficulty levels for the nine skill indicators and the curriculum readiness level of graduate programs. Skills include SQL, AI, Python, ML (Machine Learning), Bayes, Programming, Statistics, Algorithms, and Research. Measures range from four to negative eleven logits. Each hash represents five programs, and each dot represents one to four programs.

Figure 1. Wright map of curriculum readiness for preparing social data scientists in the AI era.

Discussion

In terms of the results above, we discuss the observed misalignment between academic preparation and industry demands in the social data science and elaborate on several possible pathways to narrow this gap. The limitations and future directions of the current work are also included in this section.

The growing misalignment

When we look back at the long-standing conversations around academic preparation versus industry demands (e.g., Everson, 2022; Stone et al., 2009; Trauth et al., 1993), gaps seem to be inevitable. However, the misalignment between social data scientist curricula and industrial demands observed in the current study appears to be substantial than usual. Graduate programs are expected to provide foundational learning opportunities to help students build expertise in data science. Unfortunately, the skill gap was dramatically large and appeared not to be feasibly bridged through work experience alone, as most programs do not offer emerging skills (e.g., machine learning, cloud computing, and programming) required by the data science labor market in the current AI era. This might result in employment challenges for graduates, as well as raise concerns about the value of current higher education.

Moreover, the actual gap could be larger than what the present work observed for two reasons. First, it is a stringent assumption to expect all graduates to master the skills by just taking a course that covers relevant content. Curricula are fundamental but insufficient for students to build expertise in data science, especially in the age of AI (Spanjaard et al., 2018). Additionally, the current study investigates the graduate curricula of R1 and R2 universities. When these generally better-resourced institutions in graduate education lag behind, others might struggle even more to provide the necessary training.

The arrival of AI has clearly played a role in enlarging this gap. AI gained more popularity with the prevalent use of Chat-GPT in 2022, but it has permeated data science research and practice (Dong, 2025) and impacted the industrial labor market (Liu et al., 2024) for decades. As Liu et al. highlighted, since 2010, there has been a growing emphasis on “hard” technical skills within AI-related data scientist roles. The current study echoes this finding, showing that technical skills are highly demanded. Therefore, addressing this gap, especially in the area of hard skills, is essential for graduates to land jobs in the data science industry.

To narrow the gap: curricula reform or training mode reform?

To narrow the gap, reforming graduate curricula to better align with industrial needs seems to be a necessary step. However, this task could be challenging if we only seek solutions within social science disciplines. College faculty are the main agents to deliver curricula. As Everson (2022) noted, even the current faculty in social science commonly struggle with catching up on programming skills and have high demands for related professional development. A shortcut could be hiring new faculty with developed AI expertise (or with computer science backgrounds), but the immediate hiring of new faculty does not seem to be a viable strategy for every university. The existing tenure track system makes faculty turnover slower than in industrial or corporate sectors, which means most programs could take years or decades to accomplish a faculty iteration. By that time, dynamic industrial demands would have shifted again.

Then, reforming the mode of program training could be a more promising path to narrow the identified gap (Maassen and Cloete, 2006). The direction of reform may integrate interdisciplinary and technical training by collaborating with other schools (e.g., schools of engineering) that already have experts and talents in AI-related skills. The CMU joint PhD program, which offers students in social science both traditional statistical coursework and technical machine learning content, could be a good example. More importantly, such a mode might be practically scalable because it primarily reorganizes and reconciles already existing resources within a university. In addition to formalizing a joint program, incorporating individualized elective or cognate courses from other disciplines into students’ CWPs can be an alternative but more flexible strategy. Notably, such an approach often requires more advising support to help students identify suitable courses for their skill development. Meanwhile, it is generally recommended to integrate practical or experiential learning opportunities into programs to further align academic preparation with industry demands (Kolb, 2014). Such changes may collectively improve graduates’ employability and readiness and ensure that data science programs in social science disciplines remain competitive in the AI era.

Limitations and future directions

Given the samples of graduate programs and job posts, research conclusions are limited to social science graduate programs and the social data science industry in the U.S. Future research may examine the generalizability of the study findings in a global setting where non-U.S. graduate programs and job markets are included in the analyses. In particular, it would be beneficial to increase the number of job posts analyzed. The current study used Python-based text analysis to efficiently analyze course webpages and job posts, which may limit the interpretive depth of available data. Certain research findings (e.g., the skill demands of industry and program training coverage) warrant cross-validation through further investigation, such as in-depth interviews with industry leaders or employers in the data science field regarding their specific skill demands for data scientist employees and their perspectives on how to better align graduate training in social science with industry needs. Additionally, the keyword matching approach applied in the current work may assess the course coverage of skills but could be less effective in understanding the depth of training for each skill. Some skills (e.g., AI literacy) cannot be adequately represented or captured by one or two keywords. Future research should consider developing more sophisticated coding schemes to capture related skills based on broader textual contexts.

Data availability statement

The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found at: https://osf.io/vmwzq/.

Author contributions

YD: Project administration, Visualization, Formal analysis, Methodology, Resources, Investigation, Conceptualization, Writing – review & editing, Supervision, Software, Writing – original draft. DB: Investigation, Software, Writing – review & editing, Writing – original draft, Formal analysis, Validation, Data curation. KB: Software, Investigation, Writing – review & editing, Writing – original draft, Data curation, Methodology.

Funding

The author(s) declared that financial support was not received for this work and/or its publication.

Conflict of interest

The author(s) declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declared that Generative AI was used in the creation of this manuscript. Tools for copy-editing this paper (e.g., Grammarly) may utilize generative AI engines; however, no generative AI applications were used to produce any original content in this paper.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

Boateng, G. O., Neilands, T. B., Frongillo, E. A., Melgar-Quiñonez, H. R., and Young, S. L. (2018). Best practices for developing and validating scales for health, social, and Behavioral research: a primer. Front. Public Health 6:149. doi: 10.3389/fpubh.2018.00149,

PubMed Abstract | Crossref Full Text | Google Scholar

Bond, T. G., and Fox, C. M. (2015). Applying the Rasch Model: Fundamental Measurement in the Human Sciences (3^rd edition) : Psychology Press.

Google Scholar

Boone, W. J., Staver, J. R., and Yale, M. S. (2014). “Wright maps: first steps” in Rasch Analysis in the Human Sciences. eds. W. J. Boone, J. R. Staver, and M. S. Yale, 111–136.

Google Scholar

Brown, M. C. II. (2013). The declining significance of historically black colleges and universities: relevance, reputation, and reality in Obamamerica. J. Negro Educ., 82, 3–19. doi: 10.7709/jnegroeducation.82.1.0003

Crossref Full Text | Google Scholar

Carnegie Classification of Institutions of Higher Education. 2024. Carnegie classifications. The Carnegie Foundation for the Advancement of Teaching. Available online at: https://carnegieclassifications.acenet.edu/

Google Scholar

Costa, C., and Santos, M. Y. (2017). The data scientist profile and its representativeness in the European e-competence framework and the skills framework for the information age. Int. J. Inf. Manag. 37, 726–734. doi: 10.1016/j.ijinfomgt.2017.07.010

Crossref Full Text | Google Scholar

Dong, Y. (2025). Pre-uniform measures in the artificial intelligence era. Curr. Psychol. 44, 7919–7933. doi: 10.1007/s12144-025-07374-1

Crossref Full Text | Google Scholar

Donoho, D. (2017). 50 years of data science. J. Comput. Graph. Stat. 26, 745–766. doi: 10.1080/10618600.2017.1384734

Crossref Full Text | Google Scholar

Everson, K. C. (2022). Statistical skills gaps of professors of education at U.S. universities and HBCUs. J. Stat. Data Sci. Educ. 30, 45–53. doi: 10.1080/26939169.2022.2034488

Crossref Full Text | Google Scholar

Gundlach, E., and Ward, M. D. (2021). The data mine: enabling data science across the curriculum. J. Stat. Data Sci. Educ. 29, S74–S82. doi: 10.1080/10691898.2020.1848484

Crossref Full Text | Google Scholar

Hardin, J., Horton, N. J., Nolan, D., and Lang, D. T. (2021). Computing in the statistics curricula: a 10-year retrospective. J. Stat. Data Sci. Educ. 29, S4–S6. doi: 10.1080/10691898.2020.1862609

Crossref Full Text | Google Scholar

Hijazi, R., and Alfaki, I. (2020). Reforming undergraduate statistics education in the Arab world in the era of information. J. Stat. Educ. 28, 75–88. doi: 10.1080/10691898.2019.1705943

Crossref Full Text | Google Scholar

Ho, A. D. (2024). Artificial intelligence and educational measurement: opportunities and threats. J. Educ. Behav. Stat. 49, 715–722. doi: 10.3102/10769986241248771

Crossref Full Text | Google Scholar

Ismail, N. A., and Abidin, W. Z. (2016). Data scientist skills. IOSR J. Mob. Comp. Appl. 3, 52–61. doi: 10.9790/0050-03045261

Crossref Full Text | Google Scholar

Jiang, H., and Chen, C. (2022). Data science skills and graduate certificates: a quantitative text analysis. J. Comput. Inf. Syst. 62, 463–479. doi: 10.1080/08874417.2020.1852628

Crossref Full Text | Google Scholar

Kolb, D. A. (2014). Experiential Learning: Experience as the Source of Learning and Development. Upper Saddle River, NJ: FT press.

Google Scholar

Li, G., Yuan, C., Kamarthi, S., Moghaddam, M., and Jin, X. (2021). Data science skills and domain knowledge requirements in the manufacturing industry: a gap analysis. J. Manuf. Syst. 60, 692–706. doi: 10.1016/j.jmsy.2021.07.007

Crossref Full Text | Google Scholar

Linacre, J. M. 2022 Winsteps® (version 5.3.1) [computer Software] Winsteps.com

Google Scholar

Linacre, J. M. 2025 Winsteps Rasch Measurement Computer Program User’s Guide Winsteps.com

Google Scholar

Liu, J., Chen, K., and Lyu, W. (2024). Embracing artificial intelligence in the labour market: the case of statistics. Humanit. Soc. Sci. Commun. 11:1112. doi: 10.1057/s41599-024-03557-6

Crossref Full Text | Google Scholar

Luan, H., Geczy, P., Lai, H., Gobert, J., Yang, S. J. H., Ogata, H., et al. (2020). Challenges and future directions of big data and artificial intelligence in education. Front. Psychol. 11:580820. doi: 10.3389/fpsyg.2020.580820,

PubMed Abstract | Crossref Full Text | Google Scholar

Maassen, P., and Cloete, N. (2006). Global reform trends in higher education. Transformation in higher education: Global pressures and local realities. eds. N. Cloete, P. Maassen, R. Fehnel, T. Moja, T. Gibbon, and H. Perold. (The Netherlands: Springer).

Google Scholar

Min, J., Song, X., Zheng, S., King, C. B., Deng, X., and Hong, Y. (2024). Applied statistics in the era of artificial intelligence: a review and vision. arXiv. doi: 10.48550/arXiv.2412.10331

Crossref Full Text | Google Scholar

Mitchell, R. (2018). Web Scraping with Python: Collecting More Data from the Modern Web. Sebastopol, CA: O'Reilly Media.

Google Scholar

Nolan, D., and Temple Lang, D. (2010). Computing in the statistics curricula. Am. Stat. 64, 97–107. doi: 10.1198/tast.2010.09132

Crossref Full Text | Google Scholar

Sarkar, D. (2016). Text Analytics with Python, vol. 2. New York, NY: Apress.

Google Scholar

Schwab-McCoy, A., Baker, C. M., and Gasper, R. E. (2021). Data science in 2020: computing, curricula, and challenges for the next 10 years. J. Stat. Data Sci. Educ. 29, S40–S50. doi: 10.1080/10691898.2020.1851159

Crossref Full Text | Google Scholar

Spanjaard, D., Hall, T., and Stegemann, N. (2018). Experiential learning: helping students to become ‘career-ready’. Australas. Mark. J. 26, 163–171. doi: 10.1016/j.ausmj.2018.04.003

Crossref Full Text | Google Scholar

Stone, K. B., Kaminski, K., and Gloeckner, G. (2009). Closing the gap: education requirements of the 21st century production workforce. J. Ind. Teach. Educ. 45, 5–33.

Google Scholar

Trauth, E. M., Farwell, D. W., and Lee, D. (1993). The IS expectation gap: industry expectations versus academic preparation. MIS Q. 17:293. doi: 10.2307/249773

Crossref Full Text | Google Scholar

Wright, B. D., and Linacre, J. M. (1994). Reasonable mean-square fit values. Rasch Meas. Trans. 8, 370–371.

Google Scholar

Keywords: artificial intelligence, coursework, curriculum gap, data science, Rasch modeling, text analysis, web-scraping

Citation: Dong Y, Baral D and Baral K (2026) Are U.S. graduate curricula ready to prepare social data scientists for the AI era? Front. Educ. 10:1657651. doi: 10.3389/feduc.2025.1657651

Received: 01 July 2025; Revised: 06 December 2025; Accepted: 23 December 2025;
Published: 12 January 2026.

Edited by:

Barbara Jones, Bibliometrica Limited, United Kingdom

Reviewed by:

Francisco Rafael Trejo-Macotela, Universidad Politécnica de Pachuca, Mexico
Rany Sam, National University of Battambang, Cambodia

Copyright © 2026 Dong, Baral and Baral. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Yixiao Dong, eWRvbmdAdWNzYi5lZHU=

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.