AUTHOR=Lyu Shiyang , Craig Simon , O'Reilly Gerard , Taniar David TITLE=The development and use of data warehousing in clinical settings: a scoping review JOURNAL=Frontiers in Digital Health VOLUME=Volume 7 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/digital-health/articles/10.3389/fdgth.2025.1599514 DOI=10.3389/fdgth.2025.1599514 ISSN=2673-253X ABSTRACT=IntroductionThe emergence of data warehousing in clinical settings has greatly enhanced data analysis capabilities, facilitating the accurate and comprehensive extraction of valuable information. This scoping review explores the contributions of data warehouses in clinical settings by analysing the strengths, challenges and implications of each type of data warehouse, with a particular focus on general and specialised types.MethodsThis scoping review adheres to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. We searched four databases (PubMed, CINAHL, Scopus and IEEE-Xplore), identifying peer-reviewed, English-language studies from 1st January 2014 to 1st January 2024, that focus on data warehousing in healthcare, covering either general or specialised data warehouse applications. Python programming was used to extract the search results and transform the data into a tabular format for analysis.ResultsAfter removing 1,194 duplicates, 4,864 unique papers remained. Abstract screening excluded 4,590 as irrelevant, leaving 274 for full-text evaluation. In total, 27 papers met the inclusion criteria, of which 17 focused on general data warehouses and 10 on specialised data warehouses.General data warehouses were found to be primarily used to address data integration issues, particularly for electronic health record (EHR)/ Electronic medical Record (EMR) and general clinical data. These warehouses typically use a star schema architecture with online analytical processing (OLAP) and query analysis capabilities. In contrast, specialised data warehouses were focused on improving the quality of decision support by handling a wide range of data specific to diseases, using specialised architectures and advanced artificial intelligence (AI) capabilities to address the unique and complex challenges associated with these tasks.ConclusionsGeneral purpose data warehouses effectively integrate disparate data sources to provide a comprehensive view of disease management, patient care, and resource management. However, their flexibility and analytical capabilities need improvement. In contrast, specialised data warehouses are gaining popularity for their focus on specific diseases or research purposes, using advanced tools such as data mining and AI for superior analytical performance. Despite their innovative designs, these specialised warehouses face scalability challenges due to their customised nature. Addressing these challenges with advanced analytics and flexible architectures is critical.