AUTHOR=Kumar Devender , Puthusserypady Sadasivan , Dominguez Helena , Sharma Kamal , Bardram Jakob E. TITLE=CACHET-CADB: A Contextualized Ambulatory Electrocardiography Arrhythmia Dataset JOURNAL=Frontiers in Cardiovascular Medicine VOLUME=Volume 9 - 2022 YEAR=2022 URL=https://www.frontiersin.org/journals/cardiovascular-medicine/articles/10.3389/fcvm.2022.893090 DOI=10.3389/fcvm.2022.893090 ISSN=2297-055X ABSTRACT=Electrocardiogram (ECG) is a non-invasive tool for arrhythmia detection. In recent years, wearable ECG-based ambulatory arrhythmia monitoring has gained increasing attention. However, arrhythmia detection algorithms trained on existing public arrhythmia databases show higher false-positive rate (FPR) when applied to such ambulatory ECG recordings. It is primarily because the existing public databases are relatively clean as they are recorded using clinical-grade ECG devices in controlled clinical environments. They may not represent the signal quality and artifacts present in ambulatory patient-operated ECG. To help build and evaluate arrhythmia detection algorithms that can work on wearable ECG from free-living conditions, we present the design and development of the CACHET Contextualised Arrhythmia Database (CACHET-CADB), a multi-site contextualized ECG database from free-living conditions. The CACHET-CADB is subpart of the REAFEL study, which aims at reaching the frail elderly patient to optimize the diagnosis of atrial fibrillation. In contrast to the existing databases, along with the ECG, CACHET-CADB also provides the continuous recording of patients’ contextual data such as activities, body positions, movement accelerations, symptoms, stress level, and sleep quality. These contextual data can aid in improving the machine/deep learning-based automated arrhythmia detection algorithms on patient-operated wearable ECG. Currently, CACHET-CADB has 259 days of contextualized ECG recordings from 24 patients, and 1602 manually annotated 10-seconds heart-rhythm samples. The patient’s ambulatory context information (activities, movement acceleration, body position, etc.) is extracted for every 10-seconds interval cumulatively. From the analysis, nearly 11% of the ECG data in the database is found to be noisy. A software toolkit for the use of the CACHET-CADB is also provided.