3D gait analysis in children using wearable sensors: feasibility of predicting joint kinematics and kinetics with personalized machine learning models and inertial measurement units

Introduction: Children’s walking patterns evolve with age, exhibiting less repetitiveness at a young age and more variability than adults. Three-dimensional gait analysis (3DGA) is crucial for understanding and treating lower limb movement disorders in children, traditionally performed using Optical Motion Capture (OMC). Inertial Measurement Units (IMUs) offer a cost-effective alternative to OMC, although challenges like drift errors persist. Machine learning (ML) models can mitigate these issues in adults, prompting an investigation into their applicability to a heterogeneous pediatric population. This study aimed at 1) quantifying personalized and generalized ML models’ performance for predicting gait time series in typically developed (TD) children using IMUs data, 2) Comparing random forest (RF) and convolutional neural networks (CNN) models’ performance, 3) Finding the optimal number of IMUs required for accurate predictions. Methodology: Seventeen TD children, aged 6 to 15, participated in data collection involving OMC, force plates, and IMU sensors. Joint kinematics and kinetics (targets) were computed from OMC and force plates’ data using OpenSim. Tsfresh, a Python package, extracted features from raw IMU data. Each target’s ten most important features were input in the development of personalized and generalized RF and CNN models. This procedure was initially conducted with 7 IMUs placed on all lower limb segments and then performed using only two IMUs on the feet. Results: Findings suggested that the RF and CNN models demonstrated comparable performance. RF predicted joint kinematics with a 9.5% and 19.9% NRMSE for personalized and generalized models, respectively, and joint kinetics with an NRMSE of 10.7% for personalized and 15.2% for generalized models in TD children. Personalized models provided accurate estimations from IMU data in children, while generalized models lacked accuracy due to the limited dataset. Furthermore, reducing the number of IMUs from 7 to 2 did not affect the results, and the performance remained consistent. Discussion: This study proposed a promising personalized approach for gait time series prediction in children, involving an RF model and two IMUs on the feet.

Introduction: Children's walking patterns evolve with age, exhibiting less repetitiveness at a young age and more variability than adults.Threedimensional gait analysis (3DGA) is crucial for understanding and treating lower limb movement disorders in children, traditionally performed using Optical Motion Capture (OMC).Inertial Measurement Units (IMUs) offer a cost-effective alternative to OMC, although challenges like drift errors persist.Machine learning (ML) models can mitigate these issues in adults, prompting an investigation into their applicability to a heterogeneous pediatric population.This study aimed at 1) quantifying personalized and generalized ML models' performance for predicting gait time series in typically developed (TD) children using IMUs data, 2) Comparing random forest (RF) and convolutional neural networks (CNN) models' performance, 3) Finding the optimal number of IMUs required for accurate predictions.
Methodology: Seventeen TD children, aged 6 to 15, participated in data collection involving OMC, force plates, and IMU sensors.Joint kinematics and kinetics (targets) were computed from OMC and force plates' data using OpenSim.Tsfresh, a Python package, extracted features from raw IMU data.Each target's ten most important features were input in the development of personalized and generalized RF and CNN models.This procedure was initially conducted with 7 IMUs placed on all lower limb segments and then performed using only two IMUs on the feet.

Introduction
Children's walking patterns are distinctive from adults and evolve with age (Cigali et al., 2011;Senden et al., 2023).At a small age, gait tends to be less repetitive and will differ from those of adults (Ganley and Powers, 2005;Jain et al., 2016), emphasizing the need to build normative data for a pediatric population (Ganley and Powers, 2005).The differences in school-aged children's walking patterns are often attributed to their significant musculoskeletal changes (Bari et al., 2023), given that they are in a critical stage of growth and development (Onis et al., 2007).A recent study (Bach et al., 2021) suggested that the degree of gait maturity does not always directly relate to the chronological age of the child.This finding underscores the complexity of assessing gait development in children.Moreover, it's been shown that, compared to adults, young individuals exhibit more variable kinematic patterns when performing repetitive movements (Kuhtz-Buschbeck et al., 1996).Additionally, the observed variability in Electromyography (EMG) waveform within-session for children exhibited approximately twice the variability of EMG signals (muscle activation level) for adults (Granata et al., 2005), affecting their joint kinematics and kinetics within a single session.
Three-dimensional gait analysis (3DGA) is a valuable tool for understanding a child's gait pattern and how it compares with normative data of typically developed children's gait (Ito et al., 2022;Bari et al., 2023).The insights gained from 3DGA in children affected by lower limb movement disorders serve as a foundation for clinical assessment to target personalized treatment and improve their walking patterns (Bari et al., 2023).Considering the unique challenges and broad spectrum of motor impairments in this population, addressing developmental challenges requires a tailored approach.The current gold standard for performing 3DGA involves Optical Motion Capture (OMC) along with force plates due to its high accuracy and robustness (Chester et al., 2005).However, the high cost of OMC systems and the time-consuming data post-processing needed lead to long waitlists for patients and sometimes long-distance travel for families coming from rural areas (Aminian and Najafi, 2004).
Wearable sensors like Inertial Measurement Units (IMUs) are potential alternatives to the OMC system, enabling the potential to capture 3DGA in rural areas and natural environments (Gurchiek et al., 2019).Unlike the OMC systems, IMUs are inexpensive, small, and lightweight and can be used outside the clinic by wearing them or attaching them to the children's limbs or pelvis (Aminian and Najafi, 2004).Although IMU sensors are very promising in motion analysis, challenges such as time-increasing drift errors, which result in less accurate estimations, still need to be overcome (Aminian and Najafi, 2004).Moreover, traditional approaches, such as sensor fusion algorithms (Sabatini, 2006;Madgwick et al., 2011), as well as the tool package OpenSense (Al Borno et al., 2022), require placing an IMU on each body segment for accurate kinematics calculations and functional calibration.
The challenges associated with processing IMU data in adult populations have been addressed in previous studies (Findlow et al., 2008;Luu et al., 2014;Dorschky et al., 2020;Giarmatzis et al., 2020;Lim et al., 2020;Stetter et al., 2020;Mundt et al., 2021;Sharifi Renani et al., 2021;Tan et al., 2022;Moghadam et al., 2023a) by implementing Machine learning (ML) models.While each of these studies utilized a combination of IMUs and ML techniques, their focuses varied: some concentrated on predicting joint kinematics (Findlow et al., 2008;Luu et al., 2014;Dorschky et al., 2020;Sharifi Renani et al., 2021;Tan et al., 2022), som on joint kinetics (Giarmatzis et al., 2020;Stetter et al., 2020), and few on both kinematics and kinetics prediction (Lim et al., 2020;Mundt et al., 2021;Moghadam et al., 2023a).These ML models can establish a direct relationship between the IMUs' data and OMC derived gait time series such as, joint kinematics, joint kinetics, and muscle forces (Moghadam et al., 2023a).Prior research indicated the efficacy of this approach in adult populations, demonstrating highly accurate results with low errors during the personalized model (tested on the same individual used for training).Additionally, reliable estimations were yielded using generalized models (tested on new participants not included in the training set), even in scenarios with limited dataset availability.Among various ML models developed for the adults population, artificial neural networks (ANN) have been widely utilized for predicting gait time series.However, there is a limited body of literature exploring alternative data-driven models that may demand smaller datasets while achieving comparable results to ANNs.Building on this context, in a prior study, we demonstrated that Random Forest (RF) models can yield results comparable to more intricate machine learning models such as Convolutional Neural Networks (CNNs) for 3D Gait Analysis (3DGA) in adults (Moghadam et al., 2023a).Given the greater heterogeneity in children's gait, it will be interesting to explore whether RF or CNNs can be applied to a pediatric population with similar performances.
The primary focus of existing ML models for 3DGA in children lies in gait classification (Kamruzzaman and Begg, 2006;Zhang et al., 2009;Zhang and Ma, 2019;Choisne et al., 2020;Khaksar et al., 2021) rather than the development of models for predicting gait time series.There are only a handful of studies focused on predicting children's gait using ML techniques (Kwon et al., 2012;Vigneron et al., 2017;Morbidoni et al., 2021;Kolaghassi et al., 2022;Kolaghassi et al., 2023).A research group used EMG sensors' signals to predict children with cerebral palsy (CP) knee moment and achieved high correlation coefficients between 0.71 and 0.93 for different participants (Kwon et al., 2012).Another study proved the feasibility of using neural networks in predicting gait events from surface EMG signals in hemiplegic cerebral palsy (Morbidoni et al., 2021).Other studies have employed ML techniques to estimate one-step-ahead gait trajectories to control lower-limb robotic devices in children with CP (Kolaghassi et al., 2022;Kolaghassi et al., 2023).However, none of the mentioned studies utilized IMUs' data to develop the ML model.Given the effective performance of a combination of IMU and ML models in adults, exploring its applicability in a heterogeneous pediatric population would be an interesting avenue for investigation.
It is noteworthy that previous studies have indicated the feasibility of predicting diverse gait time series in adults using a single IMU on the pelvis (Lim et al., 2020) or a pair of IMUs on the shanks (Sharifi Renani et al., 2020;Yeung et al., 2023) or the feet (Gholami et al., 2020).However, given the unique challenges posed by children's gait, it remains crucial to extend this inquiry to children's gait analysis by exploring the applicability of using a reduced number of IMUs.A high number of IMU sensors on the body could be impractical in realworld gait analysis, particularly for at-home applications, as it requires high computational power to monitor numerous IMUs (Sivakumar et al., 2019).Therefore, another aspect requiring investigation is to quantify the optimal number of IMUs needed for accurately estimating gait time series in children.
Therefore, this study aimed to assess the feasibility of leveraging data from IMUs to construct ML models for predicting gait time series in school-aged children.This goal was pursued through three key objectives.Firstly, we sought to explore whether personalized and generalized ML models for predicting gait time series in children could demonstrate comparable efficacy to their adult counterparts.Secondly, an evaluation was conducted to compare the accuracy of two distinct ML models-the multi-output RF and CNN models-for predicting gait time series in children.The final objective centered on exploring the potential of placing a singular IMU on each foot, as opposed to employing seven IMUs distributed across all lower limb segments.
2 Materials and methods

Participants
Seventeen typically developed (TD) children (9 Females,8 Males;age = 10.5 ± 2.8 yr [6:15]; height = 147. 2 ± 16.9 cm [119:174]; weight = 37. 1 ± 11.7 kg [19.7:56.9])were recruited for this study.Each child's legal guardian provided informed consent prior to data collection.The research strictly adhered to ethical principles outlined in the Helsinki Declaration and received approval from the University of Auckland (New Zealand) human participant ethics committee (reference number 021615).Pictures of the front (A) and back (B) of a participant, illustrating markers' placement (numbered in the image) and sensor locations (IMU axes depicted in red).The study did not utilize data from the Electromyography (EMG) sensors.

Data collection
OMC, force plates, and IMUs data were recorded while each participant completed one static and a minimum of 15 over-ground walking trials for over 20 m at their self-selected speed.For the initial five participants, we affixed 37 reflective markers, indicated by numbers 1 to 37 in Figure 1, on their body segments.Subsequently, a paper by Bakke and Besier (2022) from our lab suggested a streamlined marker set, removing markers 6, 7, 8, 9, 10, 11, 18, 21, 32, and 33, which demonstrated equivalent accuracy in kinematics calculation.For the remaining participants, we adopted this refined marker set with 27 markers.Marker trajectories were traced by a 14-camera optical motion capture system (Vicon Motion Systems Ltd., United Kingdom) at a sampling frequency of 100 Hz for all trials.In addition, seven IMU sensors [Blue Trident, Vicon IMeasureU Ltd. (NZ)] were secured on the participants' pelvis (between left and right posterior superior iliac spine markers), thighs (1 cm above the lateral aspect of the patella), shanks (1 cm above the lateral aspect of the ankle), and feet (on the dorsal surface) as shown in Figure 1, and recorded three axes of angular velocity and linear acceleration at 2 kHz.Ground reaction forces (GRFs) were acquired at 2 kHz from three force plates (Bertec, Columbus, Ohio) embedded in the gait lab floor.The Vicon Nexus software (version 2.12) was used to collect and synchronize marker trajectory, GRF, and IMUs data and subsequently to reconstruct markers' trajectories.

Data processing
After extracting data as C3D files from Nexus, MOtoNMS, a Matlab Motion data elaboration toolbox for neuromusculoskeletal applications (Mantoan et al., 2015), was used to filter marker trajectories and ground reaction forces (GRF) using a Butterworth fourth order, 8 Hz low pass filter.Then MOtoNMS was employed for rotating and aligning the lab coordinate system to the OpenSim coordinate system, where X, Y, and Z-axes correspond to the frontal, transverse, and sagittal planes, respectively.Additionally, MOtoNMS was utilized to determine hip joint center (HJC) locations using Harrington regression equations from static trials for scaling in OpenSim (Harrington et al., 2007).
A musculoskeletal model was created for each participant by linearly scaling OpenSim gait 2392 model (Delp et al., 2007), which is a generic adult model.The Gait 2392 model is a detailed biomechanical representation, featuring 23 degrees of freedom and 92 musculotendon actuators.Within this model, the pelvis and hip joints offer three rotational degrees of freedom each, allowing for movements in the three planes of motion.The pelvis in Gait 2392 allows for movements such as tilt, obliquity, and rotation in the transverse plane, facilitated by its complex structure of joints.The hip joint is characterized as a ball-and-socket joint, enabling motions such as flexion/extension, adduction/abduction, and internal/external rotation.The knee model is a simple hinge joint with one degree of freedom allowing for flexion/extension.Additionally, the ankle (allowing for ankle dorsi/plantar flexion) and subtalar (allowing for ankle inversion/eversion) joints are simulated as frictionless revolute joints.The scaling tool in OpenSim (version 3.3) aligns virtual markers on the generic model with those placed on specific anatomical landmarks of the participant's body during the static trial.The HJCs calculated by MOtoNMS were used to scale the femur.The kinematics and kinetics of the lower limb joints, including the pelvis (3 DOF), hip (3 DOF), knee (1DOF in the sagittal plane), and ankle (2DOF; sagittal and frontal planes), were calculated for all participants using the inverse kinematics (IK) and inverse dynamics (ID) tools in OpenSim.To estimate joint kinematics and kinetics, we picked two gait cycles from each trial, resulting in a minimum of 30 gait cycles for each participant.The IK tool employs an optimization technique to ensure precise alignment between the virtual markers on the scaled model and the corresponding experimental markers in a least-squares sense (Lu and O connor, 1999;Knudson, 2007).For joint kinetics prediction, we focused on the gait cycles occurring on the force plates to allow for joint forces and moments calculation through the ID tool, which solves the equations of motion (Davis et al., 1991).We excluded trials where the participant's feet were not entirely within the force plates.Therefore, a variable number of kinetics gait cycles remained for each participant, ranging from a minimum of 8 to a maximum of 18.
This process resulted in a dataset encompassing measurements for 15 joints kinematics and 15 joints kinetics targets, including pelvis tilt, pelvis rotation, pelvis obliquity, hip rotation, hip flexion, hip abduction/adduction, knee flexion/extension, ankle dorsi/ plantar flexion, and ankle inversion/eversion joints angles and moments for both legs.Finally, the IMU data were down sampled to 100 Hz to align the data's frequency with the joint kinematics and kinetics frequency.This also reduces the computational load for feature extraction and machine learning (ML) model construction.

Joint kinematics and kinetics prediction using ML models
After processing data for the 17 participants, a total of 73,364 time points for joint kinematics and 21855 time/data points for joint kinetics were used for the development of ML models.The outlined procedures (Figure 2), including windowing IMU data, feature extraction, feature selection, model development, and model evaluation, were executed as detailed in the subsequent sections.

Training and testing sets
We implemented two distinct data splitting methods to facilitate two types of examinations (Figure 2, Step 1); the first looked at the intra-subject examination accuracy, and the second looked at the inter-subject prediction accuracy.
Intra-subject examination: In this approach, the training dataset consisted of 70% of a participant's gait cycles, and the remaining 30% of gait cycles were allocated for the testing dataset.A total of 17 training and testing datasets were created to cover all participants and perform the intra-subject examination.
Inter-subject examination: To create training and testing datasets for this examination, we employed a leave-one-out approach for our cohort of 17 participants.The dataset was partitioned to assess the model's generalization across diverse individuals.During each iteration, one participant's gait cycles were set aside for testing, while the gait cycles from the remaining 16 participants constituted the training dataset.This process was repeated 17 times, each time excluding a different participant from the training set.

IMU sensors data windowing and feature extraction
In our pursuit of enhancing the accuracy of the learned models and emphasizing the main characteristics of the input data (Laird and Saul, 1994), we adopted a feature engineering technique.From each IMU, we took six time series, encompassing triaxial angular velocity and linear acceleration, to extract features.Thus, we had a total of 42 data vectors from seven IMUs.We organized the input time series data into sequences of consecutive, sliding, and overlapping windows.We selected a window size of 0.75 s, as shown to be the most accurate in predicting gait time series (Moghadam et al., 2023b).
Then, we employed the Tsfresh (Christ et al., 2018) (Time Series FeatuRe Extraction on the basis of Scalable Hypothesis tests) python package to perform feature extraction on the windowed input data (Figure 2, Step 2).This process yielded a feature vector x i ( f 1 (x 1 ), f 2 (x 2 ), . .., f m (x i )) for each vector of input data (x i ).Tsfresh extracted 788 distinct features from each channel of IMU data, resulting in a substantial total of 33,096 features derived from the 42 channels of input data.

Feature selection
The presence of irrelevant and noisy features may considerably reduce the performance of the ML model.The process of removing irrelevant features and selecting the most relevant features is called feature selection (Figure 2, Step 3).We eliminated all zero-variance features to initiate the process of determining the most important features.Then, the Tsfresh feature selector's built-in function was utilized to remove any non-significant feature, using the Benjamini-Hochberg method (Benjamini and Hochberg, 1995).In the next step, the remaining features were ranked based on their Gini Importance in predicting each target using a Random Forest (RF) regressor (Hasan et al., 2016).Then, the top ten features associated with each target were selected.Our previous findings demonstrated that this selection of 10 features per target yields precise estimations in multi-output models (Moghadam et al., 2023b).
After feature selection, two comprehensive feature sets were constructed, each including 150 features.The first merged all the top features related to kinematics targets, forming the basis for a multioutput ML model dedicated to kinematics prediction.The second feature set put all the top features associated with kinetics targets together to develop a multioutput ML model tailored for kinetics prediction.We retained only the features present in the super feature set from all the extracted features for the training dataset (Figure 2, Step 4).

Non-linear regression ML models
We developed RF and CNN models to assess their accuracy in predicting lower limb joint kinematics and kinetics during gait (Figure 2, Step 5).The hyperparameters for both RF and CNN models were chosen based on previously optimized models (Moghadam et al., 2023a).We employed an RF model comprising 500 trees, each with a maximum depth of 25.
For the CNN model, we used a multi-output architecture with five hidden layers.The selected features were scaled using the Standard Scaler function from the Sklearn library to ensure all variables fell within the same range (between zero and one).Targets were also scaled, and post-prediction, they were rescaled to their original values using the same scaler.The model's architecture featured an input layer with a size of 150, followed by two convolutional layers, each followed by a max-pooling layer.Both convolutional layers comprised 256 filters with a kernel size of three and employed a "relu" activation function.The max-pooling layers had a pool size of two.Subsequently, the data was flattened and passed through a dense output layer with a linear activation function.The number of units in the output layer corresponded to the number of targets (15 for both CNN models utilized for kinematics and kinetics prediction).The 'Adam' solver with a learning rate of 0.01 was used for weight optimization, employing the mean squared error as the loss function.An early stopping mechanism monitored validation loss and halted training if no improvement was observed after five epochs.The batch size was set to 32, and the model was trained for a maximum of 100 epochs to achieve robust results.

Models' evaluation
To evaluate the performance of the CNN and RF models, we began by extracting the super feature set from the windowed test dataset (Figure 2, Step 6).Subsequently, the trained ML models were employed to predict targets, joint kinematics, and kinetics using the extracted features from test datasets (Figure 2, Step 7).Then, we computed the root mean square error (RMSE) and Normalized RMSE (NRMSE) between the OpenSim outputs and the predicted values generated by each ML model for all targets.Violin plots were utilized to illustrate the distribution of RMSEs across various IMU configurations and examinations (intra and inter-subject) for each target.These plots offer a visual representation of how data is spread out within each category.In a violin plot, the width of the shape at any given point indicates the probability of values occurring.Additionally, within the violins, the median line is depicted as a short horizontal line, providing a clear reference point for the central tendency of the data.The reported RMSEs and NRMSEs for intrasubject and inter-subject examination are average of personalized and generalized models, respectively.After determining the optimal ML model and IMU sensor configuration, we conducted additional analysis by 1) plotting average waveforms from both OpenSim and ML models' outputs, 2) calculating R2 values and creating correlation plots, and 3) generating Bland-Altman (Bland and Altman, 1986) plots to evaluate agreement between OpenSim outputs and predicted values for the selected model.

The effect of reducing IMU sensors to feet IMUs
In a prior study (Moghadam et al., 2023b), we demonstrated the feasibility of accurately estimating gait time series using machine learning models with just two IMUs positioned on the feet.To explore the applicability of this approach to children's data, we replicated the steps described in Figure 2 using two IMUs placed on the feet instead of the full set of 7 IMUs.

Joint kinematics prediction
The distribution of predicted joint kinematics RMSE revealed similar predictive accuracy between the RF and CNN models, whether we're looking at the personalized models (intra-subject) or generalized models (inter-subject) (Figure 3).The RF model exhibited, on average, lower prediction errors of 0.22 °in intrasubject examinations and 0.20 °in inter-subject examinations when compared to the CNN model across all joints and planes of motion.
For the personalized models, reducing the number of IMUs to only one on each foot did not alter the prediction of joints kinematics compared to using all seven IMUs (Figure 3A).Interestingly, in the inter-subject examination, pelvis rotation, hip rotation, and ankle inversion/eversion angles experienced a decrease in their prediction errors by using only two IMUs; however, it increased the RMSE in the inter-subject evaluation for pelvis tilt and hip flexion/extension (Figure 3B).It is worth mentioning that these differences were not statistically significant.Independently of the model and the number of IMUs used, the average RMSE across all joints and planes of motion indicated considerably lower values in the intra-subject examinations compared to the inter-subject examinations.In the intra-subject evaluation, the RMSE spanned from a minimum of 1.0 °(Pelvis tilt) to a maximum of 6.7 °(ankle inversion/eversion).For the inter-subject evaluation, the range of Violin plots illustrate the RMSE in degrees for joint kinematics predictions, comparing OpenSim IK outcomes with those from ML models.The red violins represent errors from the RF model, while the blue violins depict errors from the CNN model.Darker hues indicate models utilizing data from the full set of IMUs (n = 7), and lighter hues denote models using data solely from foot-mounted IMUs (n = 2).Panel (A) presents results from the intra-subject examination, while panel (B) displays the inter-subject examination results, utilizing models designed to generalize across participants.TABLE 1 The Normalised RMSE (NMRSE) along with their corresponding standard deviation (SD) values for joint angle prediction across all joints and planes of motion in intra and inter-subject examinations, based on RF models' output using two IMUs.

Joint kinematics target
Intra-subject examination Inter-subject examination In our analysis, we found that overall, the RF model gave slightly better results, and the number of IMUs used (two vs. seven) did not have an impact on the results.Therefore, we concentrated on the results provided by the RF model with two IMUs for the subsequent analysis.After normalizing the RMSE values to the data range, we observed that the lowest normalized RMSE (NRMSE) was associated with knee flexion/extension angle, and the highest NRMSE value was related to the pelvis tilt angle (Table 1).In the hip and ankle joint angles prediction, the lowest error appeared in the sagittal plane; however, in the case of the pelvis, the highest error was associated with the sagittal plane.This finding held true for both intra and inter-subject examinations.Notably, the NRMSE values for all joints and planes of motion in the inter-subject results were nearly twice as high as those observed in the intra-subject examination.Specifically, the average RMSE across all targets increased from 9.5% to 19.9% (Table 1).When comparing the average NRMSE for children below 10 years with children older than 10 years, a clear trend emerges.On average, the NRMSE is lower (1.7% in intra-subject and 0.3% in inter-subject examinations) in the older age group when predicting joint kinematics (refer to Supplementary Table SA1).
To understand if the prediction accuracy is consistent across the gait cycle for the intra-subject examination, we performed further analysis for the RF model outputs encompassing: 1) Average range of motion (ROM) comparison between the OpenSim IK tool and the RF model's output.2) Correlation plot and R-squared (R 2 ) Assessment, and 3) Bland-Altman Analysis to provide insights into the agreements between predicted and measured variables.
Hip, knee, and ankle joint angles in the sagittal plane are presented in Figure 4.Additional results for other targets, including pelvis angles in all planes of motion, hip joint angles in the frontal and transverse planes, and ankle joint angles in the frontal plane, are detailed in Supplementary Figure SA1.
Plotting an average ROM (standard deviation (±SD)) across all participants revealed that the predicted waveforms closely followed the measured waveforms obtained from the OpenSim IK tool.The SD area of the predicted values fell within the shaded area representing the measured values, indicating a close fit between the predicted and measured data in the intra-subject examination (Figures 4A, D, G; Supplementary Figures SA1A, D, G, J, M, P).
Furthermore, there was a strong correlation between the OpenSim IK outputs and predicted joint angles, with R 2 values exceeding 0.83 for pelvis angles in all planes of motion (Supplementary Figures SA1B, E, H), 0.76 for hip angles The plots are made across all participants in the intra-subject examination, specifically for hip (A-C), knee (D-F), and ankle (G-I) joint angles in the sagittal plane.Panels (A,D,G) present the RF model's average predictions (the dashed red line represents the average, and the red shaded area indicates the SD) for joint angles, utilizing data from IMUs placed on the feet.These predictions are compared to the joint angles derived from the OpenSim IK tool (the solid blue line represents the average, and the blue shaded area indicates the SD).Panels (B,E,H) illustrate the correlation and R-squared (R 2 ) values for the mentioned joint angle targets.In (C,F,I), we utilized Bland-Altman plots to visually depict the errors throughout one gait cycle for all participants.In these plots, the dashed blue line represents the mean error, and the mean ± 2SD is depicted as dashed red and green lines.Each distinct color in these plots represents the results of one participant.
(Supplementary Figures SA1K, N; Figure 4B), 0.97 for knee angle (Figure 4E), and 0.77 for ankle joint angles (Supplementary Figures SA1Q; Figure 4H).The high performance of the RF model in the sagittal plane was observed at the hip, knee, and ankle.
Strong agreement between the measured and predicted values was evident in the Bland-Altman plots, with the error falling within the range of two standard deviations from the mean value for most participants.No specific pattern in the error values was seen based on these plots; however, the bias between predicted and measured kinematics was around zero for all targets (Figures 4C, F, I; Supplementary Figures SA1C, F, I, L, O, R).

Joint kinetics prediction
When predicting joint kinetics, the RF model demonstrated slightly superior performance when compared to the CNN model (Figure 5).The RF model yielded lower prediction error values than the CNN model across all joints and planes of motion, with a reduction of 0.017 Nm/kg RMSE in intra-subject examinations and 0.037 Nm/kg RMSE in inter-subject examinations.The better performance of the RF model was more pronounced in the intersubject examination (Figure 5B).
In terms of the number of IMU sensors used for joint kinetics prediction, we found nearly identical results when employing only the feet IMUs as compared to using all 7 IMUs.However, in specific kinetics targets, such as pelvis tilt and hip flexion/extension in the intra-subject examination and pelvis obliquity, hip adduction/ abduction, and hip rotation in the inter-subject examination, even lower prediction errors were achieved by utilizing just two IMUs placed on the feet.It is important to highlight that irrespective of the model type and the number of IMUs employed for prediction, the RMSE values in the inter-subject examination were consistently higher than the RMSE in the intra-subject examination.
As for the kinematics, we concentrated on the results provided by the RF model with two IMUs for further analysis.After calculating the NRMSE between outputs of the RF model and the OpenSim ID tool, we observed that the highest NRMSE values were Violin plots illustrate the RMSE in degrees for joint kinematics predictions, comparing OpenSim ID outcomes with those from ML models.The red violins represent errors from the RF model, while the blue violins depict errors from the CNN model.Darker hues indicate models utilizing data from the full set of IMUs (n = 7), and lighter hues denote models using data solely from foot-mounted IMUs (n = 2).Panel (A) presents results from the intra-subject examination, while panel (B) displays the inter-subject examination results, utilizing models designed to generalize across participants.associated with the pelvis tilt in the intra-subject examination (similar to the kinematics analysis) and hip flexion/extension in the inter-subject examination.Conversely, ankle dorsi/plantar flexion exhibited the lowest NRMSE for intra-subject examination, while knee flexion/extension displayed the lowest NRMSE for inter-subject examination.
TABLE 2 the NRMSE values along with their corresponding SD for joint moment prediction across all joints and planes of motion in intra and inter-subject examinations, based on RF models' output using feet IMUs.

Joint kinetics target
Intra-subject examination Inter-subject examination The plots are made across all participants in the intra-subject examination, specifically for hip Just like with joint kinematics, the NRMSE values for all joints and planes of motion in the inter-subject results were higher than the intra-subject examination.Specifically, the average RMSE across all targets increased from 10.7% to 15.2% (Table 2).Similar to the prediction of joint kinematics, a consistent trend was noted in the prediction of joint kinetics (refer to Supplementary Table SA2).Notably, the NRMSE was lower by 1% in intra-subject and 4.3% in inter-subject examinations for older children (above 10 years old) as opposed to their younger counterparts (below 10 years old).
To understand if the prediction accuracy is consistent across the gait cycle for the intra-subject examination, we performed further analysis for the RF model outputs encompassing: 1) Average normalized moment comparison between the OpenSim IK tool and the RF model's output.2) Correlation plot and R-squared (R 2 ) Assessment, and 3) Bland-Altman Analysis to provide insights into the agreements between predicted and measured variables.
The results for the hip, knee, and ankle joint moments in the sagittal plane are shown in Figure 6.Additional results for other targets, including pelvis moments in all planes of motion, hip joint moments in the frontal and transverse planes, and ankle joint moment in the frontal plane, are detailed in Supplementary Figure SA2.
Plotting the average and standard deviation waveforms for joint moments throughout a gait cycle in intra-subject examination, we observed that the RF model's predictions effectively tracked the OpenSim ID tool output.However, the SD area of the predicted values did not consistently fall within the shaded area related to the measured values across the entire gait cycle [(Figures 6A, D, G); Supplementary Figures SA2A, D, G, J, M, P].Especially toward the end of the stance phase (50%-60% of the gait cycle), the model predicted a lower range of motion in the three joints in the sagittal plane.Also, at the beginning of the stance phase, we can observe some discrepancies in knee flexion and hip extension moment predictions.
Compared to joint angles, the correlation between the measured and predicted joint moments was lower for the pelvis, with R2 values of 0.43, 0.31, and 0.41 for pelvis tilt, obliquity, and rotation, respectively (Supplementary Figures SA2B, E, H).The R2 values for other joint kinetics were consistent with the performance seen in joint kinematics prediction.Specifically, the hip joint experienced R2 higher than 0.73 in all planes of motion (Supplementary Figures SA2K, N; Figure 6B), the R2 for the knee joint was 0.80 in the sagittal plane (Figure 6E), and the ankle displayed R2 values higher than 0.83 in both the frontal (Supplementary Figures SA2Q) and sagittal planes (Figure 6H).
According to Bland-Altman plots, there was a good agreement between the measured and predicted targets, as the errors were within a range of two SD of the mean value for most participants.Similar to joint kinematics prediction, the bias between measured and predicted variables was around zero for all kinetics targets.An interesting observation was that during the final phase of the gait cycle (swing phase), errors were almost zero for most joint kinetics (Figures 6C, F, I; Supplementary Figures SA1C, F, I, L, O, R).

Discussion
The aim of this study was to investigate the feasibility of using a combination of IMUs' data and ML models for predicting joint kinematics and kinetics in school-aged children.To answer this aim, the first objective was to assess the accuracy of RF and CNN ML models by quantifying both intra-subject and inter-subject prediction errors.The second objective was to evaluate the influence of using only an IMU on each foot versus seven IMUs, one on each segment of the lower limb, on kinematics and kinetics prediction performance.We employed a feature engineering technique to extract and select the most important features from the IMUs' acceleration and angular velocity data to enhance the models' performance.

ML comparison
Regarding the first objective, findings suggested that the RF and CNN models demonstrated comparable performance for predicting joint kinematics (NRMSE of 9.5% versus 10.6% for personalized and NRMSE of 19.9% versus 22.5% for generalized models, respectively) and kinetics (NRMSE of 10.7% versus 12.9% for personalized and NRMSE of 15.2% versus 17.9% for generalized models respectively) in TD children.This implies that the complexity of deep neural network structures may not be necessary for gait time series prediction.Consequently, it opens the door to more efficient and easily interpretable modeling approaches, such as the RF model (Breiman, 2001).Supporting this notion, a separate study found RF models to outperform CNN models in estimating step length, showing an absolute error of 5.09 cm for RF compared to 5.26 cm for CNN (Seifer et al., 2023).Furthermore, the superiority of RF models, with an average error of 5.57 °, becomes evident in gait trajectory generation, surpassing the neural network model with an average error of 6.00 °in another study (Ren et al., 2019).
The higher performance of RF models could be attributed to their resilience against overfitting.This resilience arises from their capacity to amalgamate multiple decision trees trained on bootstrapped data, coupled with the utilization of feature randomization, pruning, and averaging (Breiman, 2001).On the other hand, CNNs exhibit a notable susceptibility to overfitting, particularly when dealing with smaller datasets (Slijepcevic et al., 2023).

Intra-subject examination
We created 17 personalized models for predicting kinematics and kinetics based on customized feature sets specific to each participant.Our results from the RF model using two IMUs data demonstrated strong predictive accuracy, with an average RMSE ranging from 1.61 °to 4.16 °(NRMSE of 5.2%-14.1%)across all joint kinematics.The RMSE values for joint kinematics stayed well below the 5 °error threshold, which is often considered a clinically acceptable level of deviation for assessing joint movements (Slater et al., 2018).However, the joint kinematics prediction error were higher than for adults in other studies, where observed values ranged from 1.38 °to 3.96 °for all targets (Findlow et al., 2008;Giarmatzis et al., 2020;Moghadam et al., 2023a;Yeung et al., 2023).Similarly, joint kinetics prediction error were higher (0.038-0.233Nm/kg) in this study than on adult population, where the RMSE ranged from 0.042 to 0.198 Nm/kg (Dey et al., 2019;Mundt et al., 2020;Moghadam et al., 2023a).This elevated error in both kinematics and kinetics prediction in the intra-subject evaluation of children compared to the adult models underscores the greater variability in gait time series within individual children across different trials.Different gait maturity level (Bach et al., 2021) as well as heightened variability in gait patterns (Kuhtz-Buschbeck et al., 1996) and EMG gait waveforms (Granata et al., 2005) in children compared to adults has been shown in other studies.This discrepancy may be attributed to the increased susceptibility of children to distractions during walking (Stolze et al., 1998) or to the heterogeneity in children's gait cycles compared to adults.
Despite the higher errors in the children's personalized ML model compared to adults, the Bland-Altman plots revealed a notable level of agreement between the measured and predicted values during the intra-subject examination.Most participants exhibited errors within the range of two SD from the mean error value.Additionally, average errors consistently remained near zero for all predicted targets, underlining the good overall agreement between the IMU-based and OMC-based kinematics and kinetics.It is worth noting that no discernible patterns in the error values were observed, indicating a lack of systematic bias in the predictions (Bland and Altman, 1986).These findings emphasize the practicality and suitability of employing this approach, which involves a personalized RF model utilizing IMU data for accurately estimating gait time series in children.
One interesting finding of personalized modeling (intra-subject examination) was the good model performance within the sagittal plane compared to the other planes of motion for joint kinematics prediction, especially in the case of hip and ankle joint angles.The knee angle, only computed within the sagittal plane, demonstrated a high correlation between the actual and predicted values (R2 of 0.97).The enhanced performance of the RF model in the sagittal plane (higher R2 and lower errors) can be attributed to the more prominent joint movements within this plane, which, in turn, yields more distinct signals from the IMUs.Consequently, this clarity in the IMU signals contributes to the model's improved predictive performance in the sagittal plane.While other planes of motion, such as the frontal and transverse planes, contribute to a comprehensive understanding of gait, the sagittal plane takes precedence in gait analysis due to its primary role in capturing the fundamental aspects of forward movement.We've shown that the proposed personalized method exhibits remarkable accuracy, demonstrating a clinically acceptable level of error, particularly in the sagittal plane.This notable precision positions it as exceptionally valuable for advancing the gait analysis of children.

Inter-subject examination
In the inter-subject examination of the RF model, the results were less promising compared to the intra-subject test, with average RMSE ranging from 3.5 °to 9.6 °(NRMSE of 9.6%-33.1%)for joint kinematics.When comparing the outcomes with adults cohort, it becomes apparent that the RMSE values in adults exhibit lower errors (RMSE between 2.17 °and 6.53 °) (Luu et al., 2014;Dorschky et al., 2020;Lim et al., 2020;Sharifi Renani et al., 2021;Moghadam et al., 2023a).Similar findings were found for joint moments with NRMSE of 10.3%-26.4% found in this study compared to 4.54%-10.74% in previous adults studies (Giarmatzis et al., 2020;Lim et al., 2020).In fact, these errors are of such magnitude that they do not provide confidence in the accurate prediction of gait time series in children not included in the training set.This contrasts with the previously demonstrated success of inter-subject modeling in predicting time series for the adult population with limited data (Giarmatzis et al., 2020;Lim et al., 2020;Stetter et al., 2020;Moghadam et al., 2023a).
The primary reason for the elevated error in the children's generalized model compared to adults can be attributed to the diverse gait patterns among individual children, given their ongoing musculoskeletal changes and developmental stages within the specified age range of six to 15 years in this study (Onis et al., 2007;Bari et al., 2023).However, as age advances, there is a reduction in variability within the gait pattern, as demonstrated in our findings revealing higher errors in gait analysis for younger children compared to older ones.When analysing the average (±SD) waveforms of targets within a gait cycle, we observed a noticeable standard deviation surrounding the average waveform for the children, reaffirming the high variability of gait patterns among children.This is comparable to a study by Fokuchi et al., where a greater deviation area for younger people compared to the adults' normative gait data is shown (Fukuchi et al., 2018).The secondary reason for the high error in generalized modeling lies in the limitations of the dataset.Effective ML models typically require access to extensive datasets comprising a wide spectrum of walking patterns.Consequently, the performance of a model trained on a small dataset featuring only 16 participants is inherently limited when applied to new, unseen subjects.The considerable errors observed in generalized modeling render this approach less advisable for children who were not part of the initial training dataset.

Number of IMUs
Concerning the second objective, we demonstrated that utilizing two IMUs on the feet, instead of a total of seven IMUs, resulted in similar accuracy of the models, specifically in intra-subject examination.Concerning the inter-subject examination, while there was a slight increase in error for some targets (pelvis tilt, hip flexion/extension angles and moments, and ankle dorsi/plantar moment), reducing the number of IMUs to feet IMUs resulted in decreased errors for specific targets such as pelvis rotation angle, hip rotation angle, ankle inversion/eversion angles, pelvis obliquity moment, hip adduction/abduction moment, and hip rotation moment.
These findings are consistent with our prior research, which suggested that in adult gait prediction, employing ML models allows us to achieve nearly identical results using only feet IMUs, as opposed to utilizing seven IMUs (one for each segment) (Moghadam et al., 2023b).This can be attributed to the proficiency of ML models in establishing a robust relationship between IMU data and targets.Another contributing factor is the identification of alternative features to raw IMU data, thereby augmenting the predictive capabilities of the ML model, even when working with a limited number of IMUs.The feasibility of employing a single IMU on the pelvis (Lim et al., 2020) or a pair of IMUs on shanks (Sharifi Renani et al., 2020;Yeung et al., 2023) or feet (Gholami et al., 2020) for predicting a diverse range of gait time series has been demonstrated in prior studies.Reducing the number of IMUs streamlines model implementation, decreases data processing time, and lowers sensor-related costs.Additionally, the potential integration of foot IMUs within shoes, rendering them inconspicuous during community use, could enhance patient compliance.

Limitations and future work
This study presents limitations to be addressed by future research.Firstly, the utilization of a generic adult model for scaling and constructing musculoskeletal models for children.
The issue lies in the potential discrepancies between generic adult models and the individual anatomical characteristics of children.Developing more precise, subject-specific models would ideally involve leveraging medical imaging data, such as MRI, CT, or X-rays, which can be both time-consuming and financially burdensome (Nolte et al., 2016).To address this limitation, future research could explore alternative methods like statistical shape modeling to build children's musculoskeletal models (Carman et al., 2022).
Another limitation of this study pertains to the omission of an investigation into the potential impact of slight variations in the placement of IMUs that may occur when different individuals are responsible for placing the IMUs.The concern here is that small variations could influence the data collected and, consequently, affect the accuracy and reliability of the results.Addressing this limitation in future research might involve conducting a sensitivity analysis or implementing standardized procedures for IMU placement to mitigate the potential impact of such variations on the study outcomes.It is noteworthy that a similar analysis was taken in an adult study and did not change the results (Moghadam et al., 2023a).So, we expect that these findings apply to children, too, meaning small changes in sensors' placement should not substantially affect the outcomes.
Our study focused on TD children to establish the models, which will differ from other populations, such as children with cerebral palsy.The choice of ML model and the number of required IMUs may differ, as children with movement disorders often exhibit more complex and diverse gait patterns.Several research groups have successfully employed regression machine learning models to estimate gait time series in specific patient cohorts.Examples of previous studies on patients include the prediction of knee joint moments during gait in individuals with CP (Kwon et al., 2012), the estimation of knee joint kinematics in patients with knee osteoarthritis (Tan et al., 2022), and the forecasting of gait parameters for patients with osteoarthritis (OA) and those undergoing total knee arthroplasty (TKA) (Sharifi Renani et al., 2020).Notably, these investigations demonstrated high correlation coefficients ranging from 0.71 to 0.99, showcasing the viability of gait time series prediction in targeted patient groups using wearable sensors and machine learning models.While the model employed in this study demonstrates robust performance with TD children, its suitability for diverse pathologies warrants exploration.Gait patterns vary significantly across different conditions, making it imprudent to apply the exact same model to a new population.Consequently, our future endeavours will involve evaluating the performance of our algorithm on additional patient cohorts, including children with cerebral palsy.
Another notable limitation is the computational resources required for the primary feature extraction and selection processes.We utilized high-performance computers with 80 GB of RAM memory to address this demand.However, once the model is trained, it can be executed on less powerful computers, focusing solely on extracting the selected features and providing inference from the model.
It is also important to acknowledge that the accuracy of estimations using data from other labs may not match the precision of our own results.This discrepancy can be related to variations in equipment and sensor usage across different laboratories.However, by incorporating data from multiple labs into the training dataset for our models, we can enhance the models' ability to generalize across different settings.In future work, it would also be valuable to consider the integration of a contactless monitoring system, akin to the innovative approach developed in a separate study (Huang et al., 2024).Integrating such systems into the ML model holds the potential for real-time prediction of gait time series in children.

Strengths and contributions
By developing an ML model for predicting gait time series in children with diverse gait patterns, we achieved results comparable to studies focused on the adult population, particularly in the context of personalized modeling.We believe that our model offers several advantages over traditional methods that rely on IMUs for gait analysis.For instance, our model can predict a comprehensive set of lower limb joint angles and moments during gait using only two IMUs attached to the feet.To the best of our knowledge, this is the first study tailored to children's gait time series prediction, leveraging a combination of IMU data and ML techniques.
Other methodologies, which would use sensor fusion algorithms rather than ML, require additional normalization steps to calculate each IMU sensors' orientation relative orientation to each body segment, leading to inaccuracies and numerical drift errors.In contrast, our personalized models have good accuracy, can be streamlined, and work independently of the user's expertise.Following a single data collection session in a gait lab, remote patient monitoring becomes feasible by placing IMUs on the patient's feet and feeding the IMU data into the model for inference.Furthermore, this workflow can be utilized in real-time, as the inference time for the RF model is on the order of milliseconds.

Conclusion
The current study showed that RF and CNN models exhibit comparable results in the context of gait analysis within a typically developed pediatric population.The practicality of employing only two IMUs placed on the feet for predicting a comprehensive set of lowerlimb joint kinematics and kinetics was successfully demonstrated.The presented workflow, employing foot IMUs, not only reduces processing time but also streamlines the integration of wearable sensors in clinical settings.Our forthcoming research endeavors will include increasing the sample size and introducing more variability to the overground walking scenarios to enhance the accuracy of our generalized model.Moreover, future work will be dedicated to developing ML models tailored to a cohort of children with movement disorders, specifically children with CP.This expansion promises to bring valuable insights and tools to the field of pediatric gait analysis, serving as a testament to the potential for advanced technology to benefit those with unique clinical requirements.

FIGURE 2
FIGURE 2The workflow to develop the ML models.Step 1: Split the data into training and testing sets.Step 2: Window IMU data and feature extraction for the training dataset.Step 3: Feature selection.Step 4: Keep only the selected features in the training set.Step 5: Train ML models using selected features in the training set.Step 6: window IMU data and extract the super feature set (determined in step 3) for the test dataset.Step 7: Test the trained model on the testing set.

FIGURE 5
FIGURE 5 (A-C), knee (D-F), and ankle (G-I) joint moments in the sagittal plane.Panels (A,D,G) present the RF model's average predictions (the dashed red line represents the average, and the red shaded area indicates one SD) for joint moments, utilizing data from IMUs placed on the feet.These predictions are compared to the joint moments derived from the OpenSim ID tool (the solid blue line represents the average, and the blue shaded area indicates one SD).Panels (B,E,H) illustrate the correlation and R-squared (R 2 ) values for the mentioned joint moment targets.In (C,F,I), we utilized Bland-Altman plots to visually depict the errors throughout one gait cycle for all participants.In these plots, the dashed blue line represents the mean error, and the mean ± 2SD is depicted as dashed red and green lines.Each distinct color in correlation and error plots represents the results of one participant.