A Framework for Sensor-Based Assessment of Upper-Limb Functioning in Hemiparesis

The ultimate goal of any upper-limb neurorehabilitation procedure is to improve upper-limb functioning in daily life. While clinic-based assessments provide an assessment of what a patient can do, they do not completely reflect what a patient does in his/her daily life. The use of compensatory strategies such as the use of the less affected upper-limb or excessive use of trunk in daily life is a common behavioral pattern seen in patients with hemiparesis. To this end, there has been an increasing interest in the use of wearable sensors to objectively assess upper-limb functioning. This paper presents a framework for assessing upper-limb functioning using sensors by providing: (a) a set of definitions of important constructs associated with upper-limb functioning; (b) different visualization methods for evaluating upper-limb functioning; and (c) two new measures for quantifying how much an upper-limb is used and the relative bias in their use. The demonstration of some of these components is presented using data collected from inertial measurement units from a previous study. The proposed framework can help guide the future technical and clinical work in this area to realize valid, objective, and robust tools for assessing upper-limb functioning. This will in turn drive the refinement and standardization of the assessment of upper-limb functioning.


INTRODUCTION
After neurological injury, individuals require physical rehabilitation to promote recovery, minimize disability, and maximize independent living. Despite years of research pointing to the benefits of repetitive practice, the time patients spend in inpatient rehabilitation settings is often much less than the recommended guidelines (De Wit et al., 2005;Barrett et al., 2018). Moreover, after discharge, patients do not have enough opportunities to do targeted movement therapy at home, sometimes leading to a pattern of "learned non-use" (André et al., 2004) and other compensatory strategies to accomplish daily activities.
Valid and reliable assessments are crucial for gaining a better understanding of a subject's sensorimotor state, and allowing us to tailor intervention strategies or improve health services. While clinic-based assessments of body function and activity can measure the capability of a patient, they are poor indicators of the actual use of a limb in day-to-day life (Mallinson and Hammel, 2010;Lemmens et al., 2012;Rand and Eng, 2012;Van Meulen et al., 2016). Thus, the assessment of movement behavior in natural settings is vital to evaluate recovery and the real-world impact of rehabilitation interventions. In the context of hemiparesis, such assessments can help gauge the extent to which changes in day-to-day activities can be attributed to true recovery or compensatory strategies.
There are four inter-related aspects that need consideration to build a comprehensive picture of upper-limb functioning in daily life: (a) amount of use (duration and/or intensity), (b) hand preference, (c) ability and capability, and (d) quality of movement. They can be posed as the following questions of interest to a clinician: • Q1. How much is an upper-limb used during daily life? • Q2. What is the relative preference for using the more-affected limb over the less-affected one? • Q3. What kind of upper-limb tasks does the subject achieve in day-to-day activities? • Q4. What is the quality of upper-limb movements performed during day-to-day activities?
Assessments such as the motor activity log (MAL) (Uswatte et al., 2006b) have been devised to capture, to an extent, such aspects of upper-limb functioning. In the MAL, the amount and quality of use are rated on a 11-point Likert scale for a set of pre-selected tasks. The amount and quality of the more-affected limb'use is reported by comparing it either to less-affected limb or to the pre-stroke condition of that limb. However, the MAL has limited sensitivity and relies on a patient's ability to recall upper-limb use from memory. Thus, the MAL can only provide a coarse and subjective evaluation of upper-limb functioning in daily life. There is growing interest in wearable sensors for continuous and objective monitoring of upper-limb functioning (Bailey et al., 2014(Bailey et al., , 2015McLeod et al., 2016;Bochniewicz et al., 2017;de Lucena et al., 2017;Lang et al., 2017;Leuenberger et al., 2017;David et al., 2020;Lum et al., 2020). Inertial sensors composed of accelerometers and gyroscopes have been the preferred modality for assessing upper-limb functioning in the natural setting, due to their availability, affordability, and compact size (Bailey et al., 2014(Bailey et al., , 2015McLeod et al., 2016;Bochniewicz et al., 2017;de Lucena et al., 2017;Lang et al., 2017;Leuenberger et al., 2017;David et al., 2020;Lum et al., 2020). Thus far, the focus of sensorbased assessment in hemiparesis has been the quantification of the overall amount (Q1) and the relative bias (Q2) in using the upper-limbs during daily life (Bailey et al., 2014(Bailey et al., , 2015McLeod et al., 2016;Bochniewicz et al., 2017;de Lucena et al., 2017;Lang et al., 2017;Leuenberger et al., 2017;David et al., 2020;Lum et al., 2020). The current methods for quantifying the amount of upperlimb use have either used: (a) the magnitude of acceleration [e.g., activity counting (AC) (Uswatte et al., 2006a;Bailey et al., 2014;de Lucena et al., 2017)] or (b) the duration of functional movements detected from sensor data [e.g., gross movement (GM) score (Leuenberger et al., 2017;David et al., 2020), machine learning (ML) algorithms (McLeod et al., 2016;Bochniewicz et al., 2017;Lum et al., 2020)]. Although related, movement duration and intensity convey slightly different information, and individually they only provide partial characterization of how much a particular arm is used. A complete picture of how much an arm is used in daily life requires knowledge of both the duration and the intensity of the upper-limb movements. Also, there is currently little work on using sensor data for determining the nature of tasks/activities and quantifying the quality of movements performed during daily life in neurorehabilitation application. These aspects are likely to be explored in the coming years with the increasing interest in this area, the availability of more data and sophisticated data analysis methods.
In order to develop rigorous methods to assess different aspects of upper-limb functioning, now is an opportune moment to lay a good foundation for this problem through a formal framework consisting of: (a) definitions of essential concepts and their interrelationships, (b) visualization methods for the information collected and computed from the sensor data, and (c) quantitative measures for different aspects of upper-limb functioning. Such a framework can help steer future technical developments in the appropriate direction, and limit work on ill-founded methods.
This paper presents a framework for the sensor-based assessment of upper-limb functioning, targeting researchers developing and validating quantitative methods for sensorimotor assessments. This framework focuses on questions Q1 and Q2 described earlier, which is necessary for answering the other two questions. The paper starts with formal definitions of the various concepts (section 3) of the framework. This is followed by qualitative and quantitative analysis (section 4) of different methods for visualizing how much an upper-limb is used (Q1), and the relative bias between the two upper-limbs (Q2). We note two important points about the current work to set the right context for the reader: 1. The work presented here is theoretical in nature concentrating primarily on clearly defining essential concepts and delineating their relationships. Demonstrations of the different concepts, new visualizations, and measures are preliminary in nature, carried out through data collected from a previous pilot clinical study . 2. The framework presented in sections 3-4 does not make any overt assumptions on the sensing modality used for assessing upper-limb functioning. The exact nature of the sensing modality can have a major influence on the fidelity of the assessment.
The clinical relevance of the proposed framework, followed by its limitations are discussed in the final section of the paper section 5. We also bring to light some important issues that should be addressed in the coming years for making pervasive, sensor-based objective assessment of upper-limb functioning a clinical reality.

THE NATURE OF ASSESSMENT OF UPPER-LIMB FUNCTIONING
Any assessment of human behavior in a natural setting is a nontrivial undertaking due to the lack of control over important confounding factors influencing behavior (e.g., desk vs. manual jobs would lead to very different movement patterns), and the constraints in measuring the information of interest (e.g., privacy/security issues). Note that behavior in this context refers to the different tasks, postures, and movements carried out by a subject. In standard clinic-based assessments of motor ability (e.g., FMA, ARAT, etc.), these factors are controlled by defining them as part of the assessment protocol (e.g., definition of the task, limb to be used, measurement approach etc.). A controlled environment for assessment enables clear interpretation of the observed movement behavior, and simplifies intra-and intersubject comparison of motor abilities. This luxury is absent in the assessment of upper-limb functioning in natural settings, which uses measurements during unconstrained behavior to assess the different aspects of upper-limb functioning (questions Q1-Q4 in section 1). Behavior is affected by two types of factors: • intrinsic factors that are directly related to upper-limb functioning, which are to be estimated by the assessment process (shown within the purple ellipse in Figure 1). The motor ability and the preference for the two upper-limbs will determine the types of tasks, amount, and quality of movements performed by a subject during daily life. For example, lower ability is likely to reduce the overall use of the affected upper-limb and result in poorer quality of movements. This will also discourage its use in complex, high intensity, and long duration tasks. Similarly, a subject might avoid doing fine manipulation tasks with the affected dominant upper-limb. • extrinsic factors are confounders that influence behavior and thus affect assessment outcome (listed on the left of the purple ellipse in Figure 1). Some of these factors include the time of observation of behavior, personal and professional constraints, etc. For instance, the observed behavior is likely to be different earlier vs. latter in a day, or the day of the week/month etc., due to changes in requirements and constraints of daily routine. Similarly, constraints from personal and professional life will influence behavior.
Interpretation of behavior through measurements is influenced by two factors, namely, the nature of the sensing modality used, and the duration for which a subject is observed (shown above the green ellipse in Figure 1). The exact choice of sensing modality is constrained by conflicting requirements of being minimally obtrusive while gathering maximal information for the accurate assessment of upper-limb functioning. An ideal sensing modality must be compact, wearable, aesthetic, and must comply with the necessary privacy requirements, while still gathering rich behavioral information. Furthermore, the duration of observation will also determine the quality of information gathered for assessment; longer observation periods are likely to better capture "typical" behavior than shorter ones, and thus provide a less biased estimate of different aspects of upper-limb functioning. However, longer observations periods are likely to have poor compliance, due to inconvenience in using the sensors for recording daily behavior. Thus, the assessment of upper-limb functioning must control for as many extrinsic factors as possible (e.g., fix the time of observation within and across subjects), while also ensuring the choice of sensing modality and the duration of observation are kept unchanged. This can minimize the effect on assessment outcome variability within and across subjects, thus improving the interpretability of the outcomes. It is, however, crucial to be aware that there are other extrinsic factors that still influence the outcome and thus interpreting outcomes of upper-limb functioning must be done with care.

ASSESSING UPPER-LIMB FUNCTIONING: FORMAL DEFINITIONS
Before getting into the details of the framework, we start with a brief overview of the general process of sensorimotor assessment in the context of neurorehabilitation. This detour is necessary to establish the meanings of the terms "evaluation, " "assessment, " "measure, " and "measurements, " as some of these terms are FIGURE 2 | A directed graph representation of the connections between the different constructs defined in the proposed framework. The leftmost node represents the measurements, while the rest of the nodes are constructs of interest in the assessment of upper-limb functioning. The construct at the end of a directed edge is derived using the construct/measurements at the start of the directed edge. The measures (gray color text) used to quantify a construct from measurements are placed above the directed edge. The brown colored text next to some of the construct indicate how two constructs are combined to derive the target construct.
used to mean different things in the current literature. The process of determining the sensorimotor state of a subject is a hierarchical process with clinical evaluation at its highest level. We define evaluation as the process of interpreting the results of one or more assessments to gauge a subject's sensorimotor state with respect to a reference, either him/herself from a different time point (intra-subject), or another subject (inter-subject). For instance, an evaluation is performed when comparing the results of ARAT assessments across different time points, or comparing smoothness of reaching movements of a patient against normative data. Evaluations can be aided through visualizations that allow interpretation of assessments. One level below in this hierarchy are assessments, which we define as the process of quantifying (i.e., putting numbers) abstract theoretical constructs (e.g., smoothness, coordination, synergies). For instance, the Fugl-Meyer upper-limb assessment is a process of quantifying the constructs "motor function, " "synergy, " and "coordination." Unlike an evaluation, an assessment only deals with assigning numbers (or labels in some cases) to constructs of interest. Assessments require clearly defined protocols for collecting data (e.g., tasks/movements to be performed), and measures. A measure is a well-defined mathematical function/formula, a computational algorithm, or a set of rules for mapping measurements or observations to quantities that are interpretable in the context of the given construct. For instance, spectral arc length (SPARC) and log dimensionless jerk (LDLJ) are measures of the construct "movement smoothness"; the rules used for assigning a score to the flexion synergy task in the Fugl-Meyer assessment is a measure of the construct "flexion synergy." Measures with good properties are essential to obtain valid, reliable, and interpretable assessments. Finally, measurements are records of variables (e.g., speed, position, orientation, etc.) obtained through various sensors or through human observation. Measurements are used by measures to quantify constructs, e.g., measurements of movement speed profile are used by the SPARC measure to quantify movement smoothness.
In the rest of this section we provide definitions of the constructs of the proposed framework for assessing upperlimb functioning. The relationship between various constructs introduced is in this section is summarized in Figure 2. Relevant literature that supports the constructs defined below are given in Table 1. We make no explicit assumption on the types of measurements available for quantifying the different constructs defined below. It should, however, be evident that the different types of measurements will vary in terms the amount of information they convey about the different'constructs.

Measurement Space
The type of sensor measurements available for an assessment will determine the steps in the analysis pipeline and the nature of information extracted about a subject's sensorimotor state. It is thus crucial for any assessment procedure to clearly state the

None*
Average upper-limb activity (Ai) None* Task (τi) for detecting tasks using different sensors (e.g., Bulling et al., 2014;Nweke et al., 2018) Movement quality None* The components of the framework that have not been explored in the current literature, entry is filled as "None". *means that this is to best of the authors' knowledge.
measurement variables used to quantify a specific construct. In this context, we define measurement space as the following. Definition Measurement space is the universal set of all possible sensor measurements available from an upper-limb for quantifying the different constructs in an assessment. We denote this set-the measurement space-by M and assume that the same quantities are measured from both upperlimbs for the given assessment procedure. Inertial sensing is one of the most common modalities used in the current literature, where the arm movements are measured using wrist-worn accelerometers M = R 3 or IMUs 1 M = R 6 . For a more elaborate measurement setup that includes wrist position and orientation, along with k joint angles of the arm, M = R 3 × SO (3) × [0, 2π) k ; where R 3 is the set of all possible wrist positions, SO (3) is the special orthogonal group of all rotation matrices representing 3D orientations of the wrist, and [0, 2π) k is the set of k joint angles.
Measurements for an assessment are made over a finite observation period referred to as the measurement epoch. Let In addition to specifying M, it is also essential for the reproducibility of an assessment to clearly specify the exact sensors used for the measurements, their accuracy, noise characteristics, resolution, sampling rate, etc. The values of these parameters have practical implications for data analysis and interpretation. These practical 1 IMU-Inertial Measurement Unit-consists of an accelerometer and a gyroscope. issues will be not be considered in this manuscript, and to simplify the presentation the mathematical formalism used assumes all measurements to be continuous in time and space.

Upper-Limb Use
Definition Upper-limb use is a binary construct indicating the presence or absence of a voluntary, meaningful movement, or posture of an upper-limb.
In this definition, the boundary of what constitutes a "meaningful" movement/posture must be defined a priori. Some examples of meaningful use include reaching and grasping, turning a doorknob, stabilizing an object with one limb while manipulating it with the other, holding a glass, writing, typing, upper-limb therapy exercises etc. Under this definition, involuntary and passive upper-limb movements/postures are not considered meaningful, e.g., resting the arm on a table, upperlimb moved by an external force, etc. There are, however, cases where the presence/absence of upper-limb use is ambiguous, e.g., arm swing during walking, passively resting the upper-limb on a book to prevent the pages from turning, etc. Such ambiguities are best resolved in an application-specific manner, where the set of tasks considered as meaningful are clearly stated a priori. For instance, in the current upper-limb use literature arm swing during walking is not considered as meaningful, even though these are unlikely to be purely passive movements (Blouin and Fitzpatrick, 2010).
Upper-limb use can be mathematically represented as a binary signal over time computed from upper-limb measurements M i ∈ M ([0, T]), where i ∈ l, r . Let f u be a function representing a measure that maps a given measurement signal M i to a binary signal u i over the same temporal domain, i.e., where, u i is the upper-limb use signal of the upper-limb i. The choice of f u is determined by several factors, such as the measurement space M, computational complexity of the measure, sensitivity/specificity of the measure, etc. All measures of upper-limb use exploit some common structure in functional/meaningful movements present in the measured data to detect upper-limb use. Some examples of the current measures f u that make use of accelerometers or IMUs are: • Thresholded activity counting. Activity counting (AC) is one of the most popular methods in the literature to quantify upper-limb functional and non-functional activity (Uswatte et al., 2006a;Bailey et al., 2014Bailey et al., , 2015de Lucena et al., 2017). AC has high sensitivity, but poor specificity (Subash et al., 2020). AC can be used with both accelerometers and IMUs. Upper-limb use u i is computed from AC by assigning a value of 1, whenever the AC is above a threshold. • Gross movement (GM) score. The Gross movement (GM) score (a.k.a Gross Counts or Gross Movement Identification method) proposed by Leuenberger et al. (2017) reconstructs the forearm orientation using a wrist-worn IMU to detect movements that occur in a pre-specified range of forearm orientations (Leuenberger et al., 2017). It is 1 whenever there are arm movements in a pre-specified range of forearm orientations, else it is 0. The GM score is highly specific, but has low sensitivity (Subash et al., 2020), and it can only be used with an IMU. • Random forests classifier. Bochniewicz et al. (2017) proposed the use of a random forests classifier to detect upper-limb use from features extracted from an accelerometer. The ML approach can be used with both accelerometers and IMUs, and has reasonable sensitivity and specificity (Lum et al., 2020).
Upper-limb use as defined in this section is an idealized construct, and its detection in practice using sensor measurements will be error-prone due to measurement noise, the natural intra-and inter-subject movement variability, and the relative sensitivity of the sensor measurements to movements and postures. The nature of the measurements M and the choice of measure f u will influence how well upper-limb use can be quantified (e.g., sensitivity and specificity) in practice. The upper-limb use signals from the two limbs can be used for defining unimanual and bimanual upper-limb use at time t as the following: Unimanual use of the right limb: Bimanual use of both limbs: u r (t) · u l (t) Unimanual use refers to a situation where only one of the upperlimbs is used, i.e., only u r (t) or u l (t) is 1 at time instant t, but not both. On the other hand, bimanual use involves the use of both limbs simultaneously, i.e., both u r (t) and u l (t) are 1 at a given time instant t. Bimanual use will include a wide range of coordination patterns between the two limbs (Kantak et al., 2017), including completely independent use of the two limb (e.g., writing with one hand while holding and drinking from a cup with the other hand) to movement of the two limbs with tight spatio-temporal synchronization (e.g., holding and balancing a tray of glasses with both hands).

Instantaneous Intensity of Use
Definition Instantaneous intensity of use is a construct that reflects how strenuous a movement/posture is at a particular instant of time, when the upper-limb is in use.
Some examples of measures f µ to quantify instantaneous intensity of use include the magnitude of movement velocity, acceleration, interaction force, muscle activity, etc. Let µ i represent the instantaneous intensity of use signal for the upperlimb i. It assumes non-negative values when the upper-limb is used, and is defined to be zero otherwise.
The exact choice for f µ is application-specific and dictated by M. We also note that results obtained from different types of measurements and different measures f µ might not be comparable, e.g., the magnitude of movement velocity can be independent of the magnitude of movement acceleration. Thus, it is imperative to report the exact f µ and its units when reporting the instantaneous intensity of use. Activity counting, as defined in Bailey et al. (2014), Bailey et al. (2015), and de Lucena et al. (2017), is an example of an instantaneous intensity of use measure in the current literature.
In general, µ i (t) will not be uniformly zero in a continuous interval of time t ∈ [t 1 , t 2 ] where there is a functional movement. However, µ i (t) can be uniformly zero in a continuous interval under two circumstances: for some choice of measurement signal and f µ . For example, activity count, magnitude of movement acceleration/velocity will be zero during an upperlimb posture. On the other hand, the magnitude of muscle activity controlling the upper-limb will not be zero even while holding a voluntary posture.

Average Upper-Limb Use or Relative Duration
Definition Average upper-limb use or relative duration is a construct that reflects the proportion of time an upper-limb is used in a given time period D.
Average upper-limb use at time t, denoted by U i (t; D), can be computed as the average value of u i in the past D seconds.
is a smoothed version of u i . We will drop D in the parenthesis in the rest of the manuscript and use it only if it needs to be explicitly mentioned. From Equation (4), we can immediately identify some essential properties of U i : where the upper-limb was used, i.e., u i (t) was 1. Thus, there are infinitely many u i s that can result in the same U i . • The value of the parameter D will depend on the application, and controls the amount of smoothing of u i ; larger values of D will results in smoother U i while compromising time localization of the information conveyed by U i . When D = T, then U i measures the proportion of time the upper-limb i was used over the entire measurement epoch.

Average Intensity of Use
Definition Average intensity of use is a construct that reflects the average intensity of movements during upper-limb use in a given time period D. Average intensity of use I i (t) can be computed from the upper-limb use signal u i and the instantaneous intensity of use signal µ i (t) as the following, where, I i (t) ∈ R ≥0 . The same ambiguity as µ i (t) exists when I i (t) = 0 for some time instant t. I i (t) = 0 could mean that the upper-limb was either not used during the time interval (t − D, t] or it was used for performing upper-limb postures, depending on the measure used to quantify instantaneous intensity of use.

Average Upper-Limb Activity
The amount of use of an upper-limb during a measurement epoch depends on both the duration and intensity of movements performed during this period, which are captured by U i and I i , respectively. Definition Average upper-limb activity is a construct that reflects of how long and how intensely an upper-limb is used in a given time period D.
High amounts of average upper-limb activity correspond to long duration, high intensity movements, while low activity corresponds to short duration, low intensity movements. Average upper-limb activity A i of the upper-limb i can be captured by the product of U i and I i , which quantifies the co-variation of these two factors. We thus define A i as, where, A i (t) ∈ R ≥0 assumes non-negative values and is upperbound by I i (t). A subject with high values for A i would be referred to as more active, than one with lower values of A i . Visualization of how much an upper-limb is used during a measurement epoch, and its quantification through a single number using A i are discussed in section 4.2.

Functional Workspace
We are also often interested in knowing the region of space around a person's body where an upper-limb is used to reach and manipulate the environment (Ploderer et al., 2016). The inclinic assessment of active range of motion of an upper-limb only tells us about the space that can be reached by a subject. It does not necessarily convey information about the space a subject regularly moves to carry out daily activities. We refer to the latter as the functional workspace, defined as the following.
Definition Functional workspace of an upper-limb is a quantitative representation of space traversed by an upper-limb when carrying out functional activities during daily life.
In this definition space could be the endpoint (Euclidean) space of the hand represented in an egocentric frame of reference, or the joint space upper-limb composed of the various joints of the limb.
Let W i (t) represent the functional workspace of upper-limb i computed at the time instant t from a segment of M i from the last Dsec, i.e., from t − D sec to t sec. In order to have a general and informative representation of the workspace, we define W i to be a probability density function over the space of interest, where the density at a given spatial location is proportional to the relative amount of time spent in a small volume of space around the spatial location during the last D sec of performing functional movements or postures.
where, f W (M i , u i ; t, D) is the function that computes a probability density function from the segment of the measurement M i and upper-limb use u i signals between times t − D and t. Such a probability density function would allow useful visualization of the functional workspace as a heatmap representing regions of space that a subject is comfortable traversing during their daily activities. Such heatmaps have been found to be useful by clinicians to understand the nature of use of the upper-limb (Ploderer et al., 2016). It should be noted that one might also be interested in the workspace of the limb when performing non-functional movements-non-functional workspace, which can be obtained from Additionally, the use of probability density functions for W i also enables one to compute various summary measures about the nature of the workspace, such as volume of the functional workspace, preferred spatial locations during day-today activities, symmetry of the functional workspace between the two limbs, etc., which help to characterize different aspects of the functional workspace.

Task
The constructs discussed so far-u i , µ i , U i , I i , A i , and W i -are task-agnostic constructs that only depend on whether or not a meaningful movement or posture is performed, irrespective of its type (e.g., reaching, manipulation, drawing). To elucidate the nature of upper-limb use, task-specific measures are required, i.e., measures that can classify the types of tasks being performed, how well these tasks are performed, etc. This information could be used to target therapy to accomplish specific rehabilitation goals. To carry out task-specific analysis, one must first define a set of tasks of interest that can be identified from the measurement M i . Definition Task is any upper-limb movement or postural pattern of interest.
Let the set T 0, 1, 2, . . . p ⊂ N be a set of natural numbers representing the p distinct tasks of interest; the numbers from 1 to p correspond to the p tasks, and 0 represents all tasks other than these p tasks of interest. Let τ i (t) ∈ T ([0, T]) represent the task performed by the upper-limb i at time t.
n, Task n is being performed by UL i at time t. 0, None of the n tasks of are being performed by ULi at time t.
(8) The function f τ is a measure that maps the measurement signal . We assume that, in general, the p tasks of interest are functional in nature, which implies τ i (t) can take on a non-zero value only if u i (t) = 1. Similar to upper-limb use, the detection of tasks from the measurement data will also be probabilistic in nature due to the natural intra-and inter-subject movement variability (Bulling et al., 2014). Human activity recognition using various sensing modalities is a current area of research in the machine learning and artificial intelligence community (Bulling et al., 2014;Nweke et al., 2018). Wearable sensors such as IMUs and vision systems are two commonly employed sensing modalities for recognizing different activities or tasks. The choice of the algorithm f τ will depend on several interrelated factors, all of which have a bearing on its overall performance in detecting tasks: • the exact nature of the measurements M, determined by the types and number of sensors used for measuring movement behavior. A diverse set of sensing modalities and higher numbers of sensors is likely to result in better detection performance. For wearable sensors, there is evidence that indicates that sensors on multiple segments of the arm can result in better performance than a single sensor just on the hand (Bulling et al., 2014). • the sets of tasks T that one is interested in detecting.
The choice of the specific tasks of interest is application specific, and must ensure they can be detected using the available sensor data. Ideally, one must avoid complex tasks which are composed of a set "sub-tasks" (e.g., cooking) and require contextual information about the user and the environment that is often not available (Van Meulen et al., 2016). Furthermore, the chosen tasks in T must have distinct kinematic patterns that can be robustly distinguished with the available measurements.
• the size of annotated ground-truth data available for training a chosen algorithm will determine the nature of algorithm that can be used. This factor will also have direct implications for the generalizability of the algorithm's performance to unseen data, and thus its performance. The size of the training dataset required for any algorithm will depend on the nature of the algorithm, the number of tasks, and the intra-and inter-class variability in the data for the problem of interest. For example, machine learning algorithms perform better in detecting movement behavior in healthy subjects compared to patients (Lum et al., 2020), arguably due to an increased intersubject (and may be also intra-subject) variability in patient populations. • the availability of computational resources will also influence the types of algorithms that can be trained and used for detecting tasks. Real-time algorithms running on wearable devices will have more constraints (Laput and Harrison, 2019) than algorithms that work on the data offline on a PC.
A range of different algorithms have been explored for human activity recognition, such as Hidden Markov Models, Decision Trees, Supper Vector Machines, k-Nearest Neighbors, etc. (Bulling et al., 2014). More recently there has been an increased use of deep learning networks for activity recognition (Nweke et al., 2018), which have shown very promising results even with a single wrist-worn smartwatch measuring accelerations of the wrist (Laput and Harrison, 2019).

Movement Quality (MQ)
Definition Movement quality (MQ) is construct that reflects the quality of the underlying sensorimotor control. MQ is a high level construct indicative of the motor ability of a user, and is composed of other constructs such as movement smoothness (Balasubramanian et al., 2015), coordination (Levin, 1996;Cirstea et al., 2003), tremor (Mansur et al., 2007), etc. Movement quality includes both tasks-specific and taskagnostic constructs, which differ primarily in terms of their computational procedure, and their interpretation. Task-agnostic measures of movement quality (e.g., amount of tremor) can be computed from the M i without worrying about the underlying tasks being performed. For instance, the amount of tremor in a frequency band of interest could be computed over short segments of movement data from the entire measurement epoch. The numbers, thus obtained, indicate the change in the amount of tremor as function of time. Although there can be several reasons (e.g., the task currently performed) that influence the amount of tremor experienced by a subject, these reasons do not influence what the numbers mean at a given time instant. On the other hand, task-specific measures (e.g., smoothness, coordination, etc.) must be computed only from complete data segments corresponding to a particular occurrence of a specific task. The appropriate interpretation of such task-specific movement quality measures requires necessary contextual information, which must include at least the task being performed (Balasubramanian et al., 2015). For example, an equally smooth reaching movement and a drawing movement will result in two different values computed from the same smoothness measure, because the spatio-temporal constraints of the two tasks are different (Balasubramanian et al., 2015). Thus, for such task-specific constructs the numbers alone are not sufficient to appropriately interpret quality of the observed movement. The context in which the different tasks are performed are also crucial for meaningfully interpreting these numbers in this scenario (Ploderer et al., 2016).
Similar to the other constructs discussed above, issues such as the nature of the available sensor data, the use of appropriate measures for movement qualities, applicability of these measures with different types of sensor data etc. need to be considered. For instance, recent work on estimating movement smoothness with IMUs has shown that the SPARC measure cannot be used with acceleration data, even though the SPARC has been shown to be a good measure of movement smoothness when applied on movement velocity .
Given these difficulties, it is not surprising that there is currently little work on assessing movement quality of upperlimb functioning in daily life using sensors (Bulling et al., 2014). We note that it might also be of interest to evaluate the quality of postures (e.g., the amount of scapular elevation used by subject to hold an object against gravity), in which case movement quality can be generalized to mean movement or posture quality. Furthermore, common compensatory strategies, such as the use of the trunk to compensate for shoulder and elbow deficits (Levin et al., 2002(Levin et al., , 2009, would also fall within the purview of movement quality. Such compensatory strategies can be seen as some form of task-specific abnormal coordination between the different joints of the trunk and upper-limb.

VISUALIZATION AND QUANTIFICATION OF UPPER-LIMB FUNCTIONING
Measuring upper-limb movements during daily-life can result in vast amounts of data, which needs to be summarized through appropriate quantitative and graphical means. A welldesigned graphical summary can provide quick and clear insights into data, and allow users to answer specific questions about upper-limb behavior. In this section, we present three graphical approaches for summarizing answers to Q1 and Q2 discussed in the introduction: (a) temporal profile of upperlimb functioning for depicting the variation of upper-limb use over time; (b) summary of upper-limb activity; and (c) relative use of the two upper-limbs. Each graphical approach mentioned above is explained using data obtained from a previous study by David et al. (2020). The measurements were obtained from IMUs donned on each wrist, i.e., M i (t) = a i (t) ⊤ ω i (t) ⊤ ⊤ ∈ R 6 = M and consists of the linear acceleration a i (t) and angular velocity ω i (t) measured by the triaxial accelerometer and gyroscope, respectively, at time t from the upper-limb i. Upper-limb use was estimated using the GM score algorithm (Leuenberger et al., 2017), and instantaneous intensity of upper-limb use was chosen to be the activity counts (Brønd et al., 2017) derived from the accelerometer data. Average upper-limb use and intensity were computed using D = 60s. Note that the visualizations discussed below are not restricted to one particular sensing modality but can be appropriately adapted for different measurements as discussed in section 5.1.

Temporal Profile of Upper-Limb Functioning
The plots of u i , µ i , U i , I i over the course of the measurement epoch, allows the user to see variations in these constructs over the course of a day or days. Sample plots of u l , U l , µ l , I l for a healthy (left column) and an impaired subject (right column) over a period of 90 min are shown in Figure 3. The left upperlimb use u l is visualized as an event plot in Figures 3A,B, where the presence of a gray vertical line at time t means u l (t) = 1, else it is 0. The average upper-limb use or duration U l is displayed in a red trace in Figures 3A,B. Figures 3C,D display the corresponding µ l and I l for this period in gray and blue traces, respectively. We note that µ l (t) = 0 in these plots correspond to either the upper-limb not being used or a functional posture, since both the GM score algorithm and the activity counts are insensitive to postures. Similarly, I l (t) = 0 when the upper-limb was not used or used in a posture in the last D seconds, i.e., U l (t) = 0.
I l (t) only provides a summary of the intensity of left upperlimb use in a temporal segment by computing the average intensity. A more detailed depiction of movement intensity can be provided by displaying the relative proportions of time the movement intensity is low, medium, or high, in an observation window; the definitions of the three intensity levels are provided in the figure's caption for this particular case. The plots in Figure 3 can aid clinical evaluation. It indicates that the overall amount and intensity of use for the patient (right column) is lower than that of the healthy subject. The patient also has little or no high intensity movements compared to the healthy participant (Figures 3E,F).

Visualization of Upper-Limb Activity
A visual summary of the amount of upper-limb use during a measurement epoch can be provided through a scatter plot of U i (t) vs. I i (t), ∀t, such that U i (t) = 0 2 . This plot will be referred to as the Use vs. Intensity plot, UI plot, which provides a simple visual summary of the overall upper-limb activity. With no loss of generality, we have chosen I i and U i to be the x and y axes of the UI plot, respectively, which has the following properties: • All points of this scatter plot belong to the set P = x, y | 0 ≥ x , 0 < y ≤ 1 . This a strip of height 1 extending along the positive x axis.
• By definition, the x axis is not part of the plot since only data points where U i = 0 are considered. • Depending on the measurement signal and the choice of measure f µ , the set of all points 0, y | 0 < y ≤ 1 will correspond to upper-limb postures; this will not be true when I i (t) = 0 for meaningful postures. • Scatter points with large values for x and low values for y correspond to short duration high intensity Although not shown in these figure, it would also be useful to indicate in such plots periods of time where there is no data available, i.e., periods where a wearable sensor has been removed and is not recording movement data from a subject.
movements, e.g., swatting a fly. Whereas, points with values of y close to 1 and low values for x correspond to prolonged low intensity movements, e.g., writing, typing.
Data from both upper-limbs can be visualized in a single plot by plotting them in the first and second quadrants as shown in Figure 4. Here, the right and left upper-limbs are depicted in the first and second quadrants, respectively; note that the data in the second quadrant are plotted by negating the value of I i . The light red colored lines in these plots correspond to constant average upper-limb activity lines, i.e., A i = U i · I i = c, where c is a constant. Figures 4b,c display the UI scatter plots for a healthy and stroke participant, respectively, using data collected from a single day (6-8 h) of recording . For the healthy subject, most points are of short to medium duration (U i < 0.5) and low intensity (I i < 50) in Figure 4b, with some long duration, high intensity movements performed with both limbs. In comparison, most movements of the stroke participant were of relatively shorter duration (U i < 0.2), with low to medium intensity movements (I i < 100); high intensity movements I i > 100 were rare. This observation is also evidenced by reduced number of constant A i lines that cut through the scatter plot in Figure 4c compared to that of the healthy subject.

Quantification of Overall Upper-Limb Activity
The distribution of points in an UI plot can be thought of as a sample obtained from a bi-variate probability density function of U and I, p I,U x, y 3 . The univariate probability densities of U i , I i , and A i can be obtained from p I,U as the following, We define a quantitative measure of how much an upper-limb is used, H q , as the q th percentile of A, which can be computed from its probability density function p A , where, the subscript in q in H q indicates that the measure is computed using the q th percentile. Properties of H q . We demonstrate through a set of simulated scenarios (Figure 5) that the measure H q agrees with our intuition. Consider the scenarios depicted in Figure 5, which shows five UI plots, in the top row, with different distribution of points. In each of these plots, points are assumed to be uniformly distributed in the gray regions shown; the light red colored curves are the A i = c lines, where c is a constant. The rows of plots below the UI plots show the univariate probability density functions p I , p U , and p A estimated from the data points sampled from the FIGURE 4 | Use vs. Intensity (UI) plot to depict the overall amount of use of the upper-limbs. (a) This plot provides the details of a typical UI plot and highlights some critical elements to help interpretation. The x axis cannot be part of the plot, and light red colored curves are the constant upper-limb activity lines. If f µ is the magnitude of acceleration as is the case in (b,c), then the y axis represents meaningful/functional postures where the intensity can be zero. (b) UI plot for a healthy participant using data collected from a single day. The 1st and 2nd quadrants of the scatter plot depicts the right (blue) and left (red) upper-limbs, respectively. (c) UI plot for a stroke participant using data collected from a single day. It is clear that the stroke participant has a low level of activity compared to the healthy participant, which is also reflected in their corresponding H q scores. corresponding distributions p I,U shown in the UI plots in the top row, along with the corresponding q th percentile values of the sample data (q was set to 90).
The following observations can be made about the five scenarios depicted in Figure 5, which are reflected in the measure H q : • • Scenario-2 has higher activity H q = 6.10 than scenario-1 as it contains movements of larger duration or higher intensity in addition to movements similar to scenario-1. This results in larger values for A i ∈ [0, 10] compared to scenario-1. • Scenario-3 has higher activity H q = 14.78 than scenario-2 as it has longer duration and higher intensity movements than scenario-2, resulting in even larger range of values for A i ∈ [0, 18] than scenario-2. • Scenario-4 has movements with longer duration and higher intensity than scenarios 2 and 3, resulting in a large interval for the possible values of A i ∈ [0, 50] compared to scenarios 2 and 3. Thus, resulting in a much higher level of activity, H q = 36.82. • The difference in upper-limb activity between scenario-4 and scenario-5 is smaller than that of scenario-4 and scenario-3. Scenario-4 has more long duration and high intensity movements than scenario-3, but has more shorter duration and lower intensity movements than scenario-5. Scenario-5 only has longer duration and higher intensity movements.

Visualization of Relative Use of the Upper-Limbs
Visualizing the relative use of the upper-limbs has been explored through 2D scatter plots or heat-maps of different variables related to the use of the upper-limbs (Bailey et al., 2014;David et al., 2020). Relative upper-limb use can be visualized and quantified using measures of average upper-limb use (U r , U l ), average upper-limb intensity (I r , I l ) or average upper-limb activity (A r , A l ); here, we use average upper-limb intensity for demonstration purposes. We only consider data points where at least one of the two upper-limbs was used, i.e., I r (t) + I l (t) > 0 4 ; it is meaningless to talk about relative use when neither upper-limb is used. In general, relative use of the upper-limbs can be visualized by plotting two functions g (I r , I l ) and h (I r , I l ) of the subject's data along the x and y axis, respectively. These two function g (·) and h (·) will determine the nature of distribution of data points in this "gh" scatter plot and its fundamental properties. A qualitative understanding of these properties can be obtained from the following four family of curves L 1 to L 4 in the gh plot: where, c ∈ R ≥0 . L 1 and L 2 are particularly useful in explaining the shape of the distribution of points in the different visualization plots, where the bounding curves of a scatter plot are generated from different L 1 and L 2 curves. We present the analysis of three visualization methods, the first one based on the work of Bailey et al. (2014), the second from the work of David et al. (2020), and the third one is a rotated version of second plot. Two additional visualization methods based on BMMR and LIRI are presented in Appendix A.

Bilateral-Magnitude vs. Magnitude-Ratio (BMMR) Plot
This method proposed by Bailey et al. (2014Bailey et al. ( , 2015 and Lang et al. (2017) used activity counting to plot a heatmap between the magnitude ratio (MR) and bilateral magnitude (BM), Bailey et al. bounded the value of MR to be within ±7, which we ignore in this discussion. The mathematical definitions of L 1 to L 4 , and the plot of these curves for different values of c are shown in Figure 6A. The following are some of the essential properties of BMMR plot: • The vertical line x = 0 corresponds to I l = I r , and divides the plot into two halves x > 0 and x < 0 corresponding to right and left dominated halves, respectively. • Pure unilateral use I l = 0 or I r = 0 corresponds to x = ±∞, which was approximated to be x = ±7 by Bailey et al. (2014). • Equal, unbiased use of the two upper-limbs results in a symmetric leaf-like distribution of points (blue curves in Figures 7a,b). The region enclosed by closed blue curve in Figure 7a corresponds to 5 ≤ I l , I r ≤ 500. We note that the shape of the heatmaps for healthy subjects in Bailey et al. (2015) closely resembles this symmetric leaf shape. • Biased use of the upper-limbs results in an asymmetric distribution of points, with more points located at a larger distance from the x axis on the side with increased use (red curve in Figures 7a,c). The region enclosed by closed red curve in Figure 7a corresponds to 1 ≤ I l ≤ 50 and 1 ≤ I r ≤ 250.

Left Intensity vs. Right Intensity (LIRI) Plot
This simple approach was proposed by David et al. (2020) where the authors had used the average upper-limb use instead of intensity. Here, we use the average upper-limb intensity I r and I l (Figure 6B), x (t) = g (I l , I r ) = I r (t) ; I r (t) ≥ 0 The following are some of the essential properties of the LIRI plot: FIGURE 6 | Analysis of (A) bilateral magnitude vs. magnitude ratio (BMMR) plot (Bailey et al., 2014), (B) left intensity vs. right intensity (LIRI) plot , and (C) intensity sum vs. intensity difference plot (ISID), by investigating the nature of the family of four curves L 1 (blue), L 2 (red), L 3 (green), and L 4 (black) introduced in Equation (11). The solid and dashed lines indicate different values of c for the same curve.
• The y = x corresponds to I l = I r and divides the first quadrant into an upper and lower half about this diagonal line which correspond to relatively high left I l > I r and right use I l < I r , respectively. • Pure right and left unilateral use correspond to points long the x and y axes, respectively. • Equal, unbiased use of the two upper-limbs in a square shaped region of distribution of points (blue curve in Figures 7g,h); the square is symmetric about the y = x line. • Biased use of the upper-limbs results in rectangular distribution of points, with the longer side of the rectangular oriented along the axes corresponding to the upper-limb with increased use (red curve in Figures 7g,i).

Intensity Sum vs. Intensity Difference (ISID) Plot
This plot is derived by rotating the LIRI plot by 45 • counterclockwise, which results in a plot of the sum vs. the difference between the average upper-limb intensities ( Figure 6C).
Because I i is non-negative, the points y < |x| are not part of the plot, which is shown by the shaded region in Figure 6C. The following are some of the essential properties of ISID plot: • Like the BMMR plot, the x = 0 corresponds to I l = I r .
• Pure right and left unilateral use correspond to points long the y = x and y = −x lines, respectively.
• The shape of the distribution of points are the same as LIRI but are rotated by 45• counter-clockwise (Figures 7h,i).

Quantification of Relative Upper-Limb Use
A quantitative measure of relative upper-limb use should allow us to distinguish between different levels of relative use of the upperlimbs through a single number. Such a measure should map: (a) the spectrum of pure unimanual behavior to pure bimanual behavior to a compact interval on the real line, and (b) report low values for unimanual, and high values for bimanual behaviors.
We can conceive such a quantitative measure of relative upper-limb use through an approach similar to that of H q . Consider the joint probability density p I r ,I l r, l of I l and I r 5 . We can compute the marginal densities of I r and I l , and the probability density of I r · I l from p I r ,I l r, l using the approach in Equation (9). We define a measure of relative upper-limb use R q as the following, where, R q : R ≥0 [0, T] × R ≥0 [0, T] → [0, 1] maps two time signals I r and I l to the set [0, 1]. The subscript q in R q indicates that the measure is computed using the qth percentiles, and q r , q l and q rl are the qth percentiles of the probability density functions of I r , I l , and I r · I l , respectively. It should be noted that q r and q l will never be simultaneously zero as we only include data points where I l (t) + I r (t) > 0. The mapping of different movement behaviors to the interval [0, 1] by this measure is shown in Figure 8, where the LIRI plot was chosen for depicting different types of unimanual and bimanual movement behaviors. The distribution of points in these LIRI plots are indicated by the gray regions, where we assume the points are distributed with uniform density; plots with just a black line depict scenarios where the points are distributed uniformly along the line. The red diagonal line in each of these LIRI plots is the x = y line. The value of R q for each of these plots is shown in the respective plots, and their location in the interval [0, 1] on the real-line is shown in the bottom of the figure (thick black line) with colored vertical lines. The R q measure has the following properties.
• Pure unimanual use. R q (I r , I l ) = 0 indicates pure unilateral use, such that I r (t) · I l (t) = 0, ∀t 6 . • Symmetric bimanual use. R q (I r , I l ) = 1 indicates pure symmetric bimanual use, such that I r (t) = I l (t) , ∀t.
• Symmetry about the x = y line. R q is symmetric about the x = y line, i.e., R q (I r , I l ) = R q (I l , I r ). Two distribution of points that are mirror symmetric about the x = y line will have the same value for R q . Thus, low values for R q only indicate biased use and do not provide any information about the direction of the bias. This implied that R q (I r , m · I l ) = R q I r , 1 m · I l = m, 0 ≤ m ≤ 1. • R q is independent of uniform scaling I r and I l , i.e. R q (I r , I l ) = R q (c · I r , c · I l ) , c > 0 is the value.
The measure R q only tells us if one limb is used over the other, and is silent about which of the two limbs is used more. This information can be obtained from the sign of the different between q r and q l , which is +1 when the right limb is used more than the left, and −1 when it is vice versa. R q along with the sign of q r − q l will provide information amount of bias in using the upper-limbs, along with the preferred limb.

DISCUSSION
The framework presented here is a step toward a rigorous foundation for the sensor-based assessment of upper-limb functioning by formalizing existing ideas/concepts. Lack of rigor is not an uncommon problem in movement science, which is reflected in the literature as ambiguous definitions of constructs, lack of clear specifications for measures, and absence of theoretical and experimental validation of measures proposed to quantify constructs of interest. Movement smoothness is a prime example of such a construct that was quantified using several measures with little or no knowledge about their properties (Balasubramanian et al., 2012(Balasubramanian et al., , 2015. Given the increasing interest in assessment of upper-limb functioning using sensors, we strongly believe that the proposed framework can help guide future developments in this area.
In this section, we highlight some important issues with sensor-based assessment of upper-limb functioning, and point out the limitations of the current work, and avenues for future work.

On the Importance of Measurements and Measures
Measurements and measures form the basis of any assessment procedure. Measurements contain "raw" information about an underlying behavior, and measures map measurements to numbers that quantify and summarize constructs of interest. Thus, the choice of measurements and measures determine the quality of information obtained from an assessment.
In the assessment of upper-limb functioning, several practical issues play a major role in the choice of sensing modality, such as the compactness, power efficiency, aesthetics, ease of donning and doffing of the sensors, privacy, etc. These constraints on a chosen sensing modality can limit the nature and fidelity of information about upper-limb functioning. For example, accelerometers have shown to perform a little better at detecting different tasks than gyroscopes alone (Bulling et al., 2014). Upper-limb use involving fine finger, wrist, and hand movements are unlikely to be captured by a single IMU worn on the forearm (Subash et al., 2020). Detecting tasks involving physical interactions with environment will require some form of vision technology (Tsai et al., 2020), and cannot be obtained purely from body segment kinematics. Similarly, the movement qualities that can be quantified also depend on the sensing modality, e.g., smoothness cannot be computed from pure accelerometer data, except under special circumstances .
Since these are early days in the field of assessment of upperlimb functioning, it would be unwise to make recommendations for the sensing modalities required for accurate assessment of upper-limb functioning in daily life. However, one can confidently speculate that a compact, body worn sensing system [e.g., wrist band (Bailey et al., 2014;David et al., 2020;Lum et al., 2020), sensorized clothing (Lorussi et al., 2016), etc.] with more than one sensing modality (Maceira-Elvira et al., 2019) [e.g., inertial sensor, pressure sensor, physiological sensing, vision, radar-on-a-chip for hand movement tracking (Malešević et al., 2019), magnetic ring finger tracking (Friedman et al., 2014) etc.] will become the standard for assessing upper-limb functioning.

On Task-Agnostic and Task-Specific Analysis of Upper-Limb Functioning
Upper-limb use u i and instantaneous intensity of use µ i , their averages (U i , I i , A i ), and the functional workspace W i together provide a measure of how much an upper-limb is used during a measurement epoch. These constructs are independent of the nature of the task being performed by the subjects, as they only demarcate functional behaviors from non-functional ones. The work presented in this manuscript focused only on taskagnostic analysis, given that these have been of primary interest in the recent literature. This is necessary information which only sheds light on the overall incorporation of the upper-limbs in daily life, without divulging the details of how the upper-limbs are used. Although, these task-agnostic construct provide some information about motor impairments, a more fine-grained taskspecific analysis is required for identifying limitations in activity and participation.
Task-specific analysis requires the segmentation of measurements based on features of specific tasks of interest. Such analysis can allow the estimation of various impairment, activity, and participation level parameters to help build a comprehensive profile of the subject's disability. The details of the tasks performed during the measurement epoch provide information about limitations at the activity (e.g., time taken and range of motion while performing a task) and participation levels (e.g., limitations in carrying out household and work-related activities). The fidelity of such an analysis will depend on the nature of the available measurements, and algorithms that can accurately and robustly detect the task of interest. To our knowledge, there is currently no work on task-level analysis for assessing upper-limb movement functioning. This too is likely to change in the coming years with advances in human activity classification using sensors (Chen et al., 2020). Recent work by Schambra et al. on a taxonomy for upper-limb motion provides a nice framework for decomposing functional movements into different "functional primitives" (Schambra et al., 2019). They also demonstrated that most upper-limb functional activities carried out during therapy are captured by this taxonomy. One possible approach to leverage this work for task-specific analysis of upper-limb functioning is to develop algorithms to detect the five different functional primitives (reach, reposition, transport, stabilize, idle) defined in this taxonomy, and use these detected primitives to further identify higher level tasks/activities. This bottom up approach to detecting tasks/activities would also help quantify the "functional" composition of day-to-day movements in terms of functional primitives. Such a decomposition might be relevant for therapy planning, allowing therapist to focus therapy on primitives that might be limiting the patient's daily activity and participation. Ploderer et al. (2016) investigated the usefulness of different visualization methods for conveying information about upperlimb functioning in daily life. They found that temporal plots of the amount of upper-limb activity (similar to Figure 3) can be useful in understanding the use of the upper-limb from over several hours to weeks. The clinicians also emphasized the importance of a visualization method in providing a quick overview of the upper-limb functioning (Ploderer et al., 2016) over those provided by Use vs. Intensity (UI) and the relative upper-limb use plots in Figures 4, 7. Plots of functional workspace in the form of range of motion plots of joint angles and/or heatmaps of hand position in an egocentric frame were also found to be useful, which were not explored in the current study; data from a single wrist-worn IMU does not allow the extraction of hand position information or the arm joint angles. This again highlights the importance of the sensing modality in determining the information that can be obtained about upperlimb functioning.

On the Visualization of Upper-Limb Functioning
The Use vs. Intensity plot, UI plot provides information about how much the upper-limbs are used during the measurement epoch, taking into account both the duration U i and intensity I i of movements. The nature of distribution of points in a UI plot depends on: (a) the nature of measurements M; (b) the function f u , f µ used to quantify u i and µ i ; and (c) the window length D (Equations 4 and 5) used to compute U i and I i . The extent to which the choice of these parameters affects interpretation were not investigated in this paper and requires further investigation.
Three approaches for visualizing relative use of the upperlimbs were analyzed in this paper. To promote the development and standardization of an appropriate visualization method, we make the following recommendations: • Avoiding complex transformations will make it easier to interpret graphs. The LIRI plot is simpler than the BMMR plot, as I l and I r are visualized without any non-linear transformations. BMMR, MPMR, and BIUNI plots require complex transformations that hinder intuitive interpretation of these plots. • Symmetry about the x = 0 line might be easier to interpret.
Plots where the x = 0 line corresponds to I l = I r divide the plot into two regions where the use of one upper-limb is higher than the other. These plots are easier to interpret. For instance, ISID plot, which is a rotated version of LIRI, is probably easier to interpret than LIRI. • Elucidating the properties of a visualization method.
Understanding a new visualization method can be made easier by depicting plots of special cases. For instance, the family of four curves L 1 to L 4 (Equation 11) were used to demonstrate some properties of the visualization approaches for relative use of the upper-limb. Thus, we recommend that researchers make use of such an approach when developing new visualization methods.
The visualization and quantification of relative use of the upper-limb were demonstrated using (I r , I l ). Although the properties of the visualization and quantification using (U r , U l ) or (A r , A l ) are likely to be similar, there will be some differences. One must be cautious of these differences to ensure proper interpretation of the data. For example, unlike I i and A i the LIRI plot with (U r , U l ) is restricted to the square 0 ≤ x, y ≤ 1.

On the Clinical Relevance of the Proposed Framework
Information about how an individual uses their upper limbs in every day activities is arguably a fundamental criterion of interest to a clinician. Our proposed framework aims at removing the ambiguity of what is meant by upper-limb functioning by defining key components that are necessary to depict "how" individuals behave in every day life in an objective manner. This information is conveyed by upper-limb use, intensity, and their averages, which relate to the overall duration and strenuousness of upper-limb use. The combination of these two constructs provide a good measure of how active an upper-limb is during daily life. Visualization of this information over the course of the day or across days was found to be useful by clinicians for monitoring upper-limb use in daily life (Ploderer et al., 2016).
Asymmetry of upper-limb use can be evaluated from the level of activity of the two limbs which, when compared to normative data, can provide the measure of compensation employed by the patient by using the less affected limb. The ability to measure this asymmetry can help identify the underlying cause of this asymmetry, such as specific sensorimotor impairments, learned non-use (Taub et al., 2006), etc. The detailed characterization of upper-limb use by decomposing it into different tasks can provide ecologically relevant information about activity and participation. Furthermore, the time of occurrence, duration, and frequency of different tasks identified during daily life, and tracking these parameters over time can help evaluate changes in the ability and confidence in using the upper-limb either due to therapeutic interventions or spontaneous recovery. Finally, the quality of movement performed while carrying out different tasks can provide additional clues about the sensorimotor control ability and its relationship with hand preference and behavior. The realization of the clinical utility of the different constructs in this framework and their incorporation in routine clinical use is at least a few years away, given the numerous technical and clinical hurdles that need to be overcome. To this end, we note some of the limitation of the current work and make suggestions for future research in the following subsection.

Limitations
This work is an initial attempt toward a framework for the systematic analysis and interpretation of upper-limb functioning using sensors. We hope that the ideas presented here form a base for future work in this area, and anticipate that these ideas will be further refined and improved in the coming years. To aid this process, we make explicit the limitations of the current work, which are as follows: • The ideas presented in this work are theoretical in nature, and do not provide any specific algorithms or methods for quantifying the different constructs. Appropriate algorithms for realizing the measures f u , f µ , f W , and f τ are essential for practical implementation of a good assessment procedure, which will be an active area of research in the coming years. • The different components of the proposed framework were chosen based on the authors' experience and understanding of the current clinical needs, and the trends in the neurorehabilitation literature. However, the clinical utility of these ideas (concepts, measure, and visualization methods) needs further validation. • The work targets the evaluation of upper-limb functioning in hemiparesis. Thus, not all of the ideas presented here would be relevant for other conditions, such as those involving tremors, chorea, dystonia, etc. Application of this framework to other conditions, e.g., Parkinson's disease or orthopaedics, might require new concepts or revised definitions. • Assessments of upper-limb functioning using sensors usually results in large amounts of data. The analysis methods that have been employed in the current literature and proposed in the current paper typically only extract a portion of information available in the measured data. Future work must focus on exploring data mining algorithms for identifying patterns of recurring behavior across time. Recent developments in computational ethology (von Ziegler et al., 2020) and automatic behavioral clustering (Berman et al., 2014) could be leveraged to identify such patterns. There is also currently little work on investigating patterns of upper-limb functioning within and across days, which might be useful in evaluating the participation of a patient in different day-to-day activities and their life roles. • The work only addresses questions Q1 and Q2 presented in the introduction section, which deal with how much the upperlimbs are used in daily life and the bias in using one limb over the other. More detailed task-level analysis are likely to be of increasing interest in the future. Further, the work also did not explore measures for constructs such as "ability, " which may be of interest to a clinician; "ability" is likely to depend on amount of activity, types of tasks, and movement quality.
To address the aforementioned limitations and to advance the state-of-the-art in sensor-based assessment of upperlimb functioning, we make the following suggestions for future research: • Multi-modal sensing system. Compact wrist/forearm-worn inertial sensors have been most popular solution to measuring upper-limb movements in recent times. Given the popularity this form-factor, exploring the use of additional sensing modalities such as radar-on-a-chip, EMG sensing, etc. for picking hand movements might be useful. Other sensing modalities such a textile-based sensing (Lorussi et al., 2016), egocentric camera (Tsai et al., 2020), and wristmounted cameras (Chen et al., 2018) might be able to provide more information about full body movements and object interactions. • Robust, accurate methods for detecting upper-limb use and tasks. Recent work from Lum et al. (2020) and Subash et al. (2020) have carried out direct comparison of existing methods for detecting upper-limb use. This line of work, along with the development, validation and comparison of more sophisticated methods, leveraging the recent developments in machine learning, should be pursued to improve the accuracy and robustness of detecting upper-limb use and tasks of interest. One possible approach is to adapt ideas from human activity recognition literature to the specific needs of assessing upper-limb functioning. For instance, the taxonomy proposed by Schambra et al. (2019) could be used for building an algorithm that exploits the hierarchical structure of complex activities proposed in this taxonomy. Low-level algorithms can be devised for detecting the occurrence and duration of functional primitives. Information about the timing, duration, and amplitudes of these functional primitives could be used to detect the occurrence of more complex activities. The hierarchical analysis of sensor data might also be beneficial in identifying specific movement difficulties during daily life. • Open dataset of upper-limb behavior. The development and validation of algorithms proposed in the previous points require data from the target population. The neurorehabilitation research community would immensely benefit from the availability of annotated open dataset consisting of relevant movement behaviors of interest from healthy and patient population with varying degrees of impairment. The development of such datasets and the sharing of data from various studies carried out in the community in this area can help drive the field forward in the coming years. The ImageNet (Deng et al., 2009) dataset played a crucial role in recent success of object recognition models in computer vision using machine learning algorithms. Similar efforts are already being made in the human activity recognition community (Laput and Harrison, 2019). • In-clinic and home-based clinical trials. The clinical usefulness of the framework needs to be evaluated through both in-clinic and home-based studies. In-clinic studies for tracking upper-limb functioning of in-patients undergoing therapy would be relatively easier to carry out, and the availability of some information about the day-to-day routine of patients would allow validation of the assessment of upper-limb functioning using sensors. These can be followed by home-based studies to evaluate the usefulness of the proposed framework for assessing upper-limb functionally in the natural setting.

CONCLUSION
The paper presented a framework for sensor-based assessment upper-limb functioning, with focus on hemiparesis. The proposed framework provided formal definitions of constructs in upper-limb functioning, methods for their visualization, and two generic measures for quantifying the amount and the bias in using the two upper-limbs. Demonstration of some of these components were provided through preliminary data obtained from a previous study. We also pointed out the limitations of the current work which are likely to be addressed in the coming years. We firmly believe that the proposed framework can act as a scaffold for researchers in the field to build and test different ideas for assessment of upper-limb functioning. These future explorations will help identify issues with the framework, while adding, revising, and even completely replacing elements from the framework, which are not clinically and technically relevant. We hope this work is a useful step toward realizing an objective, accurate, and clinically relevant assessment tool to evaluate the true effect of neurorehabilitation in patients' daily life.

DATA AVAILABILITY STATEMENT
The data analyzed in this study was obtained from a previously published article . Requests to access these datasets should be directed to Sivakumar Balasubramanian, siva82kb@cmcvellore.ac.in.

AUTHOR CONTRIBUTIONS
AD and SB conceived the initial skeleton for the framework. The details of the framework were developed through discussions among AD, TS, SV, AM-C, and SB. AD and TS carried out the data collection and analysis presented in the paper. The initial manuscript was prepared by AD and SB. AD, TS, SV, AM-C, and SB reviewed, revised, and approved the final manuscript. All authors contributed to the article and approved the submitted version. (e,f) Corresponding BIUNI plots for the same subjects. The closed black curves shown in the plots for the healthy and impaired participant correspond to the 2.5th and 97.5th percentiles for I l and I r .