Objective assessment of dysarthric disorders in patients with multiple sclerosis depending on sex, age, and type of text read

Purpose To assess dysarthric disorders in multiple sclerosis (MS) patients in comparison with healthy individuals and MS patients without dysarthria depending on the patient’s sex, age, and the type of text read using an objective tool. Methods The study was carried out in a group of 72 persons, including 24 with MS presenting dysarthria (study group) and 24 healthy individuals (healthy control group), and 24 with MS without dysarthria (MS control group). Performance (reading) time was evaluated by means of an objective tool created for the purpose of the analysis. Results The study showed significant statistical differences in the analyzed performance time of: poetry reading, prose reading, and completing a diction exercise, among persons with MS from the study group presenting dysarthria and both control groups (p < 0.05). It took more time to read the poem, and prose and to perform a diction exercise in the study group with dysarthria than in both control groups (with no significant differences between the two) Similarly, the comparison between the groups in terms of sex and age showed disturbances in the above-mentioned parameter in the study group. What was not demonstrated were significant differences in the evaluated speech parameters depending on both sex and age separately in the group of MS patients with dysarthria, and both control groups (p < 0.05). Conclusion The objective tool created for the purpose of speech analysis is useful in detecting discrepancies in performance (reading) time among MS patients with dysarthria, and healthy individuals, as well as patients with MS without dysarthria and can be used in clinical practice for diagnostic purposes, however, further research is essential to complete its validation.


Introduction
Multiple sclerosis (MS) is an immunological, inflammatory disease demyelinating the central nervous system, which constitutes the most common nontraumatic cause of disabilities in young adults (1). Currently, a total of 2.8 million people is estimated to live with MS worldwide, 35.9 per 100,000 population (2). MS prevalence has increased in every world region since 2013 (2). The newest data indicate that MS incidence and prevalence in Poland are higher than previously reported and in 2019 amounted to 6.6 and 131.2/100,000 inhabitants, respectively (3).
The onset of MS usually occurs between the ages of 20 and 40 and is two to three times more common in women than men. Most commonly the disease is of relapsing-remitting type (85-90%), progressing over time. A small percentage of patients (10%) are diagnosed with primary progressive MS, which is characterized by progression from the onset of the disease (4).
In the relapsing-remitting form, each relapse may be associated with a different type of neurological deficit. Typical clinical symptoms include retrobulbar inflammation of the optic nerve, eye movement disorders, cerebellar ataxia, spastic paresis, and sensory disturbances (5). Pyramidal pathways damage is the cause of paresis of the limbs, increased muscle tone, exaggerated deep reflexes, and the presence of pathological symptoms. In the initial phase of the disease, these symptoms occur in 32-41% of patients, and in the majority of MS patients (90%) in the chronic phase. Cerebellar ataxia stems from dysfunction in the cerebellum, resulting in uncoordinated movements whereas sensory ataxia arises due to the impairment of sensory input in regulating movement. The symptoms of the cerebellar syndrome include: dysarthria, dysmetria, dysadiadiochokinesis, intention tremor, dysrhythmia, disturbances in motor coordination and balance. Unlike cerebellar ataxia sensory ataxia (damage to the dorsal columns) is not accompanied by dysarthria, and nystagmus, or postural abnormalities, but impairment in deep sensation, attenuation or loss of deep reflexes, and finger-nose test worsening with eyes closed (proprioception deficit) (6). Among the study group, symptomatically, all subjects were characterized by features of atactic disorders and pyramidal signs. In 8.33% of the study group, additional disorders associated with cranial nerves were observed.
The combination of neurological symptoms can be extremely varied and variable. Speech and voice disorders are among the least accurately described clinical symptoms of MS, although their estimated prevalence reaches 40-50% (7)(8)(9). Demyelinating damage to the central nervous system may cause spasticity, weakness of the tongue muscles, and impaired motor coordination of the tongue, jaw, soft palate, vocal cords, and diaphragm (10). Communication impairment may result from difficulties in voice control and articulation of words due to the dysfunction of the speech-responsible muscles and insufficient subglottic pressure (11). The most commonly reported speech disorders include speech, speech speed reduction, voice quality deterioration, hoarseness, volume and tone control disorders, imprecise articulation, impaired speech fluency, and swallowing problems (10,12,13). Speech disorders in patients with MS are associated with negative physical and psychosocial consequences, including communication problems, frustration, low self-esteem, and limited participation in daily activities (14). Studies show that dysarthria, i.e., a motor disorder of speech function arising as a result of sudden as well as chronic diseases causing problems with effective verbal communication, is the most common speech disorder in MS, affecting up to 45% of patients (15). There are many types of dysarthria: cerebellar, spastic, bulbar, and dystonic. In the case of MS when the disorder is present it can mirror either a single type or a mixture of a few types, with spastic-ataxic being the most frequent. Spastic dysarthria is a combination of weakness and spasticity. It can manifest itself with slow and reduced range and force of speech. Ataxic dysarthria is associated with damage to the cerebellar control circuit. Associated with disturbed coordination, it may occur in all speech levels: respiration, phonation, resonation, and articulation, however, it is most noticeable in terms of articulation and prosody (16). For our research we chose a group presenting mixed, spastic-ataxic dysarthria. The study of these disorders can (especially in the context of tracking the dynamics of changes) become an effective and accurate diagnostic tool, especially in chronic diseases whose complications include neuromuscular disorders (15).
So far, in Poland, the deficits have been commonly assessed on the basis of an interview collected from patients (17). The Speech Pathology Specific Questionnaire for people with MS has also been developed and validated (7). Based on the review of the world literature, it can be concluded that in many countries the following scales are most often used in phoniatric practice: Vocal Tract Discomfort, GRBAS listening scale, Voice Handicap Index (18)(19)(20)(21). All of the above-mentioned scales are subjective tools, and therefore they are burdened with a certain margin of error, thirdly, they do not allow for the unequivocal distinction between physiological and pathological values, therefore they are of limited use in routine clinical examinations of the vocal organ and do not accentuate on the mechanisms inducing the phenomenon of dysarthria, which is crucial in conducting therapy (22).
The observations became the motivator for undertaking the discussed research. Another reason was the fact that researchers currently report there are no MS-specific diagnostic instruments for dysarthria (15,23). Providing there is an objective tool for measuring speech deterioration and its sensitivity to disease activity is confirmed in the studies, dysarthria may, in the future, be used as a biomarker for the progression of the disease (15). Therefore, we decided to create an objective tool to assess dysarthric disorders in patients with MS.
The purpose of the study was to assess dysarthric disorders in MS patients in comparison with healthy individuals and MS patients without dysarthria depending on the patient's sex, age, and the type of text read using an objective tool.

Participants and setting
The study was carried out in the Clinical Neurology Ward with Stroke Unit at the Clinical Hospital No 2 in Rzeszow, Poland, as a part of broader research analyzing speech parameters of native speakers of Polish suffering from MS. It was conducted in a group of 72 persons, including 24 with multiple sclerosis (MS) presenting mixed spasticataxic dysarthria (study group), 24 healthy individuals (healthy control group) and 24 persons with MS without dysarthria (MS control group). The patients were treated for their MS in the Clinical Neurology Ward with Stroke Unit at the Clinical Hospital No 2 in Rzeszow, Poland, as part of the government-funded pharmacological treatment. The study group comprised 12 women and 12 men, with a mean age of 39.2 ± 12.30. Both control groups were age-and sex-matched to the study group. The characteristics of the three groups are shown in Table 1.
The study group included people diagnosed with MS, presenting with mixed, spastic-ataxic dysarthria, in remission, who gave their informed consent to participate in the study. Patients with cognitive deficits impairing the ability to understand and follow instructions (Mini-Mental State Examination <24), with visual impairment, and those: with speech disorders other than spastic-ataxic dysarthria, or other than dysarthria; in the period of relapse; with any comorbidities that may affect the quality of speech; or who did not consent to participate in the study were excluded. The MS control group included people diagnosed with MS without dysarthria or any speech disorders, matched in terms of age and sex. The healthy control group consisted of healthy people, without any speech disorders, matched in terms of age and sex.
The study protocol was assessed and accepted by the Bioethical Committee at the University of Rzeszów (approval no. 3/01/2020). All the procedures were executed in full compliance with the principles set forth in the Declaration of Helsinki. All the study participants gave their informed consent in writing.

Procedures
The following work implements an analysis of performance time based on recording samples of the study participants reading three suggested texts: (1) a poem, (2) a text in prose, and (3) a one-line diction exercise. Every participant had time to read the text before the recording began. Only when the examiner made sure that everything in the text was understandable and that the subject was ready to read it out loud, did the recording start.
(1) A poetic text: 16 lines 10 syllables each -a poem in verse 2 lines each; (2) A text in prose: including simple and complex sentences containing all sounds of the Polish language, as well as phonetic phenomena indicating potential dysarthric disorders, for instance, consonantal clusters provoking phonetic mistakes. 4 sentences, 3 complex and 1 simple: 1st sentence -24 words, 2nd sentence -4 words, 3rd sentence -21 words, 4th sentence -23 words; (3) The diction exercise: required accurate (without phonetic mistakes) realization of 10 syllables in the fastest possible manner. PA TA KA PA TA KA PA TA KA PA. All samples qualified for the study were then assessed by a specialist in terms of the severity of the disorder considering the performance time (PT) of each text in seconds.

Outcome measures
For the purposes of this study, an objective speech analysis tool (an IT tool, concerning information technology, software with elements of machine learning) was created. The neural network model was implemented based on the TensorFlow library.

Layers of the model
The Model consists of 19 layers of a total number of 572,868 parameters (including 572,100 that undergo the learning process). Eleven layers, created in line with the Convolutional Neural Network (CNN), comprise three blocks of the following spread: Plexus → Plexus → Data Normalization → Subarea Maximum. Due to the fact that the sound samples vary in length the network that was used would be able to remember the previous state while 'listening' to the recordings. In a classic neural network, the so-called feed-forward neural network, the input data passes through the hidden layers, therefore the output data is determined. In a trained network, specific inputs always generate exactly the same outputs. In the case of a recursive network (RNN), an additional loopback was used to remember the state between successive calls. Input x may return different y values each time, depending on the currently stored state of the network h (hidden state). Classic RNN layers involve a gradient, which exponentially disappears in time while reverse learning -causes the older data to be forgotten quickly, with the newest data having the greatest impact on final results. The advantage of classic RNNs is that they learn faster and require fewer resources. Unfortunately, in this case, they did not give satisfactory effectiveness. For this reason, the LSTM (Long Short-Term Memory) architecture was used. If we treat the LSTM cell as a black box, it is similar to a classic recursive cell, except that its state is divided into two vectors: h (hidden, storing short-term data) and c (cell, storing long-term data).

Activation function
In the model, the networks used include the following activation functions ( Figure 1  was applied in all hidden layers of the convolutional neural network. -Tanh, is similar to the sigmoid function, however, its core constitutes 0 and covers a bigger area. It is quite flat for large values so it still can cause slow network learning. It was used in hidden layers in the recurrent parts (RNN). -Softmax returns probabilities of belonging to disjoint classes. The values are normalized (the sum of probabilities equals 1.0). It is a milder version of the argmax function, which outputs the highest value index. It was used in the output layer to determine the probability of receiving a particular evaluation.

Activation thresholds
In each one of the neurons, all input signals are multiplied by individual weights of input signals and compared to the activation threshold. Reaching the activation threshold results in evoking the following neurons in the consecutive network layers.

Network learning process
Neural network underwent a learning process that is based on a proper selection of weight factors based on the examination of breathing and phonation disorders. Therefore, it was essential to find the best method and algorithm to effectively conduct the process. Training data were set by Short-Time Fourier Transform (STFT) in the form of binary samples. For learning, the network uses a backpropagation algorithm, which proved perfectly in the controlled teaching process of a multi-layered one-directional neural network. The name of the algorithm originates from the order of counting error d signals, which runs in the opposite direction to the way the signals travel within the network, from the input layer through the hidden layers to the output layers. The algorithm allowed to count gradients, the direction in which the error is minimalized. The value charges the first-row optimizer, which based on gradient analysis corrects weights in the model. Adam's optimization method was applied as the first-row optimizer. The algorithm was first described by Diederik et al. (24).

Data analyses
The analysis was performed in the R program, version 4.2.2. Quantitative variables (i.e., expressed in numbers) were analyzed by calculating the mean, standard deviation, median, and quartiles. ANOVA (followed by Fisher's LSD post-hoc test) was used to compare quantitative variables between the three groups. The relationship between two quantitative variables was assessed with Pearson's coefficient of correlation. The analysis adopted a significance level of 0.05. Thus, all p-values below 0.05 were interpreted as significant associations.

Comparison within the groups: sex and age
On analyzing the performance time (PT) of the three suggested texts: (1) poem [s], (2) a text in prose [s], and (3) a diction exercise [s] between women and men separately in the study group and in both control groups no statistically significant differences (p > 0.05). The PT parameter was then compared between the age groups: 20-40 and 40-62 among women and men in the study group and both control groups with no statistically significant differences found (p > 0.05). Similarly, in the corresponding age groups, i.e., 20-40 and 40-62, between women and men separately in the study group and in both control groups no statistically significant differences were found (p > 0.05).

Comparison between the groups 3.2.1. Study group to control group
Statistically significant differences were found (p < 0.05) in the performance time (PT) of people from the study group and both control groups. Reading a poem, reading prose, and completing the diction exercise took more time in the dysarthria study group than in both control groups, which did not differ significantly ( Table 2).

Study group to control group: sex and age
In terms of analyzing differences in PT between women from the study group and both control groups, and men from the study group and both control groups, statistically significant differences were found in all instances (p < 0.05). Reading a poem, reading prose, and The activation function in the network model.
Frontiers in Neurology 05 frontiersin.org completing the diction exercise took more time in the dysarthria study group than in both control groups, which did not differ significantly (Table 3). Similarly, in the analysis of performance time between women in the 1st age group of the study and women in the 1st age group of both control groups, statistically significant differences were found in all analyzed speech parameters (p < 0.05). Poem reading, prose reading, and diction exercise took longer in the study group than in both control groups, which did not differ significantly. Statistically significant differences (p < 0.05) were found between women in the 2nd age group of the study group and women in the 2nd age group of both control groups (p < 0.05) in terms of reading prose and performing diction exercises, which took more time in the study group (Table 4).
In the analysis of differences in performance time between men in the 1st age group of the study group and men in the 1st age group of both control groups, statistically significant differences were found in all analyzed speech parameters (p < 0.05). Poem reading, prose reading, and diction exercise took longer in the study group than in both control groups, which did not differ significantly. Statistically significant differences were found between men in the 2nd age group in the study group and men in the corresponding 2nd age group of both control groups (p < 0.05). Poem reading, prose reading, and a diction exercise took longer in the study group (Table 5).

Reference to the Expanded Disability Status Scale
Considering the relationship between the performance time of reading a poem, reading prose, and completing the diction exercise and the Expanded Disability Status Scale (EDSS) level in the study group no significant dependencies were found (p > 0.05) for each of the analyzed speech parameters (Table 6).

Discussion
A review of the literature focusing on the issues shows that research evaluating the character of speech disorders in MS is limited (15,23). The available dysarthria scales are based on more or less subjective data, which are more difficult to compare (15,(18)(19)(20)(21)(22)(23). As noted in the review article by Noffs et al. objective speech assessment is more accurate, replicable, and feasible when contrasted with perceptual analysis (25) which occurs in the majority of the studies concerning the notion of dysarthria in the course of neurological diseases. Therefore our study focused on an objective assessment of speech features in dysarthric disorders in MS patients in comparison with healthy individuals depending on sex, age, and type of text read using an objective tool. The reports of Hartelius et al., who attempted a subjective assessment of speech difficulties in people with various types and degrees of dysarthria involved a self-report questionnaire, Living with Neurologically Based Speech Difficulties (Living with Dysarthria) (26). As in our studies, the authors showed that the degree of communication difficulties was not dependent on age and sex, and the dominant speech difficulties were associated with reduced speech speed and the need for repetition as a consequence of misunderstandings in communication with other people (26). Therefore, it can be assumed that sex and age do not differentiate dysarthric speech disorders, regardless of their cause.
We were unable to find publications that would allow a discussion between the results of the study and their verification concerning the analysis we performed that focused on the differences in speech speed rate during reading particular texts in Polish: a poem, a text in prose, and a diction exercise between women and men, separately, in the study group and in both control groups: healthy controls and MS without dysarthria controls.
As far as the analysis of performance time (PT) depending on the type of text read between the study group and both control groups is concerned statistically significant differences (p < 0.05) were observed. It is worth mentioning that all of the parameters were substantially higher in the study than in both control groups with no significant  Frontiers in Neurology 07 frontiersin.org syllables/pa ta ka/and (puh puh kuh). The participants were asked to produce as many syllables as possible (minimum 7) per breath (8,27). Then, the average number of syllables produced per second was studied. In our opinion making patients inhale a maximum of air changes the nature of speech production and increases the risk of biased results. We measured the time of producing 10 syllables without provoking the patients to make a disproportionate effort needed to complete the task, which in our view, is more reliable and easier to interpret by a therapist in the context of the diagnosis of dysarthria as well as potential evaluation of the dynamics of its progression. Moreover, when contrasted with speech rate, determining performance time, that is, the total time needed to read each text, is in this case a less demanding task, including an easier analysis process. Speech rate is a unit expressed by a count of words per minute or syllables per second, whereas performance time in seconds. we tested the dependencies between the prolonged performance time of the diction exercise between the study group and the control group. We did not observe any significant differences. The proportions were identical. When considering the diction exercise, in the study group the average SR (speech rate) amounted to 3.15 syllables/s and the average PT (performance time) was 3.17 s. In the control group of healthy individuals, the average SR was 6.18 syllables/s, and the Frontiers in Neurology 08 frontiersin.org average PT -1.62 s. The execution time of the exercise by the dysarthric patients in the study group was longer by 96.2% when the speech rate was counted, and 95.7% longer when the performance time was measured. We believe that speech speed expresses as PT is a fundamental parameter differentiating correct speech from dysarthric. The results of the aforementioned studies cannot be compared to ours due to the phonetic variety of the languages (English/Czech) and the unit used -we concentrate on the performance time of producing 10 syllables of the diction exercise and total reading time of the suggested text. The cited studies (8,27) analyzed speech rate and articulation rate measured as words/min and syllable/s. There were no studies thar conducted assessment of the differentiation of reading of a poetic text.
A few studies comparing speech differences, were concerned with other groups of patients, i.e., adults with cerebral palsy being compared to healthy people (28). In their research, Liu and Chen tested both the question of whether consonant landmarks could be used as predictors for dysarthric speech in adult patients with cerebral palsy, as well as if there was a link between the aforementioned landmarks and the exacerbation of the speaking disorder. The researchers contrasted differences in the speech of seven adults with cerebral palsy suffering from dysarthria with the speech of seven Frontiers in Neurology 09 frontiersin.org healthy persons from the control group matched in terms of sex and age to the subjects from the study group (28). Similarly to our research, significant differences were observed between the subjects from the study group and the control group. Moreover, all landmark features were noted in the case of patients from the study group (26). On the other hand, Alhinti et al. assessed acoustic differences in the emotional speech of four dysarthric patients caused by cerebral palsy (1 person) or by Parkinson's disease (3 persons) in comparison with 21 healthy individuals. The authors analyzed the speech rate (determined by the number of syllables spoken per time unit -calculated using a Praat script) and HNR (harmonic-to-noise ratio) of the dysarthric patients and contrasted them with healthy subjects. Furthermore, shimmer and jitter values were compared between female and male speakers (29). The study does not include an analysis of features such as performance time, which is the subject of our analysis. Sechidis et al., as in our own research, conducted an objective assessment of speech through a machine learning modeling approach, however, in patients with Parkinson's disease. Researchers used the Mixture-of-Experts (MoE) architecture to recognize speech-related emotions (30). The difference between the evaluation of our objective device and that of Sechidis et al. is that our device examines dysarthric disorders in the course of neurological diseases (MS in this case) based on the time of implementation of individual texts, without touching on the issue of emotionality of the statement. We also used longer texts for the study, making, in our opinion, the assessment of the disorder more objective.
To sum it up, it can be stated that our findings pave the way to a better understanding of speech characteristics in the group of MS patients with dysarthria and also indicate directions for the therapeutic process of dysarthric speech disorders. The examined aspects seem to be important due to the fact that the incidence of MS is increasing both nationally and globally. In addition, it should be emphasized that currently in Poland there are no objective tools for assessing speech disorders, adding a practical dimension to this study by introducing the first device of this type in our country.

Limitations
The study presents some limitations. First of all, our group of surveyed people with MS was practically homogeneous in terms of level of education, 83.33% had secondary education, therefore we were unable to analyze speech according to the level of education, which may have an impact on speech. In our research, we also did not analyze the type of work the subjects performed. Therefore, further research on a bigger group of MS patients is necessary to divide them according to their level of education and occupation. Secondly, the study included the assessment of mixed spastic-ataxic dysarthria only in the course of multiple sclerosis in patients aged 20 to 62. Therefore, further research into the matter is necessary to consider both dysarthric patients suffering from other illnesses, other types of dysarthria as well as other age groups, i.e., children, teenagers, and the elderly. Thirdly, the study is of preliminary character, aiming at evaluating whether the created objective tool is useful in detecting discrepancies in speech parameters between persons with speech disorders and control ones. It is essential to continue research in a larger group of patients aiming at validating the created device, comparing it with a test already in use and assessing its reliability and sensitivity. Additional studies should also include evaluating the effectiveness of therapeutic programs in terms of improving speech parameters.

Conclusion
The study showed statistically significant differences in the speaking speed in all analyzed speech samples, i.e., reading a poem, reading prose, and performing a diction exercise between people from the study group with MS with dysarthria and the both control groups: healthy controls and MS without dysarthria controls. Reading a poem, reading prose, and completing the diction exercise took more time in the dysarthria study group. Thus, the comparison between the groups in terms of sex and age showed disturbances in the analyzed samples in the study group. However, there were no significant differences in terms of sex and age, separately in the group of people with MS with dysarthria and in both control groups. The developed objective tool for speech analysis is useful in detecting differences in speech parameters, such as speed, between people with MS with dysarthria and healthy people and MS without dysarthria and can serve diagnostic purposes in clinical practice to improve the understanding of speech characteristics of MS patients with dysarthria, however, further research is needed to validate the created device.

Data availability statement
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Ethics statement
The studies involving human participants were reviewed and approved by the Bioethical Committee at the University of Rzeszów (approval no. 3/01/2020). The patients/participants provided their written informed consent to participate in this study.

Author contributions
WW: conceptualization and methodology. MP, WW, and AG: investigation, formal analysis, and writing-original draft preparation. HB-P, AG, and MD: data curation and writing-review and editing. WW and MP: project administration. All authors contributed to the article and approved the submitted version. Frontiers in Neurology 10 frontiersin.org

Funding
The IT tool created in the course of the project run by BD CENTER Ltd.: Innovative IT tool aiding diagnostic process, prognosis and tracking of change dynamics in neurological patients -development work and its implementation' No. RPPK.01.02.00-18-0004/20 co-financed by the European Regional Development Fund carried out within Regional Operational Programme of Podkarpackie Voivodeship 2014-2020 -Priority Axis 1: Competitive and Innovative Economy.

Conflict of interest
WW and MP were employed by BD Center Ltd.
The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher's note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.