Neural Network Based Mental Depression Identification and Sentiments Classification Technique From Speech Signals: A COVID-19 Focused Pandemic Study

COVID-19 (SARS-CoV-2) was declared as a global pandemic by the World Health Organization (WHO) in February 2020. This led to previously unforeseen measures that aimed to curb its spread, such as the lockdown of cities, districts, and international travel. Various researchers and institutions have focused on multidimensional opportunities and solutions in encountering the COVID-19 pandemic. This study focuses on mental health and sentiment validations caused by the global lockdowns across the countries, resulting in a mental disability among individuals. This paper discusses a technique for identifying the mental state of an individual by sentiment analysis of feelings such as anxiety, depression, and loneliness caused by isolation and pauses to the normal chains of operations in daily life. The research uses a Neural Network (NN) to resolve and extract patterns and validate threshold trained datasets for decision making. This technique was used to validate 2,173 global speech samples, and the resulting accuracy of mental state and sentiments are identified with 93.5% accuracy in classifying the behavioral patterns of patients suffering from COVID-19 and pandemic-influenced depression.


INTRODUCTION
The world is at present facing an uncertain time due to the global pandemic caused by Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), also known as COVID- 19. The pandemic has forced nations to exercise lockdown as a preventive measure to slow the spread of the virus. The pandemic has resulted in economic failure and disruptions in the supply chain all over the world. There was a race for a vaccine among modern drug and research organizations. The pandemic has caused major adverse effects such as mental depression, isolation, anxiety, and loneliness besides respiratory disorders and aligned symptoms. Depression and other mental health issues have been caused by lockdown and restrictions in travel and work, with a new normal social life now conducted via technological platforms.
The pandemic has bought a sense of maturity and adverse implications concerning psychosocial behaviors and mental health implications such as depression, anxiety, and loneliness. In this research, a systematic evaluation was conducted on the behavior of users based on speech signals, which were recorded using a machine learning technique to extract keywords and classify data using sentimental analysis techniques (1). The research in this article also focuses on identifying the user's mental state via the speech signals recorded over technological platforms used for virtual meetings and other gatherings (2). The research aims to provide schematic evaluation and validation approaches and classify patients based on medical conditions.

LITERATURE SURVEY
The global pandemic situation under COVID-19 has left traces of various adverse effects on people, which are the result of isolation, lockdown, mental health destabilization, and much more.  generation, focussing on children's behavior and reactions to the new normal. The study shows the overall implications of isolation on children and adolescents. Pfefferbaum and North (5) have discussed the impact and relationship of mental stress caused due to the global pandemic situation, with detailed insights into public health emergencies and the influence of the pandemic on looming health conditions. Furthermore, a discussion on the challenges faced by health care workers (HCW) and their state of mental stress is documented by Spoorthy et al. (6). The HCWs are frontline attributes and hence require assistance in evaluating and validating mental health via the main mode of communication now used, i.e., speech signals through digital media platforms and applications and a similar discussion is highlighted in other studies (7,8).
Some studies are focused on the terms of technological solutions for the mental distress caused by the pandemic. These solutions have outlined the use of a telemedicine approach for reaching the maximum and remote population of a developing country like India. A study by Ahmed et al. (9) discussed Multidimensional Optimal Medical Dataset processing under a telemedicine channel. These MooM datasets include a signal processing unit for a standardized approach and can be used for intimated processing in the proposed study, with a supported algorithm from (10). The method of detecting and validating speech signals is also proposed in this article, based on the influence of telemedicine approaches with a numerical clustering validation by (11) and (12).
The latest findings in the survey are recorded with real-time datasets as discussed in (4). This approach aims to validate treatment and handling, focusing on pandemic control and coordination. The prediction and modeling of the pandemic are discussed by Iwendi et al. (13) and Ngabod et al. (14), who propose a technique for classifying pandemic growth in smart cities. Under the process processing state, this dedicated networking model can be utilized, as discussed by Ahmed et al.,  under a dynamic user cluster grouping approach (15)(16)(17). These developments have provided a reliable solution for handling pandemic data using text mining and decision support. The classification of Covid-19 studies and surveys are reported and validated by (18,19).

METHODOLOGY
The proposed methodology aims to focus on the detection and validation of speech signals via a depression and mental disorder identification based on speech signal processing using a neural network (NN). The process is defined using mass datasets from 2,173 speech samples, as discussed in the architecture model in Figure 1. The agenda of the proposed technique is to restore a correlation with trained datasets in extracting and evaluating the samples of speech and classifying on demand. These speech signals are interdependent and have a higher order of distinction in recovering and validating the sample of COVID-19 patients' mental stability and sentiments (20).
The processing datasets are computed in a centralized database with user-to-user interface coordination, thereby generating a pool of databases consisting of raw and unprocessed data from the users. The process is initiated with data alignment and pre-processing techniques, as discussed in the mathematical modeling of the proposed technique. The process is designed with a trained database of the speech signals with a heap address of thresholds relating to global attributes such as country, location, gender, age, and professional practice.
The trained datasets provide the threshold process for the extracted attributes of the user input signals. The process is designed with a comparative validation model to assure the process execution, as demonstrated in

MATHEMATICAL REPRESENTATION
The computation of the speech signals and detection of mental stress is achieved under the processed instruction architecture, as demonstrated in Figure 1. The process aims to validate the signals into coordination datasets with a synchronization approach of proving learning and pooling clusters of similar patterns, as demonstrated in Figure 3. The mathematical approach is discussed in this section.

Attribute Extraction and Dependencies Validation
Consider a dataset (D) with a raw calibrated ecosystem of attributes (A) where each of attributes A={A 1 ,A 2 ,A 3 ,. . . . . . ,A n } such that each attribute (A i ) resembles the paradigm of operation, as in Equation (1).
Where, each of the i th attributes, has a correlated paradigm of operation and process extraction. Thus, the extracted attributes (A e ) are as shown in Equation (2).   Where the extracted attribute (A e ) is processed over the raw attributes set, in extracting the most relevant threshold attributes such as the peak frequency of a word or a repeated phrase of a sentence with a dilution of D z and mapping with T as a threshold paradigm in validating all processing attributes (A e ) in the speech signal.

Segmentation of Samples
Samples are primarily divided into extracted attributes (A e ) sets, such that each of the attribute Region of Interest (ROI) is highlighted and marked in the entire speech signal, as shown in Figure 3.
Consider the segmentation (S) of the overall input signal (speech signal) with a highlighted extracted attribute (A e ). On consideration, each attribute in the signal has an occupancy time ( t) in operating, and thus, a reflective ratio of division is processed based on CNN's evaluation paradigms.
The signal (S) of an independent sample (S i ) tends to occur in ROI in an independent location of the time matrix ( t). Hence, the segmentation of signal (S) is as shown in Equation (4).
Where each signal strength is measured in R with a signal time t, for all regional attributes extraction; hence, for segmentation to be processed completely, the schematics of each attribute signal strength ( R) is then computed with an exhausted peak of ROI from the signal as shown in Figure 3.
The process of pattern with speech signals is internally correlated to the amplitude of the signal (amp). Where it is represented as f amp ={f amp1 ,f amp2 ,f amp3 ,. . . ..,f ampn }. The amplitude of each frequency feature can be represented and extracted as shown in Equation (5).
Where each signal pattern (S n ) represents the overall coordination in speech signals, and the '' represents band filters of the speech signal with a coefficient of amplitude and frequency. On extraction of patterns from those correlated in Equation (5), the frequency patterns can be sorted by independent bandwidth as shown in Equation (6).
The 'P i ' on Equation 7 is the pattern of repeated learning from the CNN framework. The internal arrangements can be represented as the frequency (f) under the operation of amplitude, (mode) is represented as f amp , further graded into the Gaussian constant (G). The process in Equation (7) is then concluded, as shown in Equation (8).
n k=0 α i j .amp f k ∂t 2 (9) Thus, Equation (10) represents the coordinates of the pattern extracted and validated for the pattern with respect to a segment (S i ). Thus, on summarisation, the representation can be as P = {P 1 ,P 2 ,P 3 ,. . . .,P n } co-related to coordination of segment as S p ={S p1 ,S p2 ,S p3 ,. . . . . . ,S pn }, where 'n' is the last segment of given input signal.

Clustering and Classification of Datasets
Equation (10) retrieves the pattern of individual segments, and thus the coefficient of such segments are summarized and represented in S p ={S p1 ,S p2 ,S p3 ,. . . . . . ,S pn }. Hence the clustering is shown in Figure 4.
The cluster (C) is retained from a group of values and its corresponding coefficients for the value re-compensation. The clusters are internally evaluated with the focus of associating.
Where each cluster (C i ) is validated with a corresponding pattern coefficient and a threshold value ( T). The internal Threshold value ( T) is validated and evaluated. In summary, the clusters (C) = {C 1 ,C 2 ,C 3 ,. . . ..,C n }. These clusters have an association of common patterns, for example, represented as {(C i ∩ C j ) ∩ C k }, and these associations are subjected to attribute validation, as shown in Figure 4.

Threshold Validation and Decision Making
The clusters and classification of speech signals using clusters are validated and approved for processing into decision making. The decision-making approach is termed with a threshold value consultation, i.e., the overall technique extracts the validated pattern coefficient and thereby synchronizes it with a relatively more and likely approach of matching and schema validation. The proposed approach typically validates the decision of signal segmentation using the threshold value toward segregating the dataset of speech signals based on emotions. These emotionbased evaluations are rather computational, and hence a most likely decision is processed.

RESULTS AND DISCUSSIONS
The proposed technique has successfully retrieved the signal attributes and the prediction ratio for evaluation. The input signals from the users via a remote connecting platform are uploaded to a centralized database in a cloud computing ecosystem using AWS-sponsored services. The datasets are processed and validated according to a multidimensional approach. The variation of predicting the sentiments is based on the information designed and developed via clustering datasets. The prediction ratio is summarized in Figures 5,  6, respectively, with a comparative evaluation from previous systems. Table 1 shows the parameters related to the mental stress and paradigms to provide decision support. The table highlights the evaluation parameters such as the occurrence delay of a keyword in clustering, as shown in Equation (11). The supported approach thus classifies the pattern of these keyword occurrence sequences for decision making.
The results of data/signal processing and decision-making are shown in Table 2. The results show promising outcomes in proving a precision of 90% and higher in various users across the language and location. The results of processing a single sample are included in Table 3. The processing signal magnitude and the power spectrum computation demonstrate a higher order of signal clarity in analysis and validation.

CONCLUSION
The technique proposed in the present study uses neural networking terminology to learn and develop a pool of clusters and patterns to provide a systematic and reliable decision to categorize speech signals. The processing system is based on open database processing to validate the mental health conditions of users during the ongoing isolation and lockdowns caused by the COVID-19 pandemic. The results show a promising outcome with a precision of 90% and higher accuracy across various users. The approach has a projected accuracy of 93.5% under the open validation platform on a computational evaluation. The proposed technique could be included in classifying and categorizing patients' behavior in future, with supervised approaches to keyword extraction and classification in dynamic signals.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding authors.