Skip to main content


Front. Mech. Eng., 13 June 2024
Sec. Mechatronics
Volume 10 - 2024 |

Improvement of action recognition based on ANN-BP algorithm for auto driving cars

www.frontiersin.orgYong Tian* www.frontiersin.orgJun Tan
  • New Media College, Sichuan Film and Television University, Chengdu, China

Introduction: With the development of artificial intelligence and autonomous driving technology, the application of motion recognition in automotive autonomous driving is becoming more and more important. The traditional feature extraction method uses adaptive search hybrid learning and needs to design the feature extraction process manually, which is difficult to meet the recognition requirements in complex environments.

Methods: In this paper, a fusion algorithm is proposed to classify the driving characteristics through time-frequency analysis, and perform backpropagation operation in artificial neural network to improve the convergence speed of the algorithm. The performance analysis experiments of the study were carried out on Autov data sets, and the results were compared with those of the other three algorithms.

Results: When the vehicle action coefficient is 227, the judgment accuracy of the four algorithms is 0.98, 0.94, 0.93 and 0.95, respectively, indicating that the fusion algorithm is stable. When the road sample is 547, the vehicle driving ability of the fusion algorithm is 4.7, which is the best performance among the four algorithms, indicating that the fusion algorithm has strong adaptability.

Discussion: The results show that the fusion algorithm has practical significance in improving the autonomous operation ability of autonomous vehicles, reducing the frequency of vehicle accidents during driving, and contributing to the development of production, life and society.

1 Introduction

Action recognition (AR) has a wide range of applications in auto driving (AD) vehicles, which can help the vehicle to make real-time decisions and thus improve driving safety. Data preprocessing is the key link in AR, the existing processing methods use dimensionality reduction, under the condition of increasing data volume, the quality of raw data is subsequently reduced. AR refers to the analysis of sensor data to recognize the human action, this method for AD cars can be used to identify other vehicles on the road, pedestrians, and traffic signs, which can help the car to make the appropriate driving style. In the field of AR of autonomous vehicles, optimization algorithms have been the subject of extensive attention and research, with the objective of improving recognition accuracy and system performance (Abdar et al., 2021). The advent of deep learning technology has enabled the development of sophisticated tools for AR tasks (Claussmann et al., 2019). The construction of a multi-layer nonlinear network structure enables deep learning to automatically learn complex feature representations from the original data, thereby significantly enhancing the accuracy of recognition (Patole et al., 2017; Song et al., 2024). In the field of AR, many experts use ANN to model the time domain and spatial relationship of driving video frames. The core of ANN is neurons, which transfer information between them through weights. Traditional ANN has the risk of data blockage, to solve this problem, this study performs back propagation (BP) computational operation on the ANN’s driving data chain, which is used to enhance the obstacle avoidance and robustness in the AD process, and thus reduce the frequency of data transmission barriers in this process, to generate a fusion algorithm (ANN-BPAD). The following points represent the most significant innovations of the research: 1. The traffic characteristics are classified by time-frequency analysis, which enhances the model’s ability to analyze the characteristics of time series signals. 2. A mechanism for adaptive feature extraction is employed, enabling the process to be adjusted in real-time according to traffic conditions. Parallel computing technology is utilized to optimize the weight updating process, resulting in a notable acceleration of the training speed of the model. Furthermore, the decision process is optimized and the calculation delay is reduced. The study’s primary content is broken down into four sections. The first section primarily examines and summarizes the existing uses of ANN and BP. The second part introduces the connection method between AD and ANN-BP and introduces it into AD. The third part conducts simulation experiments on Autov dataset. The last part analyzes and compares the performance of the algorithm with the traditional algorithm and presents the remaining deficiencies in the research. The practical significance of this research is to increase the autonomous operation of AD vehicles and thus reduce the frequency of vehicle accidents during driving. It aims to increase the demand for algorithms to characterize timing signals, which in turn contributes to productive life and social development.

2 Literature review

AR in AD cars, the positive effect of ANN on this approach has been extensively studied internationally. More et al. (2023) developed an improved model for augmenting AD cars to characterize AR capabilities through a machine learning approach. The method was based on regularization with strong constraints and appropriately assigned exemplary correction levels. The training process was based on non-gradient, which they analyzed and discussed extensively. Their method outperformed most of the known models in experiments to predict the capabilities of different AD cars (More et al., 2023). Chu et al. (2023) confirmed that the defects of BP methods cannot be ignored in AD cars. In order to solve the local minima problem of the BP method, they proposed an integrated learning-based improvement method for AD cars. Remote sensing images and training samples were used as input datasets, and the base learner was used to generate multiple car inversion results. Finally, they performed tests to validate the method. The results indicated that the proposed method can solve the local minima problem and obtain high robustness (Chu et al., 2023). Ali et al. (2021) used BP method to predict the safety of unmanned vehicles in a cement plant and compared their results with the manned driving situation. Based on this, they developed model to train using input features. For 10 neurons in the hidden layer, their mean square error for validation was 0.283. From the experimental results it can be concluded that this method can maintain the safety of unmanned driving in cement plants (Ali et al., 2021).

As the research progressively advances, the convergence rate of the algorithms tends to decrease trend, and there is a gradual focus on the application of feature classification and time-frequency analysis in signal acquisition. Ludwig-Barbosa et al. (2023) carried out diffraction and multipath studies in order to project the signal into different cars. When the auxiliary plane coincides with the diffracted region, the interference to the BP was minimized. This method was applied to irregularities in the location, but there was no supplementary data to validate the estimate. Therefore, some test cases were designed to evaluate the size of this method in multiple irregular regions. The experimental results indicated that the location estimation accuracy followed the resolution of the implementation of the method (Ludwig-Barbosa et al., 2023). Parikh et al. (2023) argued that temporal features in vehicle signals are affected by a variety of complex factors, and that identifying features that improve classification accuracy is a major problem in vehicle operation research. They compared transform domain feature extraction for different state classifications and evaluated its feature generation capability using a convolutional autoencoder. To experiment with the features generated by the model feature fusion, they used a dataset for their experiments. The experimental results proved its effectiveness (Parikh et al., 2023). Klonecki et al. (2023) argued that most vehicle driving methods do not take into account the cost information associated with features, for which they design a cost-constrained feature selection problem. This approach permitted the construction of models with high predictive power and no more than a specified range of ability to predict individual vehicles. Also considering the balance of vehicle feature subset relevance and their actions, they concluded that it is not important in practice. Experiments on real datasets demonstrated its effectiveness (Klonecki et al., 2023).

In conclusion, although progress has been made in the field of motion recognition of autonomous vehicles, there are still some significant limitations. Conventional feature extraction methods depend on artificially designed fixed patterns, which are challenging to adapt to the intricate and dynamic traffic environment, resulting in suboptimal recognition accuracy and resilience. Furthermore, the existing ANN is susceptible to overfitting when confronted with large-scale data, which impairs the model’s generalizability. While BP has the advantage of network weight optimization, it also has the disadvantage of local minimization and slow convergence when dealing with nonlinear and dynamically changing environmental characteristics. The parallel processing ability and self-learning ability of ANN enable it to effectively extract features from a large number of sensor data. Furthermore, the weight adjustment mechanism of the BP algorithm can optimize the network structure according to the error feedback, thus improving the prediction performance of the model. The integration of BP into ANN enables operators to not only model the intricate driving environment through the nonlinear mapping capacity of ANN, but also to dynamically adjust the weights within the network through the back propagation mechanism of the BP algorithm, thereby facilitating adaptation to the ever-changing traffic conditions. To make more effective use of computer technology to improve autonomous vehicles, the study associated ANN and BP to construct a novel autonomous driving technology, thereby providing a technical reference for the vehicle engineering industry.

3 ANN-based action recognition model design

ANN mimics the structure of the human nervous system and is a recognition model with a learning function. It allows information transfer between neurons, which is used to realize AR on vehicles. This study performs BP operation in it, which is used to reduce the probability of errors in ANN, which in turn provides an improved model for AD vehicles, aiming to create value for the development of society.

3.1 Combining ANN and BP for time domain features analysis

In the field of AR, ANN is widely used to classify human actions. In this study, time domain features (TDF) analysis of actions is performed based on ANN for automatic recognition of human actions. By tracking the signal’s evolution over time, one may determine the information properties of the signal (Yumatov et al., 2022). In this process, the ANN embodies a powerful learning and error correction ability to propagate errors in multiple layers of mapped nerves. The components of this kind of network are divided into three steps as shown in Figure 1.

Figure 1

Figure 1. Three steps of error propagation by ANN.

In Figure 1, the study first collects action datasets for training and testing, which contain different kinds of human action video clips. In order to improve the performance of the model, the study performs sufficient sample preparation in the input layer and considers multiple action categories (Wang et al., 2021). For each clip in the video, the study labels the corresponding action category, at which point the hidden layer can accept the network propagation of the video. Where the preprocessing process includes the image processing session as shown in Eq. 1.


In Eq. 1, the smallest unit of the image processing process is denoted as γi. Setting the video contains the number of images as N, the j th feature is denoted using ξij. For video data, the study first extracts each action segment frame by frame to generate an image sequence at the output layer of the network (Hou et al., 2020). Then, the obtained image sequence is preprocessed, including color channel conversion and image enhancement operations. In the feature extraction process of the image, the feature vectors of each frame are labeled and their relationship is as follows in Eq. 2.


In the above Eq. 2, ZS,ZF is the vector of the initial frame and the action frame respectively, ES denotes the total number of features of the vector, and the magnitude of the image enhancement is denoted as IS. Due to the existence of some nulls in the experimental environment at the time of data acquisition, the accuracy of the algorithm will be reduced when such data are processed by the features. When the feature vectors are relatively large, the ANN will overfit during training, making the algorithm’s recognition results unsatisfactory. Therefore, the study performs null removal operation on the data as shown in Eq. 3.


In Eq. 3, the null value of the initial frame is denoted as re, and the deviation of the action frame is represented using Tei. Under the condition that the model is not overfitted, Te¯ represents the accurate recognition result. When the result is inaccurate, SOMi is the current feature, and the mean value of multiple trials is denoted as SOM¯. After the model evaluation and cleaning are completed, the study deploys the model into the AD car to realize the automatic recognition of the driver’s action, and the flow is shown in Figure 2.

Figure 2

Figure 2. Flow chart of automatic identification of model in auto driving car.

Figure 2 shows the workflow of the ANN model in the AR task. This model can be used for time series modeling for human AR. This research integrates the use of spatial features and time series features to design a hybrid model based on artificial networks and BP’s. In addition, the model includes several layers for extracting different levels of feature representations and the working relationship between these layers follows the following Eq. 4.


In Eq. 4, T represents the total layers in the ANN, the layer where the current video is located is denoted using t. ωmax,ωmin represent the maximum and minimum values when the image is parsed, respectively. The BP process is used to model the time series of video frames, which in turn captures the timing information of the action (Charles, 2023). In this study, BP is used as the core model of ANN to model the time series of feature sequences extracted by ANN to achieve efficient feature extraction of complex time series signals, and the extraction method is shown in Eq. 5.


In Eq. 5, x is called the complex timing signal and y represents the barrier to signal propagation. The retrieval time is denoted as n, χ represents the weakening of the signal, ei is the speed of feature extraction, and the resulting group of video signals is represented using p. The understanding of different human actions is then achieved by learning the dependencies between the time series features. After the model design is completed, the study utilizes the training data to train the model. In addition, as Eq. 6 illustrates, this validation data may be used to fine-tune the model.


In Eq. 6, the result of the tuning is denoted as b*. αi* makes their tuning process smoother and the cross-entropy loss function of the training results is denoted using sgn. In this process, the study uses stochastic gradient descent to update the model parameters. Also the study provides technical support to the AD automotive field by monitoring the real-time motion video through the deployed model (Song et al., 2022). In this, the parameter optimization process of the ANN-BP model is shown in Figure 3.

Figure 3

Figure 3. Parameter optimization method diagram of ANN-BP model.

In Figure 3, the study first receives the video in the ANN through the input layer and cuts the video in the hidden layer. For the output results obtained from the output layer, the research will go through the hidden layer for BP. When the weight threshold degree of the network reaches the limit, the research will add timing signals in the TDF analysis for feature extraction and classification of the ANN-BP model. When the clarity of the video does not meet the requirements, the study introduces Eq. 7 at this point to solve the problem.


In Eq. 7, ,υ are the video quality before and after adjustment, respectively, l and Ω represent the temporal signals of the image, and the conditional parameter of this process is denoted as τ. In this way, the study uses the pre-trained ANN-BP model as the TDF extractor, which in turn obtains the feature vector of each video frame. In summary, this approach realizes the automatic recognition and classification of human actions, which is valuable for application in the study of action acquisition models.

3.2 Improvement of ANN-BP model for auto driving

ANN-BP model for AD technology is a hot research topic in the field of Artificial Intelligence, in which ANN is an important component of this system. This research proposes ANN-BP model to optimize the parameter selection process by continuously adjusting the weight threshold of the network through BP in ANN. In TDF analysis, ANN-BP is used to characterize the time domain signals. The current ANN-BP model has some problems in the application of AD, such as long training time and high video requirements (Amin et al., 2023). This study proposes Eq. 8 to increase the convergence speed of the model. Improvement of the study.


In Eq. 8, a1,a2 are the convergence speed of the model before and after acceleration, and o,q are the TDF analysis and spatial features of the image sequence in the AD process, respectively. Aiming at the problem of long training time of ANN-BP model, this study uses parallel computing method to improve it. In the working process of ANN-BP model, the largest proportion of time-consuming weights is the adjustment of the training method. For the threshold of weights of this process model, the study used the computational measures as shown in Figure 4.

Figure 4

Figure 4. Measure chart of weight threshold calculation in training process of ANN-BP model.

Figure 4 demonstrates the adaptation scheme of the weight thresholds when the ANN-BP model is trained. In Figure 4, the study puts the weight thresholds of this network through the BP method, which makes the network adapt to the relationship between inputs and outputs. The BP process goes through multiple hidden layers for processing complex vehicle driving videos (Dang and Pham, 2021). In this process, the ANN-BP model continuously adjusts the connection weights to minimize the output, to achieve the minimization of the error of the desired output and to reach the prediction of the input pattern as shown in Eq. 9.


In Eq. 9, MF refers to the connection weight at the time of detection, the error of the current output is denoted as Ubef, and the minimum value of the error at the same time is expressed using Upro. For TDF analysis of the ANN-BP model, the study uses ANN for feature extraction of the timing signal in combination with the BP method using AD optimization (Zeng et al., 2022). Where ANN has the ability to extract the abstract features of the signal and BPAD is able to make further predictions of the features in the following Eq. 10.


In the above Eq. 10, the prediction range of BPAD is denoted as t, i,j,p represents the time node among ANN, and its environmental factors are denoted by δ,O. ε represents the time range of sampling, and η is the adjustment method of adaptive features. The combination of TDF and spatial features chosen for this research can increase the dimension of the feature set under the condition of comprehensive ANN-BPAD features, which in turn reduces the convergence speed of the ANN-BPAD algorithm. So the research is to reduce the dimensionality of the feature set as shown in Eq. 11.


In Eq. 11, the convergence rate of the ANN-BPAD algorithm is denoted as dλ, the dimension of the feature set at this point is denoted as u, and dt represents the dimension of the dataset after feature reduction. This type of parallel computing can distribute the data processing process to multiple nodes to achieve the dimensionality reduction of vehicle driving features (Bibyan et al., 2023). In this step, AR is an important component of AD that helps the vehicle to recognize different objects in the surrounding environment. For driving decision making for driving safety, the study senses the environment around the vehicle and obtains the relevant sensor data in the way shown in Figure 5.

Figure 5

Figure 5. Method diagram of sensor learning automobile surroundings.

As seen in Figure 5, the research firstly records the driving objects around the vehicle by LiDAR and inputs these data into the model for recording in order to realize the recognition of different actions. By monitoring the actions of the vehicle’s surroundings in real time, this method can help the AD vehicle to make corresponding program improvements, thus improving driving efficiency. To address the problem of excessive generalization ability of ANN-BPAD algorithm in AR, the study uses Eq. 12 for optimization.


In Eq. 12, Φmcosϖt+κ the generalization around the vehicle during driving and Φ0 is the driving efficiency during the actual driving. ϖt represents the feature extraction ability of the image data and κ is the complexity of the data. This method can improve the training efficiency and generalization ability of the ANN-BPAD algorithm to achieve the AR of the AD car. This value is trained in which with the variables of the direction at the same time until it meets the preset conditions of the result, and the final result can be output (Liu et al., 2022). In order to screen the valid values within the corresponding TDF, this study was calculated using the method shown in Eq. 13 below.


In the above Eq. 13, the effective time domain in a single experimental setting is denoted as σ0, and N represents the number of vehicle ADs performed during this time. With this technique, the ANN-BPAD algorithm can recognize the speed and direction of travel of other vehicles, thus avoiding traffic accidents. In areas where there are many pedestrians, the algorithm is able to detect pedestrians in a timely manner, thus protecting them in the road. For the recognition of traffic signs during driving, the ANN-BPAD algorithm has the ability to drive by rules, thus improving traffic efficiency. During these recognitions, the data is recorded in a way that follows Figure 6.

Figure 6

Figure 6. Data recording mode in the recognition process of ANN-BPAD algorithm.

Figure 6 presents the ANN-BPAD algorithm for pedestrians and traffic signs. In Figure 6, the study extracts their features for AR based on the original sensor data. This method can effectively deal with a large amount of driving data and automatically learn the feature representation of the data through BPAD, which in turn improves driving safety. For vehicle AR, sensor data is the most important input. When the sensor faces noise, the study introduces the following Eq. 14.


In the above Eq. 14, k represents the number of error corrections for the sensor data and Ek is the noise handling capability of the ANN-BPAD algorithm. This method is effective in reducing the requirements of input data, which in turn enhances the weak feature extraction ability of the data. When faced with manual preprocessing and feature extraction, the applicability of the ANN-BPAD algorithm is limited. In this condition, the study designs the method of automatic feature learning so that it can quickly screen the TDF and frequency domain features of the video as shown in Eq. 15.


In Eq. 15, Γ0 is the speed of automatic learning of vehicle features, the TDF of the video is denoted as Γm, and Γm represents its frequency domain features. In summary, the design of vehicle AD models for the ANN-BPAD algorithm involves data preparation and preprocessing, model design and training, and actual deployment of the algorithm. This research analyzes these aspects and improves the flaws contained in them. This method will achieve more significant results and create more economic value for the transportation industry, and its specific performance needs to be verified by experiments. This study presents a transfer learning method based on the distributed centroid, which is employed to construct an intermediate representation between the target domain and the source domain. This approach is designed to facilitate more effective cross-domain knowledge transfer. The objective is to design a tag recovery and track designable network that identifies and corrects mislabeled data and learns more robust feature representations to properly account for the effects of tag noise and data decentralization in motion recognition tasks for autonomous vehicles.

4 Improved performance analysis of auto driving car based on ANN-BP algorithm

To validate the research proposed ANN-BPAD algorithm for AR improvement in AD cars, this study was conducted on Autov dataset for experiments. In this dataset, a total of 57 roads are included and they are facing different number of vehicles respectively. The experiments of the study include the training efficiency, accuracy and robustness of the model.

4.1 Experimental validation of ANN-BPAD algorithm for action recognition

In this study, the data in the Autov dataset were divided into two groups according to the 9:17 ratio, which was used to rationalize the use of the limited total amount of data and to validate the learning performance of the model and the performance of AR while driving the car, respectively. Before the experiment, the study conducts the experiment with the equipment and parameters determined as shown in Table 1.

Table 1

Table 1. Equipment selection and parameter determination in ANN-BPAD algorithm.

Table 1 presents the data set utilized in the experiment, designated Autov. This data set comprises videos recorded in a variety of traffic scenes. Each video clip is equipped with detailed annotation information, including the type of action, the time stamp when the action occurred, and the location of the action subject. The distribution of action categories in the data set is balanced, thus avoiding the problem of an excess of some categories and a deficiency of others. Furthermore, the actions are captured in real time, reflecting the dynamic changes in the real world. The data set should be divided into two distinct portions: 70% for model training and 30% for experimental testing. After determining the parameters of the experiment in accordance with Table 1, the study conducts experiments to analyze the performance of the ANN-BPAD algorithm and compares the experimental results with those of the spotted hyena algorithm (SH), the ANN algorithm and the BPAD algorithm in order to verify the superiority of the algorithm. The population size of the SH is set to 50, and the feature dimension of each individual is consistent with the problem space. The selection of crossover rates of 0.8 and variation rates of 0.01 is based on the findings of previous studies and preliminary experimental tests, with the objective of maintaining a balance between the exploration and development of the algorithm. In order to ensure that the algorithm is able to adequately search the solution space, the number of iterations has been set to 100. The ANN algorithm employs a three-layer structure, comprising an input layer, two hidden layers, and an output layer. The initial layer comprises 128 neurons, while the subsequent layer contains 64 neurons. This configuration is subject to adjustment based on the characteristics of the data set and the outcomes of numerous experiments. The activation function is ReLU, the learning rate is 0.01, and it decays at a ratio of 0.95 as the number of iterations increases. This acceleration of convergence and avoidance of local minima is achieved through the use of a decaying learning rate. The introduction of the momentum term into the BPAD algorithm serves to enhance the efficiency of the network weight updates. The momentum parameter is set to 0.7, a value that has been demonstrated in numerous experiments to effectively accelerate convergence and reduce oscillation. The initial value of the learning rate is set to 0.01, and an adaptive adjustment strategy is employed to enable the algorithm to respond dynamically to changes in the network error. The results of the experiment are presented in Figure 7.

Figure 7

Figure 7. Test the ANN-BPAD algorithm’s supremacy. (A) Experiment of vehicle blind guide error. (B) Action judgment accuracy test.

The experimental findings demonstrating the four algorithms’ superiority are displayed in Figure 7. In Figure 7A, under the condition that the vehicle AR index grows, the error occurrence of the four algorithms for vehicle AD decreases subsequently. Their control ability for AD vehicles reaches stability when the recognition index exhibits 413. At this time, the error rate of the ANN-BPAD proposed in the study is the lowest among the four algorithms, which is 2.7*10−4, indicating that the algorithm has the strongest action judgment ability. In Figure 7B, the performance of all four algorithms in terms of the degree of accurate appearance of motion judgment shows an upward trend. Following the automobile action coefficient of 227, there is a steady trend in the four algorithms’ judgment accuracy. As of right now, the ANN-BPAD, SH, ANN, and BPAD algorithms have accuracy values of 0.98, 0.94, 0.93, and 0.95, respectively. The ANN-BPAD has the best accuracy of all of them, indicating the strongest error avoidance capability. To confirm that this algorithm can operate vehicles, the study carried out trials, as illustrated in Figure 8.

Figure 8

Figure 8. Extended test of vehicle running ability with four algorithms. (A) Vehicle running ability of ANN-BPAD algorithm. (B) Vehicle running ability of SH algorithm. (C) Vehicle running ability of ANN algorithm. (D) Vehicle running ability of BPAD algorithm.

In Figure 8, the driving control of vehicle category is positively correlated with the number of sample data. Category 1, 2, 3, 4, and 5 respectively represent acceleration, deceleration, turning, reversing, and uniform driving. In addition, with the increase of the number of samples in the road, the control effect when driving shows a trend of getting better. When the road sample is 547, the vehicle driving ability of the four algorithms are 4.7, 3.5, 3.2, and 2.6 respectively. Among them, the ANN-BPAD algorithm has the best experimental performance, which indicates that the ANN-BPAD is optimal in terms of vehicle operation ability. The results of this experiment show that the algorithms are effective for AD, and for the practical effectiveness of the ANN-BPAD algorithm in road operation, the study likewise conducts experimental verification.

4.2 Validation of the effectiveness of a driving automation algorithm combining ANN and BP

In order to experiment the driving automation effect of ANN-BPAD algorithm on different roads, it is studied to change its driving distance under the condition of controlling the same driving time, and then verify the wide range of ANN-BPAD algorithm on the road. The devices and sensors used in the experiment are shown in Figure 9.

Figure 9

Figure 9. Experimental device and sensor.

In Figure 9, the experiment employs a pure electric rear-drive four-wheel vehicle as the experimental apparatus. The front video sensor is situated on the upper surface of the front windshield and the rearview mirrors on both sides, while the rear video sensor is located on the upper surface of the rear windshield. The experimental data were reconstructed and stored by EmerNeRF, as illustrated in Figure 10.

Figure 10

Figure 10. Data acquisition and reconstruction.

In Figure 10, after the video data is collected, the sub-frame is analyzed, reconstructed and saved. The accuracy test is shown in Figure 11.

Figure 11

Figure 11. Accuracy test results of four models of wind shear. (A) Algorithm working index experiment. (B) Algorithm driving ability experiment.

In Figure 11A, the behavioral judgments of the four algorithms for driving automation are proportional to the algorithm work index. The ANN-BPAD method has the highest performance AR ability of all four algorithms at 23, while the driving environment detection abilities of the other three algorithms are 14, 17, and 11 accordingly. In Figure 11B, the on-road driving ability of the four algorithms changes as the algorithm working condition rises, but the ANN-BPAD algorithm consistently performs the best among the four algorithms. The results of this experiment show the wide range of ANN-BPADs, and the study conducted the effectiveness experiment of this algorithm is shown in Figure 12.

Figure 12

Figure 12. A large number of test results of ANN-BPAD algorithm driving unmanned.

In Figure 12, in the comparison of vehicle running distance of ANN-BPAD, SH, ANN and BPAD algorithms, when the driving time is 17.3 h, the running dimension of their algorithms are 1.2, 2.3, 2.7, and 3.4, respectively. It indicates that the ANN-BPAD algorithm is optimal in terms of energy-saving ability in the data processing of unmanned vehicles. The running time of the proposed algorithm of the study gradually stabilizes with the increasing distance traveled by the vehicle. It tends to stabilize when the driving distance is 7.3 km, and the best performance among the four algorithms is 14.7 h. The experimental results show the continuity of the ANN-BPAD algorithm, and in order to verify the effect of the algorithm on vehicle operation in different environments, the study carries out the algorithm’s robustness experiments, and the results of the experiments are shown in Figure 13.

Figure 13

Figure 13. Robustness test of four algorithms. (A) Experiment of algorithm auto driving. (B) Driving F1 value test.

Figure 13 shows the experiments conducted on the robustness of the ANN-BPAD algorithms. The AD capability of all four algorithms grows as the algorithm working time increases in Figure 13A. The highest vehicle driving abilities of ANN-BPAD, SH, ANN and BPAD algorithms are 29, 24, 19 and 17 respectively, indicating that the ANN-BPAD is superior in driving distance. In Figure 13B, the ANN-BPAD algorithm always has the lowest F1 value among the four algorithms. When the vehicle driving distance is 17 km, its accuracy reaches a stable level of 1.3. The experimental results show that the proposed ANN-BPAD algorithm has effectiveness, superiority, and robustness for AR in vehicular AD, and is suitable for effective obstacle avoidance in vehicular AD.

5 Conclusion

With the advancement of research in AD vehicles, the algorithms gradually tend to collect information at a high speed. To add the risk prediction capability of the environment to the AD of the vehicle, this study combined ANN and BP operations to generate ANN-BPAD in the AD vehicle. This study introduced time-frequency feature analysis, which enhanced the model’s adaptability to dynamic environments through in-depth analysis of driving features in the time-frequency domain. This analysis enabled the model to achieve more accurate AR in complex traffic scenes. To address the issue of the traditional BP algorithm requiring a significant amount of time and computational resources during the training phase, the study introduces parallel computing technology, a momentum term, and an adaptive learning rate to enhance the model’s generalizability. The study compared the experimental outcomes of the other three algorithms with the actual performance of the algorithms. When the road sample was 547, the vehicle driving ability of the four algorithms was 4.7, 3.5, 3.2, and 2.6, respectively, indicating that the ANN-BPADs have AD capability. When the driving time was 17.3 h, the dimensions of their algorithm operation were 1.2, 2.3, 2.7 and 3.4, respectively. The study’s ANN-BPAD algorithm, which is appropriate for lowering the risk of vehicular AD, possesses the qualities of high training efficiency, high accuracy, and robustness in the environmental risk probing of vehicular AD, according to the experimental results. However, this study was only conducted on small vehicles, while the road operation instrumentation also contains large vehicles. With the gradual upgrading of the equipment, this will get carried out in future studies.

Data availability statement

The original contributions presented in the study are included in the article/Supplementary material, further inquiries can be directed to the corresponding author.

Author contributions

YT: Conceptualization, Investigation, Methodology, Writing–original draft, Writing–review and editing. JT: Data curation, Software, Validation, Visualization, Writing–original draft, Writing–review and editing.


The author(s) declare that no financial support was received for the research, authorship, and/or publication of this article.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.


Abdar, M., Pourpanah, F., Hussain, S., Rezazadegan, D., Liu, L., Ghavamzadeh, M., et al. (2021). A review of uncertainty quantification in deep learning: techniques, applications and challenges. Inf. fusion 76, 243–297. doi:10.1016/j.inffus.2021.05.008

CrossRef Full Text | Google Scholar

Ali, A., Kamal, K., Ratlamwala, T. A. H., Fahad Sheikh, M., and Arsalan, M. (2021). Power prediction of waste heat recovery system for a cement plant using back propagation neural network and its thermodynamic modeling. Int. J. Energy Res. 45 (6), 9162–9178. doi:10.1002/er.6444

CrossRef Full Text | Google Scholar

Amin, S. N., Shivakumara, P., Jun, T. X., Chong, L., Zan, D. L. L., and Rahavendra, R. (2023). An augmented Reality-Based approach for designing interactive food menu of restaurant using android. Artif. Intell. Appl. 1 (1), 26–34. doi:10.47852/bonviewaia2202354

CrossRef Full Text | Google Scholar

Bibyan, R., Anand, S., and Jaiswal, A. (2023). Software reliability testing coverage model using feed-forward back propagation neural network. Int. J. Model. Identif. Control 43 (2), 126–133. doi:10.1504/ijmic.2023.10056489

CrossRef Full Text | Google Scholar

Charles, D. (2023). The Lead-Lag relationship between international food prices, freight rates, and Trinidad and Tobago's food inflation: a support vector regression analysis. Green Low-Carbon Econ. 1 (2), 94–103. doi:10.47852/bonviewglce3202797

CrossRef Full Text | Google Scholar

Chu, S., Cheng, L., Cheng, J., Zhang, X., Zhang, J., Chen, J., et al. (2023). Shallow water bathymetry based on a back propagation neural network and ensemble learning using multispectral satellite imagery. Acta Oceanol. Sin. 42 (5), 154–165. doi:10.1007/s13131-022-2065-6

CrossRef Full Text | Google Scholar

Claussmann, L., Revilloud, M., Gruyer, D., and Glaser, S. (2019). A review of motion planning for highway autonomous driving. IEEE Trans. Intelligent Transp. Syst. 21 (5), 1826–1848. doi:10.1109/tits.2019.2913998

CrossRef Full Text | Google Scholar

Dang, H. D., and Pham, T. T. (2021). Predicting contract participation in the mekong delta, vietnam: a comparison between the artificial neural network and the multinomial logit model. J. Agric. Food Industrial Organ. 20 (2), 135–147. doi:10.1515/jafio-2020-0023

CrossRef Full Text | Google Scholar

Hou, L., Lei, Y., Fu, Y., and Hu, J. (2020). Effects of lightweight gear blank on noise, vibration and harshness for electric drive system in electric vehicles. Proc. Institution Mech. Eng. Part K J. Multi-Body Dyn. 234 (3), 447–464. doi:10.1177/1464419320915006

CrossRef Full Text | Google Scholar

Klonecki, T., Teisseyre, P., and Lee, J. (2023). Cost-constrained feature selection in multilabel classification using an information-theoretic approach. Pattern Recognit. J. Pattern Recognit. Soc. 15 (2), 141–147. doi:10.1016/j.patcog.2023.109605

CrossRef Full Text | Google Scholar

Liu, Y., Li, Y., Ma, L., Chen, B., and Li, R. (2022). Parametric design approach on high-order and multi-segment modified elliptical helical gears based on virtual gear shaping. Proc. Institution Mech. Eng. Part C J. Mech. Eng. Sci. 236 (9), 4599–4609. doi:10.1177/09544062211058396

CrossRef Full Text | Google Scholar

Ludwig-Barbosa, V., Rasch, J., Sievert, T., Carlström, A., Pettersson, M. I., Thuy Vu, V., et al. (2023). Detection and localization of F-layer ionospheric irregularities with the Back-Propagation method along the radio occultation ray path. Atmos. Meas. Tech. 16 (7), 1849–1864. doi:10.5194/amt-16-1849-2023

CrossRef Full Text | Google Scholar

More, b, Almeida de, M., Emailprotected, E., Jakse, N., and Poloni, R. (2023). Artificial neural network-based density functional approach for adiabatic energy differences in transition metal complexes. J. Chem. Theory Comput. 19 (21), 7555–7566. doi:10.1021/acs.jctc.3c00600

PubMed Abstract | CrossRef Full Text | Google Scholar

Parikh, H., Patel, S., and Patel, V. (2023). Evaluation of deep learning and transform domain feature extraction techniques for land cover classification: balancing through augmentation. Environ. Sci. Pollut. Res. 18 (9), 515–531. doi:10.1007/s11356-022-23105-6

CrossRef Full Text | Google Scholar

Patole, S. M., Torlak, M., Wang, D., and Ali, M. (2017). Automotive radars: a review of signal processing techniques. IEEE Signal Process. Mag. 34 (2), 22–35. doi:10.1109/msp.2016.2628914

CrossRef Full Text | Google Scholar

Song, L. K., Li, X. Q., Zhu, S. P., and Choy, Y. S. (2024). Cascade ensemble learning for multi-level reliability evaluation. Aerosp. Sci. Technol. 148, 109101. doi:10.1016/j.ast.2024.109101

CrossRef Full Text | Google Scholar

Song, Y., Mo, S., Feng, Z., Song, W., and Hou, M. (2022). Research on dynamic load sharing characteristics of double input face gear Split-Parallel transmission system. Proc. Institution Mech. Eng. Part C J. Mech. Eng. Sci. 236 (5), 2185–2202. doi:10.1177/09544062211026349

CrossRef Full Text | Google Scholar

Wang, H., Wu, Y., Gao, C., Deng, Y., Zhang, F., Huang, J., et al. (2021). Medication combination prediction using temporal attention mechanism and simple graph convolution. IEEE J. Biomed. Health Inf. 25 (10), 3995–4004. doi:10.1109/jbhi.2021.3082548

CrossRef Full Text | Google Scholar

Yumatov, E. A., Dudnik, E. N., Glazachev, O. S., Filipchenko, A. I., and Pertsov, S. S. (2022). Revealing true and false brain states based on wavelet analysis of electroencephalogram. Neurosci. Med. 13 (2), 61–69. doi:10.4236/nm.2022.132006

CrossRef Full Text | Google Scholar

Zeng, F., Wan, R., Cao, Y., Song, F., Peng, C., and Liu, H. (2022). Predicting the self-diffusion coefficient of liquids based on backpropagation artificial neural network: a quantitative Structure–Property relationship study. Industrial Eng. Chem. Res. 61 (48), 17697–17706. doi:10.1021/acs.iecr.2c03342

CrossRef Full Text | Google Scholar

Keywords: artificial neural networks, back propagation, feature classification, action recognition, auto driving, time-frequency analysis

Citation: Tian Y and Tan J (2024) Improvement of action recognition based on ANN-BP algorithm for auto driving cars. Front. Mech. Eng 10:1400728. doi: 10.3389/fmech.2024.1400728

Received: 14 March 2024; Accepted: 23 May 2024;
Published: 13 June 2024.

Edited by:

Mohamed Arezki Mellal, University of Boumerdés, Algeria

Reviewed by:

Lu-Kai Song, Beihang University, China
Bin Yang, Xi’an Jiaotong University, China

Copyright © 2024 Tian and Tan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Yong Tian,