Edited by: Giray Gozgor, Istanbul Medeniyet University, Turkey
Reviewed by: Yuhua Song, Zhejiang University, China; Yu Guan, Renmin University of China, China
This article was submitted to Environmental Psychology, a section of the journal Frontiers in Psychology
This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
In this paper, we regard the Baidu index as an indicator of investors' attention to China's epidemic stocks. We believe that when seeking information to guide investment decisions, investor sentiment is usually affected by the information provided by the Baidu search engine, which may cause stock prices to fluctuate. Therefore, we constructed a GARCH extended model including the Baidu index to predict the return of epidemic stocks and compared it with the benchmark model. The empirical research in this paper finds that the forecast model including the Baidu index is significantly better than the benchmark model. This has important reference value both for investors in predicting stock trends and for the government's formulation of policies to prevent excessive stock market volatility.
Information has become one of the most valuable assets in the financial market. A growing body of literature shows that there is a complex correlation between information and financial market fluctuations. However, due to data limitations, most of the previous research has focused on the exploration of the role of micro-enterprise financial information. Many studies have also theoretically explored the importance of information to financial markets. With the popularization of mobile Internet, the emergence of big data information has completely changed the production, intermediary, dissemination, and consumption of information in the financial industry. Investors usually use enhanced information searches to cope with uncertainty and assist decision-making. The main advantages of using Internet searches are that they are free, wide-ranging, and timeliness.
The use of information obtained from search engines for stock market return forecasts is gradually gaining attention. The availability of big data brought about by search engines reinforces this trend. These data can usually provide investors with real-time information, thereby enhancing the scientific nature of investment decisions. In view of the gradual popularity of the mobile Internet, it has become a standard practice for most investors to seek online information before making an investment decision. Investors leave behind data about what they are looking for every time they use a search engine. If these retrieved data are analyzed systematically, investor sentiment in the real world can be effectively tracked. Therefore, the analysis of stock market dynamics usually depends on the availability of online information. Among them, the use of Google search data to predict epidemics, unemployment rates, and epidemiology is widely known.
Although Google ranks first in the global search engine market, it is unavailable to Chinese users. In China, investors usually use the Baidu search engine to retrieve data. This search engine's market share ranks first in China. Therefore, it is reasonable to believe that the data provided by Baidu searches can reliably reflect the attention of Chinese investors. The Baidu Index is a tool similar to Google Trends, launched by Baidu to reflect the intensity and interest of users in searching for specific keywords. Unlike Google Trends' weekly data, the Baidu Index can obtain daily user retrieval data, but there is a fee to download this data. In the current context in which the COVID-19 epidemic has not yet been effectively controlled, the availability of such high-frequency data will not only assist consumers in making investment decisions, but also help government departments to monitor epidemic trends in real time, contributing to efforts to control the epidemic.
In the raging situation of the COVID-19 epidemic, most existing studies use Google Trends to analyze the epidemic's impact, predict its trends, determine the users' needs for epidemic prevention materials and information, and monitor epidemiological trends. There are also some studies using Google Trends to examine the impact of COVID-19 on the macro economy and the stock market (Szczygielski et al.,
In recent years, some scholars have used Google Trends to study epidemics and disease spread. There are also some studies that predict financial markets based on Google Trends data, but the conclusions are inconsistent.
Yung and Nafar (
Research by Vlastakis and Markellos (
Preis et al. (
In view of the data availability and habits of Chinese investors using search engines, we use the Baidu index to analyze investors' attention to information about the epidemic. For keywords related to the epidemic, we have selected representative terms such as “N95 masks,” “Wuhan epidemic,” “latest news about COVID-19,” “medical masks,” etc. We use N95, Wuhan, COVID-19, and Mask to represent the above variables, respectively. All keyword data are taken with natural logarithmic values. We selected Walvax, a listed company, as the research object for the stock return data of the epidemic. Walvax is a modern biopharmaceutical enterprise specializing in the research and development, and the production and sales of biological drugs such as vaccines and blood products in China. It is a nationally recognized high-tech enterprise and a national enterprise technology center. The company was listed on the Growth Enterprise Market of the Shenzhen Stock Exchange of China in November 2010. Its stock code is 300142. We use the following formula to calculate stock returns:
In the above formula, Closing
Before exiting the Chinese market in 2010, Google ranked first in the Chinese search engine market. Since then, Baidu's market share has continued to rise from 60.2% in 2010 to 72.4% in May 2021 (Statcounter, 2021). In view of Baidu's leading position in the current Chinese search engine market, we choose Baidu Index instead of Google Trends as the source of search data to obtain investor sentiment (
The statistical description of the main variables is shown in
Statistical description of the main variables.
Mean | 0.016 | 6.424 | 7.617 | 10.058 | 5.930 |
Median | 0.096 | 6.417 | 7.617 | 10.041 | 5.872 |
Maximum | 12.423 | 7.272 | 9.526 | 11.439 | 7.403 |
Minimum | −19.996 | 5.429 | 6.758 | 9.141 | 4.997 |
Std. Dev. | 4.129 | 0.325 | 0.475 | 0.486 | 0.438 |
Skewness | −0.370 | −0.500 | 0.744 | 0.625 | 0.641 |
Kurtosis | 5.895 | 3.891 | 4.352 | 3.273 | 4.271 |
Jarque-Bera | 68.468 | 13.739 | 30.971 | 12.536 | 25.008 |
Probability | 0 | 0.001039 | 0 | 0.001896 | 0.000004 |
Sum | 2.910 | 1182.072 | 1401.443 | 1850.660 | 1091.151 |
Sum Sq. Dev. | 3119.236 | 19.308 | 41.262 | 43.163 | 35.151 |
Observations | 184 | 184 | 184 | 184 | 184 |
Before empirical research, we need to test the stationarity of the data. We use ADF and PP tests to investigate the stationarity of the data, and the test results are shown in
Data stationarity test.
Return | −16.31137 |
−16.31657 |
N95 | −3.686776 |
−3.752247 |
Wuhan | −3.711559 |
−3.418142 |
COVID-19 | −2.856401 |
−2.974697 |
Mask | −3.359155 |
−3.365990 |
The forecast error of financial time series usually has heteroscedasticity. The size of the residual is related to the most recent residual value. Ignoring the effects of ARCH may lead to reduced effectiveness. We use the ARCH LM method to test the ARCH effect.
The null hypothesis of the ARCH LM test is: there is no ARCH effect up to the order in the residual sequence, and the following regression is required:
In formula (2), û
The test results are shown in
LM test results.
Breusch-Godfrey Serial Correlation LM Test: | |||
Null hypothesis: No serial correlation at up to 1 lag | |||
F-statistic | 3.486551 | Prob. F(1,245) | 0.0631 |
Obs*R-squared | 3.535849 | Prob. Chi-Square (1) | 0.0601 |
Since many asset prices are conditionally heteroscedastic, GARCH is widely used in the financial industry. We first estimate the GARCH(
Given that the investor sentiment represented by Baidu search keywords may affect stock returns, we add the Baidu Index variable to the above formula (3) to extend the benchmark model. But we need to first determine the lagging term of the Baidu Index. The estimation results show that selecting the lag period 0 can pass the 1% significance test. This is mainly because the Baidu Index search terms are based on days and are reflected in real-time. In other words, the investor sentiment reflected in the Baidu Index may be reflected in the stock market returns of the day. To this end, we establish the following extended GARCH model including the Baidu Index:
Based on the extended model of Equation (4), we use the one-step forward method to predict the return of epidemic stocks. The entire sample period covers March 4, 2020 to March 9, 2021. We first use the time series data from March 4, 2020 to September 6, 2020 to estimate the model, and then we use the above model to predict the variance on September 7, 2020. We continue to repeat the above steps, using the one-step forward method, to predict the variance of the entire sample period. At the end of the sample period, we can use the prediction error to judge the prediction accuracy of the model.
The prediction error value of a model is the difference between the actual model and the conditional variance of the predicted value. The forecast results are shown in
MSE of the benchmark and extended models.
To accurately calculate the degree of improvement in the prediction accuracy of the extended model relative to the benchmark model, we calculate the root mean square error of the two models. The calculation results show that, compared with the benchmark model, the extended model with the Baidu index reduces its root mean square error by an average of 3.16%. We also divide the entire sample period into a period of high volatility and a period of low volatility. High volatility refers to the period when the conditional variance exceeds the mean. Low volatility refers to periods that are below average. The results of the study show that during periods of high volatility, the number of searches on the Baidu index is also higher. This shows that in turbulent phases of the market, investors increase their information searches to assist in decision-making to avoid risks. This sentiment is directly reflected in the increase in transaction volume. In addition, the calculation results show that the degree of MSE decline in high volatility periods is more significant. Overall, regardless of the period, the extended model including the Baidu Index has greater prediction accuracy than the benchmark model. This shows that the extended model that includes the Baidu index is more suitable for predicting the returns of epidemic stocks. In other words, given the real-time nature of investor retrieval data, investor sentiment can be reflected in epidemic stocks in a timely manner. This has a shorter time lag than the previous official release of data with a significant lagging period on stock prices. Regardless of whether we consider investors or market regulators, including real-time investor sentiment into the prediction model will help improve investment returns or predict the direction of the stock market in real time. Incorporating diversified big data into stock and financial market analysis has become a standard practice.
In stock investment, although professional investors will monitor the leading index, this approach is largely inaccessible to retail investors due to cost and ability. In the raging situation of the COVID-19 epidemic, once a drastic change in the epidemic is discovered, investors may use search engines such as Google and Baidu as a source of information to assist in investment decisions. Epidemic data obtained by using the Baidu index can well represent investor sentiment and concern about epidemic stocks. After searching for relevant information on stocks related to the epidemic, investors may take investment actions immediately or the next day to affect stock returns.
In this paper, we use Baidu index data to measure investors' interest in searching for stock information about the epidemic. We find that during periods of severe market turbulence, investors' attention to the stock market increases substantially. To avoid risks, investors tend to increase the trading volume of stocks, which leads to large fluctuations in stock prices. By incorporating the Baidu index into the extended GARCH model, we find that investor sentiment is reflected in the stock returns of the epidemic in a timely manner. Especially in periods of market turbulence, the model including the Baidu index can significantly improve forecasts. Therefore, investor sentiment reflected in search engine data constitutes a valuable source of information for predicting future volatility. Such real-time information can improve investor returns and help regulators to monitor the trends of the epidemic in real time. This provides information and decision-making reference points for controlling the epidemic and judging the demand for epidemic prevention materials. Under the COVID-19 pandemic, we can also predict the outbreak of the epidemic and the demand for medical supplies in advance by monitoring investors to search for keywords, and provide a scientific basis for the formulation of a vaccine plan.
Publicly available datasets were analyzed in this study. This data can be found here:
BJ: writing—original draft. HZ: writing—review. JZ: investigation and design. CY: draft writing and software. RS: editing. All authors contributed to the article and approved the submitted version.
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.