Does business news sentiment matter in the energy stock market? Adopting sentiment analysis for short-term stock market prediction in the energy industry

Lee, Chi-Yuan; Anderl, Eva

doi:10.3389/frai.2025.1559900

BRIEF RESEARCH REPORT article

Front. Artif. Intell., 23 July 2025

Sec. AI in Finance

Volume 8 - 2025 | https://doi.org/10.3389/frai.2025.1559900

This article is part of the Research TopicApplications of AI and Machine Learning in Finance and EconomicsView all 11 articles

Does business news sentiment matter in the energy stock market? Adopting sentiment analysis for short-term stock market prediction in the energy industry

Chi-Yuan Lee

Eva Anderl^*

HM Business School, Hochschule München University of Applied Sciences, Munich, Germany

Characterized by high volatility the energy stock market provides ample research potential for stock market prediction using machine learning models. This paper investigates using business news as an indicator of market sentiment in Recurrent Neural Networks. The authors adopt a finance-specific Transformer-based model, FinBERT, for news sentiment analysis and use a Long Short-Term Memory (LSTM) model for stock prediction. As prior research indicates that sentiment may vary for different news elements, they specifically explore differences between news headlines and content. Results show that (1) transformer-based sentiment analysis of business news can improve stock market prediction in the energy industry and that (2) sentiment of news content is more effective than sentiment of news headlines.

1 Introduction

Predicting stock market prices is a well-known challenge in machine learning due to the complex and dynamic nature of financial markets (Rouf et al., 2021). Various data sources have been explored to predict market performance, including technical factors and macroeconomic indicators (Latif et al., 2025) as well as global crises (Nuta et al., 2024). Over the last years, market sentiment, i.e., investors’ emotions toward a specific target affecting their decision-making (Medhat et al., 2014), has gained attention as a valuable predictor. Social media (Herrera et al., 2022; Reboredo and Ugolini, 2018) and news (Gupta and Banerjee, 2019; Li et al., 2021) have been identified as relevant sources influencing investors’ emotions—especially as interaction frequency with online content increases.

Sentiment analysis, a subfield of natural language processing (NLP), allows to computationally determine the emotional tone of digital textual content (Liagkouras and Metaxiotis, 2024), thereby supporting its integration into stock market prediction models. As of late, Transformer models have emerged as a promising advancement in NLP (Rahali and Akhloufi, 2023). Transformer models are neural networks that leverage the self-attention mechanism and have demonstrated superior performance over traditional models across various natural language processing tasks, including speech recognition and machine translation (Rahali and Akhloufi, 2023). In sentiment analysis, they have demonstrated superior performance in capturing contextual meaning and complex linguistic patterns (Mishev et al., 2020).

Prior research shows that market sentiment features derived using sentiment analysis can enhance model performance for stock prediction in different industries (Jin et al., 2020; Sarkar et al., 2020) and for market indexes (Li et al., 2014; Shi et al., 2018). Yet, as machine learning approaches for stock prediction are not always directly applicable to other industries (Ebadi et al., 2019), this study aims to find out whether sentiment analysis with novel Transformer-based models can be used for short-term stock market prediction in a highly volatile industry such as the energy sector. Characterized by its high volatility, the energy sector of the stock market is an interesting field of application for predicting stock market movements. According to Moran (2020), the energy sector exhibits the highest volatility compared to the commodities, financial, and technology sector from 2009 to 2019. This volatility is also reflected in a highly volatile annual return. For instance, the energy sector in S&P 500 fluctuated −33.68%, 54.64%, 65.72% and −1.33% in 2020, 2021, 2022 and 2023 respectively (US Bank, 2024).

Selected studies have already examined the use of social media sentiment for energy stock prediction (Ben Yahia et al., 2024; Herrera et al., 2022; Reboredo and Ugolini, 2018). However, there is a lack of research on sentiment analysis of general business news in the energy industry. Studies in other domains have shown that business news can have an impact on stock prediction (Deveikyte et al., 2022; Li et al., 2014; Ranco et al., 2016; Sarkar et al., 2020). For example, Deveikyte et al. (2022) find evidence of correlation between sentiment in tweets and news headlines and stock market movements for the largest 100 companies listed on the London Stock Exchange using VADER, a lexicon and rule-based sentiment analysis tool. Yet, to the best of our knowledge, the application of Transformer-based models for sentiment analysis of general business news in the context of industry-specific stock market prediction remains unexplored.

Given that news headlines and news content serve different purposes for both publishers and readers, we propose to analyze them separately as their sentiment and predictive capability may differ. Especially in a digital environment, news headlines are an important tool to attract the readers’ attention. They are supposed to raise the curiosity of readers to entice them to open the article. Prior research has shown that containing sentimental words significantly increases the click-through performance of headlines (Kuiken et al., 2017), which might prompt editors to formulate headlines with different sentiment.

Taken together, this paper will analyze the following research questions to investigate the applicability of Transformer-based sentiment analysis of business news in the energy industry:

1. Can Transformer-based sentiment analysis of business news improve short-term stock market prediction in the energy industry?

2. Do news headlines and content differ in their predictive capabilities?

2 Methods

To address our research questions, we conduct a quantitative study using real-world data. Our dataset comprises approximately 10 years of data from 7 energy sector companies. We employ FinBERT, a finance-specific Transformer-based model, for news sentiment analysis and utilize a Long Short-Term Memory (LSTM) model for stock price prediction.

2.1 Data collection

For our research, we select 7 energy companies listed on the New York Stock Exchange (NYSE) with top-ranking capital values according to the S&P Energy Index in November 2022: Exxon Mobil (XOM), Chevron Corporation (CVX), ConocoPhillips (COP), Eog Resources (EOG), Schlumberger N. V. (SLB), Occidental Petroleum Corp (OXY), and Pioneer Natural Resources Company (PXD). The research time interval is set from January 2013 to November 2022, covering 9 years and 11 months in total. This time interval is considerably longer than the average time period covered in prior stock market prediction studies (Kumar et al., 2021), allowing for a comprehensive analysis.

Stock data and business news are collected separately. Stock data are collected via the Yahoo Finance open source API (GitHub, 2023) with all features in numerical form. The stock data contains five basic features from historical trades, namely open, high, low, close, adjusted close, and volume. Open refers to the open price of the trading day, high refers to the highest price within the trading day, low refers to the lowest price of the trading day, close refers to the close price of the trading day, adjusted close refers to the close price with the consideration of dividend payoff on a certain date, and volume refers to the number of trades of the trading day.

The collection of textual data is more complex. We design a web scraping program to collect news pages within the research time interval automatically. The program extracts the publication timestamps, publishers, news headlines, and news content from 10 selected news publishers with widespread readership covering global financial markets and corporate affairs (Bloomberg, CNBC, The Economist, Financial Times, Forbes, The New York Times, Reuters, The Wall Street Journal, The Washington Post, and Yahoo Finance). In total, we collect 18,254 news headlines and corresponding URLs from Google News (2025) using Python. Utilizing the requests library, we then retrieve the full HTML content for 17,862 of these articles, resulting in a dataset comprising 17,862 complete data points.

2.2 Data preprocessing

In the data preprocessing stage, stock data and business news are subjected to different processing approaches. For stock data, we select close, open, and volume as the basic information in each stock. We normalize each close, open, and volume between 0 and 1 for each stock, then, we aggregate the 7 normalized stocks as an index’s close, index’s open, and index’s volume. The purpose of creating an index is to simplify and overcome lack of data from individual company stocks. Then, we create three extra features, percent, diff, and fluctuation to represent the differences of index close and index open.

For business news, we first remove duplicate entries based on identical URLs, followed by the elimination of HTML codes and irrelevant information such as advertisements, suggested readings, article information, and unrelated context. To identify passages relevant to our selected companies, we tokenize each article into individual sentences. For each sentence containing the target keyword, we extract the surrounding five sentences. After removing overlapping segments to avoid duplication, the resulting snippets are used as input for sentiment analysis.

To perform the sentiment analysis, we use FinBERT (Huang et al., 2023), a specialized financial language model based on BERT (Bidirectional Encoder Representations from Transformers), designed for natural language processing tasks in finance. FinBERT has been trained on financial texts, such as analyst reports, financial news, and SEC filings, to enhance its understanding of financial terminology and sentiment. It has been shown to substantially outperform other machine learning algorithms in sentiment classification (Huang et al., 2023). Prior researchers have already successfully used FinBERT in sentiment analysis of summarized news extracted from The New York Times (Kim et al., 2023).

Using the five-sentence snippets as input, the final layer of FinBERT produces a softmax output, yielding a probability distribution across three sentiment classes: positive, negative, and neutral. The sum of the three categories equals 1.¹ Using this classification, we create six aggregate sentiment features for each day: headline_positive, headline_negative, headline_neutral, content_positive, content_negative, and content_neutral.

As 1,373 samples have no news sentiment collected on that date, our final training dataset consists of 2,497 samples with 11 features and 1 target variable (i.e., index_close). Table 1 provides an overview of the features in our final dataset.

Table 1

Table 1. Descriptive statistics of features.

As shown in Table 1, neutral sentiment dominates in both headlines and content. In headlines, negative sentiment exceeds positive sentiment, whereas in content, positive sentiment is more prevalent than negative. Besides, we observe that the distributions of sentiment features are all left-skewed, whereas stock price features are more symmetric.

After preparing the dataset, we create five subsets of features for performance comparison between models with and without sentiment features (see Table 2). Our baseline is the subset only including the stock price features. Subset 1 contains stock price features and content sentiment features. Subset 2 contains stock price features and headline sentiment features. Subset 3 includes stock price features, content sentiment features, and headline sentiment features. Subset 4 contains only headline and content sentiment features.

Table 2

Table 2. Subset details.

2.3 Model architecture and training

A popular approach in stock-market prediction is using Recurrent Neural Networks (RNNs), which have become state-of-the-art models for a variety of machine learning problems (Greff et al., 2017). Specifically, RNNs with Long Short-Term Memory are a frequently used tool in time series forecasting (Guo, 2020; Jin et al., 2020; Sarkar et al., 2020). As LSTMs have been successfully used for stock prediction with market sentiment (Bhandari et al., 2022; Fischer and Krauss, 2018; Sarkar et al., 2020; Yadav et al., 2020), we adapt this approach and use an LSTM model.

We employ a five-layer LSTM architecture, each with 50 hidden units, followed by a dropout layer with a dropout rate of 0.2 after every LSTM layer. A dense layer with 1 hidden unit is added at the end of the model. The hyperparameters were determined during training using a grid search.

For model training, we split the whole dataset into training, validation, and test datasets (70, 15, and 15%). Since we are dealing with time-series data, we do not shuffle the samples so that the sequence of the samples remains. All models are trained with 60 timestamps and predict the 61^st target variable. To prevent overfitting, we use a checkpoint to reserve the best epoch in iterations based on the lowest validation loss. Finally, we use the test data as out of sample dataset to evaluate the final results. Besides, because of the stochastic nature of LSTMs, the model performance might vary even with the same hyperparameters. To examine and compare the true performances, we retrain and evaluate each model 100 times to collect the performance distributions. As a robustness check, we add an alternative implementation using XGBoost (Chen and Guestrin, 2016), a widely used machine learning algorithm.

3 Results

For model evaluation, we use Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), Mean Square Error (MSE) and Root Mean Squared Error (RMSE). A review of existing research indicates that MSE is most frequently employed for stock market prediction, closely followed by Accuracy and MAE (Ketsetsis et al., 2020). In another systematic literature review, it was found that RMSE, MAPE, and MSE were the most frequently employed metrics (Nti et al., 2020).

As we retrain each model 100 times, we obtain the distribution in each subset. Besides, we also apply statistical tests calculating the z-score and p-value for MAE to examine if there are significant differences when using the sentiment features.

As shown in Table 3, for the LSTM, average performances in subset 1 (stock price features + news content) are better than in the baseline relying only on stock price features. The model performs better in subset 3 (stock price features + news content + news headlines) than in subset 2 (stock price features + news headlines), whereas in subset 4 (news content + news headlines), which includes only sentiment scores and no stock price features, the model does not perform well.

LSTM and XGBoost achieve similar predictive accuracy (MAE, MAPE, MSE, and RMSE). Yet, interestingly, XGBoost seems to rely less on sentiment features, with the baseline model outperforming the other models. However, the ranking of sentiment-based models (subset 1 performing better than subset 2 and 3) remains consistent with the LSTM results across all metrics, supporting the robustness of our results.

Table 3

Table 3. Model performance distributions.

To compare performance differences among the subsets, we use statistical tests to verify if the sentiment features bring a significant improvement in average LSTM performance. We compare the mean model performance in the baseline and the mean model performance for other subsets using MAE, whose scale depends on the scale of the data, thus making it easy to interpret if applied to a single dataset (Hyndman and Koehler, 2006). Table 4 shows the statistical test results for the LSTM models.²

Table 4

Table 4. Statistical test results (LSTM).

In light of Table 4, compared to the baseline, the performance improvements in subset 1 (p = 0.001) and subset 3 (p = 0.038) are statistically significant. Thus, we conclude that for the LSTM, subset 1 (stock price features and news content) and subset 3 (stock price features, news content, and news headlines) significantly outperform the baseline.

4 Discussion

In this paper, we examine whether business news sentiment features created with FinBERT can be used in Transformer-based stock price prediction in the energy industry. Combining stock price features with news sentiment features significantly improves the average predictive performance for LSTM models. Adopting subset 1 (stock price features and content sentiment features) is the most effective combination. In comparison, headline sentiment features seem to be less effective.

Our findings have several important theoretical implications. First, we show that sentiment analysis of business news can successfully be applied in the energy sector. While social media sentiment has been used for energy stock prediction (Ben Yahia et al., 2024; Herrera et al., 2022; Reboredo and Ugolini, 2018), general business news have not been tested in the energy sector. Second, we provide an example of how domain-specific Transformer models can be applied in sentiment analysis. While FinBERT has already been shown to outperform more established analysis approaches regarding classification accuracy of central bank communication (Kim et al., 2024), applications in an industry-specific context are still rare. As the data preparation steps taken are not specific to the energy industry, our paper provides a blueprint of how to apply sentiment analysis of general business news for stock market prediction in different industry settings. Third, we show that news content and news headlines differ in their predictive ability. A potential explanation is that news headlines might include some overreacted sentiment to attract the readers’ eyes (Rieis et al., 2021). Prior studies support the existence of the incongruity between headlines and content (Deveikyte et al., 2022; Yoon et al., 2019; Yoon et al., 2021). Yet, unlike Deveikyte et al. (2022), our study finds that for LSTMs, news content is more effective than news headlines. This divergence might be explained by three reasons: (1) in contrast to Deveikyte et al. (2022), our study uses the same data source for headlines and news, thus eliminating the risk of potential differences in the datasets. (2) We analyze general business news and not financial news and focus on the energy industry. Writing styles and thus sentiment distribution might differ between these news types and within industries. (3) The importance of features may differ between traditional machine learning models (such as XGBoost) and deep learning models (Lai et al., 2019). Collectively, our findings thus highlight the importance of analyzing headlines and content independently and show a need for more differentiated research. Content features, requiring more effort to extract and thus less commonly utilized in existing research, provide additional value and should not be neglected.

From a practical perspective, our study has implications for several groups interested in stock market prediction. While adopting sentiment features as the only feature set is not sufficient to predict stock prices (as in subset 4), Transformer-based sentiment analysis of business news can help to improve stock prediction performance in the energy industry, especially when analyzing headlines and content separately. Quantitative traders can thus introduce business news sentiment features generated by domain-specific Transformer models in trading models. Additionally, this research also points out a risk of stock market manipulation using news sentiment. Regulatory authorities should monitor this closely and might need to update policies and regulation accordingly.

As every research, our study is subject to several limitations. First, we collected data from 10 news publishers and 7 energy companies from January 2013 to November 2022. Even though this provides us with sufficient data to conduct our analyses, we suggest future studies expand the scope of data sources and extend the time interval. Second, our research could be expanded by testing other types of stock prediction such as percentage of return, stock picking, and, potentially, a swing trade strategy. Third, we focused on one LSTM specification and FinBERT as a single sentiment analysis model. We suggest further researchers adopt other models and apply different sentiment analysis approaches as outlined by Liagkouras and Metaxiotis (2024) to investigate the strengths and weaknesses of different approaches.

Data availability statement

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Author contributions

C-YL: Writing – original draft, Writing – review & editing. EA: Writing – original draft, Writing – review & editing.

Funding

The author(s) declare that financial support was received for the research and/or publication of this article. C-YL was supported by the Oskar-Karl-Forster Büchergeldstipendium by Hochschule München.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The authors declare that no Gen AI was used in the creation of this manuscript.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Footnotes

1. ^For details about the implementation of FinBERT, please refer to Liu et al. (2021).

2. ^For XGBoost, a significance test is not possible.

References

Ben Yahia, S., Garcia Sanchez, J. A., and Kaffel, R. H. (2024). Impact of sentiment analysis on energy sector stock prices: a FinBERT approach [Preprint].

Google Scholar

Bhandari, H. N., Rimal, B., Pokhrel, N. R., Rimal, R., Dahal, K. R., and Khatri, R. K. (2022). Predicting stock market index using LSTM. Mach. Learn. Appl. 9:100320. doi: 10.1016/j.mlwa.2022.100320