A prediction method for consumer online purchasing behavior based on big data analysis

Wen, Yuanyuan; Liu, Lei

doi:10.3389/fphy.2026.1686157

ORIGINAL RESEARCH article

Front. Phys., 02 February 2026

Sec. Social Physics

Volume 14 - 2026 | https://doi.org/10.3389/fphy.2026.1686157

A prediction method for consumer online purchasing behavior based on big data analysis

Yuanyuan Wen¹

Lei Liu²*

¹College of Computer Science and Engineering, Taizhou Institute of Science and Technology, NJUST, Taizhou, China
²College of Business, Taizhou Institute of Science and Technology, NJUST, Taizhou, China

With the rapid development of electronic networks, consumer online purchasing behavior data presents massive growth and diverse characteristics. How to accurately predict purchasing behavior based on big data analysis becomes the key to improving the quality and efficiency of consumer services. Introducing deep learning methods into purchase prediction research, this paper proposes entity embedding-convolutional neural network-convolutional block attention module (EE-CNN-CBAM) for predicting consumer network purchasing behavior. By entity embedding (EE), high cardinality categorical variables are transformed into low dimensional dense vectors to reduce the computational cost of big data. Using convolutional neural network (CNN) as the core, local association patterns are extracted from user behavior sequences to capture implicit features of consumer online purchasing behavior. And based on the time series data of consumer online purchasing behavior, the characteristic indicators of purchasing behavior patterns are constructed. Convolutional block attention module (CBAM) adjusts channel attention adaptively, allowing the model to prioritize and reinforce the expression of important purchasing behavior features. The experimental results show that EE-CNN-CBAM improves the prediction accuracy on large scale consumer network purchase datasets, providing effective support for consumer behavior prediction in big data environment.

Highlights

• This paper proposes entity embedding-convolutional neural network-convolutional block attention module (EE-CNN-CBAM) for predicting consumer network purchasing behavior.

• By conducting multiple simulation experiments, it is verified that EE-CNN-CBAM can effectively provide more reliable big data analysis support for predicting purchasing behavior.

1 Introduction

With the deep penetration of the digital economy, network electronics have become the core carrier of Chinese residents’ consumption [1]. Behind this trend is the transformation of online shopping from “supplementary channels” to “mainstream scenarios”. Consumers not only complete product purchases through e-commerce platforms, but also leave massive behavioral traces throughout the entire process of browsing, comparing, consulting, and evaluating. These data cover user basic attributes such as age, gender, region, interactive behaviors such as click sequences, browsing time, add purchase actions, transaction records such as purchase frequency, average order value, payment methods, and time characteristics such as purchase period, stay time, repeat purchase cycle, etc. These constitute a large-scale and multi-dimensional consumer behavior big data pool.

Accurately predicting consumer purchasing behavior has become the key for enterprises to improve conversion efficiency and optimize service experience. For online platforms, accurately judging consumers’ online purchasing intentions can achieve personalized recommendations, such as pushing target products to high intention users, dynamically pricing discounts for price sensitive users, and inventory allocation based on predicted demand for early stocking. For brand owners, purchasing behavior prediction can support precision marketing such as targeted coupon distribution, product iteration based on user preference adjustment functions, and channel optimization focusing on advertising during high conversion periods. For consumers, an efficient prediction system can reduce the interference of invalid information, improve shopping decision-making efficiency and security [2].

However, the complexity of consumer online purchasing behavior and the challenges posed by data characteristics make accurate prediction face multiple challenges. From the perspective of behavior itself, purchasing decisions are influenced by multiple coupled factors. This includes not only the matching degree between product attributes such as price, brand, evaluation, and user preferences such as category orientation and quality sensitivity, but also the interference of scenario factors such as promotional activities, seasonal changes, and social influences such as social recommendations and influencer sales, presenting non-linear and dynamic characteristics. From a data perspective, consumer behavior big data has high-dimensional characteristics, with a total feature dimension exceeding one billion under the scale of millions of users. Another advantage is high sparsity, where user behavior exhibits strong selectivity. In addition, the data type covers structured data such as user ID, commodity classification, semi-structured data such as browsing path log and unstructured data such as evaluation text and consultation dialogue. Traditional single model is difficult to achieve multimodal feature fusion.

Big data analysis technology has the ability to mine deep features, and with the help of deep learning models such as neural networks and attention mechanisms, it can capture implicit associations from sparse data, and cope with dynamic changes in purchasing intentions, such as sudden increases in user intentions after promotions begin [3]. These abilities make it possible to extract core predictive signals from complex behavioral data and construct high-precision predictive models. The research on predicting consumer purchasing behavior can be traced back to the traditional retail era. In the early days, statistical methods [4] were used to analyze the purchasing patterns of limited samples, such as exploring the relationship between price and sales volume through regression models. With the rise of online consumption and data accumulation, research has gradually shifted towards machine learning methods, resulting in rich achievements in recent years.

In terms of predictive model construction, research has evolved from traditional machine learning to deep learning. In traditional methods, logistic regression is widely used for purchasing probability prediction due to its strong interpretability, such as predicting users’ purchasing behavior after clicking on advertisements, but it is difficult to capture nonlinear relationships [5]. Random forest and gradient boosting tree, also known as GBDT, improve accuracy through ensemble strategies and perform excellently on structured data. The rise of deep learning has provided new tools for processing sequential data and high-dimensional features. Recurrent neural networks, also known as RNNs and variants such as LSTM and GRU, capture temporal dependencies of behavior such as browsing to purchase sequence associations through memory mechanisms [6–8], resulting in improved accuracy in time series prediction compared to traditional methods. CNN extracts local behavioral patterns through convolution operations, such as clicking on products of the same category three times in a row. The introduction of attention mechanism further enhances the model’s ability to focus on key features, enabling the model to automatically pay attention to strong intentional behaviors, thereby improving prediction accuracy.

The core value of deep learning in predicting consumer online purchasing behavior lies in its ability to model nonlinear relationships and automatically learn features. Traditional machine learning relies on artificial feature engineering, while deep learning can automatically transform raw data into higher-order features through multi-layer neural networks. This end-to-end feature learning mechanism is particularly suitable for capturing implicit associations in consumer big data. Under training with millions of samples, the model can autonomously discover patterns that are difficult for humans to detect.

In recent years, hybrid models have become a research hotspot in the field of predicting consumer online purchasing behavior due to their adaptability to multi-source data and advantages in prediction accuracy [9]. This type of model achieves synergistic efficiency by integrating the core advantages of different algorithms. For example, when integrating LSTM, which excels in capturing temporal dependencies, with attention mechanisms that have dynamic focusing capabilities, the attention mechanism can strengthen its focus on key decision nodes by calculating the associated weights of each node in the behavior sequence based on LSTM’s long-term modeling. Specifically, when there are sudden signals in the user behavior sequence, such as concentrated clicks during promotional periods or peak purchases before major promotions, the attention mechanism will automatically increase the weight proportion of these nodes. It accurately locates the core behaviors that affect purchasing, avoiding feature dilution caused by long time series. This fusion model has improved prediction accuracy compared to a single LSTM when processing user behavior data across weeks and months [10], especially in identifying purchase intentions in promotional scenarios.

However, the implementation of existing hybrid models in actual consumer network scenarios still faces multiple real-world challenges. Firstly, the processing efficiency of large-scale real-time data is insufficient. The daily user behavior data generated by e-commerce platforms can reach billions of levels, covering dozens of features such as browsing clicks, add ons, favorites, payment conversions, etc. The multi-layer network structure and complex computational logic of the hybrid model can easily lead to feature processing delays. Single user feature calculation can take up to 50 milliseconds, making it difficult to meet real-time requirements under concurrent requests, directly affecting the instant conversion opportunities of high intention users. Secondly, the time pattern mining of behavioral data is still insufficient. The temporal characteristics of user purchasing behavior exhibit significant dynamism and multi granularity, including both long-term stable habitual preferences and temporary changes driven by short-term scenarios. Existing hybrid models often use fixed time windows to extract features, making it difficult to adapt to such dynamic changes. The model cannot capture the migration of temporal preferences in a timely manner, resulting in prediction results lagging behind actual behavioral patterns. At the same time, most models only handle time granularity at the day or hour level, failing to delve into minute level fine patterns, and the behavioral differences in such high-frequency periods are often key signals for distinguishing high intention users.

In the field of time feature mining, early research mostly focused on static time attributes, ignoring the temporal patterns of behavior. However, existing research has not combined the long-term and short-term patterns, such as the mutual influence between short-term promotions and long-term habits, resulting in deviations in predictions when scenarios change, such as during major promotions. This paper aims to provide models and methods for predicting consumer purchasing behavior in the big data environment, and promote the deep integration of theoretical research and online consumption practice.

1. This paper proposes EE-CNN-CBAM, which consists of entity embedding, convolution neural network and convolutional block attention module. Among them, EE is used to improve the performance of CNN on structured big datasets, while CNN is mainly responsible for predicting consumer network purchasing behavior in the end.

2. The introduction of CBAM can achieve precise capture of key features for massive consumer behavior data. It dynamically assigns differentiated weights to different features or sequence positions, strengthening the focus on core features that are strongly related to purchasing decisions, while weakening redundant information such as accidental browsing of non target categories.

3. By conducting multiple simulation experiments, it is verified that EE-CNN-CBAM can effectively provide more reliable big data analysis support for predicting purchasing behavior.

The rest of this paper consists of four parts. Section 2 is related literature related to the work. Section 3 provides a detailed introduction to EE-CNN-CBAM of consumer online purchasing behavior prediction, which combines big data analysis. Section 4 designs comparative experiments for testing and analysis based on multiple baselines. Finally, Section 5 is the summary.

2 Literature review

The application of machine learning technology in the field of online electronics had become quite mature. Especially in the field of commodity price prediction, machine learning algorithms integrated and processed multi-dimensional correlation features such as multi-source time-series data, market dynamics and consumer behavior to construct nonlinear prediction models, achieving dynamic prediction of price fluctuations. [11] conducted research on consumer purchase prediction in non-contractual environments by establishing a machine learning framework based on consumer purchase data. [12] proposed the Bayesian personalized ranking (BPR) framework based on the premise that browsing time within a game app was directly proportional to the likelihood of purchasing products. And by conducting experiments on the game app dataset, they ultimately improved the accuracy of player app purchase prediction. [13] used a logistic regression model to adjust time-related features based on an e-commerce dataset, and compared the purchasing behavior of consumers for different brand products in two scenarios of marketing and daily sales. [14] used a support vector machine algorithm to mine activity record data of Twitter users and predicted consumers who purchased digital cameras and personal computers. [15] proposed a two-stage consumer purchase model COREL, in which the first stage established the connection between consumers and products. And the second stage predicted which associated products consumers were more likely to purchase. Finally, they conducted an empirical study on purchasing prediction on the JD consumer purchasing dataset. The results showed that COREL could effectively calculate product popularity and accurately predict customer purchasing behavior. [16] added RFM variables to the association rules of the recommendation algorithm based on the online user purchasing cosmetics dataset, enabling the algorithm to dynamically predict consumer purchasing behavior. [17] used neural networks and decision tree algorithms to predict customers’ purchase behavior of shopping cart items based on consumer online purchase click-through flow data and demographic data. [18] included the factor of consumer purchase stay time in their purchase prediction study and compared the predictive performance of support vector machine (SVM), logistic regression (LR) and other algorithms on RFID datasets. The results confirmed that SVM achieved better predictive performance.

With the development of deep learning methods, temporal modeling techniques had made breakthrough progress in the field of price prediction. [19] proposed a stock price trend prediction model called LSR-IGRU, which significantly improved the accuracy of stock trend prediction by integrating multi-scale temporal features and enhancing gating mechanisms. [20] conducted predictive research on consumer purchasing and product classification on a large retail sales dataset by constructing LDA and MDM models. [21] proposed a Time Series Retrieval Augmented Generation method (TimeRAG), which combined retrieval-enhanced generation and large-scale language models to improve the accuracy of time series prediction, especially in the field of quantitative trading. [22] proposed a product image feature selection model that integrates CNNs and attention mechanisms. And they combined an improved probability unit model and consumer selection model. They determined the optimal pricing strategy through nonlinear constraint programming to adapt to different market environments and changes in consumer characteristics. [23] constructed a feature combination deep learning framework (FC-LSTM) based on consumer purchase history data and demographic data to predict consumer purchase decisions. [24] proposed a novel deep learning algorithm based on customer purchasing behavior, namely Weight Optimized Long Short Term Memory Network (WOLSTM), for dynamic pricing solutions on e-commerce platforms. [25] proposed a real-time online user purchase prediction model, which consisted of two modules. The two modules were MLP and LSTM. They successfully conducted a two-stage study on purchasing prediction using this model on user online browsing data, clickstream data and demographic data. [26] proposed a new price prediction model that predicts the sales price of goods through news events and improved the accuracy of price prediction. [27] proposed a dynamic pricing model based on linear regression, which predicted the optimal price of agricultural products by real-time analysis of market supply, helping farmers cope with price fluctuations and maximize profits at low cost.

3 Research on consumer online purchasing behavior analysis and prediction for big data analysis

3.1 Overview of EE-CNN-CBAM combined with big data

This paper constructs EE-CNN-CBAM for predicting consumer online purchasing behavior, which integrates EE, CNN and CBAM modules. Its core advantage lies in adapting to multi-source consumption data processing in big data environments. As the core component, CNN can mine local association patterns from massive consumer behavior sequences, such as product browsing trajectories and click timing, and capture implicit features such as adding items to the shopping cart after continuous browsing [28]. EE generates low dimensional vectors for high cardinality categorical variables, such as product categories and user labels. It not only solves the dimensionality disaster problem of traditional encoding, but also reveals the inherent relationship between user preferences and product attributes through vector semantic association, adapting to the dimensionality reduction needs of structured consumer big data.

In response to the performance limitations of deep learning on large-scale structured data, the model combines EE with CNN to achieve the fusion modeling of structured classification features and unstructured behavior sequence big data, fully utilizing the complementary value of multimodal consumption data. By introducing attention mechanisms, the model’s memory ability is significantly enhanced, enabling it to more effectively capture key patterns and patterns in historical data. CBAM dynamically focuses on key information among massive features through two-stage refinement of channel and spatial attention mechanisms. Channel attention assigns weights to feature dimensions such as product price and discount strength, strengthening features strongly correlated with purchase. Spatial attention captures the behavior patterns of key time nodes such as concentrated browsing and staying on detail pages during promotional periods, avoiding feature dilution caused by excessive data size. Overall, EE-CNN-CBAM effectively addresses the high-dimensional, multimodal and strong noise characteristics of consumer behavior big data by processing high-dimensional structured data through EE, extracting sequence features through CNN and focusing on key patterns through CBAM. This provides accurate big data analysis support for purchasing behavior prediction. The EE-CNN-CBAM structure design constructed is shown in Figure 1.

Figure 1

Flowchart illustrating a consumer online purchasing behavior prediction model. Variables undergo embedding, then merge into a CBAM with spatial and channel attention modules. The process continues through convolutional layers, max pooling, and flattening. It proceeds to fully connected layers with a dropout layer, ultimately predicting consumer behavior.

Figure 1. EE-CNN-CBAM network structure diagram.

As shown in Figure 1, the EE-CNN-CBAM input layer includes EE layer. Initially, each variable is mapped to the corresponding embedding layer through EE. Among them, unstructured text data is a key predictive signal carrier, and preprocessing and embedding need to be designed in conjunction with its features. The evaluation text is 5–200 words long, with fragmented and colloquial language, including emotions, product attributes and purchase intentions. The noise includes emoticons, system tags and meaningless short sentences. The consultation dialogue consists of multiple rounds of interaction, with each round consisting of 2–50 words, containing contextual and decision-oriented information. Noise includes customer service prompts, spoken abbreviations, idle chat content and temporal associations need to be retained. Preprocessing follows a three-level process of data cleaning, semantic normalization and noise refinement. During the data cleaning phase, non-textual symbols are removed from the evaluation text using regular rules, short and long texts are processed. Consultation conversations filter out irrelevant interactions and complete fragmented rounds. In the semantic standardization stage, a unified format and expression in both Chinese and English are used, followed by segmentation and filtering of stop words using an e-commerce custom dictionary. The context window and rule library are used to resolve ambiguity in polysemous words and complete dialogue omission expressions. During the noise refinement stage, the evaluation text and sentiment dictionary retain core information, while the consultation dialogue uses intent classification models and entity recognition to extract purchase-related content.

The EE module converts high cardinality categorical variables such as user ID and product classification into 256 dimensional vectors. After dimension alignment, the text is embedded in the feature fusion layer for modal concatenation with the original input of the model. Subsequently, all embedded layers are merged into a fusion layer, which is suitable for structured data and serves as input for subsequent CNN. Two convolutional layers of convolution layer1 and layer2 extract consumer network purchasing behavior features through convolution operations. Then, the max pooling layer, merging layer and flatten layer are mainly used for reducing data dimensionality. The neurons in the fully connected layer1 connect all feature operations in the convolutional and pooling layers to obtain non-linear combinations of higher-level consumer network purchasing behavior features. The fully connected layer2 in the output layer is used to output the predicted probability score of consumer network purchases. In addition, in order to improve the generality of the network and avoid overfitting, Dropout technology is introduced here, which can pause half of the feature detectors from participating in training each time the training samples are taken.

3.2 CBAM

CBAM is a module that combines convolutional blocks and attention mechanisms in CNN [29]. It is mainly used to enhance the representation ability of CNN on consumer online purchasing related features, especially when processing purchasing behavior data in big data environments such as user browsing time, product click sequences, historical purchase records, etc. It can help the network focus on core features strongly related to purchasing decisions, such as high-frequency browsing of product categories, length of stay after adding to the shopping cart, etc. The channels here correspond to different dimensions of purchasing behavior, such as product browsing frequency, collection quantity, price sensitivity identification, etc. This can pay more attention to features strongly related to purchasing behavior in the channel dimension, such as the channel corresponding to the behavior of adding to the shopping cart, while suppressing redundant channel information such as page jump records unrelated to purchasing decisions.

The importance weights of purchase features corresponding to different interaction moments or sequence positions of product browsing in the spatial dimension can be obtained. This can enable the network to pay more attention to key time nodes or locations related to decision-making in the purchasing behavior sequence in the spatial dimension, accurately capturing behavior patterns such as concentrated purchasing intentions during promotional activities and ordering signals after specific browsing sequences. The channel attention mechanism enables the module to focus on channels that are useful for predicting purchasing behavior tasks, such as channels corresponding to recent purchase frequencies, and suppress channels that are unrelated to purchasing decisions, such as browsing history channels for non target categories. The spatial attention mechanism focus its attention on key areas in the spatial dimension of the purchasing behavior sequence, such as the peak time of product clicks during discount periods and decision nodes in specific browsing sequences. The output of spatial attention mechanism highlights the information of key time nodes or sequence positions in purchasing behavior. The calculation method is shown in Equations 1, 2.

A^{'} = M_{c} (A) \otimes A (1)

A^{″} = M_{s} (A^{'}) \otimes A^{'} (2)

Here, $A$ represents the input consumer network purchase behavior feature map, covering multi-dimensional data such as browsing sequences, purchase frequency and price preferences. $A^{'}$ denotes the feature map obtained after CBAM processing, which is the output of CBAM. $A^{″}$ indicates the consumer network purchase behavior result derived from $A^{'}$ through attention mechanism processing $M_{c}$ . $M_{c}$ represents channel attention weighting, $M_{s}$ denotes spatial attention weighting and $\otimes$ signifies element-level product operators.

The channel attention module and spatial attention module in CBAM respectively affect the channel dimension and spatial dimension of the consumer network purchasing behavior feature map. This aims to explore key features that are valuable for predicting purchasing behavior. Finally, the learned attention weights are multiplied to adjust the representation of consumer purchasing behavior features within each channel, highlighting the behavior channels that have a significant impact on purchasing decisions. It is shown in Equation 3.

\begin{array}{l} M_{c} (F) & = σ (M L P (A v g P o o l (F)) + M L P (M a x P o o l (F))) \\ = σ (W_{1} (W_{0} (F_{A ν g}^{C})) + W_{1} (W_{0} (F_{Max}^{C}))) \end{array} (3)

The $σ$ denotes the sigmoid activation function. While $F$ represents the input consumer network purchase behavior feature map containing multi-dimensional data such as browsing history, purchase frequency and price preferences, typically denoted as $C \times H \times W$ . $F_{avg}^{c}$ indicates global average pooling operation. $F_{\max}^{c}$ represents the global maximum pooling operation.

In the spatial attention module, for the feature map of consumer online purchasing behavior, global average pooling is first performed in the height and width dimensions, aiming to extract spatial information of consumer behavior that covers the whole world. Subsequently, global max pooling is performed in the same dimension to capture significant features in the global spatial information that have a prominent impact on purchasing decisions. Subsequently, results are input into two independent multilayer perceptrons (MLPs) for processing. Each MLP applies non-linear mapping to the globally pooled consumer behavior characteristics, mining deep correlations between behavior features such as browsing trajectories and purchase frequency in spatial dimensions. Sigmoid activation is employed to produce spatial attention values in the [0,1] range. This highlights the key spatial behavior patterns for predicting purchasing behavior and helps to accurately understand the logic of consumer purchasing decisions. The calculation is displayed in Equation 4, which is used to highlight the key spatial behavior patterns for predicting purchasing behavior.

M_{s} (F) = σ (f^{7 * 7} ([F_{A ν g}^{S}, F_{Max}^{S}])) (4)

The $f^{7 * 7}$ denotes the feature mapping function for consumer online purchasing behavior data, specifically implemented as a 1 × 1 convolution operation to extract correlations between behavioral features such as browsing patterns and purchase frequency. $F_{A ν g}^{S}$ represents the result of global average pooling applied across spatial dimensions on the input consumer online purchasing behavior feature map, capturing overall temporal or interaction sequence trends in behavioral data. $F_{Max}^{s}$ indicates the outcome of global maximum pooling performed on the spatial dimension of the input consumer online purchasing behavior feature map. It is used to extract key nodes in behavior sequences that have a significant impact on purchasing decisions, such as concentrated purchase intentions during promotional periods, order signals after specific browsing sequences, etc.

3.3 Characteristic indicators of consumer online purchasing time pattern

In previous research on predicting consumer online purchases, the features used for prediction are often common consumer demographic characteristics and network membership related characteristics. Extensive research using these types of features has achieved good predictive results, but there are still many shortcomings. For example, the range of data types for constructing features is not wide enough. In recent years, with the advent of the era of big data and artificial intelligence, the time series data collected from consumers’ shopping behavior has grown rapidly. Due to the fact that consumer online purchases occur within a certain time frame, it is of great significance to uncover the patterns behind consumer shopping time patterns in order to improve the performance and interpretability of model purchasing behavior prediction.

We introduce time series data of consumer purchases into the study of purchasing behavior prediction. Specifically, in order to better explore the time patterns of consumer online purchases, we plan to first construct time characteristic indicators for purchase behavior patterns, namely purchasing time diversity, time loyalty and time regularity, abbreviated as $time_div$ , $time_loy$ and $time_reg$ .

Firstly, based on a basic concept of “bin”, the “bin” in the purchase time feature is the time dimension segmentation unit, which is one-dimensional. We define the minimum segmentation unit for purchasing time pattern features as hours, so the total number of “bin” must be 24. For all online consumers, regardless of the time period of purchase, the corresponding labeled ‘bin’ is where they fall. The more times you scan the code to purchase in a certain “bin”, the higher the frequency counted in that “bin”. If you never purchase in the time zone of that “bin”, the corresponding frequency for that “bin” is 0. On the basis of defining the “bin”, the following further defines $time_div$ , $time_loy$ and $time_reg$ .

The $time_div$ represents the degree of dispersion of consumers’ online purchases of goods at all times, that is the degree of dispersion of consumers’ purchase times in various “bin” in the time dimension. The higher the value of time diversity for a consumer, the more likely they are to purchase goods at different times on the internet, meaning they are more likely to purchase goods at different times and vice versa. The original definition of temporal diversity is shown in Equation 5.

t i m {e_div}_{i} = \frac{- \sum_{j} N p_{i j} \log p_{i j}}{\log M_{i}} (5)

The $N$ represents the total number of “bin” available, $M_{i}$ denotes the quantity of all “bin” involved in consumer $i$ ’s online purchase history. The $p_{ij}$ indicates the probability that consumer $i$ completes a transaction within the $j$ -th “bin”.

The $time_loy$ reflects the proportion of purchases occurring within the top $k$ “bin” with highest purchase frequency. Since the first $k$ “bin” purchased at a frequent time can largely reflect the degree of consumers’ willingness to repeat purchases at a certain time, $k$ is taken as 3 in the original definition of $time_loy$ . As all purchase occasions have been categorized into “bin”, this time loyalty measure specifically captures consumers’ preference for high-frequency purchase periods. The formula for consumer $i$ ’s time loyalty is defined as shown in Equation 6.

t i m e_l o y_{i} = \frac{f_{i}}{\sum_{j = 1}^{N} p_{i j}} (6)

Among them, $f_{i}$ is the proportion of the number of purchases in the “bin” that consumers buy most frequently among all the “bin”. Obviously, $time_lo y_{i}$ is essentially a probability between 0 and 1.

The regularity of consumer purchase time comprehensively considers the overall pattern of consumer purchase time diversity and purchase time loyalty in different periods. In the initial definition, the observation period is 1 month in the short term and 3 months in the long term. Due to the consideration of time diversity and time loyalty variables in both long and short periods of $time_reg$ , the information on consumer time patterns covered should be more comprehensive. The specific definition of positional regularity is shown in Equation 7.

t i m e_r e g_{i} = 1 - \frac{\sqrt{{(t i m e_d i v_{i}^{s} - t i m e_d i v_{i}^{l})}^{2} + {(t i m e_l o y_{i}^{s} - t i m e_l o y_{i}^{l})}^{2}}}{\sqrt{2}} (7)

Here, $time_di v_{i}^{s}$ and $time_di v_{i}^{l}$ represent the diversity of consumers’ long-term and short-term purchasing time for consumer $i$ , while $time_lo y_{i}^{s}$ and $time_lo y_{i}^{l}$ denote their corresponding long-term and short-term purchasing time loyalty. Under the normalization of these indicators, the value range of $time_re g_{i}$ calculated by Equation 7 falls within interval [0,1]. A value closer to 1 indicates higher similarity in purchasing behavior patterns between long-term and short-term periods for consumer i, while a value farther from 1 suggests less consistency.

Obtaining the importance weights of purchase features corresponding to different interaction moments or product navigation sequence positions in the spatial dimension can be achieved through econometric equation methods. Adopting an ordered logit model to adapt the ordered characteristics of navigation sequence positions, the basic importance weights of each position and interaction time feature are represented by its partial regression coefficient. The Lasso regularization equation is combined for feature selection and weight shrinkage to suppress multicollinearity interference caused by redundant features. The vector autoregression equation is introduced to capture the lag effect and dynamic weight allocation of features in response to the dynamic impact at different interaction moments. Finally, through standardization processing, the coefficients output by various econometric equations are uniformly transformed into quantified results in the 0–1 interval, thereby accurately obtaining the importance weights of corresponding purchase features.

4 Experimental design and result analysis

4.1 Experiment setup

The experiment is conducted on cloud servers to support efficient deep learning training and testing. The cloud server is configured with 12 vCPUs, with a memory capacity of 90 GB and is equipped with Intel (R) Xeon (R) Silver 4214R CPU with a clock speed of 2.40 GHz to meet large-scale data processing needs. At the same time, it is equipped with NVIDIA GeForce RTX 3080 Ti GPU, which has 12 GB of video memory and can accelerate the training of deep learning models. The programming language used is Python 3.7 and the deep learning framework PyTorch is adopted to fully utilize its flexibility and powerful features. We choose Linux as the operating system to ensure good performance.

The experiment uses two datasets, namely ICPR MTWI2018 [30], MEP-3M [31] and Amazon-M2 [32]. ICPR MTWI2018 is a web-based text dataset primarily composed of images purchased online, containing multiple fonts and scales of text. This dataset is jointly collected and calibrated by South China University of Technology and Alibaba, with a total of 10,000 available images containing labels. The difficulty of detecting this dataset lies in the complexity and variability of fonts, the range of text pixels from 0 to 100 and the presence of complex background interference. MEP-3M has large-scale, hierarchical classification, multi-mode, fine-grained and long tail characteristics. According to statistics, MEP-3M contains over 3 million products, making it the largest dataset compared to existing network electronic product datasets. The products in MEP-3M are represented in three forms, namely image, text description and OCR text. Amazon-M2 clearly covers multiple regions and multilingual scenarios, including regional shopping data corresponding to six languages. User conversations originate from Amazon e-commerce platforms in different regions around the world and can be directly used for comparative research on cross regional user purchasing behavior.

In order to verify the effectiveness of EE-CNN-CBAM, this paper comprehensively considers the applicability of various evaluation indicators. Based on existing research, it is ultimately decided to comprehensively evaluate the trend prediction performance of EE-CNN-CBAM and other benchmark comparison prediction models using four evaluation indicators. They are accuracy, precision, expected maximum profit (EMP) and F1 score (F1) [33]. In order to introduce the specific definition of EMP, it is necessary to first define the average classification profit and the maximum profit, where the definition of average classification profit is shown in Equation 8.

P (t; b_{0}, c_{0,} b_{1,} c_{1}) = b_{0} π_{0} F_{0} (t) + b_{1} π_{1} (1 - F_{1} (t)) - c_{0} π_{0} (1 - F_{0} (t)) - c_{1} π_{1} F_{1} (t) (8)

The left side of the equation represents the average classification benefit when the classifier threshold is set to $t$ , while the right side calculates the total sum of all classification benefits and losses. Since both the numerator and denominator contain $N$ in the averaging process, $N$ is ultimately canceled out to obtain Equation 8. In addition to the average classification benefit, another maximum benefit feature needs to be defined, as shown in Equation 9.

M P = \underset{\forall t}{argmax} P (t; b_{0}, c_{0}, b_{1}, c_{1}) = P (T; b_{0}, c_{0}, b_{1}, c_{1}) (9)

Here, $T$ is the optimal threshold. And this most threshold $T$ must also satisfy the first sequence condition for maximum average revenue, as shown in Equation 10 below.

\frac{f_{0} (T)}{f_{1} (T)} = \frac{π_{1} (b_{1} + c_{1})}{π_{0} (b_{0} + c_{0})} = \frac{π_{1} θ}{π_{0}} (10)

Among them, the parameter $θ = (b_{1} + c_{1}) / (b_{0} + c_{0})$ is also known as the cost-benefit ratio, which indicates that the optimal threshold and benefits depend on the ratio of costs to benefits. With the above definition, the expression of EMP can be derived, as shown in Equation 11.

E M P = \int_{b_{0}} \int_{c_{0}} \int_{b_{1}} \int_{c_{1}} P (T (θ); b_{0}, c_{0}, b_{1}, c_{1}) * w (b_{0}, c_{0} b_{1}, c_{1}) d b_{0} d c_{0} d b_{1} d c_{1} (11)

Expression 11 is the general expression of EMP. For each combination ( $b_{0}, c_{0}, b_{1}, c_{1}$ ), the optimal parameter $T$ is determined by 10. And $w$ ( $b_{0}, c_{0}, b_{1,} c_{1}$ ) is the joint probability density function of classification cost.

4.2 Performance evaluation

The determination of the number of iterations for training consumer online purchasing behavior data requires a comprehensive consideration of the balance between model convergence efficiency and purchase accuracy. Too few iteration steps can lead to insufficient learning of multimodal behavioral features in the model, amplifying the prediction error of purchase intention. Excessive iteration steps may lead to overfitting, causing the model to overfit to the random behavior patterns in the training set and increasing the computational cost of big data, thereby reducing training efficiency. Based on this, the maximum number of training iterations is set to 50. By dynamically monitoring the trend of purchasing behavior prediction loss during the training process, the optimal number of steps is determined. And the training curve shown in Figure 2 is plotted to visually present the variation of accuracy with the number of iterations.

Figure 2

Figure 2. Change in loss of EE-CNN-CBAM.

From Figure 2, it can be seen that based on the datasets ICPR MTWI2018 and MEP-3M, the average loss value of EE-CNN-CBAM gradually decreases with increasing training times. After 33 epochs of iteration, the loss value reaches around 0.7 and the rate of decline gradually slows down. After training for 37 epochs, it gradually stabilizes. Although the iteration continues, the prediction error remains relatively flat and has not shown a significant decrease. Even in some subtle fluctuations, there is a slight upward trend, which may be due to the model overfitting high-frequency interaction noise in the data. Overall, when the iteration reaches 37 steps, the model’s prediction accuracy for consumer purchasing behavior has stabilized. At this point, it is possible to fully capture multimodal behavioral features, avoid overfitting risks caused by excessive iterations and reduce redundant computational consumption in big data training. Therefore, the final number of network training steps is determined to be 37.

4.3 Predictive performance analysis

Tables 1, 2 respectively present the experimental results of EE-CNN-CBAM and comparative models under conditions without and with time series data. Each table provides specific experimental results for the four evaluation metrics of accuracy, precision, EMP and F1. The specific values of each evaluation indicator in the table are taken as the average of ten experimental results conducted on the dataset, with three decimal places. For conducting comparative experiments on predicting and analyzing consumer online purchasing behavior, we select CNN, LSTM [34], CNN-GRU [35], CNN-CBAM [36] and EE-CNN-CBAM.

Table 1

Table 1. Prediction experiment result without time series data.

Table 2

Table 2. Prediction experiment result with time series data.

4.3.1 Non time series data prediction

By analyzing Table 1, it can be concluded that on the ICPR MTWI2018 and MEP-3M, comparative experiments between EE-CNN-CBAM and models such as CNN, LSTM, CNN-GRU and CNN-CBAM show that EE-CNN-CBAM performs the best in accuracy, precision, EMP and F1. In ICPR MTWI2018, the accuracy of EE-CNN-CBAM reaches 0.974 and F1 reaches 0.962, which are 5.6% and 6.5% higher than CNN-CBAM, respectively. In MEP-3M, the accuracy of EE-CNN-CBAM is 0.959 and F1 is 0.953, which is 12.5% and 6.1% higher than CNN-CBAM. It has fully verified its adaptability to consumer data in the big data environment. Through the collaboration of EE dimensionality reduction, CNN feature extraction and CBAM attention mechanism, it effectively addresses the high-dimensional and multimodal characteristics of consumer behavior big data, providing reliable support for accurate prediction of purchasing behavior. Figures 3, 4 are visual representations of experimental analysis based on different datasets in Table 1. For the sake of simplicity, CNN-GRU, CNN-CBAM and EE-CNN-CBAM in figures are represented by the abbreviations CG, CC and ECC, respectively.

Figure 3

Bar graphs showing performance metrics of five models: CNN, LSTM, CG, CC, and ECC. Each of four panels displays a different metric. 1. Accuracy: CNN 0.74, LSTM 0.81, CG 0.87, CC 0.92, ECC 0.97. Precision: CNN 0.73, LSTM 0.82, CG 0.86, CC 0.91, ECC 0.97. EMP: CNN 0.72, LSTM 0.80, CG 0.81, CC 0.82, ECC 0.93. F1: CNN 0.74, LSTM 0.83, CG 0.87, CC 0.90, ECC 0.96.

Figure 3. Analysis chart of prediction experiment based on ICPR MTWI2018.

Figure 4

Four bar charts compare CNN, LSTM, CG, CC, and ECC models on different metrics. Top left shows Accuracy: ECC highest at 0.96. Top right shows Precision: ECC highest at 0.96. Bottom left shows EMP: ECC highest at 0.89. Bottom right shows F1: ECC highest at 0.95.

Figure 4. Analysis chart of prediction experiment based on MEP-3M.

From Figures 3, 4, it can be seen that the experimental results on ICPR MTWI2018 show that LSTM outperforms CNN. This is due to the fact that the gating mechanism of LSTM is more suitable for the long-term dependencies of purchasing behavior sequences, which can better capture the dynamic associations of browsing and purchasing, while the limitations of CNN in local feature extraction make it slightly weaker in massive temporal data. The hybrid model CNN-GRU has significantly improved its performance compared with the single model by integrating CNN’s local pattern mining and GRU’s time series modeling capabilities, which verifies the adaptability of multi structure fusion to complex consumption data. The introduction of attention mechanism in CNN-CBAM further optimizes performance, demonstrating that the channel and spatial attention of CBAM can focus on key signals in massive features and reduce the interference of invalid browsing behavior. The EE-CNN-CBAM performs the best, thanks to the dimensionality reduction of high cardinality categorical variables by EE. This not only solves the curse of dimensionality in traditional coding, but also reveals the deep connection between user preferences and product attributes through vector semantic association. By combining CBAM with dynamic reinforcement of key features, the model can still maintain high accuracy in high-dimensional consumption big data, fully verifying its effectiveness in predicting purchasing behavior in big data environment.

4.3.2 Time series data prediction

Similarly, using the Olist dataset [37] and the User Behavior Data on Taobao (User Behavior) [38], after introducing consumer purchase time data, the overall performance of EE-CNN-CBAM in Table 2 remains outstanding. After comparing the results in Tables 1, 2, it can be concluded that consumer purchase time data has a significant universal improvement effect on the performance of consumer purchase prediction behavior models. After incorporating time features, all core predictive indicators of the models show quantifiable gains, with an average increase of 3.2%–7.8% in accuracy, precision and F1. For example, the accuracy of EE-CNN-CBAM on Olist increases from 0.959 in Table 1 to 0.987 in Table 2. The F1 increases from 0.953 to 0.982, with increases of 2.9% and 3.0% respectively. Both basic models such as CNN and LSTM, as well as hybrid models such as CNN-GRU, CNN-CBAM and EE-CNN-CBAM, show performance improvements on datasets. This further proves the significance of purchasing time data in improving model performance.

On Amazon-M2, EE-CNN-CBAM still maintains optimal performance, verifying the model’s generalization ability in cross-regional scenarios. This is due to EE’s dimensionality reduction of high cardinality categorical variables such as region codes and age groups, as well as CBAM’s dynamic focus on region-specific behavioral characteristics. When comparing ICPR MTWI2018 with MEP-3M, the performance of all models on Amazon-M2 decreases to some extent, mainly due to behavioral heterogeneity and data sparsity across regions. The purchase habits of users in different regions differ significantly. If a single model does not capture the regional characteristics specifically, it is prone to feature generalization bias. The user samples in some niche areas of Amazon-M2 only account for 3.2% of the total data, resulting in insufficient learning of behavior patterns by CNN and LSTM in this area. EE-CNN-CBAM alleviates the sparsity problem through vector semantic association and the performance degradation is even smaller.

Figures 5, 6 present the intuitive experimental results of the comparative analysis in Table 2, clearly demonstrating the dynamic trend of the model’s predictive performance after incorporating consumer purchase time data. After incorporating purchase time data into the initial dataset, both the base models and EE-CNN-CBAM show significant improvements in accuracy, precision and F1, with an average increase of 3.2%–7.8%. EMP generally decreases by 2.1%–5.3%. This systematically validates the general gain effect of consumer purchase time data on the performance of prediction models, with the importance of time_div and time_loy. This represents the characteristics of purchasing time series data, with highlighted parts and time_deg also performing well.

Figure 5

Line graph comparing performance metrics of five models: CNN, CNN-CBAM, LSTM, CNN-GRU, and EE-CNN-CBAM. Metrics include accuracy, precision, EMP, and F1 score. EE-CNN-CBAM consistently performs best, while CNN has the lowest performance across metrics.

Figure 5. Experimental analysis of time characteristics on purchase prediction under Olist.

Figure 6

Line graph comparing five models: CNN, CNN-CBAM, LSTM, CNN-GRU, and EE-CNN-CBAM across metrics—accuracy, precision, EMP, and F1. EE-CNN-CBAM shows the best performance overall, particularly in accuracy.

Figure 6. Experimental analysis of time characteristics on purchase prediction under User Behavior.

After incorporating time series data, while accuracy and F1 improve, EMP declines. This phenomenon primarily stems from the interplay between the cost-benefit ratio $θ$ and the optimal threshold $T$ . Time-series data concentrates positive sample probability density while dispersing negative samples. To attract high-intent users, companies increase marketing investments in positive samples. However, this strategy elevates marketing costs and user aversion risks while underestimating negative sample benefits, ultimately increasing $θ$ . To increase $θ$ , $T$ must be shifted to the right to satisfy the probability density ratio condition, which reduces the total classification gain $P (T)$ . EMP is the weighted expectation of $P (T)$ . $P (T)$ decreases due to TP revenue reduction and FP cost savings, ultimately leading to a decline in EMP. As an important component of consumer big data, the characteristics of consumer purchase time patterns not only enrich the dimensions of features, but also compensate for the shortcomings of traditional features in capturing dynamic purchasing patterns by mining deep correlations.

4.4 Text comparison experiment

To further analyze the independent contributions of evaluation texts and consulting dialogues, comparative experiments were designed with only evaluation texts, only consulting dialogues, and both types of texts, as shown in Table 3.

Table 3

Table 3. Experimental results of text type splitting.

There is a synergistic effect between the two types of text features, and the F1 improvement when input simultaneously is 6.0%, which is greater than the sum of the evaluation text and the consultation dialogue input separately. This proves that the post feedback of the evaluation text and the pre intention of the consultation dialogue can complement each other, comprehensively covering the entire chain of user purchasing decisions. The independent contribution of the evaluation text is higher because its semantics are more directly related to post purchase behavior, such as repurchase. And consulting conversations need to be combined with behavioral sequences to fully realize their value, such as asking for inventory and requiring additional purchase actions to be a strong signal.

The time pattern characteristics of consumer online purchases are essentially external manifestations of the interaction between psychological cognition and environment. Consumers with high time loyalty are often driven by the theory of habit formation and familiarity preferences. Their long-term purchase behavior at fixed time periods stems from sunk cost effects and the principle of decision-making efficiency. These consumers are more inclined to rely on past experience to reduce decision-making risks. Users with prominent time diversity exhibit exploratory purchasing psychology and their behavior is influenced by novelty needs, price sensitivity. They are more likely to be stimulated to purchase by marketing stimuli such as limited time promotions and new product launches. Consumers with strong temporal regularity demonstrate cognitive consistency and habitual lifestyle rhythms, their purchasing behavior is deeply bound to daily routines, work rhythms and other life scenarios, which is in line with the core concept of behavior embedding in social physics. This study provides a concrete practical path for e-commerce managers to accurately push personalized coupons and repeat purchase product recommendations to users with high time loyalty during their high-frequency purchase periods. By strengthening the habit path to enhance user stickiness, exploratory consumers with outstanding time diversity can meet their novelty needs through modules such as limited time flash sales and new product zones. At the same time, by combining browsing trajectories to optimize product association recommendations and reduce exploration costs, managers can use the model’s prediction results to adjust inventory configuration in advance during high intention purchase periods, avoiding shortages of hot selling products. In addition, store page layout can be optimized based on user behavior sequence characteristics, placing high intention products in the core visual area to reduce decision-making friction. This study still has certain limitations as it relies on historical behavioral data from Taobao and OLIST at the data level. This lacks real-time interactive data and cross platform behavior trajectories, which may overlook the impact of multi scenario linkage on purchasing behavior. The feature dimension focuses on mining behavioral data, without fully integrating consumer psychological variables and social environmental factors. This is difficult to fully reveal the intrinsic transmission mechanism between behavior and motivation, although the model captures the correlation between time patterns and purchasing behavior. However, there is insufficient exploration of the causal relationships behind behavior. Future research can further integrate behavioral frameworks such as planned behavior theory and self-determination theory, introduce psychological variables such as perceived risk and subjective norms to enrich the feature system, deepen the academic interpretation of behavioral mechanisms, expand data sources, integrate offline consumption records, social media interaction data and physiological feedback data. E-commerce managers can embed models into customer relationship management systems to build a closed-loop optimization mechanism for predicting execution feedback. At the same time, based on the research results of consumer psychology, while respecting user privacy, emotional interaction design and personalized services are used to achieve a win-win situation between commercial value and user experience.

5 Conclusion

This paper constructs EE-CNN-CBAM, which achieves efficient processing of consumer online big data through EE, CNN and CBAM module collaboration. EE transforms million level high cardinality categorical variables into low dimensional vectors, solving traditional dimensional disaster problems. CNN extracts local association patterns from behavioral big data sequences. CBAM focuses on key features in consumer network purchasing big data through a two-stage mechanism of channel and spatial attention, effectively avoiding feature dilution caused by excessive data size. In addition, three major indicators of purchase time diversity, loyalty and regularity are constructed to address the temporal dependence of consumer behavior. The experimental results indicate that EE-CNN-CBAM has effectively addressed the high-dimensional characteristics of consumer purchasing behavior big data, improving the accuracy of online purchasing behavior prediction and providing support for the analysis and prediction of consumer network big data. However, there are still some shortcomings in this paper, such as the overly simplistic division of time series data when exploring the characteristics of purchase time patterns. And the previous analysis only shows a decrease in overall EMP, without delving into the profitability performance of different user groups and product categories. In addition, although EE-CNN-CBAM has superior performance, it still falls short in terms of algorithm interpretability.

Data availability statement

The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.

Author contributions

YW: Formal Analysis, Writing – original draft, Investigation, Conceptualization, Visualization, Validation, Resources, Project administration. LL: Methodology, Writing – review and editing, Supervision, Software, Data curation, Funding acquisition.

Funding

The author(s) declared that financial support was received for this work and/or its publication. This work is supported by the Taizhou city Youth Talent Lifting Project in 2023 and Outstanding Young Key Teacher Program of Taizhou Institute of Science and Technology, NJUST.

Conflict of interest

The author(s) declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declared that generative AI was not used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

1. Wang Y, Li L. Digital economy, industrial structure upgrading, and residents' consumption: empirical evidence from prefecture-level cities in China. Int Rev Econ and Finance (2024) 92:1045–58. doi:10.1016/j.iref.2024.02.069

CrossRef Full Text | Google Scholar

2. Wang F, Aviles J. Enhancing operational efficiency: integrating machine learning predictive capabilities in business intellgence for informed decision-making. Front Business, Economics Management (2023) 9(1):282–6. doi:10.54097/fbem.v9i1.8694

CrossRef Full Text | Google Scholar

3. Miao J, Ning X, Hong S, Wang L, Liu B. Secure and efficient authentication protocol for supply chain systems in artificial intelligence-based internet of things. IEEE Internet Things J (2025). doi:10.1109/JIOT.2025.3592401

CrossRef Full Text | Google Scholar

4. Meeker WQ, Escobar LA, Pascual FG. Statistical methods for reliability data. John Wiley and Sons (2021).

Google Scholar

5. Guo S, Li X, Zhu P, Mu Z. ADS-detector: an attention-based dual stream adversarial example detection method. Knowledge-Based Syst (2023) 265(1):110388. doi:10.1016/j.knosys.2023.110388

CrossRef Full Text | Google Scholar

6. Mienye ID, Swart TG, Obaido G. Recurrent neural networks: a comprehensive review of architectures, variants, and applications. Information (2024) 15(9):517. doi:10.3390/info15090517

CrossRef Full Text | Google Scholar

7. Karthik RV, Pandiyaraju V, Ganapathy S. A context and sequence-based recommendation framework using GRU networks. Artif Intelligence Rev (2025) 58(6):170. doi:10.1007/s10462-025-11174-1

CrossRef Full Text | Google Scholar

8. Zhu P, Fan Z, Guo S, Tang K, Li X. Improving adversarial transferability through hybrid augmentation. Comput and Security (2024) 139:103674. doi:10.1016/j.cose.2023.103674

CrossRef Full Text | Google Scholar

9. Zhang D, Wu P, Wu C, Ngai EWT. Forecasting duty-free shop demand with multisource data: a deep learning approach. Ann Operations Res (2024) 339(1):861–87. doi:10.1007/s10479-024-05830-y

CrossRef Full Text | Google Scholar

10. Ma Y, Huang Z, Su J, Shi H, Wang D, Jia S, et al. A multi-channel feature fusion CNN-Bi-LSTM epilepsy EEG classification and prediction model based on attention mechanism. IEEE Access (2023) 11:62855–64. doi:10.1109/access.2023.3287927

CrossRef Full Text | Google Scholar

11. Martínez A, Schmuck C, Pereverzyev JS, Pirker C, Haltmeier M. A machine learning framework for customer purchase prediction in the non-contractual setting. Eur J Oper Res (2020) 281(3):588–96. doi:10.1016/j.ejor.2018.04.034

CrossRef Full Text | Google Scholar

12. Harada S, Taniguchi K, Yamada M, Kashima H. In-app purchase prediction using bayesian personalized dwell day ranking. (2019).

Google Scholar

13. Dong Y, Jiang W. Brand purchase prediction based on time-evolving user behaviors in e-commerce. Concurrency Comput Pract Experience (2019) 31(1):e4882. doi:10.1002/cpe.4882

CrossRef Full Text | Google Scholar

14. Tsuboi Y, Jatowt A, Tanaka K. Product purchase prediction based on time series data analysis in social media. In: 2015 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT), 1. IEEE (2015). 219–24. doi:10.1109/wi-iat.2015.170

CrossRef Full Text | Google Scholar

15. Qiu J, Lin Z, Li Y. Predicting customer purchase behavior in the e-commerce context. Electron Commerce Research (2015) 15(4):427–52. doi:10.1007/s10660-015-9191-6

CrossRef Full Text | Google Scholar

16. Cho YS, Moon SC. Weighted mining frequent pattern based customer’s RFM score for personalized u-commerce recommendation system. JoC (2013) 4(4):36–40. Available online at: https://www.earticle.net/Article/A215904.

Google Scholar

17. Sılahtaroğlu G, Dönertaşli H. Analysis and prediction of Ε-customers' behavior by mining clickstream data. In: 2015 IEEE international conference on big data (big data). IEEE (2015). 1466–72.

CrossRef Full Text | Google Scholar

18. Zuo Y, Yada K, Ali ABMS. Prediction of consumer purchasing in a grocery Store using machine learning techniques. In: 2016 3rd Asia-Pacific World Congress on Computer Science and Engineering (APWC on CSE). IEEE (2016). 18–25.

CrossRef Full Text | Google Scholar

19. Zhu P, Li Y, Hu Y, Liu Q, Cheng D, Liang Y. Lsr-igru: stock trend prediction based on long short-term relationships and improved gru. In: Proceedings of the 33rd ACM International Conference on Information and Knowledge Management (2024). 5135–42.

Google Scholar

20. Jacobs BJD, Donkers B, Fok D. Model-based purchase predictions for large assortments. Marketing Sci (2016) 35(3):389–404. doi:10.1287/mksc.2016.0985

CrossRef Full Text | Google Scholar

21. Yang S, Wang D, Zheng H, Jin R. Timerag: boosting llm time series forecasting via retrieval-augmented generation. In: ICASSP 2025-2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE (2025). 1–5.

CrossRef Full Text | Google Scholar

22. Guo L. Cross-border e-commerce platform for commodity automatic pricing model based on deep learning. Electron Commerce Res (2022) 22(1):1–20. doi:10.1007/s10660-020-09449-6

CrossRef Full Text | Google Scholar

23. Ling C, Zhang T, Chen Y. Customer purchase intent prediction under online multi-channel promotion: a feature-combined deep learning framework. IEEE Access (2019) 7:112963–76. doi:10.1109/access.2019.2935121

CrossRef Full Text | Google Scholar

24. Suresh Kumar S, Margala M, Siva SS, Chakrabarti P. A novel weight-optimized LSTM for dynamic pricing solutions in e-commerce platforms based on customer buying behaviour. Soft Comput (2023) 1–13. doi:10.1007/s00500-023-08729-1

CrossRef Full Text | Google Scholar

25. Sakar CO, Polat SO, Katircioglu M, Kastro Y. Real-time prediction of online shoppers’ purchasing intention using multilayer perceptron and LSTM recurrent neural networks. Neural Comput Appl (2019) 31(10):6893–908. doi:10.1007/s00521-018-3523-0

CrossRef Full Text | Google Scholar

26. Tseng KK, Lin RFY, Zhou H, Kurniajaya KJ, Li Q. Price prediction of e-commerce products through internet sentiment analysis. Electron Commerce Research (2018) 18(1):65–88. doi:10.1007/s10660-017-9272-9

CrossRef Full Text | Google Scholar

27. Banerjee T, Sinha S, Choudhury P. Dynamic price prediction of agricultural produce for e-commerce business model: a linear regression model. In: Data management, analytics and innovation, 2. Singapore: Springer Singapore (2021). 493–504. doi:10.1007/978-981-16-2937-2_31

CrossRef Full Text | Google Scholar

28. Altayeb Y. Predicting consumer behavior in online shop using clickstream data and machine learning algorithms (Masterȁs thesis). Tilburg University (2024).

Google Scholar

29. Hussaini H, Bano S, Elyan E, Moreno-Garcia CF. Modified CBAM: sub-block pooling for improved channel and spatial attention. In: Annual Conference on Medical Image Understanding and Analysis. Cham: Springer Nature Switzerland (2025). 116–30.

CrossRef Full Text | Google Scholar

30. He M, Liu Y, Yang Z, Zhang S, Luo C, Gao F, et al. ICPR2018 contest on robust reading for multi-type web images. 24th international conference on pattern recognition (ICPR). IEEE (2018). 7–12.

CrossRef Full Text | Google Scholar

31. Liu F, Chen D, Du X, Gao R, Xu F. MEP-3M: a large-scale multi-modal E-commerce product dataset. Pattern Recognition (2023) 140:109519. doi:10.1016/j.patcog.2023.109519

CrossRef Full Text | Google Scholar

32. Hsieh CH. A review of machine learning algorithms for the Amazon-M2 dataset. In: International Conference on Human-Computer Interaction. Cham: Springer Nature Switzerland (2025).

CrossRef Full Text | Google Scholar

33. Petrides G, Moldovan D, Coenen L, Guns T, Verbeke W. Cost-sensitive learning for profit-driven credit scoring. J Oper Res Soc (2022) 73(2):338–50. doi:10.1080/01605682.2020.1843975

CrossRef Full Text | Google Scholar

34. Bhandari HN, Rimal B, Pokhrel NR, Rimal R, Dahal KR, Khatri RK. Predicting stock market index using LSTM. Machine Learn Appl (2022) 9:100320. doi:10.1016/j.mlwa.2022.100320

CrossRef Full Text | Google Scholar

35. Lu L, Zhang C, Cao K, Deng T, Yang Q. A multichannel CNN-GRU model for human activity recognition. IEEE Access (2022) 10:66797–810. doi:10.1109/access.2022.3185112

CrossRef Full Text | Google Scholar

36. Li H, Wei G, Wang T, Bui T, Zeng Q, Wang R. Reducing video coding complexity based on CNN-CBAM in HEVC. Appl Sci (2023) 13(18):10135. doi:10.3390/app131810135

CrossRef Full Text | Google Scholar

37. Xu Z, Wang X, Tan Y, et al. Data-driven analysis for the operation status of the E-Commerce platform based on offlist. (2023).

Google Scholar

38. Huang Q. Analysis of E-commerce user purchase intention based on user behavior data. In: Proceedings of the 2nd Guangdong-Hong Kong-Macao Greater Bay Area International Conference on Digital Economy and Artificial Intelligence (2025). 1759–64.

Google Scholar

Keywords: attention, big data analysis, consumer online, convolutional neural network, purchasing behavior prediction

Citation: Wen Y and Liu L (2026) A prediction method for consumer online purchasing behavior based on big data analysis. Front. Phys. 14:1686157. doi: 10.3389/fphy.2026.1686157

Received: 15 August 2025; Accepted: 09 January 2026;
Published: 02 February 2026.

Edited by:

Rui M. S. Cruz, Universidade do Algarve, Portugal

Reviewed by:

Marian Pompiliu Cristescu, Lucian Blaga University of Sibiu, Romania
Zongkang Yang, Jiangsu University, China

Copyright © 2026 Wen and Liu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Lei Liu, bGl1bGVpQG51c3R0aS5lZHUuY24=

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.