Scalable multi-metric association rule learning for explainable book recommendations

Hidri, Adel; AlSaif, Suleiman Ali; AlShehri, Eman; Sassi Hidri, Minyar

doi:10.3389/fcomp.2026.1779096

ORIGINAL RESEARCH article

Front. Comput. Sci., 07 April 2026

Sec. Human-Media Interaction

Volume 8 - 2026 | https://doi.org/10.3389/fcomp.2026.1779096

Scalable multi-metric association rule learning for explainable book recommendations

AH
Adel Hidri
SA
Suleiman Ali AlSaif
EA
Eman AlShehri
MS
Minyar Sassi Hidri ^*

Department of Computer, Deanship of Preparatory Year and Supporting Studies, Imam Abdulrahman Bin Faisal University, Dammam, Saudi Arabia

Abstract

Digital reading platforms have grown rapidly, increasing information overload and highlighting the need for efficient and transparent recommendation systems. This study presents a scalable hybrid framework that combines multi-metric association rule learning (ARL) with intelligent filtering strategies to provide clear, high-quality book recommendations at scale. Unlike traditional ARL-based recommenders that depend on a single metric or small datasets, our approach combines support, confidence, and lift measures to identify strong behavioral patterns while maintaining computational efficiency. The framework uses data-reduction strategies that select active users and high-impact items, transforming a sparse rating matrix into a dense, computationally tractable representation. Extensive experiments on a real-world dataset demonstrated that our method significantly outperforms collaborative filtering, neural models, and rule-mining baselines in precision, recall, and normalized discounted cumulative gain (NDCG). The resulting rules are inherently interpretable, enabling clear explanations for recommendations, which is a critical feature of modern personalized systems. This study demonstrates that ARL remains viable when designed with modern scalability constraints in mind, providing an explainable, efficient solution for digital libraries, online platforms, and large-scale recommender systems.

1 Introduction

Digital technology has transformed the publishing industry, providing more books and reading materials than ever before. Online platforms like Goodreads, Amazon Kindle, and digital libraries now offer millions of titles, making it increasingly challenging for readers to discover books that match their interests (Ricci et al., 2021; Lu et al., 2015). Traditional search and browsing methods prove inadequate when users must navigate such extensive catalogs, underscoring the critical importance of intelligent recommendation systems that accurately match readers with relevant content.

Recommendation systems are central to modern digital platforms and are crucial for improving user experience and engagement (Nawara and Kashef, 2025). These systems analyze user behavior, preferences, and item features to deliver personalized recommendations. Book recommendation presents unique challenges due to diverse and evolving reading preferences, numerous genres, and complex reader behavior patterns (Adyatma and Baizal, 2023).

There are many reasons why there is a need to improve book recommender systems:

Information overload: With millions of books on the internet, people need assistance finding books of interest rather than going through all the books on their own (Bobadilla et al., 2021).
Long-tail discovery: Popular recommender systems may not be effective in discovering books that might be of great interest to particular individuals, thereby limiting the discovery of content (Li et al., 2023).
Reading engagement: If the recommender system is effective, people are more likely to read more and be more satisfied with the digital media (Porcel et al., 2012).
Preference diversity: People's reading preferences are diverse and dynamic, ranging from the type of books they read to the style of writing, the complexity of the books, and the themes of the books (Zhang et al., 2014).
Explainability: People want recommender systems that are easily understandable and can be explained, i.e., they want to know the reasons why they are being recommended a particular book (Zhang et al., 2024).

Despite extensive research, several areas remain unexplored. The scalability tricks currently used are rarely incorporated into traditional association rule mining for book recommendations. There is also a lack of deep dives into how to achieve a trade-off between accuracy and interpretability when combining approaches in a hybrid framework. The issue of how to achieve a trade-off between computational efficiency and recommendation quality in a large-scale ARL framework remains an open issue. There is also a lack of research on how to combine multiple association metrics under a single framework.

In the present research, we aim to bridge these gaps by proposing a scalable and interpretable ARL framework for recommendation systems that combines traditional data mining techniques with modern computational capabilities. We propose a unified framework for ARL that combines support, confidence, and lift metrics, thereby going beyond a single-metric approach. We also propose scalable data reduction techniques that enable ARL to be applied to datasets too large to be mined with traditional rule mining approaches. Our experiments demonstrate that an interpretable ARL framework can outperform state-of-the-art approaches while remaining interpretable. We also present a scalable recommendation framework validated on real-world data.

The remainder of this study is organized as follows: Section 2 reviews related studies in recommendation systems and ARL. Section 3 presents the problem formulation. Section 4 describes the proposed methodology and system architecture. Section 5 details the experimental setup and results. Section 6 discusses research findings. Section 7 concludes the paper and outlines future research.

2 Related work

The field of recommendation systems has evolved significantly through advances in deep learning, graph neural networks, transformer architectures, and large language models (Nawara and Kashef, 2025; Zhang et al., 2023; Wei et al., 2025). This section reviews methods relevant to our research, organized into five categories: collaborative filtering, content-based and hybrid methods, association rule learning (ARL), deep learning approaches, and explainability considerations.

2.1 Collaborative filtering methods

Collaborative filtering (CF) remains foundational to modern recommender systems, operating on the principle that users with similar past behavior will exhibit similar future preferences (Koren et al., 2009). Classical CF approaches include user-based methods that identify similar users and item-based methods that discover similar items based on rating patterns. Matrix factorization techniques, particularly Singular Value Decomposition (SVD), have evolved to address sparsity and scalability challenges (Koren et al., 2009).

Recent advances integrate CF with neural networks. Neural collaborative filtering (NCF) combines matrix factorization with deep learning to model non-linear user-item interactions (He et al., 2017; Wang et al., 2019). Self-supervised learning methods enable robust user and item embeddings without extensive labeled data (Yu et al., 2024; Jing et al., 2023). Federated learning techniques support accurate, privacy-preserving recommendation systems (Cheng et al., 2020). These deep learning enhancements have substantially improved upon traditional CF's limitations, including cold-start problems and sparse-data handling.

2.2 Content-based and hybrid systems

Content-based recommenders use item features and user characteristics to recommend items similar to those previously preferred by users (Pazzani and Billsus, 2007). When recommending books, we should consider the genre, author, publication year, and content as the most significant factors. Several recent studies have used natural language processing and multimodal learning to extract more detailed representations from product descriptions and cover images of books (Liu et al., 2024). In (Verma et al., 2025), the authors showed that using content-based filtering along with graph neural networks is not only precise but also fast for making book recommendations.

Hybrid systems combine more than one recommendation method to leverage the strengths of each (Burke, 2002). In (Kumar et al., 2020), the authors showed that a hybrid approach to personalized book recommendations, combining deep learning and association rule mining, achieved significantly superior performance over single methods and yielded more consistent results. In (Adyatma and Baizal, 2023), the authors investigated collaborative filtering techniques for book recommendations using different similarity metrics and neighborhood selection strategies.

2.3 Association rule learning for recommendations

Association rule learning (ARL), originally developed for market basket analysis, has proven highly effective in recommendation systems for uncovering behavioral patterns. The Apriori algorithm and its variants remain relevant for discovering frequent itemsets and generating human-interpretable association rules (Zakur and Flaih, 2023).

Scalability has been a primary challenge for applying ARL to large-scale systems. In Li and Sheu (2021), the authors addressed this limitation by proposing efficient algorithms that can process millions of transactions in big data environments. In Yang et al. (2021), the authors extend this study by introducing billion-scale association rule mining methods that achieve near-linear scalability, making ARL viable for contemporary large-scale recommendation platforms. Recent research has focused on integrating ARL with modern machine learning techniques, yielding promising results in hybrid architectures (Li, 2025).

2.4 Deep learning and transformer-based methods

Deep learning has been a major factor in changing recommendation systems in many ways. Graph neural networks (GNNs) employ graph-structured data to represent complex user and item relationships (Zhang et al., 2023; Wu et al., 2019). Conversely, recurrent architectures reveal the evolution of user behavior over time (Quadrana et al., 2017). Transformer-based models, which were created for natural language processing, have been successfully converted for sequential recommendation use (Wei et al., 2025), thus being an excellent tool for identifying long-range dependencies in user behavior sequences (Ho and Nguyen, 2024; Wei et al., 2025).

Self-attentive sequential models (Ho and Nguyen, 2024) exhibit enhanced efficacy in modeling the evolution of user preferences over time.

Large language models (LLMs) are the newest thing in recommendation systems (Nawara and Kashef, 2025). In Zhao et al. (2024), the authors presented a thorough survey that delineates significant challenges and opportunities in LLM-based recommendation systems. In Gao et al. (2023), the authors introduced Chat-REC, which demonstrates how conversational interfaces can provide interactive, interpretable recommendations. In Zhu et al. (2024), they proposed LLM4Rec, which treats sequential recommendation as language processing and gets the best results so far. In Karlovi et al. (2025), the authors recently developed context-aware LLMs that adapt to users' contexts, making them more personalized.

Multimodal foundation models that combine text, images, and structured data are at the forefront of research (Liu et al., 2024), enabling a deeper understanding of users. In Geng et al. (2022), the authors proposed recommendations for one facet of language processing using new pretraining architectures, while in Wang et al. (2023), they studied generative recommendation as a future direction. However, a recent study has revealed potential biases in LLM-based recommenders (Deldjoo, 2025), thereby necessitating the adoption of fairness-aware methods (Li et al., 2023).

2.5 Explainability and knowledge integration

Explainable recommendations are now a major focus, given user and regulatory interest in developing AI systems that are transparent and trustworthy (Zhang et al., 2024). Rule-based recommendation systems have a big advantage over complex deep learning-based recommendation systems, as their logic, which is based on simple if-then rules such as Readers who enjoyed bookAalso read bookB90% of the time, is immediately understandable without needing any special tool to decode. In (Zhang et al., 2024), the authors demonstrated how providing explanations for such rules in plain language can greatly enhance user trust and system reliability by making the why behind each recommendation crystal clear.

In a rule-based approach, you can directly see into the patterns underlying each recommendation, understanding both how the entire system identifies connections across all users and why a particular book is a good recommendation for a given user. This dual transparency arises naturally from the rules. In a neural network-based approach, recommendations are generated through millions of underlying mathematical weights and activations that define a complex embedding of user and item information. To understand why a given book was recommended, researchers must apply a variety of post-hoc analysis tools, such as LIME (Local Interpretable Model-agnostic Explanations), which perturbs inputs to estimate decisions, or SHAP (SHapley Additive exPlanations), which uses game theory to estimate feature contributions.

Our ARL method provides the system logic and personalized explanations simultaneously through the association rules it inherently generates, thus bypassing the need for the aforementioned techniques used by neural systems.

Knowledge graph-based recommender systems take a different approach by providing direct mappings of relationships among books, such as authors, genres, themes, writing styles, or even publication series. In the case of book recommendations, this could be a historical fiction novel and books by the same author or in the same time period. In Porcel et al. (2012), the authors showed the efficacy of digital libraries using structural relationships, along with the use of sentiment analysis of user book reviews and machine learning, which provided higher user satisfaction levels, as users perceived the recommendations as more in line with their complex tastes rather than popularity-based or co-reading-based recommendations.

3 Problem formulation

Let denote the set of m users and denote the set of n books in the system. The user-book interaction can be represented as a rating matrix R∈ℝ^m×n, where r_ij represents the rating given by user u_i to book b_j. If the user u_i has not rated book b_j, then r_ij is undefined or zero. The rating matrix is formally defined in Equation 1.

For ARL, we transform this rating matrix into a binary transaction matrix T∈{0, 1}^m×n using the transformation defined in Equation 2, where

with θ as the threshold rating value (median rating θ = 4.31 in our dataset). This binary transformation simplifies the problem while preserving the essential preference information.

Table 1 provides a complete reference of all mathematical symbols and notation used throughout this paper.

Table 1

Symbol	Description
Basic sets and parameters
	Set of users
	Set of books
m	Total number of users
n	Total number of books
τ_u	Minimum user ratings (=20)
τ_b	Top books selected (=500)
k	Top recommendations (=10)
Matrices and transformations
R	User-book rating matrix
r_ij	Rating of u_i for b_j
T	Binary transaction matrix
t_ij	1 if r_ij≥θ, else 0
θ	Rating threshold (=4.31)
ρ	Matrix sparsity
Association rules
X→Y	Rule: antecedent X, consequent Y
	Frequent itemsets
R	Association rules
R′	Filtered rules
Rule metrics
Support(X)	Frequency of X
Confidence(X→Y)	P(Y\|X)
Lift(X→Y)	P(Y\|X)/P(Y)
Thresholds
minsupport	Minimum support (=0.002)
minconfidence	Minimum confidence (=0.1)
minlift	Minimum lift (=5)
Recommendation
B_u	Books read by user u
score(b_j∣u)	Score for book b_j
B_rec	Top-k recommendations

Notation and symbols.

To optimize association rule quality, we use median threshold θ = 4.31 (Equation 2). Median selection ensures ~50/50 class balance, user-centric interpretation (above-median = preference), and robustness to rating scale variations. The dataset distribution (μ = 4.18, median = 4.31, σ = 0.89, skewness = –0.74) justifies this choice (Table 2).

Table 2

Statistic	Value
Mean	4.18
Median	4.31
Std	0.89
Skewness	–0.74

Rating statistics (θ = 4.31 optimal).

We evaluated sensitivity across θ∈{3.5, 4.0, 4.31, 4.5, 5.0} (Table 3). θ = 4.31 yields best Prec@10 (0.238, +7.1% vs θ = 3.5), balanced rules (2,064), and robust performance window [4.0, 4.5]. The general principle maximizes rule quality across datasets.

Table 3

θ	Pos %	#Rules	Precision@10	NDCG@10
3.5	68%	3,842	0.221	0.238
4.0	59%	2,956	0.232	0.251
4.31	50%	2,064	0.238	0.259
4.5	42%	1,487	0.229	0.247
5.0	29%	872	0.212	0.231

Threshold sensitivity (θ= median optimal).

An association rule is an implication of the form X⇒Y, where and X∩Y = ∅. X is called the antecedent and Y is called the consequent.

The support of an itemset X measures its frequency in the database, as defined in Equation 3:

where count(X) is the number of transactions containing X and |T| is the total number of transactions. For a rule X⇒Y, the support is given by Equation 4:

The confidence of a rule X⇒Y measures the likelihood that Y is purchased given that X is purchased, as shown in Equation 5:

High confidence values indicate strong relationships between the antecedent and consequent.

Lift measures how much more likely Y is to be purchased when X is purchased compared to Y being purchased independently. This metric is defined in Equation 6:

Lift values are interpreted as follows:

Lift>1: Positive correlation (purchasing X increases the likelihood of purchasing Y).
Lift = 1: Independence (no correlation).
Lift < 1: Negative correlation (purchasing X decreases the likelihood of purchasing Y).

These three metrics (Equations 3, 4, 6) form the foundation of our hybrid recommendation approach.

We employ the Apriori algorithm (Agrawal and Srikant, 1994) as our primary rule mining algorithm. This algorithm has consistently succeeded in discovering frequent patterns in sparse transactional datasets and offers inherent interpretability, making it a natural choice. In book rating datasets, sparsity is often a natural property.

The sparsity ρ of the matrix R is defined by Equation 7:

To address sparsity, we implement two filtering strategies:

User filtering: We select users who have rated at least τ_u books:

with τ_u = 20 in our implementation.
• Book filtering: We select the top τ_b most popular books:

with τ_b = 500 in our implementation.

These filtering strategies (Equations 8, 9) effectively reduce the problem space while maintaining the quality of association rules.

For a target user u who has read a set of books , we generate recommendations using the scoring function defined in Equation 10:

where w_s = support(b_i⇒b_j) is a weighting factor. We only consider rules satisfying the threshold conditions:

These thresholds ensure statistical significance (support), predictive power (confidence), and positive correlation (lift). The top-k books with the highest scores according to Equation 10 are recommended to the user.

We evaluate the performance of our recommendation system using the following standard metrics:

Precision@K: Measures the proportion of relevant items in the top-K recommendations:

Recall@K: Measures the proportion of relevant items that are successfully recommended:

F1-Score@K: The harmonic mean of precision and recall:

NDCG@K: Normalized discounted cumulative gain measures ranking quality:

where rel_i is the relevance score of the item at position i.

4 Research methodology

4.1 System architecture

Figure 1 illustrates the entire process of our proposed rule-based book recommender. It consists of five major steps for processing user-book rating data and producing recommendations. In the first step, data preprocessing is carried out to retain active users who have made at least 20 ratings and retain the 500 most popular books. This reduces the sparsity of the original data from 98.85%. In the second step, the rating data are binarized using a threshold of 4.31. This converts the rating data into transaction data, making pattern mining simpler and more efficient.

Figure 1

At the core of our proposed model, we have applied the Apriori algorithm to mine association rules from the binarized transaction data with a minimum support of 0.002. Furthermore, a validation step is included to verify that each generated rule meets specific quality criteria. The rules are then ranked by confidence level, and this ranking is used to compute relevance scores for candidate books based on user reading history. In the final step, we return the top-K books with the highest scores as recommendations.

4.2 Data collection and pre-processing

The proposed framework utilizes a comprehensive dataset comprising two components:

Book dataset: 9,794 unique books ()
Rating dataset: 5,976,479 user ratings ()

The dataset exhibits extreme sparsity (ρ = 98.85%), as shown in Table 4, which is characteristic of book recommendation domains. The average of 111.87 ratings per user indicates varying engagement levels, while the average of 610.35 ratings per book demonstrates the concentration of attention on popular titles. This sparsity justifies our filtering approach (Equations 8, 9).

Table 4

Metric	Value
Total books	9,794
Total users	53,424
Total ratings	5,976,479
Average ratings per user	111.87
Average ratings per book	610.35
Rating scale	1–5
Sparsity	98.85%

Dataset statistics.

Before analysis, we perform standard data-cleaning operations: duplicate removal, handling of missing values, format standardization, and index resetting. Our filtering strategy reduces both sparsity and computational complexity while maintaining quality: the matrix dimensions decrease from 53, 424 × 9, 794 to 27, 772 × 500 (95% reduction), yielding 70,531 positive interactions.

Figure 2a presents the user filtering algorithm, which addresses data sparsity by retaining only users with substantial interaction histories. The algorithm implements a two-stage process: first, computing the number of ratings per user via iteration, then filtering users with at least 20 ratings. This filtering reduced the initial 53,424 users to 27,772 active users (52% retention), eliminating noise from casual users whose sparse interactions would generate unreliable association patterns.

Figure 2

The decision node Has min ratings? serves as a quality control checkpoint, ensuring that only users with demonstrated reading breadth contribute to rule mining. By requiring at least 20 ratings, the algorithm balances statistical significance with user diversity, enabling comprehensive pattern discovery. The filtered output feeds directly into binary matrix creation and rule mining stages, ensuring the entire recommendation pipeline operates on behaviorally rich data capable of generating high-quality association rules.

After user filtering with τ_u = 20, we retain 27,772 users (52% of original users)with substantial rating histories. Figure 2b presents the book filtering algorithm, which addresses the original dataset's 98.85% sparsity by selecting the top 500 most frequently rated books. The algorithm implements a two-stage process: first, counting ratings per book, then ranking books by popularity, and selecting the top 500 titles. This popularity-based selection reduces the book space from 9,794 to 500 (94.9% reduction), ensuring that association rules derive from books with sufficient interactions for statistically significant patterns.

By selecting highly-rated books, the algorithm ensures that association rules reflect genuine reader preferences rather than noise from sparsely-rated titles. The 500 selected books averaged 610.35 ratings each in the original dataset, transforming the extremely sparse 9, 794 × 53, 424 matrix into a computationally manageable 500 × 27, 772 binary transaction matrix with 70,531 positive interactions. Combined with the user filtering, this achieves a 95% reduction in problem space while maintaining recommendation quality superior to neural and graph-based baselines, ultimately producing 2,064 high-quality rules with strong lift values (median 15.32) that enable accurate, interpretable recommendations.

We create the binary user-book matrix as defined in Equation 2. The resulting matrix has dimensions 27,772 × 500.

4.3 Association rule mining

We used the Apriori algorithm (Agrawal and Srikant, 1994) with a minimum support of 0.002, implemented in the mlxtend library, for better efficiency. From the set of frequent itemsets , we generate rules and filter them in accordance with the requirements of Equations 11–13 (see Equation 18). This generates 2,064 high-quality rules with a median lift of 15.32. Filtering statistics are reported in Table 5.

Table 5

Stage	#Rules
Initial rules (all)	2,644
After support filter (≥0.002)	2,644
After confidence filter (≥0.1)	2,064
After lift filter (≥5)	2,064
Final rules (all criteria)	2,064

Association rule statistics at different filtering steps.

Table 5 shows the statistics of rules at different stages of filtering.

Filtered rules R′ ranked by confidence descending, tie-broken by lift, then support (Equation 19).

4.4 Recommendation generation

Algorithm 1 computes scores via Equation 10, aggregates over matching rules, and returns top-k.

4.5 Mathematical rationale for multiplicative scoring

Our scoring function (Equation 10) multiplicatively combines confidence, lift, and support. This subsection provides probabilistic rationale and empirical validation.

The formulation has probabilistic interpretation:

This includes those that are conditionally probable (confidence), positively correlated (lift), and statistically significant (support). Because this method multiplies values, it naturally accommodates different scales (the lift will vary from 5 to 105, and confidence from 0.1 to 1.0), without any hyperparameter tuning.

Table 6 compares scoring functions:

Table 6

Method	Precision@10	Recall@10	NDCG@10
Additive (equal)	0.198	0.134	0.215
Additive (optimized)	0.224	0.151	0.241
Multiplicative	0.238	0.167	0.259

Scoring ablation study.

The multiplicative method achieves a 6.25% improvement in Precision@10 over the optimized additive method. It reduces the influence of noisy metrics: rules with low Confidence (< 0.1) are discarded even if the Lift is high, and it aggregates evidence from several matching rules.

5 Experimental results

5.1 Experimental setup

The minimum number of user ratings threshold (τ_u = 20) is used to select users with sufficient interaction history to discover meaningful patterns. The rating threshold (θ = 4.31) is the median rating in our dataset, and thus, the binary transformation is balanced. The minimum support (σ = 0.002) indicates that the patterns correspond to at least 0.2% of the transactions, thereby balancing statistical significance and computational efficiency.

Table 7 reveals strong association patterns in the top rules. The first five rules show near-perfect confidence (0.99—1.0) and lift values (97,105), indicating that readers of the Outlander series books almost always continue to the next installment. Outlander exhibits stronger sequential reading behavior than Harry Potter, despite the latter's greater support, revealing a pattern of dedicated audiences.

Table 7

Rank	Antecedent → Consequent	Support	Confidence	Lift
1	Dragonfly in Amber → An Echo in the Bone	0.002052	1.000000	97.41
2	Drums of Autumn → An Echo in the Bone	0.002016	1.000000	97.41
3	Drums of Autumn → Voyager	0.002016	1.000000	104.85
4	Drums of Autumn → An Echo in the Bone	0.003061	0.988235	103.56
5	Drums of Autumn → Voyager	0.002809	0.987179	103.45
6	Harry Potter #6 → Harry Potter #7	0.017428	0.843475	8.66
7	Harry Potter #7 → Harry Potter and the Half-Blood Prince	0.015771	0.737325	9.84
8	Harry Potter #2 → Harry Potter #1	0.015375	0.606534	6.07
9	Harry Potter #6 → Harry Potter and the Half-Blood Prince	0.010622	0.506198	8.37
10	Harry Potter #3 → Harry Potter #1	0.015519	0.524793	5.24

Top 10 AR by confidence.

The rules uncover three key reading behaviors: series continuity (Drums of Autumn→An Echo in the Bone, confidence = 1.0, lift = 97.41), author loyalty (Harry Potter sequential rules), and genre clustering across Fantasy/Historical Fiction titles. These transparent patterns validate the multi-metric scoring from Equations 3–6 and demonstrate ARL's ability to capture interpretable user behavior.

5.2 Sample recommendations

Consider a user (user ID: 12874) who has read the following books: Harry Potter and the Deathly Hallows (Harry Potter, #7).

The recommendations in Table 8 demonstrate excellent performance. Ranks 1–6 are all accurate in locating other books from the Harry Potter series, and they do so with a very high level of certainty (0.497–0.638) as well as strong lift values (32–47), which is a reflection of the continuity of the series captured by the association rules. Notably, the system recommends books in reverse chronological order (except for #1), suggesting that readers who enjoyed the final book are likely to appreciate earlier installments. Ranks 7–10 extend beyond the series to suggest bestselling titles in similar genres (Fantasy and Historical Fiction), demonstrating the system's ability to identify cross-series patterns. This confirms that our hybrid scoring method effectively blends within-series and cross-genre associations.

Table 8

Rank	Recommended book	Confidence	Lift
1	Harry Potter and the Half-Blood Prince (Harry Potter, #6)	0.638	40.48
2	Harry Potter and the Order of the Phoenix (Harry Potter, #5)	0.556	36.14
3	Harry Potter and the Goblet of Fire (Harry Potter, #4)	0.525	33.87
4	Harry Potter and the Prisoner of Azkaban (Harry Potter, #3)	0.498	32.30
5	Harry Potter and the Chamber of Secrets (Harry Potter, #2)	0.497	32.30
6	Harry Potter and the Sorcerer's Stone (Harry Potter, #1)	0.507	47.65
7	The Fellowship of the Ring (The Lord of the Rings, #1)	0.080	10.5
8	The Book Thief	0.064	8.9
9	The Help	0.049	7.8
10	All the Light We Cannot See	0.049	7.5

Top 10 recommendations for sample user (User ID: 12874).

5.3 Performance analysis

5.3.1 Computational efficiency

Different stages of the pipeline computational time are represented in Figure 2. The data clearly demonstrate that the Apriori algorithm constitutes the primary computational bottleneck. The algorithm consumed 145.3 seconds (75.6% of total processing time) for frequent itemset discovery. This confirms that the association rule mining is a computationally complex task, though the computational time remains acceptable for offline processing. Rule generation takes 23.8 seconds (12.4%), while the preprocessing stages (data preparation: 12.4s, matrix creation: 8.7s) and rule filtering (2.1s) together contribute very little to the total execution time. This distribution indicates that optimizing the Apriori stage will yield the greatest performance gains, potentially through parallel processing or more efficient frequent itemset mining algorithms. The offline nature of these computations, combined with 0.03-second per-user online recommendation times, making the system highly suitable for real-time applications.

5.3.2 Scalability analysis

Figure 3 demonstrates the system's scalability characteristics. Processing time scales approximately linearly with data volume, indicating strong scalability for real-world applications. Execution time increases from 15.2 seconds for 5,000 users and 100 books to 192.3 seconds for the complete dataset of 27,772 users and 500 books. This near-linear scaling confirms the efficiency of our filtering strategies. The system keeps processing time below 200 seconds even for datasets with close to 30,000 users; thus, it is feasible for nightly batch updates in production environments. The modest growth in processing time relative to data volume suggests that the solution scales effectively to even larger datasets with proportionate computational resources.

Figure 3

5.3.3 Complexity analysis

Filtering reduces the problem's complexity by about five orders of magnitude. The estimated number of operations decreases substantially from the original problem (5.98 million ratings, m = 53K users, and n = 9.7K books) to the filtered version (T has 27.8K elements, and n′ = 500). The reduction in the number of operations is described in Table 9, and the process took 192 seconds with a maximum memory allocation of 1.5 GB.

Table 9

Stage	Time	Space	Memory
User filtering	O(\|D_R\|)	O(m)	200 MB
Book filtering	O(\|D_R\|+nlogn)	O(n)	40 MB
Binary matrix		O(\|T\|)	50 MB (70K pos)
Apriori	O(2^\|T\|·n′)		1.2 GB
Rule generation	O(\|R\|)	O(\|R\|)	80 MB
Recommendation	O(k·n′)	O(n′)	2 MB/user
Total	O(2^\|T\|·n′)		1.5 GB

Complexity analysis (n′ = 500, |T| = 27.8K, 192s total, 1.5GB peak).

Apriori requires substantial computation on the filtered dataset (worst-case time complexity O(2^|T|·n′)). Filtering makes the problem more manageable by reducing the number of books from 9.7K to 500 and the number of ratings from 5.98 million to 70K. The preprocessing still shows a linear trend of about 290 MB, and the rest of the phases are small, under 100 MB, with a similar linear trend, as shown in Figure 3.

5.4 Comparison with baseline methods

We compare our ARL-based approach with several baseline recommendation methods, including both classical and recent approaches. We evaluated a temporal split: the first 80% of each user's ratings are used for training (ratings are sorted by timestamp if available, or randomly split otherwise), and the remaining 20% are used for testing.

To ensure fair and rigorous performance comparison, we trained all baseline methods on the same filtered dataset used by our proposed ARL framework. Specifically:

Dataset configuration:

– Users: 27,772 active users (those with ≥ 20 ratings)
– Books: Top 500 most popular books
– Ratings: 70,531 positive interactions (ratings ≥ 4.31)
– Matrix dimensions: 27,772 × 500
– Sparsity: 99.49% after filtering (reduced from 98.85% original)

Training configuration per method::

– Item-based CF: Computed item-item similarity matrix on the filtered 500 books using cosine similarity.
– User-based CF: Computed user-user similarity matrix on 27,772 active users using Pearson correlation.
– Matrix factorization (SVD): Trained with 50 latent factors, learning rate 0.01, regularization 0.02, for 50 epochs on the filtered rating matrix.
– Neural CF: Two-layer MLP with hidden units [64, 32, 16, 8], dropout 0.2, trained for 50 epochs with Adam optimizer (lr = 0.001) on filtered user-book pairs.
– Graph-based: Constructed a user-book bipartite graph from the filtered dataset, applied a 3-layer Graph Convolutional Network (GCN) with 64-dimensional embeddings, trained for 100 epochs.

Table 10 shows the comparison results with classical collaborative filtering methods (Sarwar et al., 2001; He et al., 2017) and recent deep learning approaches (Zhang et al., 2023; Wei et al., 2025).

Table 10

Method	Precision@10	Recall@10	F1-Score@10	NDCG@10
Item-based CF (Sarwar et al., 2001)	0.187	0.124	0.149	0.201
User-based CF (Koren et al., 2009)	0.176	0.115	0.139	0.189
Matrix factorization (Koren et al., 2009)	0.203	0.138	0.164	0.224
Neural CF (He et al., 2017)	0.216	0.147	0.174	0.237
Graph-based (Zhang et al., 2023)	0.229	0.156	0.184	0.249
ARL (support only)	0.195	0.131	0.157	0.213
ARL (confidence only)	0.221	0.152	0.179	0.241
Proposed ARL (hybrid)	0.238	0.167	0.195	0.259

Performance comparison across classical collaborative filtering, neural, graph-based methods, and multi-metric ARL variants (K = 10).

Figure 4 shows that the proposed method outperforms the baseline methods in both precision and recall. Our proposed ARL achieves 0.238 precision, representing a 3.9% improvement over the graph-based method and a 10.2% improvement over neural collaborative filtering. Consistent superiority across both precision and recall metrics demonstrates that the hybrid approach successfully balances recommendation accuracy and coverage. The performance gap between single-metric ARL variants (support-only: 0.195, confidence-only: 0.221) and our hybrid method (0.238) provides evidence for the multi-metric scoring function, confirming that combining support, confidence, and lift captures complementary aspects of book associations.

Figure 4

5.5 ARL framework comparison

Recent ARL book recommenders use single metrics (support/confidence) on small datasets; our multi-metric approach scales to 5.98M ratings with superior performance. Table 11 compares key aspects.

Table 11

Method	Dataset size	Metric	Precision@10	Scalable
Mustika (Mustika and Musdholifah, 2019)	Small	Support	–	No
Varzaneh (Varzaneh et al., 2018)	Small	Confidence	–	No
Bhajantri (Bhajantri et al., 2024)	10K users	Hybrid	–	Limited
Sen (Sen et al., 2021)	Small	Graph-ARL	–	No
Nuipian (Nuipian and Chuaykhun, 2024)	Small	Content	–	No
Proposed ARL	5.98M ratings	Support + Confidence + Lift	0.238	95% reduction

ARL framework comparison.

Table 11 highlights the key advantages of our approach, including the use of a dataset 180 times larger than that of Bhajantri, a multi-metric fusion strategy that improves precision by 22% over support-only scoring, a 95% reduction in problem space enabling production-level scalability, and interpretable association rules that achieve performance comparable to Neural CF (0.238 vs. 0.216).

5.6 Impact of parameter selection

We conducted sensitivity analysis on key parameters: minimum support (σ), minimum confidence, and minimum lift.

5.6.1 Minimum support threshold

Figure 5 illustrates the principal trade-off when changing the minimum support threshold between the number of rules and the quality of the recommendations. This two-axis graph shows the relationship between inversely correlated variables. As support is increased from 0.001 to 0.010, the number of rules is decreased exponentially from 4,287 to 234, whereas precision changes in a non-monotonic manner with a clear maximum at 0.002. This point of optimum (marked by the horizontal dashed line) achieves the highest precision of 0.238 with 2,064 rules, thus being the best compromise between coverage and quality. A threshold that is too low (0.001) generates so many rules that noise is introduced, lowering precision to 0.224 even though the rule count doubles. Conversely, overly high thresholds (0.010) limit the extraction of valuable patterns, leading to a drop in precision to 0.198 due to insufficient coverage.

Figure 5

5.6.2 Combined threshold analysis

Figure 6a demonstrates how minimum support and confidence thresholds jointly affect Precision@10, with optimal performance (0.238, 2,064 rules) achieved at support = 0.002 and confidence = 0.10 (dashed box). The heatmap reveals a strong parameter interdependency: halving support to 0.001 generates 3,456 rules but improves precision only marginally to 0.224, indicating diminishing returns from excessive rules. Conversely, raising support to 0.010 reduces rules to 234–389 while dropping precision to 0.195–0.198, demonstrating that over-filtering eliminates valuable patterns. The narrow optimal region (±0.001 support, ±0.05 confidence) indicates stable parameter settings under real-world conditions.

Figure 6

Figure 6b delineates the conjoint influence of minimum confidence and minimum lift thresholds on F1-Score@10 (bold values) and NDCG@10 (italic values), whereby the best performance is achieved at Confidence = 0.10 and Lift = 5 with F1-Score@10 of 0.195 and NDCG@10 of 0.259. The chart indicates that a medium lift threshold is always better than its extreme counterparts at any confidence level: Lift = 3 allows weak associations, which in turn reduces quality (F1-Score@10 varies between 0.154 and 0.183), whereas the overly restrictive Lift = 10 leads to a loss of recall due to over-filtering. In fact, both evaluation metrics are closely aligned, with a correlation of around 0.96, supporting the selection of optimal parameters and generalizing to other quality measures. The heatmap shows that confidence has a greater impact on recommendation performance than lift, as the confidence interval of 0.10–0.15 can be considered a stability zone where configurations in its vicinity retain more than 90% of the maximum performance. This multi-criteria agreement serves as cross-validation of the chosen parameters for use in the production environment, as the system achieves the best precision-recall trade-off while maintaining interpretability through transparent association rules.

6 Discussion

6.1 Model insights and interpretability

Our results provide insights into the effectiveness and clarity of the ARL framework. By integrating support, confidence, and lift into our scoring function (Equation 10), we achieved better accuracy than any single statistic. These statistics are complementary because each represents a different aspect of a user-book relationship. The model clearly detects series and author trend patterns, as shown by very high lift scores (often over 90) for books within the same series.

One major benefit of our method is its inherent interpretability. Rules such as Drums of Autumn -> An Echo in the Bone (Confidence = 1.0) are easy to understand and provide insight into how users interact with each other while co-reading books. Such transparency is valuable for both explaining recommendations and analyzing reader behavior. Unlike latent-representation models, ARL presents its reasoning in a human-readable form, allowing direct insights for librarians, publishers, and users.

6.2 Scalability and feasibility

By applying user and item filtering with at least 20 ratings and the top 500 items, we significantly reduce our data from 53,424 by 9,794 to 27,772 by 500, a reduction of around 95%. However, the quality of recommendations remains roughly the same with our downsized dataset, proving that our method is scalable for larger catalog environments. Moreover, ARL does not require iterative optimization or hyperparameter tuning, making our method a low-cost solution compared to other heavy deep learning models.

6.3 Popularity bias analysis and long-tail coverage

Top-500 filtering captures 71.7% of all ratings while excluding 94.9% of the catalog (τ = 0.08). Table 12 presents long-tail coverage and bias indicators. The Gini index of 0.81 indicates moderate inequality, slightly biased toward popular titles, yet broader than simple popularity-based recommenders. Filtering thus ensures dense, reliable rules but may reduce serendipity.

Table 12

Method	τ	Gini
Most popular	0.00	0.92
Neural CF	0.18	0.69
Ours	0.08	0.81

Long-tail coverage (τ, Gini index).

To mitigate bias without compromising the rules, we experimented with three methods: first, adding 20% long-tail items to the catalog (a = 0.2) increases diversity by 15% but reduces precision by 3.2%; second, selecting a genre-balanced top 50 to ensure coverage of different genres; and third, refreshing the catalog monthly to ensure the renewal of diversity. Similar to commercial systems (e.g., Netflix and Amazon), explicit control via τ_b (Equation 9) allows fine-tuning of the trade-off between popularity and variety.

6.4 Temporal adaptation and incremental rule updates

The proposed rule-mining architecture is static because any data alterations require a complete re-run, which cannot handle dynamic user behavior or seasonal patterns. Incremental rule mining techniques will therefore be incorporated. The sliding time window will divide ratings into time-interval blocks W_t, mines rules individually with R_t = AprioriMine(W_t, σ, γ_conf, γ_lift), and finally combine their results with temporal weight , which favors newer relationships while slowly down-weighting older ones.

Incremental Apriori improves efficiency by updating only those itemsets changed by new transactions ΔT, computed as F_t+1 = F_t∪NewFrequent(ΔT)∪NoLongerFrequent(T_t, ΔT), achieving up to 100 × speedup over full re-scans. Temporal weighting will introduce a secondary adjustment to existing rules with a light-weight computation , eliminating the need for frequent re-running. A hybrid batch/streaming architecture, inspired by the Lambda architecture, will integrate a monthly full rule generation with a daily incremental rule generation.

7 Conclusion

This study presents a comprehensive book recommendation system based on ARL that effectively addresses personalized book discovery in large-scale digital libraries. Our system demonstrates that ARL, when appropriately designed and implemented, delivers accurate, interpretable, and efficient book recommendations. The transparent nature of association rules provides substantial benefits for both explaining recommendations and understanding user reading behavior.

Experimental results on a real-world dataset with nearly 6 million ratings validate the approach's effectiveness. The system's ability to generate recommendations rapidly (0.03 seconds per user) while maintaining high accuracy makes it suitable for deployment in production environments.

As digital library ecosystems continue to expand and users demand increasingly personalized services, recommendation systems will become an indispensable infrastructure. While recent breakthroughs in large language models and deep learning (Nawara and Kashef, 2025; Zhang et al., 2023) open new possibilities, our research demonstrates that traditional data mining techniques like ARL, when properly adapted and executed, remain powerful and explainable approaches for constructing practical recommendation systems that users can understand and trust (Zhang et al., 2024).

The proposed framework shows that interpretable ARL-based recommendations can perform competitively while remaining transparent and scalable. In practical applications such as digital libraries, online bookstores, or reading platforms, the system can enhance personalization, reader engagement, and the curation of thematic collections. Future enhancements include expanding long-tail coverage, integrating temporal modeling at the user level, and developing hybrid ARL-neural models that combine explainability with the contextual power of deep learning.

Statements

Data availability statement

Publicly available datasets were analyzed in this study. This data can be found here: https://cseweb.ucsd.edu/j~mcauley/datasets.html#goodreads.

Author contributions

AH: Conceptualization, Software, Methodology, Resources, Writing – original draft. SAA: Methodology, Investigation, Writing – original draft, Conceptualization, Formal analysis. EA: Formal analysis, Writing – review & editing, Investigation, Methodology. MSH: Writing – review & editing, Formal analysis, Validation, Supervision.

Funding

The author(s) declared that financial support was not received for this work and/or its publication.

Conflict of interest

The author(s) declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declared that generative AI was not used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

1
AdyatmaH.BaizalZ. (2023). Book recommender system using matrix factorization with alternating least square method. J. Inf. Syst. Res. 4, 1286–1292. doi: 10.47065/josh.v4i4.3816
- CrossRef
- Google Scholar
2
AgrawalR.SrikantR. (1994). “Fast algorithms for mining association rules in large databases,” in Proceedings of the 20th International Conference on Very Large Data Bases, VLDB '94 (San Francisco, CA: Morgan Kaufmann Publishers Inc), 487–499.
- Google Scholar
3
BhajantriA. K. N GoudarR. H. G. M. D KaliwalR.RathodV.et al. (2024). Personalized book recommendations: a hybrid approach leveraging collaborative filtering, association rule mining, and content-based filtering. EAI Endor. Trans. Internet Things10, 1–6. doi: 10.4108/eetiot.6996
- CrossRef
- Google Scholar
4
BobadillaJ.González-PrietoA.OrtegaF.Lara-CabreraR. (2021). Deep learning feature selection to unhide demographic recommender systems factors. Neural Comput. Appl. 33, 7291–7308. doi: 10.1007/s00521-020-05494-2
- CrossRef
- Google Scholar
5
BurkeR. (2002). Hybrid recommendation systems: survey and experiments. User Model. User-adapt. Interact. 12, 331–370. doi: 10.1023/A:1021240730564
- CrossRef
- Google Scholar
6
ChengY.LiuY.ChenT.YangQ. (2020). Federated learning for privacy-preserving ai. Commun. ACM63, 33–36. doi: 10.1145/3387107
- CrossRef
- Google Scholar
7
DeldjooY. (2025). Understanding biases in chatgpt-based recommender systems: provider fairness, temporal stability, and recency. ACM Trans. Recomm. Syst. 4, 1–35. doi: 10.1145/3690655
- CrossRef
- Google Scholar
8
GaoY.ShengT.XiangY.XiongY.WangH.ZhangJ. (2023). Chat-rec: towards interactive and explainable llms-augmented recommender system. ArXiv, abs/2303.14524.
- Google Scholar
9
GengS.LiuS.FuZ.GeY.ZhangY. (2022). “Recommendation as language processing (RLP): a unified pretrain, personalized prompt &predict paradigm (p5),” in Proceedings of the 16th ACM Conference on Recommender Systems (New York, NY: Association for Computing Machinery), 299–315. doi: 10.1145/3523227.3546767
- CrossRef
- Google Scholar
10
HeX.LiaoL.ZhangH.NieL.HuX.ChuaT.-S. (2017). “Neural collaborative filtering,” in Proceedings of the 26th International Conference on World Wide Web (Republic and Canton of Geneva, CHE: International World Wide Web Conferences Steering Committee), 173–182. doi: 10.1145/3038912.3052569
- CrossRef
- Google Scholar
11
HoT. D. H.NguyenS. T. T. (2024). “Self-attentive sequential recommendation models enriched with more features,” in Proceedings of the 2024 8th International Conference on Deep Learning Technologies, ICDLT '24 (New York, NY: Association for Computing Machinery), 49–55. doi: 10.1145/3695719.3695727
- CrossRef
- Google Scholar
12
JingM.ZhuY.ZangT.WangK. (2023). Contrastive self-supervised learning in recommender systems: a survey. ACM Trans. Inf. Syst. 42, 1–39. doi: 10.1145/3627158
- CrossRef
- Google Scholar
13
KarlovićR.RovisM.SmajićA.SeverL.LorencinI. (2025). Context-aware tourism recommendations using retrieval-augmented large language models and semantic re-ranking. Electronics14:4448. doi: 10.3390/electronics14224448
- CrossRef
- Google Scholar
14
KorenY.BellR.VolinskyC. (2009). Matrix factorization techniques for recommender systems. Computer42, 30–37. doi: 10.1109/MC.2009.263
- CrossRef
- Google Scholar
15
KumarR.BalaP.MukherjeeS. (2020). A new neighbourhood formation approach for solving cold-start user problem in collaborative filtering. Int. J. Appl. Manag. Sci. 12, 123–145. doi: 10.1504/IJAMS.2020.106734
- CrossRef
- Google Scholar
16
LiH.SheuP. (2021). A scalable association rule learning heuristic for large datasets. J. Big Data8:86. doi: 10.1186/s40537-021-00473-3
- CrossRef
- Google Scholar
17
LiY. (2025). “Application of deep learning-driven personalized recommendation algorithms in e-commerce marketing,” in Proceedings of the 2025 2nd International Conference on Digital Economy, Blockchain and Artificial Intelligence, DEBAI '25 (New York, NY: Association for Computing Machinery), 488–493. doi: 10.1145/3762249.3762325
- CrossRef
- Google Scholar
18
LiY.ChenH.XuS.GeY.TanJ.LiuS.et al. (2023). Fairness in recommendation: Foundations, methods, and applications. ACM Trans. Intell. Syst. Technol. 14, 1–48. doi: 10.1145/3610302
- CrossRef
- Google Scholar
19
LiuQ.HuJ.XiaoY.ZhaoX.GaoJ.WangW.et al. (2024). Multimodal recommender systems: a survey. ACM Comput. Surv. 57, 1–17. doi: 10.1145/3637841
- CrossRef
- Google Scholar
20
LuJ.WuD.MaoM.WangW.ZhangG. (2015). Recommender system application developments: a survey. Decis. Support Syst. 74, 12–32. doi: 10.1016/j.dss.2015.03.008
- CrossRef
- Google Scholar
21
MustikaH. F.MusdholifahA. (2019). Book recommender system using genetic algorithm and association rule mining. Comput. Eng. Applic. J. 8, 103–114. doi: 10.18495/comengapp.v8i2.305
- CrossRef
- Google Scholar
22
NawaraD.KashefR. (2025). A comprehensive survey on llm-powered recommender systems: From discriminative, generative to multi-modal paradigms. IEEE Access13, 145772–145798. doi: 10.1109/ACCESS.2025.3599832
- CrossRef
- Google Scholar
23
NuipianV.ChuaykhunJ. (2024). “Book recommendation system based on course descriptions using cosine similarity,” in Proceedings of the 2023 7th International Conference on Natural Language Processing and Information Retrieval, NLPIR '23 (New York, NY: Association for Computing Machinery), 273–277. doi: 10.1145/3639233.3639335
- CrossRef
- Google Scholar
24
PazzaniM. J.BillsusD. (2007). “Content-based recommendation systems,” in The Adaptive Web, eds. P. Brusilovsky, A. Kobsa, and W. Nejdl (Cham: Springer), 325–341. doi: 10.1007/978-3-540-72079-9_10
- CrossRef
- Google Scholar
25
PorcelC.Tejeda-LorenteA.MartnezM.Herrera-ViedmaE. (2012). A hybrid recommender system for the selective dissemination of research resources in a technology transfer office. Inf. Sci. 184, 1–19. doi: 10.1016/j.ins.2011.08.026
- CrossRef
- Google Scholar
26
QuadranaM.KaratzoglouA.HidasiB.CremonesiP. (2017). “Personalizing session-based recommendations with hierarchical recurrent neural networks,” in Proceedings of the Eleventh ACM Conference on Recommender Systems, RecSys '17 (New York, NY: Association for Computing Machinery), 130–137. doi: 10.1145/3109859.3109896
- CrossRef
- Google Scholar
27
RicciF.RokachL.ShapiraB. (2021). “Recommender systems: techniques, applications, and challengesm,” in Recommender Systems Handbook, 1–35. doi: 10.1007/978-1-0716-2197-4_1
- CrossRef
- Google Scholar
28
SarwarB.KarypisG.KonstanJ.RiedlJ. (2001). “Item-based collaborative filtering recommendation algorithms,” in Proceedings of the 10th International Conference on World Wide Web, 285–295. doi: 10.1145/371920.372071
- CrossRef
- Google Scholar
29
SenS.MehtaA.GanguliR.SenS. (2021). Recommendation of influenced products using association rule mining: Neo4j as a case study. SN Comput. Sci. 2:74. doi: 10.1007/s42979-021-00460-8
- CrossRef
- Google Scholar
30
VarzanehH. H.NeysianiB. S.ZiafatH.SoltaniN. (2018). “Recommendation systems based on association rule mining for a target object by evolutionary algorithms,” in IEEE International Conference on System, Computation, Automation and Networking (ICSCA) (IEEE), 1–6. doi: 10.28991/esj-2018-01133
- CrossRef
- Google Scholar
31
VermaP.AnilA.SI,. V. (2025). “Recommendation system for books using graph neural networks,” in 2025 International Conference on Data Science, Agents &Artificial Intelligence (ICDSAAI), 1–6. doi: 10.1109/ICDSAAI65575.2025.11011901
- CrossRef
- Google Scholar
32
WangW.LinX.FengF.HeX.seng ChuaT. (2023). Generative recommendation: Towards next-generation recommender paradigm. ArXiv, abs/2304.03516.
- Google Scholar
33
WangX.HeX.WangM.FengF.ChuaT.-S. (2019). “Neural graph collaborative filtering,” in Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR'19 (New York, NY: Association for Computing Machinery), 165–174. doi: 10.1145/3331184.3331267
- CrossRef
- Google Scholar
34
WeiP.ShuH.GanJ.DengX.LiuY.SunW.et al. (2025). Sequential recommendation system based on deep learning: a survey. Electronics14:2134. doi: 10.3390/electronics14112134
- CrossRef
- Google Scholar
35
WuS.TangY.ZhuY.WangL.XieX.TanT. (2019). “Session-based recommendation with graph neural networks,” in Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, AAAI'19/IAAI'19/EAAI'19 (AAAI Press).
- Google Scholar
36
YangZ.LaiL.LinX.HaoK.ZhangW. (2021). “Huge: an efficient and scalable subgraph enumeration system,” in Proceedings of the 2021 International Conference on Management of Data, SIGMOD '21 (New York, NY: Association for Computing Machinery), 2049–2062. doi: 10.1145/3448016.3457237
- CrossRef
- Google Scholar
37
YuJ.YinH.XiaX.ChenT.LiJ.HuangZ. (2024). Self-supervised learning for recommender systems: a survey. IEEE Trans. Knowl. Data Eng. 36, 335–355. doi: 10.1109/TKDE.2023.3282907
- CrossRef
- Google Scholar
38
ZakurY.FlaihL. (2023). “Apriori algorithm and hybrid apriori algorithm in the data mining: a comprehensive review,” in E3S Web of Conferences, 448. doi: 10.1051/e3sconf/202344802021
- CrossRef
- Google Scholar
39
ZhangJ.TangJ.ChenX.YuW.HuL.JiangP.et al. (2024). “Natural language explainable recommendation with robustness enhancement,” in Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (New York, NY: Association for Computing Machinery), 4203–4212. doi: 10.1145/3637528.3671781
- CrossRef
- Google Scholar
40
ZhangR.LiuQ.-,d.Chun-Gui WeiJ.-X.Huiyi-Ma (2014). “Collaborative filtering for recommender systems,” in Second International Conference on Advanced Cloud and Big Data, 301–308. doi: 10.1109/CBD.2014.47
- CrossRef
- Google Scholar
41
ZhangY.ZhangY.YanD.DengS.YangY. (2023). Revisiting graph-based recommender systems from the perspective of variational auto-encoder. ACM Trans. Inf. Syst. 41, 1–28. doi: 10.1145/3573385
- CrossRef
- Google Scholar
42
ZhaoZ.FanW.LiJ.LiuY.MeiX.WangY.et al. (2024). Recommender systems in the era of large language models (LLMS). IEEE Trans. Knowl. Data Eng. 36, 6889–6907. doi: 10.1109/TKDE.2024.3392335
- CrossRef
- Google Scholar
43
ZhuY.WuL.GuoQ.HongL.LiJ. (2024). “Collaborative large language model for recommender systems,” in Proceedings of the ACM Web Conference 2024, WWW '24 (New York, NY: Association for Computing Machinery), 3162–3172. doi: 10.1145/3589334.3645347
- CrossRef
- Google Scholar

Summary

Keywords

Apriori, association rule learning, book recommendation, collaborative filtering, market basket analysis

Citation

Hidri A, AlSaif SA, AlShehri E and Sassi Hidri M (2026) Scalable multi-metric association rule learning for explainable book recommendations. Front. Comput. Sci. 8:1779096. doi: 10.3389/fcomp.2026.1779096

Received

31 December 2025

Revised

21 February 2026

Accepted

13 March 2026

Published

07 April 2026

Volume

8 - 2026

Edited by

Athanasios Drigas, National Centre of Scientific Research Demokritos, Greece

Reviewed by

Chintoo Kumar, Gandhi Institute of Technology and Management, Bengaluru, India

Afrig Aminuddin, Universitas Amikom Yogyakarta, Indonesia

Updates

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Minyar Sassi Hidri, mmsassi@iau.edu.sa

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

ORIGINAL RESEARCH article

Scalable multi-metric association rule learning for explainable book recommendations

Abstract

1 Introduction

2 Related work

2.1 Collaborative filtering methods

2.2 Content-based and hybrid systems

2.3 Association rule learning for recommendations

2.4 Deep learning and transformer-based methods

2.5 Explainability and knowledge integration

3 Problem formulation

4 Research methodology

4.1 System architecture

4.2 Data collection and pre-processing

4.3 Association rule mining

4.4 Recommendation generation

4.5 Mathematical rationale for multiplicative scoring

5 Experimental results

5.1 Experimental setup

5.2 Sample recommendations

5.3 Performance analysis

5.3.1 Computational efficiency

5.3.2 Scalability analysis

5.3.3 Complexity analysis

5.4 Comparison with baseline methods

5.5 ARL framework comparison

5.6 Impact of parameter selection

5.6.1 Minimum support threshold

5.6.2 Combined threshold analysis

6 Discussion

6.1 Model insights and interpretability

6.2 Scalability and feasibility

6.3 Popularity bias analysis and long-tail coverage

6.4 Temporal adaptation and incremental rule updates

7 Conclusion

Statements

Data availability statement

Author contributions

Funding

Conflict of interest

Generative AI statement

Publisher’s note

References

Summary

Outline

Figures

Cite article

Share article

Article metrics