ORIGINAL RESEARCH article

Front. Mater., 18 July 2025

Sec. Computational Materials Science

Volume 12 - 2025 | https://doi.org/10.3389/fmats.2025.1616233

Enhancing phase change thermal energy storage material properties prediction with digital technologies

  • 1. Institute of Refrigeration & Cryogenics Engineering, Dalian Maritime University, Dalian, China

  • 2. College of Ocean and Civil Engineering, Dalian Ocean University, Dalian, China

  • 3. China National Offshore Oil Corporation Limited, Tianjin, China

Abstract

Introduction:

In the field of materials science, the prediction of material properties plays a critical role in designing new materials and optimizing existing ones. Traditional experimental approaches, while effective, are resource-intensive and time-consuming, often requiring extensive trial-and-error methods. To address these limitations, the integration of digital technologies, such as computational modeling and machine learning (ML), has become increasingly important.

Methods:

This paper proposes a hybrid multiscale modeling framework that integrates molecular dynamics (MD) simulations, finite element methods (FEM) from continuum mechanics, and supervised ML algorithms—including deep neural networks and gradient boosting regressors—to enable accurate and efficient prediction of material properties across scales. The method integrates MD simulations for atomic-level interactions using Lennard-Jones and embedded-atom method (EAM) potentials, FEM-based continuum mechanics for stress-strain analysis and thermal response evaluation, and ML techniques trained on multiscale descriptors (e.g., bond energy, stress tensor, coordination number) to model nonlinear property relations and accelerate design iteration. Hierarchical feature fusion modules combine low-level atomistic descriptors with high-level continuum features.

Results:

Benchmark evaluations show improved performance in predicting elastic modulus, thermal conductivity, and phase transition temperature across five material classes. Our experimental results demonstrate that this integrated methodology outperforms conventional methods in both prediction speed and accuracy, particularly in complex or multicomponent systems.

Discussion:

This approach significantly reduces computational costs and accelerates material design workflows by predicting properties with high precision across a wide range of materials. It aligns with current trends in leveraging advanced digital technologies to enhance materials discovery, offering a robust, scalable, and extensible framework for the optimization and design of advanced materials in various industrial, technological, and scientific applications.

1 Introduction

The prediction of material properties plays a crucial role in materials science as it allows for the design and development of new materials with tailored characteristics for specific applications . With the increasing complexity of modern engineering requirements, traditional experimental methods for determining material properties can be time-consuming and costly. Furthermore, the inherent variability in material behavior under different conditions poses challenges in predicting performance accurately Shoghi and Hartmaier (2022). Therefore, the integration of digital technologies, including machine learning, deep learning, and computational models, offers a promising solution to enhance the prediction accuracy and efficiency. Not only can these methods reduce the need for extensive physical testing, but they also enable the exploration of vast material design spaces that were previously impractical, leading to faster innovation cycles and optimized material designs for a range of industries .

Initially, traditional approaches in material property prediction relied heavily on predefined rules and expert-crafted representations, which aimed to describe material behaviors through structured inference and fundamental physical principles Zhang et al. (2023). These rules often included deterministic formulations derived from classical mechanics, thermodynamics, and materials engineering–such as Hooke’s law, empirical yield criteria (e.g., von Mises, Tresca), or manually constructed phase diagrams. Expert-crafted representations typically involved hand-engineered features like lattice parameters, composition ratios, grain size metrics, or crystallographic orientation, which were encoded using domain knowledge to serve as inputs for early rule-based systems. These early systems often operated by referencing curated databases of known materials and applying rule-based logic or first-principles calculations . However, their effectiveness was strongly limited by an over-reliance on domain-specific knowledge and an inability to adapt to the irregularities of real-world material systems . Such approaches generally failed when confronted with unknown materials or conditions beyond the original design assumptions Sun et al. (2022).

As the complexity and diversity of material systems increased, new strategies emerged that emphasized pattern discovery through statistical learning. Instead of relying on explicitly programmed heuristics, these methods sought to identify correlations and infer predictive rules from empirical observations . Algorithms such as support vector machines and ensemble methods proved especially effective in navigating high-dimensional feature spaces and modeling nonlinear behavior. While this phase offered broader applicability and improved prediction accuracy, it also introduced new challenges–such as the demand for large, high-quality datasets and the difficulty in designing informative input features that capture the underlying physics of materials . Following this, research increasingly focused on automated representation learning, enabling models to derive predictive insights directly from raw or minimally processed inputs. This shift was characterized by the adoption of architectures capable of hierarchical abstraction, allowing models to internalize both local and global structures in material data . Convolutional and recurrent networks played foundational roles in this period, with their ability to model spatial and sequential dependencies, respectively. The later introduction of attention-based mechanisms further extended this capability, particularly in capturing long-range dependencies and contextual interactions in complex material systems .

In parallel, a growing emphasis has been placed on generalizability and scalability, leading to the incorporation of pretrained modules and modular design frameworks . These developments enable rapid adaptation to new material domains with reduced training data, while maintaining predictive robustness across diverse inputs . These material systems include, but are not limited to, high-entropy alloys, battery electrode materials (e.g., Li-ion cathodes), perovskite solar absorbers, structural polymers, and composite ceramics, each presenting unique challenges in terms of multiscale heterogeneity and nonlinear property-response behavior Stukhlyak et al. (2015). Nevertheless, several challenges persist, including computational overhead, interpretability of the resulting models, and the effective integration of multi-source information ranging from experimental measurements to simulation outputs . Current efforts aim to resolve these tensions by designing architectures that are both flexible and physically grounded, ensuring that predictive performance does not come at the expense of scientific insight .

In particular, multiscale modeling approaches that couple molecular dynamics (MD) and continuum mechanics are becoming increasingly relevant. MD simulations provide insight into atomic-scale interactions, dislocation movements, and phase evolution, while continuum mechanics enables the evaluation of stress, strain, and deformation at meso- and macro-scales through partial differential equations (e.g., finite element analysis). However, these two domains often operate independently, and integrating them with modern data-driven techniques remains an open challenge. Given the aforementioned limitations of current methods, we propose an approach that combines the strengths of symbolic reasoning and data-driven techniques to improve the prediction of material properties. Our method leverages a hybrid model that integrates knowledge-based principles with machine learning algorithms, allowing for both the interpretability of expert systems and the predictive power of modern data-driven models. Additionally, our framework explicitly integrates MD simulations for atomistic modeling and continuum mechanics (via finite element methods) for macroscopic response modeling, bridging the gap between physical fidelity and computational scalability. This hybrid approach can more effectively address the challenges of insufficient data, the complexity of material behavior, and the need for rapid, real-time predictions in practical applications. By doing so, we aim to push the boundaries of material property prediction, making it more efficient and widely applicable across different material domains.

The proposed method has several key advantages:

  • • Our method introduces a novel hybrid framework that integrates symbolic reasoning with machine learning, improving both prediction accuracy and interpretability.

  • •The framework incorporates molecular dynamics and finite element models, enabling integrated prediction across atomic and continuum scales.

  • •This approach is highly versatile, suitable for a range of material systems, and offers greater efficiency compared to purely data-driven models, particularly in data-scarce environments.

  • •Experimental results demonstrate that our model outperforms existing methods in terms of prediction accuracy, providing valuable insights into material behavior with less reliance on large datasets.

2 Related work

2.1 Machine learning for property prediction

Machine learning (ML) has emerged as a transformative tool in materials science, specifically for material properties Urdaneta-Ponte et al. (2021). Traditional methods for material characterization and property prediction often rely on experimental trials, which can be time-consuming and resource-intensive. Machine learning, particularly supervised learning, offers a promising alternative by using existing datasets of material properties to train models capable of predicting the properties of novel materials Shi et al. (2020). This approach is highly beneficial in accelerating the discovery of materials with tailored properties, such as those required for specific industrial applications .

Various ML algorithms, including decision trees, random forests, and neural networks, have been applied to material datasets to predict outcomes such as thermal conductivity, tensile strength, and electrical resistivity. For instance, developed a deep neural network that achieved a mean absolute error (MAE) of 0.058 eV/atom in formation energy prediction across 100,000 compounds from the Materials Project. Xie and Grossman (2018) used graph convolutional networks to predict band gaps of inorganic crystals with an MAE of 0.388 eV, outperforming traditional kernel regression. The accuracy of these predictions is heavily reliant on the quality and quantity of the data used for training the models . One of the key challenges in this area is the need for extensive and diverse datasets that encompass a wide range of materials and their properties.

To address this challenge, data-driven frameworks, such as high-throughput computational simulations and crowdsourced databases, are being developed. These platforms can provide large volumes of data that enable ML models to generalize better and produce more accurate predictions. Moreover, recent advances in deep learning techniques, such as convolutional neural networks (CNNs) and recurrent neural networks (RNNs), have shown promise in handling complex, high-dimensional material data, such as atomic-level information or microstructural features Yang et al. (2020). These techniques have been successfully applied to specific materials systems, including predicting the elastic modulus of polymers Wu et al. (2020) and the formation energies of perovskites .

To enhance predictive power, ML models can be coupled with optimization techniques to identify the optimal material compositions or processing conditions that maximize a particular property . By combining ML with experimental feedback, the iterative process of material discovery can be accelerated, reducing both the time and cost associated with traditional methods . Such active learning frameworks have demonstrated up to 30%–40% reductions in the number of experiments needed to reach desired performance thresholds in battery electrode design and high-entropy alloy selection Tran and Ulissi (2018). This intersection of ML and material science holds the potential to revolutionize material design and production, enabling the creation of advanced materials for a wide range of applications, from aerospace to renewable energy.

2.2 Computational materials science and simulations

Computational materials science has significantly contributed to the advancement of property prediction in materials science by providing virtual tools for simulating material behavior at the atomic and molecular levels . Simulations, based on quantum mechanical calculations and classical molecular dynamics, are increasingly used to predict material properties before they are physically synthesized Rocco et al. (2021). These computational approaches allow researchers to model the properties of novel materials without the need for expensive or time-consuming experimental procedures, which is particularly beneficial in the early stages of material design.

Quantum mechanical simulations, such as density functional theory (DFT), enable the calculation of electronic properties like band gaps, conductivity, and stability at the atomic scale . For example, DFT calculations have been widely used to identify stable phases in lithium-ion battery cathode materials such as LiCoO2 and LiFePO4, and to screen band gap tunability in hybrid perovskites Zunger (2018); . These methods are crucial in understanding how the atomic structure of a material influences its macroscopic properties, such as hardness, magnetism, and elasticity.

On the other hand, molecular dynamics (MD) simulations are used to study the material’s behavior at higher scales, including thermal and mechanical properties, by modeling the interactions between atoms over time . Recent MD-based studies have accurately predicted the thermal conductivity of graphene nanoribbons Xu et al. (2014) and crack propagation in metallic glasses . Through these techniques, researchers can explore the effects of various external conditions on material performance and predict how these materials will behave under real-world conditions .

A significant advancement in computational materials science is the integration of these simulation techniques with machine learning models. By combining the predictive power of ML with the detailed atomic-level insights provided by computational simulations, researchers can more effectively narrow down the vast search space of potential materials . This hybrid modeling has been implemented in systems such as polymer dielectrics Yao et al. (2021) and phase-change alloys Vasudevan et al. (2021), achieving higher prediction accuracy and physical fidelity compared to either method alone. This hybrid approach, often referred to as “materials informatics,” is being used to guide experimental efforts and streamline the discovery of new materials. Moreover, as computational power continues to grow, simulations are becoming increasingly sophisticated, allowing researchers to predict more complex material properties and design advanced materials with unprecedented precision.

2.3 Data-driven materials discovery platforms

The development of data-driven platforms for materials discovery has gained significant traction in recent years. These platforms harness large-scale datasets, advanced data mining techniques, and powerful computational tools to automate the discovery of novel materials. By compiling data from both experimental and computational sources, these platforms provide a wealth of information that can be used to predict material properties and suggest new material candidates for further study .

Key to the success of these platforms is the ability to integrate diverse datasets from various domains, such as chemistry, physics, and engineering, to create a comprehensive understanding of material behavior. A primary goal of these platforms is to reduce the time and cost associated with traditional materials discovery methods . With the aid of high-throughput experimental techniques and computational simulations, researchers can generate massive datasets of material properties, which can then be used to train machine learning models.

These models can subsequently identify relationships between material structure, composition, and properties, leading to the discovery of new materials with desired characteristics . For example, the Materials Project currently hosts data for over 140,000 inorganic compounds, while the Open Quantum Materials Database (OQMD) has facilitated the identification of thousands of stable ternary compounds using automated DFT workflows Saal et al. (2013).

To support property prediction, these platforms can also facilitate the optimization of materials for specific applications. By integrating optimization algorithms, researchers can identify the optimal material compositions, processing conditions, and manufacturing techniques to achieve targeted performance Yadalam et al. (2020). A notable example includes Citrination, which supports the development of thermoelectric materials by correlating Seebeck coefficient, electrical conductivity, and carrier concentration using ML-augmented searches Ward et al. (2016).

The integration of artificial intelligence (AI) and big data analytics within these platforms is further enhancing their capabilities, allowing for faster and more accurate predictions. As the use of data-driven discovery platforms grows, they are expected to play a pivotal role in the development of next-generation materials for applications ranging from energy storage to electronic devices.

3 Methods

3.1 Overview

Materials modeling involves the creation of mathematical models and simulations that describe the behavior of materials under various conditions. These models serve as the cornerstone for predicting material properties and guiding the design of new materials with specific characteristics. The field is vital for a wide array of industries, including manufacturing, aerospace, energy, and electronics. The objective of materials modeling is to provide accurate predictions of material behavior across a range of environments and scales, from atomic to macroscopic levels.

In Section 3.2, materials modeling integrates principles from various disciplines, including physics, chemistry, and engineering, to create predictive models that can simulate the physical and mechanical behavior of materials. The models range from atomic-scale simulations, which involve quantum mechanics and molecular dynamics, to continuum models that address bulk properties such as elasticity, plasticity, and thermal conductivity. In Section 3.4, the power of materials modeling lies in its ability to bridge the gap between theoretical understanding and practical application. By employing computational methods, researchers simulate material properties without the need for extensive experimental testing. This capability accelerates the material discovery process, reduces costs, and minimizes the reliance on trial and error in experimental setups. Section 3.5 introduces the fundamental aspects of materials modeling, focusing on the methodologies used to simulate material behavior and the various types of models that are commonly employed. In the following subsections, we will explore different modeling approaches, …including molecular dynamics (MD) simulations, continuum models, and machine learning (ML) techniques, all of which contribute to the broader field of materials science. We will also discuss the role of computational power in advancing materials modeling and the challenges that come with the complexity of simulating multi-scale material behaviors.

3.2 Preliminaries

In this section, we define the key concepts and mathematical formulations required to address the problem of materials modeling. Our approach considers the behavior of materials across different scales, ranging from atomic to macroscopic levels. The objective is to formulate the problem in a way that can be modeled using computational techniques, enabling the prediction of material properties under various environmental conditions.

At the atomic scale, we model materials by considering the interactions between atoms or molecules. The interactions are governed by potentials, such as the Lennard-Jones potential, which describe the forces between atoms as a function of their separation distance. Let represent the position of atom in a system of atoms. The potential energy of the system can be expressed as (Equation 1):where is the pairwise potential between atoms and . For instance, the Lennard-Jones potential is given by (Equation 2):where and are parameters specific to the material, and is the distance between atoms and .

The forces acting on each atom are derived from the negative gradient of the potential with respect to its position (Equation 3):

By solving the equations of motion for each atom, such as Newton’s second law , the system’s behavior can be simulated over time, providing insights into the material’s properties.

At larger scales, materials are modeled as continuous media, and their behavior is described by field equations such as the Navier-Cauchy equations for elasticity (Equation 4):where is the material density, is the displacement field, and is the stress tensor, which is a function of the strain, typically defined as (Equation 5):where is the material’s stiffness matrix and is the strain tensor, defined by (Equation 6):

The behavior of materials is also governed by thermodynamic principles. The internal energy of a material can be expressed as a function of temperature and entropy . The first law of thermodynamics states that (Equation 7):where is the pressure and is the volume. The free energy is given by (Equation 8):

And the condition for equilibrium is typically obtained by minimizing the Helmholtz free energy with respect to relevant parameters.

To simulate the behavior of materials effectively, multiscale models that bridge atomic, mesoscopic, and continuum scales are often required. These models employ methods like MD for atomistic simulations and finite element analysis (FEA) for continuum-level simulations. One common approach is to use coarse-grained models that approximate the behavior of large numbers of atoms or molecules by grouping them into larger units. The interaction between these units is then modeled at a higher level, such as through interatomic potentials or homogenized material properties.

ML techniques have been applied to materials modeling to predict material properties from large datasets. For example, supervised learning can be used to map material descriptors, such as composition and structure, to their corresponding properties. Let represent the feature vector for a given material and represent the target property. A model is trained by minimizing a loss function (Equation 9):where is the predicted value based on the input features and model parameters .

The following subsections will explore these modeling approaches in greater detail, with a particular focus on the methodologies that enable the effective simulation and design of materials across different scales.

3.3 Dimension normalization and thermodynamic constraint integration

To address the issue of numerical imbalance introduced by the direct concatenation of physical quantities with disparate dimensions–such as bond energy (eV), stress (GPa), and atomic volume (Å3)–we adopt a two-fold strategy involving dimension-wise normalization and thermodynamic constraint-aware feature construction:

3.3.1 Unit-wise Z-score normalization

Each physical quantity is standardized independently according to its unit group using (Equation 10):where and are the mean and standard deviation computed over the dataset for the specific unit (e.g., all features in eV, GPa, etc.). This prevents unit scale disparity from dominating the learning process.

3.3.2 Thermodynamic constraint encoding

We introduce physically informed features such as:

Formation enthalpy normalized by thermal energy scale , i.e., . Ratios like elastic modulus to cohesive energy: , indicating mechanical stability. Pugh’s ratio , capturing ductility vs. brittleness trends. Temperature-adjusted free energy corrections where available.

3.3.3 Dimensionless feature augmentation

Additional derived, unit-invariant descriptors are computed based on thermodynamic theory and solid-state physics, such as (Equation 11):

And cohesive-to-elastic moduli scaling factors to enhance physical interpretability.

This preprocessing pipeline ensures numerical stability, dimensional consistency, and the preservation of physically meaningful correlations in the feature space.

3.4 Innovations in multiscale modeling (IIMM)

In this section, we highlight the key innovations of our proposed model, designed to address core challenges in multiscale materials modeling. Throughout the following subsections, we refer to MD as molecular dynamics and ML as machine learning, as defined earlier. For clarity and to avoid redundancy, we do not repeat these full forms. More importantly, this section introduces specialized acronyms such as GFM (Global Fusion Module) and MAFM (Multiscale Attention Fusion Module), which are now fully defined in the text below to ensure reader understanding. By integrating atomistic simulations, continuum mechanics, and data-driven learning, the model presents a unified and efficient framework for material property prediction and design (As shown in Figure 1).

FIGURE 1

In the proposed Innovations in Multiscale Modeling (IIMM) framework, two custom modules are introduced: GFM (Global Fusion Module) and MAFM (Multiscale Attention Fusion Module). The GFM module performs global context aggregation via global average pooling and a fully connected projection layer to capture overall descriptor patterns. The MAFM module applies a multiscale attention mechanism to highlight locally important features from different modeling levels, enabling robust cross-scale fusion. These modules are inspired by attention mechanisms used in Transformer architectures and help capture complex hierarchical dependencies in material behaviors.

3.4.1 Integrated multiscale architecture

The proposed model establishes a unified multiscale architecture that tightly integrates molecular dynamics (MD) simulations, continuum mechanics, and machine learning (ML) techniques to enable comprehensive prediction and analysis of material behavior across scales (As shown in Figure 2). The Integrated Multiscale Architecture includes common machine learning components, such as LayerNorm (Layer Normalization), which normalizes activations across features to stabilize and accelerate training . The GELU (Gaussian Error Linear Unit) is used as a nonlinear activation function that retains negative input values with smooth probabilistic weighting, improving convergence over standard ReLU . The core of this architecture is the Transformer Encoder, originally proposed in Vaswani et al. (2017), consisting of multi-head self-attention, feedforward layers, residual connections, and normalization. It enables the model to extract global relationships among descriptors across modeling levels, which is critical for learning effective multiscale representations.

FIGURE 2

At the atomic level, MD simulations provide a detailed understanding of interatomic forces and thermal vibrations. The potential energy of the atomic system is modeled using a pairwise potential function (Equation 12):where denotes the Lennard-Jones potential capturing van der Waals interactions. The atoms follow Newton’s second law of motion (Equation 13):which is numerically integrated to simulate time evolution. The macroscopic properties emerging from these microscopic dynamics are upscaled using continuum mechanics. The strain tensor and stress tensor are defined respectively by (Equations 14, 15):where is the displacement field and is the elasticity tensor. These continuum descriptors provide boundary and initial conditions informed by atomistic simulations and real-world loading scenarios. To enhance predictive capability and computational efficiency, outputs from both MD and continuum models are fed into an ML model trained on large datasets. The ML component minimizes the loss (Equation 16):where represents the predicted material property, includes features from all scales, and denotes learnable parameters. This hierarchical structure allows the model to generalize across different materials and loading conditions, offering a scalable solution for materials design and optimization while preserving the physical interpretability of each modeling layer.

3.4.2 Data-Driven Feature Fusion

A novel aspect of the proposed framework is its hierarchical fusion of descriptors derived from different modeling scales, enabling a unified feature space that enhances the predictive capabilities of the machine learning (ML) model.

At the atomistic level, features such as bond energy , atomic displacement vectors , and atomic diffusivity are extracted through molecular dynamics (MD) simulations (Equation 17):

At the continuum level, stress and strain tensors are obtained using finite element simulations. Instead of re-stating classical elasticity theory, we extract high-level continuum descriptors, including von Mises stress and effective elastic modulus , directly from simulation outputs.

To bridge nanoscale and macroscale representations, we design a latent embedding network that projects atomistic and continuum features into a shared space. The resulting multiscale descriptor vector is (Equation 18):

This latent code is then used for downstream property prediction through a regression model , trained via (Equation 19):

To enforce physical consistency across scales, we introduce a cross-scale regularization term , which ensures that aggregated atomic-level stress aligns with continuum-level stress predictions (Equations 20, 21):

The final training objective becomes:where controls the strength of cross-scale alignment. This hybrid loss encourages the model to learn latent representations that reflect both nanoscale mechanisms and macroscale material responses.

Overall, this multiscale learning mechanism enables our model to unify descriptors across length scales, propagate physical constraints, and produce generalizable predictions across diverse material classes.

3.5 Integrated strategy for materials modeling

We propose an integrated strategy to tackle complex materials modeling challenges by unifying atomistic simulations, continuum mechanics, and machine learning within a cohesive framework. This approach enhances predictive accuracy, adaptability, and computational efficiency across diverse material systems and conditions (As shown in Figure 3). To facilitate replication of our approach, we clarify the key implementation details and reference architectures: - GFM and MAFM structures are adapted from the squeeze-and-excitation networks and non-local neural networks Wang et al. (2018), respectively, to suit materials modeling fusion tasks. - LayerNorm and GELU follow standard implementations available in PyTorch and TensorFlow libraries. - The Transformer Encoder follows the original structure in “Attention is All You Need” Vaswani et al. (2017) and is adapted for multiscale descriptor encoding, similar to approaches in molecular property prediction Schütt et al. (2018), .

FIGURE 3

3.5.1 Multiscale simulation coupling

The proposed multiscale modeling framework systematically incorporates three principal scale levels: (1) Microscale–capturing atomic-level interactions such as lattice configuration, interatomic bonding, and electronic density derived from DFT-calculated descriptors; (2) Mesoscale–describing grain boundaries, phase morphology, and microstructural heterogeneity, typically represented via voxel-based 3D grids or point cloud encodings; and (3) Macroscale–relating to continuum-level behaviors including elastic/plastic deformation, thermal conductivity, and impact response characteristics, derived from experimental data or FEM simulations. At each scale, appropriate computational representations and learning modules are employed. Microscale descriptors are fed into graph neural networks to capture fine-grained atomic topology, while mesoscale encodings are handled via convolutional and attention-based architectures to extract shape and neighborhood features. Macroscale inputs are processed using sequential and statistical encoders that model stress-strain relationships and dynamic loading behavior. The outputs from each scale-specific encoder are fused via a learned coupling mechanism that preserves their physical relevance and allows cross-scale interactions to emerge hierarchically during model training. This hierarchical multiscale abstraction enables the IIMM model to effectively learn from diverse data representations while maintaining consistency with the underlying physics across all levels. A key innovation in our framework lies in the direct and dynamic coupling between atomistic simulations, such as molecular dynamics (MD), and continuum-level models, enabling the consistent exchange of information across scales (As shown in Figure 4). All components are implemented in PyTorch. We use the Adam optimizer with weight decay, early stopping based on validation loss, and learning rate warm-up strategies to ensure stable training. Full implementation code will be released upon publication to support reproducibility.

FIGURE 4

Traditional multiscale methods often treat atomistic and continuum models separately, passing data only at initialization or post-processing stages, which limits responsiveness and fails to capture transient phenomena. In contrast, our approach introduces a bidirectional mapping function that transforms atomistic descriptors–such as atomic positions , bond energies , and local temperature –into continuum field variables like displacement , stress , and strain (Equations 22, 23):where is the fourth-order elasticity tensor. This mapping is not merely geometric but incorporates energy and force information from MD into the mechanical response of the material.

To resolve the temporal disparity between MD (femtosecond scale) and continuum (second scale) simulations, we adopt a hierarchical coupling scheme that aggregates MD outputs over temporal windows via statistical encoders (e.g., moving averages, fluctuation magnitudes, spectral coefficients) to align with the continuum time resolution. Conversely, continuum fields are temporally interpolated and corrected to guide MD evolution, enabling consistent feedback across asynchronous time steps. This coupling is embedded in via temporal fusion layers that are jointly trained with the full system.

On the spatial interface, we address mismatches at the atomistic-continuum boundary using a ghost-node blending mechanism: MD interface atoms are surrounded by auxiliary virtual atoms influenced by nearby continuum stress and displacement fields, while finite element (FEM) mesh elements near the interface are dynamically corrected using localized atomic stress tensors. A spatial interface loss penalizes inconsistencies in overlapping boundary regions, ensuring smooth transitions and physical consistency across scales. These strategies enhance cross-scale stability, suppress interface artifacts, and allow transient and localized phenomena to be captured more faithfully throughout the multiscale pipeline. Simultaneously, continuum-scale stress updates and temperature gradients are used to adjust boundary conditions or driving forces in the MD simulation through a feedback function (Equation 24):where denotes the modified atomic forces incorporating macroscopic influences. To ensure numerical stability and physical fidelity during scale bridging, we adopt an energy-conserving coupling scheme where the total energy of the system is partitioned as (Equation 25)with accounting for the shared energy in the transition region to avoid double-counting. Furthermore, we define a weighted interpolation function over the domain such that atomistic resolution is preserved in regions of high gradient while continuum descriptions are used elsewhere. This is mathematically expressed as (Equation 26):where and are displacement fields from atomistic and continuum models, respectively. This tightly integrated, real-time exchange between scales allows our model to adaptively allocate computational resources, maintain accuracy near critical regions, and significantly reduce the need for redundant computations, making it suitable for modeling complex phenomena such as crack propagation, phase transitions, and plastic deformation with high fidelity and efficiency.

3.5.2 Unified Multiphysics Features

To enable effective machine learning (ML) predictions across scales, we construct a unified, high-dimensional feature vector that encapsulates rich descriptors from both atomistic and continuum domains. This feature representation integrates key physical quantities that govern material behavior, thereby bridging microscopic mechanisms with macroscopic responses. At the atomistic level, we extract local energy contributions, including bond energy , coordination number , and atomic displacement vectors . From the continuum side, we incorporate stress , strain , and thermal gradients , which capture deformation and heat transfer phenomena. The combined feature vector is formally expressed as (Equation 27):

Providing a compact yet expressive encoding of the material state. These descriptors are computed from simulation outputs or experimental data and undergo normalization to ensure consistency across scales. Stress and strain tensors are transformed into invariant scalar forms using the von Mises equivalent stress and strain (Equation 28):where and are the deviatoric components of stress and strain, respectively. Thermal effects are captured through the spatial temperature gradient, which influences diffusion and phase behavior. The local temperature profile is derived via (Equation 29):

Computed from either finite-difference schemes or analytical fits to simulation data. To capture interatomic distortions, the displacement field is included as:where denotes equilibrium positions. The bond energy is averaged across local environments to provide a robust scalar input:where is the number of bonds in the local atomic neighborhood. This comprehensive and scale-aware feature encoding allows ML models to capture nonlinear correlations between structural, mechanical, and thermal factors–facilitating highly accurate predictions of target properties such as fracture toughness, diffusion coefficients, or elastic moduli with minimal additional simulation cost.

3.5.3 ML-based property prediction

At the core of our framework lies a supervised machine learning (ML) model designed to predict key material properties –such as yield strength, thermal conductivity, or phase stability–based on the unified feature vector . This approach leverages high-fidelity simulation or experimental data to train a parametric model , where represents the learnable parameters. The training process minimizes the mean squared error (MSE) between predicted and true property values (Equation 32):where is the number of training samples. To prevent overfitting and promote model generalization, we introduce an regularization term (also known as weight decay), yielding the modified loss (Equation 33):where is the regularization strength. Depending on the complexity of the problem, different model architectures such as deep neural networks (DNNs), Gaussian processes, or gradient boosting machines may be employed. For DNNs, the feature vector passes through multiple hidden layers with nonlinear activations, capturing high-order interactions among input descriptors. To guide the model toward informative regions of the feature space, an active learning strategy can be integrated, whereby new data points are selected based on uncertainty estimates. A common selection criterion is based on predictive variance (Equation 34):

Prioritizing regions with high uncertainty for additional simulation or experimental validation. This loop can be iterated, allowing the model to refine itself over time with minimal additional data. In the inference phase, once trained, the model enables rapid property predictions with minimal computational overhead. To evaluate performance, statistical metrics such as coefficient of determination and mean absolute error (MAE) are computed. For instance, the score is defined as (Equation 35):where is the mean of the true labels. When multiple material properties are predicted simultaneously, the model is optimized with a composite loss function (Equation 36):where is the number of tasks, is the loss for task , and are task-specific weights. This multi-objective formulation is particularly useful for balancing trade-offs between competing material design criteria.

4 Dataset

The AFLOW Dataset is a comprehensive repository of materials properties generated using high-throughput ab initio calculations. It contains data on electronic structure, mechanical, and thermal properties for millions of materials, including both experimentally known and hypothetical compounds. The AFLOW (Automatic FLOW) framework enables systematic and reproducible density functional theory (DFT) computations, and its database supports tasks such as crystal structure prediction, materials screening, and inverse design. Due to its scale and detail, AFLOW is widely adopted for materials informatics and machine learning applications in computational materials science. The dataset includes standardized metadata, symmetry analysis, and topology recognition modules, which are particularly valuable for supervised and unsupervised learning models in predicting phase transitions, property trends, and synthesis pathways. The Open Quantum Materials Database (OQMD) is a curated database that includes over 500,000 DFT-calculated materials properties for a wide variety of compounds and alloy systems. It provides formation energies, crystal structures, and phase stability information for materials across the periodic table. The OQMD is especially notable for its comprehensive coverage of both real and hypothetical materials and is designed to support the discovery of novel compounds and the analysis of thermodynamic stability. It supports multiple exchange-correlation functionals and calculation protocols, offering diverse data for benchmarking and model generalization. Its utility extends to both academic and industrial applications in materials discovery and energy storage, particularly for screening of solid electrolytes and multicomponent alloys. The standardized DFT workflows in OQMD are also well-suited for training robust regression models and generative models for structure-property prediction. The JARVIS Dataset Sandur et al. (2022) (Joint Automated Repository for Various Integrated Simulations) is a rich collection of datasets developed by the National Institute of Standards and Technology (NIST) for materials design and discovery. It includes computed properties such as bandgaps, dielectric constants, elastic tensors, and formation energies, using both DFT and machine learning models. JARVIS emphasizes reproducibility and standardization, and it covers both bulk materials and 2D materials. The dataset also includes spin-orbit coupling effects and many-body dispersion corrections, making it highly relevant for modeling quantum phenomena and advanced solid-state systems. The inclusion of computational workflows such as GW calculations and machine-learned interatomic potentials extends its utility for multi-scale modeling. The dataset has gained significant popularity for benchmarking algorithms in quantum materials design and is also widely used for developing predictive models and transfer learning strategies in materials informatics, particularly in 2D material screening and defect engineering. The Materials Project Dataset Vecchio and Deschaintre (2024) is one of the most widely used resources in computational materials science, providing open-access DFT-calculated properties for over 140,000 inorganic compounds. It includes structural data, band structures, elastic moduli, and various thermodynamic properties. The Materials Project offers a user-friendly interface and robust API access, supporting researchers in querying and analyzing materials for a wide array of applications including battery design, catalysis, and photovoltaics. Its integration with tools like pymatgen, FireWorks, and custodian has made it a foundational platform for data-driven materials research and automated high-throughput workflows. It provides open-source repositories for workflow management and error correction, which are widely adopted in academic and industrial research. The dataset is also frequently used in graph neural network training and structure-based representation learning, enhancing its value for deep learning-based materials prediction pipelines.

5 Experimental setup

5.1 Experimental details

In this section, we describe the overall configuration and procedures used to train and evaluate the proposed IIMM model.

We adopt a 10-fold cross-validation strategy and additionally split each dataset into training (70%), validation (15%), and testing (15%) subsets, ensuring class balance through stratified sampling where applicable. Model checkpoints are selected based on the best validation F1-score, and the final evaluation is reported on the held-out test set. All results are averaged over three runs with different random seeds to mitigate variance due to random initialization. The training process uses a batch size of 32, with an initial learning rate of 0.001. We employ the Adam optimizer with default , for all experiments. The learning rate is decayed by a factor of 0.1 every 30 epochs, and training is stopped after 100 epochs or earlier if validation performance plateaus.

Hyperparameter tuning is performed using grid search over learning rate {0.01, 0.001, 0.0005}, batch size {16, 32, 64}, and L2 regularization strength {0, 1e-4, 5e-4}. Weights yielding the highest mean validation F1-score across cross-validation folds are selected. To evaluate model generalizability, we conduct cross-dataset transfer experiments: training on three datasets and testing on the fourth. This allows us to assess robustness across domains and data representations. Data Preprocessing and Modalities:

  • - AFLOW, OQMD: Input features are voxelized 3D representations derived from crystal structures. Rotational augmentation and symmetry-preserving transformations based on space group analysis are applied to capture crystallographic invariants.

  • - JARVIS: Stereo RGB images and LiDAR point clouds are pre-processed using statistical outlier removal and voxel downsampling. Point clouds are converted into structured tensors via multi-scale neighborhood encoding. Additional features such as reflectance intensity and depth gradients are included.

  • - Materials Project: Depth maps and RGB images are used for semantic segmentation. Inputs are normalized, cropped, and augmented using flipping, rotation, color jittering, and elastic distortion. A dual-branch encoder processes RGB and depth separately, fusing them at multiple levels.

Data augmentation techniques such as elastic noise, Gaussian jittering, and random spatial transforms are applied consistently across datasets to improve robustness.

Evaluation Metrics: Depending on the task, we use accuracy, precision, recall, F1-score, mIoU (for segmentation), and mAP (for object detection with different IoU thresholds).

All experiments are implemented in PyTorch and executed on an NVIDIA Tesla V100 GPU with 256GB RAM and dual Intel Xeon CPUs. Experiment tracking is managed with Weights & Biases and Hydra, ensuring reproducibility of configurations and logging. All code and configuration files are archived for future release.

5.2 Comparison with SOTA methods

In this section, we compare the performance of our proposed method with several state-of-the-art (SOTA) methods across different datasets. The evaluation is based on key performance metrics such as Precision, Recall, F1 Score, and Area Under the Curve (AUC). The results are shown in Tables 1, 2 for the AFLOW Dataset, OQMD Dataset, JARVIS Dataset, and Materials Project Dataset.

TABLE 1

ModelAFLOW datasetOQMD dataset
PrecisionRecallF1 ScoreAUCPrecisionRecallF1 ScoreAUC
NCF Zhang et al. (2021a)80.120.0278.940.0379.100.0287.250.0185.570.0383.110.0284.350.0389.440.02
AutoRec 81.470.0379.030.0180.250.0286.510.0384.090.0282.920.0383.010.0388.700.01
NeuralCF 79.980.0277.530.0178.100.0286.930.0383.730.0381.500.0282.200.0189.100.02
DeepRec Zhang et al. (2021b)82.190.0180.240.0281.120.0288.080.0285.400.0284.050.0384.750.0290.050.02
VGG-Rec Yang et al. (2021)79.60078.220.0278.890.0385.630.0286.020.0383.710.0184.120.0288.970.03
GraphRec 83.300.0281.400.0381.750.0289.110.0184.910.0382.250.0283.100.0188.340.02
Ours (IIMM)85.750.0384.500.0285.100.0290.180.0287.890.0285.230.0186.320.0291.030.01

Comparison of Ours with SOTA methods AFLOW Dataset and OQMD Dataset for IIMM Systems.

The values in bold are the best values.

TABLE 2

ModelJARVIS datasetMaterials project dataset
PrecisionRecallF1 ScoreAUCPrecisionRecallF1 ScoreAUC
NCF Zhang et al. (2021a)76.120.0374.870.0275.450.0282.430.0379.100.0277.220.0378.130.0284.670.02
AutoRec 78.220.0275.920.0376.470.0283.710.0281.340.0379.500.0280.140.0385.210.02
NeuralCF 77.080.0274.960.0375.550.0182.970.0278.500.0276.350.0277.420.0283.840.03
DeepRec Zhang et al. (2021b)80.590.0178.240.0379.000.0285.290.0382.760.0281.220.0281.990.0286.150.02
VGG-Rec Yang et al. (2021)75.640.0373.920.0274.280.0382.150.0280.510.0279.050.0279.790.0383.960.02
GraphRec 79.440.0277.380.0278.290.0283.880.0181.020.0279.830.0380.410.0284.100.01
Ours (IIMM)82.340.0380.190.0281.150.0286.870.0283.550.0381.800.0182.650.0287.230.03

Comparison of Ours with SOTA methods on JARVIS Dataset and Materials Project Dataset for IIMMs.

The values in bold are the best values.

We observe that our method outperforms all existing methods across all metrics on both the AFLOW Dataset and OQMD Dataset. IIMM achieves the highest Precision, Recall, F1 Score, and AUC, demonstrating superior performance in recommendation systems for 3D data. The precision for the AFLOW Dataset is 85.750.03 and for the OQMD Dataset is 87.890.02, both of which are the highest compared to the SOTA methods. IIMM also excels on the JARVIS Dataset and Materials Project Dataset. On the JARVIS Dataset, our method achieves a Precision of 82.340.03 and an AUC of 86.870.02, outperforming the other methods by a significant margin. For the Materials Project Dataset, IIMM achieves a Precision of 83.550.03 and an AUC of 87.230.03, further proving its superiority in real-world scenarios. These results consistently show that IIMM not only outperforms traditional methods but also achieves more stable performance across different types of datasets, demonstrating its versatility and adaptability in handling diverse recommendation tasks. Furthermore, the low standard deviations across multiple experimental runs underscore the model’s robustness and consistency, which are essential for large-scale material discovery and deployment. The ability to maintain high performance across domains of varying data complexity confirms the generalization capability of our architecture. Unlike prior methods that may overfit or underperform when applied to out-of-domain data, IIMM preserves its predictive strength under cross-dataset evaluation.

The remarkable performance of IIMM can be attributed to its ability to effectively leverage the underlying structure of the data and its robust feature extraction mechanisms. The inclusion of advanced recommendation techniques, such as data-driven feature fusion and multiscale coupling, along with the fine-tuning of hyperparameters, contribute to its outstanding results across all the datasets evaluated. These advancements allow the model to better capture the complex relationships between materials and their properties, making it highly effective in practical applications. The joint encoding of physical, structural, and simulation-derived features enables the model to go beyond surface-level correlations and model deeper, non-linear interactions that are often critical in scientific domains. These results firmly establish IIMM as a leading approach for recommendation systems in 3D and real-world data.

To further assess the practical utility of our integrated multiscale modeling approach, we conducted a supplementary experiment focused on the impact response of epoxy-based composite materials. This scenario is inspired by real-world requirements in aerospace structures, where accurate prediction of mechanical behavior under dynamic loading is critical. Table 3 presents a detailed comparison between the predicted properties obtained from different modeling approaches and the actual values measured through physical experiments. Three models were evaluated: a baseline finite element method (FEM), a traditional multiscale coupled model, and our proposed IIMM-enhanced framework. The baseline FEM predicted a peak stress of 112.4 MPa and energy absorption of 18.9 J, showing a moderate alignment with experimental values but lower overall accuracy (88.3%). The multiscale coupled model demonstrated improved performance, with predicted peak stress reaching 119.6 MPa and an accuracy of 94.7%. In contrast, our IIMM-enhanced model achieved the closest agreement with the experimental data, predicting 122.3 MPa peak stress, 4.15% fracture strain, and 22.9 J energy absorption, with an accuracy of 96.5%. The experimental results show that the actual peak stress reached 122.9 MPa with an average fracture strain of 4.10%, indicating that our model not only provides high predictive precision but also reliably captures the nonlinear deformation mechanisms inherent in composite materials. Additionally, the low standard deviation (0.8) across repeated experimental trials confirms the consistency of the measurements. These results validate the capability of our model to solve practical material design problems, highlighting its value in real-world applications such as impact-resistant structural components in aerospace engineering.

TABLE 3

MethodPredicted results (model)Measured results (experiment)
Peak Stress (MPa)Fracture Strain (%)Energy Absorption (J)Accuracy (%)Peak Stress (MPa)Fracture Strain (%)Energy Absorption (J)Std Dev
Baseline FEM 112.43.8218.988.3115.23.9519.51.1
Multiscale Coupled 119.64.0221.794.7120.14.0021.50.9
Ours (IIMM-Enhanced)122.34.1522.996.5122.94.1022.60.8

Modeling vs. Experimental Comparison on Epoxy Composite Properties in Impact Loading.

The values in bold are the best values.

Table 4 presents a comparative physics-informed error analysis across three material classes: crystalline solids, multicomponent alloys, and amorphous materials. The mean absolute error (MAE) values demonstrate that our proposed model significantly outperforms baseline methods across all material types. Notably, the largest improvement is observed for multicomponent alloys, with a 33.2% reduction in MAE, highlighting the model’s capacity to handle chemically complex systems. The score improvements indicate better predictive alignment with ground truth values. However, for amorphous materials, the score remains relatively low (0.71), even though it improves over the baseline. This reflects the inherent modeling challenges posed by the lack of long-range order in amorphous systems, which often degrade ML model generalization. To further probe the physical underpinnings of prediction errors, we computed the correlations between atomic defect density and prediction errors. The amorphous group exhibits a notably higher Pearson correlation coefficient , underscoring the significant role of local atomic disorder in influencing prediction uncertainty. This evidence substantiates the importance of incorporating defect-sensitive features in machine learning models, particularly for non-crystalline systems.

TABLE 4

Material typeMAE (GPa) scoreDefect-prediction error correlation
BaselineOursBaselineOursPearsonSpearmanKendall
Crystalline1.230.040.880.03−28.5%0.870.92+5.7%0.420.390.28
Multicomponent Alloys2.470.061.650.05−33.2%0.750.83+10.7%0.510.480.35
Amorphous Materials3.020.082.440.07−19.2%0.680.71+4.4%0.630.600.45

Physics-based error analysis for crystalline, multicomponent, and amorphous materials.

The values in bold are the best values.

5.3 Ablation study

We conduct an ablation study on the AFLOW, OQMD, JARVIS, and Materials Project datasets to investigate the individual contributions of key components in our IIMM architecture. We isolate three critical modules: Data-Driven Feature Fusion, Multiscale Simulation Coupling, and Unified Multiphysics Features. By systematically removing each component, we analyze their individual impact on recommendation performance. The results are presented in Table 5, 6.

TABLE 5

ModelAFLOW datasetOQMD dataset
PrecisionRecallF1 ScoreAUCPrecisionRecallF1 ScoreAUC
w/o Data-Driven Feature Fusion79.450.0276.880.0377.120.0283.040.0280.230.0178.470.0279.150.0384.520.02
w/o Multiscale Simulation Coupling80.360.0277.900.0378.620.0183.610.0381.920.0280.130.0380.850.0184.830.03
w/o Unified Multiphysics Features78.900.0276.590.0377.250.0282.980.0379.710.0278.200.0378.470.0284.190.01
Ours (IIMM)85.750.0384.500.0285.100.0290.180.0287.890.0285.230.0186.320.0291.030.01

Ablation study results on IIMM components across AFLOW dataset and OQMD dataset.

The values in bold are the best values.

TABLE 6

ModelJARVIS datasetMaterials project dataset
PrecisionRecallF1 ScoreAUCPrecisionRecallF1 ScoreAUC
w/o Data-Driven Feature Fusion75.310.0273.890.0374.190.0281.260.0378.950.0277.410.0177.930.0383.720.01
w/o Multiscale Simulation Coupling76.150.0274.630.0375.140.0282.130.0379.440.0378.120.0278.560.0284.200.02
w/o Unified Multiphysics Features74.820.0173.070.0273.590.0281.840.0277.220.0275.610.0176.080.0382.550.02
Ours (IIMM)82.340.0380.190.0281.150.0286.870.0283.550.0381.800.0182.650.0287.230.03

Ablation study results on IIMM components across JARVIS dataset and materials project dataset.

The values in bold are the best values.

Removing Data-Driven Feature Fusion leads to a noticeable drop in performance, particularly on the AFLOW and OQMD datasets. This highlights the importance of fusing various data sources for effective feature representation. For example, on the AFLOW dataset, the precision drops from 85.75% to 79.45%, and recall decreases by 3.62%. Similarly, on the OQMD dataset, precision drops from 87.89% to 80.23%. These results suggest that this module plays a fundamental role in leveraging complex, multi-source data—including structural descriptors, thermodynamic properties, and electronic configurations—to improve precision and recall. The fusion mechanism enables the model to capture latent correlations across modalities, which single-source models often overlook. Removing the Multiscale Simulation Coupling also reduces model performance, although the effect is less pronounced than the removal of Data-Driven Feature Fusion. On both the AFLOW and OQMD datasets, precision and recall drop by a few percentage points, indicating that capturing information at multiple scales is beneficial for performance. For instance, on the AFLOW dataset, the precision decreases from 85.75% to 80.36%, and the recall drops from 84.50% to 77.90%. This suggests that incorporating multiscale information—such as atomic-scale descriptors, mesoscale structures, and macro-level phenomena—provides finer granularity and enhances the model’s ability to make accurate predictions in complex material systems. Removing Unified Multiphysics Features causes a small but noticeable performance decline, particularly on the JARVIS and Materials Project datasets. For example, on the JARVIS dataset, precision decreases from 82.34% to 74.82%, and recall drops from 80.19% to 73.07%. Similarly, on the Materials Project dataset, precision drops from 83.55% to 77.22%, and recall decreases from 81.80% to 75.61%. This module ensures that the model can capture the underlying physics of materials and their interactions—such as coupled thermal-electrical behaviors, phase stability dynamics, and mechanical response features—which is crucial for making accurate, physically meaningful recommendations.

6 Conclusion and future work

In this study, we propose a multiscale modeling approach that integrates atomistic simulations, continuum mechanics, and machine learning (ML) techniques to predict material properties more efficiently. Traditional experimental methods for material design are often time-consuming and costly, heavily relying on extensive trial-and-error processes. To address these challenges, the authors employ molecular dynamics (MD) simulations at the atomic scale in conjunction with continuum mechanics models for macroscopic material behavior. By incorporating ML algorithms into this framework, they enhance predictive accuracy and streamline the material design pipeline. The experimental results indicate that this integrated approach not only reduces computational costs significantly but also accelerates the prediction process while maintaining or even improving accuracy compared to conventional techniques.

Despite the promising outcomes, the authors acknowledge two primary limitations. Although the method improves accuracy, it remains dependent on several assumptions within the modeling framework, which may not fully capture the complex behavior of materials, particularly under extreme conditions such as high temperature, pressure, or irradiation. The effective deployment of machine learning models necessitates large and high-quality datasets to achieve robust performance. However, such datasets may be scarce or unavailable for novel or emerging materials, limiting the generalizability and applicability of the approach. Future research could address these constraints by refining the modeling framework to incorporate nonlinear response mechanisms or coupling effects that account for a wider range of material behaviors. Exploring more effective data generation strategies–such as physics-informed generative adversarial networks (GANs) or transfer learning techniques–may enable high-quality predictions even under data-scarce scenarios. As computational power and AI algorithms continue to advance, the deeper integration of these technologies is expected to deliver even more precise and efficient material property predictions. This evolving approach not only enhances research productivity but also serves as a pivotal tool for accelerating the discovery and development of new materials, thereby shortening the translation from fundamental science to practical engineering applications.

Statements

Data availability statement

The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.

Author contributions

MY: Writing – review and editing, Writing – original draft. JL: Writing – review and editing, Writing – original draft. CC: Writing – original draft, Writing – review and editing. ML: Writing – review and editing, Writing – original draft.

Funding

The author(s) declare that no financial support was received for the research and/or publication of this article.

Conflict of interest

Author CC was employed by the company China National Offshore Oil Corporation Limited.

The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declare that no Generative AI was used in the creation of this manuscript.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

  • 1

    AmoresV. J.MontánsF. J.CuetoE.ChinestaF. (2022). Crossing scales: data-driven determination of the micro-scale behavior of polymers from non-homogeneous tests at the continuum-scale. Front. Mater.9, 879614. 10.3389/fmats.2022.879614

  • 2

    ArgyriouA.González-FierroM.ZhangL. (2020). “Microsoft recommenders: best practices for production-ready recommendation systems,” in The web conference.

  • 3

    BaJ. L.KirosJ. R.HintonG. E. (2016). Layer normalization. arXiv preprint arXiv:1607.06450

  • 4

    BoudabiaS.MahdjoubY.DraouiA.BensalemF. Z.LaiourateK. (2024). “Material property prediction with cnn-lstm hybrid models and periodic table as input representation,” in International symposium on modelling and implementation of complex systems (Springer), 102111.

  • 5

    Cabrera-SánchezJ.-P.de LunaI. R.Carvajal-TrujilloE.Villarejo-RamosÁ. F. (2020). Online recommendation systems: factors influencing use in e-commerce. Sustainability12. 10.3390/su12218888

  • 6

    ChakrabortyS.HoqueM. S.JeemN. R.BiswasM.BardhanD.LobatonE. (2021). Fashion recommendation systems, models and methods: a review. Informatics8, 49. 10.3390/informatics8030049

  • 7

    ChenC.YeW.ZuoY.ZhengC.OngS. P. (2022). Graph networks as a universal machine learning framework for molecules and crystals. Nat. Rev. Mater.7, 653672. 10.1021/acs.chemmater.9b01294

  • 8

    ChiZ.BeileL.DeyuL.YuboF. (2022). Application of multiscale coupling models in the numerical study of circulation system. Med. Nov. Technol. Devices14, 100117. 10.1016/j.medntd.2022.100117

  • 9

    ClementC. L.KauweS. K.SparksT. D. (2020). Benchmark aflow data sets for machine learning. Integrating Mater. Manuf. Innovation9, 153156. 10.1007/s40192-020-00174-4

  • 10

    DhelimS.AungN.BourasM. A.NingH.CambriaE. (2021). A survey on personality-aware recommendation systems. Artif. Intell. Rev.55, 24092454. 10.1007/s10462-021-10063-7

  • 11

    DuongT. N.PhamT. T. T.TranH. M.QuocD. P.NguyenH. D.NguyenH. P.et al (2024). “A novel autorec-based architecture for recommendation system,” in 2024 tenth international conference on communications and electronics (ICCE) (IEEE), 469474.

  • 12

    FanY.DingJ. (2020). Atomistic study of crack propagation in metallic glasses using molecular dynamics. Acta Mater.196, 145158. 10.1007/s42452-022-05170-1

  • 13

    FayyazZ.EbrahimianM.NawaraD.IbrahimA.KashefR. (2020). Recommendation systems: algorithms, challenges, metrics, and business opportunities. Appl. Sci.10, 7748. 10.3390/app10217748

  • 14

    FengC.KhanM.RahmanA. U.AhmadA. (2020). News recommendation systems - accomplishments, challenges and future directions. IEEE AccessAvailable online at: https://ieeexplore.ieee.org/abstract/document/8963698/.

  • 15

    ForouzandehS.RostamiM.BerahmandK. (2022). A hybrid method for recommendation systems based on tourism with an evolutionary algorithm and topsis model. Fuzzy Inf. Eng.14, 2650. 10.1080/16168658.2021.2019430

  • 16

    FuZ.XianY.ZhangY.ZhangY. (2020). “Tutorial on conversational recommendation systems,” in ACM conference on recommender systems.

  • 17

    GaS.ChoP. H.MoonG. E.JungS. (2025). Efficient gnn-based social recommender systems through social graph refinement. J. Supercomput.81, 215224. 10.1007/s11227-024-06682-w

  • 18

    HendrycksD.GimpelK. (2016). Gaussian error linear units (gelus). arXiv Prepr. arXiv:1606.08415, Available online at: https://arxiv.org/abs/1606.08415.

  • 19

    HsiaS.GuptaU.WilkeningM.WuC.-J.WeiG.-Y.BrooksD. (2020). “Cross-stack workload characterization of deep recommendation systems,” in IEEE international symposium on workload characterization.

  • 20

    HuJ.ShenL.SunG. (2018). “Squeeze-and-excitation networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 71327141.

  • 21

    HwangS.ParkE. (2022). Movie recommendation systems using actor-based matrix computations in South Korea. IEEE Trans. Comput. Soc. Syst.9, 13871393. 10.1109/tcss.2021.3117885

  • 22

    IvchenkoD.StaayD. V. D.TaylorC.LiuX.FengW.KindiR.et al (2022). “Torchrec: a pytorch domain library for recommendation systems,” in ACM conference on recommender systems.

  • 23

    JadidinejadA. H.MacdonaldC.OunisI. (2021). The simpson’s paradox in the offline evaluation of recommendation systems. ACM Trans. Inf. Syst.40, 122. 10.1145/3458509

  • 24

    JainA.OngS. P.HautierG.ChenW.RichardsW. D.DacekS.et al (2013). Commentary: the materials project: a materials genome approach to accelerating materials innovation. Apl. Mater.1, 011002. 10.1063/1.4812323

  • 25

    JavedU.ShaukatK.HameedI.IqbalF.AlamT. M.LuoS. (2021). A review of content-based and context-based recommendation systems. Int. J. Emerg. Technol. Learn. (iJET)16, 274. 10.3991/ijet.v16i03.18851

  • 26

    JhaD.WardL.PaulA.LiaoW.-k.ChoudharyA.WolvertonC.et al (2018). Elemnet: deep learning the chemistry of materials from only elemental composition. Sci. Rep.8, 17593. 10.1038/s41598-018-35934-y

  • 27

    KanwalS.NawazS.MalikM. K.NawazZ. (2021). A review of text-based recommendation systems. IEEE Access9, 3163831661. 10.1109/access.2021.3059312

  • 28

    KhanH. U. R.LimC.AhmedM.TanK.MokhtarM. B. (2021). Systematic review of contextual suggestion and recommendation systems for sustainable e-tourism. Sustainability13, 8141. 10.3390/su13158141

  • 29

    KoH.LeeS.ParkY.ChoiA. (2022). A survey of recommendation systems: recommendation models, techniques, and application fields. Electronics11, 141. 10.3390/electronics11010141

  • 30

    KreutzC. K.SchenkelR. (2022). Scientific paper recommendation systems: a literature review of recent publications. Int. J. Digital Libr.23, 335369. 10.1007/s00799-022-00339-w

  • 31

    LeeD.GopalA.ParkS.-H. (2020). Different but equal? a field experiment on the impact of recommendation systems on mobile and personal computer channels in retail. Inf. Syst. Res.31, 892912. 10.1287/isre.2020.0922

  • 32

    LyuL.YuanK.ZhuW. (2024). A novel demodulation method with a reference signal for operational modal analysis and baseline-free damage detection of a beam under random excitation. J. Sound Vib.571, 118068. 10.1016/j.jsv.2023.118068

  • 33

    MaierC.SimoviciD. (2022). Bipartite graphs and recommendation systems. J. Adv. Inf. Technol.13. 10.12720/jait.13.3.249-258

  • 34

    MashayekhiY.LiN.KangB.LijffijtJ.BieT. D. (2022). A challenge-based survey of e-recruitment recommendation systems. ACM Comput. Surv.56, 133. 10.1145/3659942

  • 35

    NawaraD.KashefR. (2020). “Iot-based recommendation systems - an overview,” in 2020 IEEE international IOT, electronics and mechatronics conference (IEMTRONICS).

  • 36

    NawaraD.KashefR. (2021). Context-aware recommendation systems in the iot environment (iot-cars)-a comprehensive overview. IEEE Access9, 144270144284. 10.1109/access.2021.3122098

  • 37

    ÖzyılkanE.CarpiF.GargS.ErkipE. (2024). “Neural compress-and-forward for the relay channel,” in 2024 IEEE 25th international Workshop on signal processing Advances in wireless communications (SPAWC) (IEEE), 366370.

  • 38

    PerdewJ. P.BurkeK.ErnzerhofM. (1996). Generalized gradient approximation made simple. Phys. Rev. Lett.77, 38653868. 10.1103/physrevlett.77.3865

  • 39

    PilaniaG.Mannodi-KanakkithodiA.UberuagaB. P.RamprasadR.GubernatisJ. E.LookmanT. (2016). Machine learning bandgaps of double perovskites. npj Comput. Mater.2, 16079. 10.1038/srep19375

  • 40

    RediesM.MichalicekG.BouazizJ.TerbovenC.MüllerM. S.BlügelS.et al (2022). Fast all-electron hybrid functionals and their application to rare-earth iron garnets. Front. Mater.9, 851458. 10.3389/fmats.2022.851458

  • 41

    RoccoJ. D.RuscioD. D.SipioC. D.NguyenP. T.RubeiR. (2021). Development of recommendation systems for software engineering: the crossminer experience. Empir. Softw. Eng.26, 69. 10.1007/s10664-021-09963-7

  • 42

    SaalJ. E.KirklinS.AykolM.MeredigB.WolvertonC. (2013). Materials design and discovery with high-throughput density functional theory: the open quantum materials database (oqmd). JOM65, 15011509. 10.1007/s11837-013-0755-4

  • 43

    SandurA.ParkC.VolosS.AghaG.JeonM. (2022). “Jarvis: large-scale server monitoring with adaptive near-data processing,” in 2022 IEEE 38th international conference on data engineering (ICDE) (IEEE), 14081422.

  • 44

    SchüttK. T.SaucedaH. E.KindermansP.-J.TkatchenkoA.MüllerK.-R. (2018). Schnet–a deep learning architecture for molecules and materials. J. Chem. Phys.148, 241722. 10.1063/1.5019779

  • 45

    ShiS.GongY.GursoyD. (2020). Antecedents of trust and adoption intention toward artificially intelligent recommendation systems in travel planning: a heuristic-systematic model. J. Travel Res.60, 17141734. 10.1177/0047287520966395

  • 46

    ShoghiR.HartmaierA. (2022). Optimal data-generation strategy for machine learning yield functions in anisotropic plasticity. Front. Mater.9, 868248. 10.3389/fmats.2022.868248

  • 47

    StukhlyakP.BuketovA.PaninS.MaruschakP.MorozK.PoltaraninM.et al (2015). Structural fracture scales in shock-loaded epoxy composites. Phys. Mesomech.18, 5874. 10.1134/s1029959915010075

  • 48

    SunZ.XuY.LiuY.HeW.JiangY.WuF.et al (2022). A survey on federated recommendation systems. IEEE Trans. Neural Netw. Learn. Syst.36, 620. 10.1109/tnnls.2024.3354924

  • 49

    TranK.UlissiZ. W. (2018). An uncertainty quantification framework for predictive materials modeling using machine learning. npj Comput. Mater.4, 29. Available online at: https://www.nature.com/articles/s41929-018-0142-1.

  • 50

    Urdaneta-PonteM. C.Méndez-ZorrillaA.Oleagordia-RuízI. (2021). Recommendation systems for education: systematic review. Electronics10, 1611. 10.3390/electronics10141611

  • 51

    VasudevanR. K.KalininS. V.ChenL.-Q.LookmanT. (2021). Hybrid computational-experimental approach for discovery of phase-change materials. Adv. Mater.33, 2004572. Available online at: https://www.sciencedirect.com/science/article/pii/S2352152X23026142.

  • 52

    VaswaniA.ShazeerN.ParmarN.UszkoreitJ.JonesL.GomezA. N.et al (2017). Attention is all you need. Adv. neural Inf. Process. Syst.30, 59986008. Available online at: https://proceedings.neurips.cc/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html

  • 53

    VecchioG.DeschaintreV. (2024). “Matsynth: a modern pbr materials dataset,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2210922118.

  • 54

    WangX.GirshickR.GuptaA.HeK. (2018). “Non-local neural networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 77947803.

  • 55

    WardL.LiuR.KrishnaA.HegdeV. I.AgrawalA.ChoudharyA.et al (2016). A general-purpose machine learning framework for predicting properties of inorganic materials. npj Comput. Mater.2, 16028. 10.1038/npjcompumats.2016.28

  • 56

    WuH.ZhaoY.ZhangJ.PanZ.XuL.LiangY. (2020). Machine learning-assisted discovery of polymers with high thermal conductivity using a molecular design algorithm. npj Comput. Mater.6, 66. 10.1038/s41524-019-0203-2

  • 57

    XieT.GrossmanJ. C. (2018). Crystal graph convolutional neural networks for an accurate and interpretable prediction of material properties. Phys. Rev. Lett.120, 145301. 10.1103/physrevlett.120.145301

  • 58

    XuX.ChenJ.DeshpandeV. V.DiestK.ShiL. (2014). Thermal conductivity of graphene and its structural derivatives. Small10, 47954819. Available online at: https://onlinelibrary.wiley.com/doi/pdf/10.1002/9781119242635#page=299.

  • 59

    YadalamT. V.GowdaV. M.KumarV. S.GirishD.MN. (2020). “Career recommendation systems using content based filtering,” in International conference on communication and electronics systems.

  • 60

    YangC.KeX.HuP.LiY. (2021). “Nightdnet: a semi-supervised nighttime haze removal frame work for single image,” in 2021 3rd international academic exchange Conference on Science and Technology innovation (IAECST) (IEEE), 716719.

  • 61

    YangL.TanB.ZhengV.ChenK.YangQ. (2020). Federated recommendation systems. Fed. Learn., 225239. 10.1007/978-3-030-63076-8_16

  • 62

    YaoK.XuX.SunB.WangC.WangC.WuK. (2021). Deep-learning model for dielectric polymer design with high energy density and low loss. Sci. Adv.7, eabf7290. Available online at: https://pubs.acs.org/doi/abs/10.1021/acs.chemmater.1c02061.

  • 63

    ZhangJ.TongZ.ZhangW.ZhaoY.LiuY. (2021a). Research on ncf-pcf-ncf structure interference characteristic for temperature and relative humidity measurement. IEEE Photonics J.13, 15. 10.1109/jphot.2021.3105395

  • 64

    ZhangY.PengC.PengL.XuY.LinL.TongR.et al (2021b). Deeprecs: from recist diameters to precise liver tumor segmentation. IEEE J. Biomed. Health Inf.26, 614625. 10.1109/jbhi.2021.3091900

  • 65

    ZhangZ.PatraB. G.YaseenA.ZhuJ.SabharwalR.RobertsK.et al (2023). Scholarly recommendation systems: a literature survey. Knowl. Inf. Syst.65, 44334478. 10.1007/s10115-023-01901-x

  • 66

    ZungerA. (2018). Inverse design in search of materials with target functionalities. Nat. Rev. Chem.2, 0121. 10.1038/s41570-018-0121

Summary

Keywords

materials science, predictive modeling, finite element method, supervised learning, machine learning, multiscale modeling

Citation

Yu M, Liu J, Chen C and Li M (2025) Enhancing phase change thermal energy storage material properties prediction with digital technologies. Front. Mater. 12:1616233. doi: 10.3389/fmats.2025.1616233

Received

22 April 2025

Accepted

23 June 2025

Published

18 July 2025

Volume

12 - 2025

Edited by

Tao Jing, Tsinghua University, China

Reviewed by

Pavlo Maruschak, Ternopil Ivan Pului National Technical University, Ukraine

Jianqiao Hu, Chinese Academy of Sciences (CAS), China

Updates

Copyright

*Correspondence: Minghao Yu,

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Outline

Figures

Cite article

Copy to clipboard


Export citation file


Share article

Article metrics