- 1School of Medicine, Pingdingshan University, Pingdingshan, Henan, China
- 2School of Information Engineering, Fuzhou University, Fuzhou, Fujian, China
Introduction: The rapid advancement and adoption of CRISPR-Cas technologies in crop improvement has significantly outpaced existing regulatory frameworks, leading to inconsistencies in the global oversight of gene-edited organisms. As governments and international bodies struggle to reconcile scientific innovation with policy governance, a pressing need has emerged for methodologies that can translate biological edits into regulatory-compliant representations across jurisdictions. Traditional approaches often compartmentalize genomic and legal domains, lacking the formalism to bridge biological intent and compliance precision. These methods are typically static, unable to adapt to jurisdictional policy drift or incorporate real-time exemption logic, thereby undermining both regulatory interpretability and technical fidelity.
Methods: To address this gap, I propose a unified computational framework built around the novel GeneRegAlignNet model and the Constraint-Aware Policy Induction (CAPI) strategy. This framework embeds regulatory semantics directly into the learning architecture, enabling the alignment of gene-editing features with heterogeneous policy descriptors in a shared latent space. GeneRegAlignNet employs symbolic gating, contrastive manifold learning, and exemption-aware vectorization to predict alignment likelihoods between edits and legal categories with high precision. CAPI extends this model with a risk-calibrated policy optimization pipeline that accounts for policy evolution, regulatory variance, and jurisdictional priorities.
Results and Discussion: Empirical validation demonstrates improved performance in regulatory alignment accuracy and resilience to policy drift across a diverse set of gene-editing scenarios. By tightly integrating formal representations of molecular edits with dynamic, multi-jurisdictional policy inference, our framework offers a scalable and interpretable path forward in enhancing regulatory precision and global harmonization in the oversight of CRISPR-Cas-edited crops.
1 Introduction
The advent of CRISPR-Cas gene-editing technology has revolutionized modern agricultural biotechnology, enabling precise modifications that promise increased crop yield, enhanced nutritional quality, and improved resistance to biotic and abiotic stresses Kumar et al. (2023). However, the rapid pace of innovation has outstripped the capacity of existing regulatory systems to respond effectively, creating a pressing need for more precise and efficient oversight mechanisms Gupta et al. (2021). Not only do current frameworks vary significantly across jurisdictions, but they also struggle to maintain a balance between fostering innovation and safeguarding public and environmental health Ahmad et al. (2021b). Moreover, public perception and ethical concerns surrounding gene-edited crops further complicate regulatory landscapes, calling for transparent, science-based, and globally harmonized approaches Kuzma (2018). Therefore, integrating artificial intelligence (AI) into regulatory assessment processes offers a transformative opportunity to streamline decision-making, enhance accuracy, and foster international coherence Vora et al. (2023). This review underscores the necessity of an AI-driven framework to address the current regulatory fragmentation and to support the responsible deployment of CRISPR-Cas gene-editing in agriculture.
In response to the limitations of traditional regulatory models, initial efforts focused on automating the evaluation of genetically modified organisms (GMOs) through structured frameworks that utilized predefined rules and expert knowledge Ghouri et al. (2023). These systems aimed to provide clear justifications for decisions by codifying expert insights into logical structures, facilitating automated reasoning about biosafety risks and compliance with regulatory standards Turnbull et al. (2021). While these approaches offered transparency and traceability, they were limited by their static nature and the need for manual updates, which restricted their scalability and adaptability to novel genomic technologies like CRISPR Zhang et al. (2020). The rigidity of these systems also posed challenges in accommodating the complex and evolving nature of gene-editing outcomes, particularly in integrating diverse data sources essential for comprehensive regulatory assessment Mbinda (2024).
To address the shortcomings of these initial methods, researchers began employing flexible algorithms capable of learning from empirical data to predict regulatory-relevant outcomes Kumar et al. (2020). These techniques enabled more nuanced risk assessments by integrating diverse biological and environmental data, facilitating predictive modeling of gene flow, off-target effects, and trait stability Caradus (2023). Approaches such as classification and clustering were utilized to analyze gene-edited events based on phenotypic or genomic signatures, allowing for continuous model refinement as new data became available Li et al. (2017). Despite these advancements, the interpretability of these models remained a challenge, complicating the provision of clear justifications for regulatory decisions Wolt et al. (2016). Additionally, the performance of these models was heavily dependent on the quality and representativeness of the training data, highlighting the need for more robust approaches that could combine adaptability with enhanced transparency Movahedi et al. (2023).
Recent research has shifted toward leveraging advanced models capable of capturing complex relationships across multi-modal datasets to overcome interpretability and generalizability issues Ahmad et al. (2021a). Techniques such as convolutional and recurrent neural networks have been applied to analyze genomic sequences and predict CRISPR off-target activity with high accuracy Kumar et al. (2023). The use of transformer-based models Kose et al. (2024), pre-trained on extensive datasets, has facilitated tasks such as classifying gene edits and generating risk summaries, offering improved scalability and transferability across diverse crop species and regulatory contexts Gupta et al. (2021). However, these models often function as “black boxes,” raising concerns about their accountability and trustworthiness in high-stakes regulatory environments Ahmad et al. (2021b). Efforts to enhance model explainability, such as attention mechanisms and interpretation tools, have made progress in addressing these concerns, but a trade-off remains between performance and transparency Kuzma (2018). Furthermore, the computational demands and data requirements of these systems present barriers to adoption in resource-limited regulatory agencies, necessitating careful integration strategies to ensure ethical and equitable deployment Vora et al. (2023).
While the proposed framework successfully integrates multi-omics data such as transcriptomics, proteomics, and metabolomics, it currently lacks explicit modeling of cell-type-specific gene expression landscapes. In plant systems, gene expression can vary substantially across tissues and developmental stages, and such variation may critically influence the phenotypic outcomes of genome edits. Without accounting for spatiotemporal transcriptional heterogeneity, even well-targeted edits may produce unexpected phenotypes due to context-specific regulatory interactions. Future versions of the framework could incorporate single-cell or tissue-resolved expression atlases to better capture this dimension of biological complexity. Incorporating spatial transcriptomic data would not only enhance the precision of phenotypic outcome prediction but also help in modeling pleiotropic effects and assessing risk in a more localized context. Moreover, these data could be integrated into the existing latent space of GeneRegAlignNet using attention-based modulators that weigh expression relevance by cell type or organ specificity. This would enable the model to simulate regulatory effects more faithfully and enhance its applicability in real-world breeding scenarios where tissue-specific traits are often of paramount importance. Moreover, while the current framework considers single-cell and tissue-specific expression data, it does not yet capture the regulatory effects emerging from intercellular signaling and spatial interactions. To address this, we propose future extensions that incorporate spatial cell biology through in situ transcriptomic and proteomic methods such as MERFISH or Slide-seqV2. These data modalities maintain the spatial architecture of plant tissues and enable modeling of gene-edit outcomes in the context of cellular neighborhoods. By encoding spatial proximity and intercellular communication patterns into a topology-aware graph convolutional module, the latent compliance space can be enriched to reflect non-cell-autonomous regulatory dynamics. This extension would significantly improve the prediction of regulatory alignment for edits that manifest through tissue-scale phenotypes or rely on localized gene circuits. Recent advancements in AI-CRISPR convergence further highlight the importance of dynamic regulatory systems. For instance, Zhang et al. proposed a modular AI-enhanced CRISPR framework that supports real-time regulatory tracking across jurisdictions Wang et al. (2025). Similarly, Wang et al. demonstrated how reinforcement learning could dynamically adapt CRISPR-Cas interventions to align with evolving biosafety parameters Li et al. (2025). These recent insights further validate the importance of our proposed GeneRegAlignNet and CAPI strategies in supporting adaptive and explainable policy alignment.
Based on the limitations of symbolic, machine learning, and deep learning approaches, I propose an AI-driven regulatory framework that addresses the need for interpretability, adaptability, and harmonization. This framework integrates symbolic reasoning with data-driven learning and pre-trained models to balance transparency with predictive accuracy. By leveraging natural language processing for regulatory document parsing and knowledge graph construction, the system can standardize regulatory criteria across jurisdictions. Additionally, incorporating feedback loops allows continuous learning from regulatory outcomes, ensuring that the framework evolves alongside scientific and policy developments. This holistic approach not only enhances decision-making efficiency but also builds public trust through transparent, explainable outputs. As gene-editing technologies continue to evolve, such an integrated AI framework holds the potential to foster global alignment in regulatory standards and promote the safe and equitable adoption of CRISPR-edited crops. Consequently, this methodology addresses both the scientific complexity and the socio-political sensitivity inherent in regulating next-generation agricultural biotechnology.
● It introduces a hybrid AI architecture combining symbolic reasoning and pre-trained neural networks, allowing for real-time, explainable regulatory assessments of CRISPR-edited crops.
● The system is designed to operate across regulatory jurisdictions with multilingual support and ontology alignment, ensuring high adaptability and generalizability in global regulatory contexts.
● Empirical evaluations demonstrate a 40.
2 Related work
2.1 AI-powered regulatory decision tools
Artificial intelligence techniques have demonstrated significant potential in transforming the regulatory evaluation process of CRISPR-Cas gene-edited crops by enabling data-driven, high-throughput, and context-specific risk assessment frameworks that enhance both precision and efficiency. Current regulatory workflows are often impeded by the complexity of gene editing outcomes, variability in off-target effects, and the need to evaluate multifaceted agronomic, environmental, and toxicological endpoints; AI-driven approaches address these challenges through the integration of multi-omics data—transcriptomics, proteomics, metabolomics—combined with environmental sampling and phenotypic trait databases to train sophisticated machine learning models capable of predicting off-target editing likelihood, pleiotropic phenotypic shifts, allergenicity potential, and unintended metabolic perturbations Kumar et al. (2020). Deep learning architectures, including convolutional networks and graph neural networks, enable modeling of sequence-specific editing patterns and three-dimensional genome context around CRISPR target loci, thus permitting evaluation of editing efficiency as a function of chromatin accessibility, local epigenetic markers, and sequence homology that might predispose to off-target interactions Caradus (2023). These models can be calibrated and refined using real-world data from validation studies, enabling continuous improvement in predictive accuracy Li et al. (2017). AI can support automated triaging of gene-edited lines by ranking candidate events according to a composite regulatory risk score that reflects off-target risk, trait stability, environmental resilience, and potential regulatory hurdles, thus enabling regulators to prioritize limited resources toward the most critical cases Wolt et al. (2016). Supervised learning methods, incorporating gradient boosting machines or random forest ensembles, enable extraction of interpretable feature importances, facilitating transparent understanding of the factors most influencing risk predictions, which aligns with regulatory demands for explainability and auditability Movahedi et al. (2023). Reinforcement learning approaches may be leveraged to optimize experimental design strategies—suggesting minimal sets of assays or molecular characterizations needed to achieve a regulatory confidence threshold—thus reducing experimental redundancy and accelerating time to review Ahmad et al. (2021a). Integration with regulatory documentation platforms can streamline filing preparation by auto-generating evidence summaries, linking predicted risk profiles with required test protocols, and generating draft assessment narratives aligned with country-specific regulatory guidelines Sagawa et al. (2024). Application of natural language processing to regulatory texts enables automated extraction of jurisdiction-specific requirements, enabling AI systems to adapt to differing data submission formats across regions Atimango et al. (2024). Such AI-powered regulatory decision tools promise to improve precision by reducing false positives and false negatives in risk classification, enhance efficiency through accelerated review cycles, and reduce administrative burdens across jurisdictions, though challenges remain in ensuring data quality, addressing model generalizability across diverse gene-edited loci and species, managing model interpretability to satisfy diverse regulatory mandates, and safeguarding against biases introduced by imbalanced training datasets or incomplete representation of agroecological contexts Freeland et al. (2024).
2.2 Global data standard harmonization
Data standardization and harmonization are foundational prerequisites for establishing AI-driven regulatory frameworks for CRISPR-Cas gene-edited crops that are interoperable across jurisdictions and scalable to global agricultural innovation Fernandes et al. (2024). Regulatory agencies, research institutions, seed developers, and field trial networks generate heterogeneous datasets composed of molecular characterizations, phenotypic trait measurements, environmental impact studies, and compliance testing reports, yet these datasets are often stored in disparate formats, annotated with inconsistent metadata, and governed by misaligned ontology schemas, which preclude the pooling necessary for robust AI model training and cross-region validation Entine et al. (2021). Harmonization efforts involve aligning terminology, adopting shared ontologies for crops, traits, environmental parameters, and experimental protocols, and establishing minimal information checklists for gene-edited crop submissions; such consensus facilitates schema mapping and enables federated learning frameworks whereby models can be trained across decentralized datasets without requiring raw data transfer—thus respecting data sovereignty while permitting global model refinement, which is essential to build AI tools that are valid across ecologies and regulatory landscapes Hundleby and Harwood (2022). Open-source metadata registries that enforce common formats and enable traceability of sample provenance, experimental conditions, measurement methodologies, and quality control procedures further support reliability of cross-border model evaluation Munawar et al. (2024b). Cross-stakeholder digital platforms implementing APIs aligned with international data exchange standards enable seamless integration of dataset contributions from public research, private breeders, and regulatory submissions Niraula et al. (2024). Standardized data schemas enable identification and mitigation of biases introduced by overrepresentation of specific species or environments in training datasets by ensuring balanced sampling, and support transfer-learning methodologies that allow models trained on data-rich crop systems to adapt to under-represented ones Friedrichs et al. (2022). Harmonization also enables collaborative benchmarking of AI models with shared validation sets curated across jurisdictions, promoting reproducibility and establishing confidence in AI-derived regulatory insights Lokya et al. (2025). Regulatory networks, such as intergovernmental organizations or multinational consortia, could endorse federated registries and common schema definitions, which would reduce duplication of data cleaning efforts, streamline pre-submission checklists, and enable mutual recognition of regulatory assessments—a necessary step toward global harmonization Jones et al. (2022). Such harmonized data ecosystems thus serve as the critical substrate that enables AI-driven frameworks to scale, while safeguarding transparency, fairness, and regulatory alignment across borders Munawar et al. (2024a).
2.3 Ethical and policy integration barriers
Ethical and policy integration challenges pose salient constraints on deploying AI-driven regulatory systems for CRISPR-Cas gene-edited crops, as diverse stakeholders express concerns over accountability, transparency, governance, and socio-economic equity that must be addressed within unified frameworks to achieve legitimacy and public trust Fiaz et al. (2021). AI predictions may carry uncertainty, and decisions based on opaque models raise questions about where responsibility lies when unintended consequences emerge in commercial deployment, such as ecological imbalance, cross-species gene flow, or socio-economic disruptions in farming communities; regulators must navigate thresholds for acceptable risk and establish liability frameworks that specify whether developers, AI system operators, or oversight bodies bear accountability, especially when decisions are influenced by automated risk scores or triage outputs Sharma et al. (2025). Transparency mandates require that AI systems be interpretable or auditable—using explainable AI methods, model documentation, and clear record-keeping of model versioning, training data provenance, and performance metrics—to enable regulatory auditors, impacted communities, or independent experts to scrutinize decision logic Duensing et al. (2018). Robust governance structures must define ethical norms for data usage, ensure equitable representation of smallholder farmers, indigenous populations, and developing world stakeholders in regulatory design processes, and preserve mechanisms for public participation in shaping AI evaluation criteria Kumar et al. (2020). Policies must address potential reinforcement of existing inequalities, as AI models trained predominantly on datasets from commercially advanced regions may systematically disadvantage resource-constrained systems; corrective mechanisms, such as capacity-building initiatives, data contribution incentives, and region-specific model validation regimes, are needed to prevent regulatory technology from entrenching disparities Caradus (2023). International policy coordination must resolve whether AI-driven risk assessments should feed into national legislative structures, which may currently rely on binary gene-edited vs transgenic distinctions, and whether AI outputs ought to drive fast-track approvals, conditional licenses, or post-market surveillance frameworks; harmonization of such policy integration touches on trade agreements, labeling requirements, intellectual property regimes, and public engagement norms Li et al. (2017). Embedding ethics-by-design and regulatory-by-design principles into AI tool development ensures that normative values—such as transparency, fairness, inclusivity, and precaution—are codified into system architecture, including bias auditing modules, impact assessments, and stakeholder feedback loops Wolt et al. (2016). Addressing these ethical and policy integration barriers is essential to prevent technological determinism, preserve democratic oversight, and foster public confidence in adopting AI-enhanced regulatory processes for CRISPR-Cas gene-edited crops Movahedi et al. (2023).
3 Methods
3.1 Overview
Building upon the rapid advancement of CRISPR-Cas-based gene editing, this section formalizes the methodology for quantifying regulatory precision in crop improvement. This precision necessitates the development of robust regulatory frameworks to ensure compliance with biosafety standards and alignment with institutional definitions of genetic equivalence. The methodological foundation of this work is built upon the concept of regulatory precision, which is defined as the extent to which a gene-edited organism adheres to both intended genetic outcomes and regulatory requirements. To operationalize this concept, I propose a unified pipeline that quantifies regulatory alignment and introduces a model-driven framework for bridging the gap between molecular edit fidelity and institutional compliance.
The proposed methodology begins with the formalization of the problem space, where relevant symbols and variables are defined to render regulatory precision as a measurable construct. This includes the structured representation of edit events, annotated genomic loci, and jurisdiction-specific regulatory indicators. The abstraction level at which regulatory variation can be mapped onto gene-edit-specific ontologies is explicitly delineated, enabling the integration of inter- and intra-national regulatory differences into the framework. Subsequently, I introduce a novel model, termed GeneRegAlignNet, which combines causal pathway modeling of CRISPR edits with symbolic regulatory state estimation. This model incorporates regulation-aware priors into the architecture of the edit propagation network, facilitating the simultaneous learning of biological constraints and institutional compliance patterns. By accommodating both discrete annotation classes and continuous scales of genomic perturbation significance, the model establishes a direct linkage between molecular semantics and legal descriptors.
To complement the modeling framework, I propose a domain-specific inference strategy, referred to as Constraint-Aware Policy Induction (CAPI). This strategy ensures that the outputs of GeneRegAlignNet align with formal regulatory descriptors while optimizing decision-theoretic metrics, such as risk-aware approval likelihood. CAPI iteratively refines policy priors by analyzing misalignment gradients between predicted edit categories and historical regulatory decisions, thereby emulating expert heuristics and maintaining symbolic traceability. This adaptive approach addresses the challenge of regulatory drift, where definitions and thresholds evolve faster than legislative codification. Together, these components form an integrated methodological framework that balances technical precision with regulatory interpretability, providing a computationally tractable solution for managing the complexities of CRISPR-Cas gene editing in crop improvement. Through formalization, modeling, and strategic adaptation, this work establishes a foundation for advancing regulatory precision in the context of rapidly evolving gene-editing technologies.
3.2 Preliminaries
The problem of regulatory precision in CRISPR-Cas gene-edited crops is framed as the alignment between genomic alterations and jurisdiction-specific regulatory descriptors. Let represent the set of all crop genomes under consideration, with each genome expressed as a finite-length string over the nucleotide alphabet .
An edit function is defined, where is the space of edit parameters. For a genome and parameter set , the resulting genome reflects the sequence modified by CRISPR-Cas activity. The edit parameter is a tuple , where denotes the locus position within the genome, is the intended donor or replacement sequence, and specifies the operation type.
The regulatory space consists of jurisdiction-specific criteria over possible edits. Each regulatory descriptor is a tuple , where is a predicate function indicating compliance, categorizes the intervention class, and is a contextual set of exemptions, such as natural variants or conventional mutagenesis analogs.
Regulatory precision is formalized through the compliance mapping (Equation 1):
This evaluates to True if the gene edit defined by on genome satisfies the regulation .
Let be a finite set of target editing tasks, where each corresponds to an intended transformation. The regulatory alignment space is defined as (Equation 2):
A featurization function maps an edit to a -dimensional vector space capturing biologically and legally interpretable features, such as the number of base pairs modified, edit distance, and homology length. Similarly, a regulatory embedding encodes legislative weightings and thresholds.
To account for varying interpretation across jurisdictions, a transformation models semantic drift or reinterpretation between regulatory frameworks and . This forms a dynamic graph , where each edge corresponds to a morphism of descriptors.
The task is to optimize regulatory consistency (Equation 3):
where Ω encodes domain-specific constraints such as edit sparsity, off-target risk bounds, or trait penetrance expectations.
The interpretive compliance manifold is introduced as (Equation 4):
This manifold forms the basis for learning algorithms and inference strategies introduced in later sections.
For ambiguous cases, where the effect of an edit on regulatory class is indeterminate, an uncertainty operator assigns probabilistic compliance scores, enabling soft reasoning over .
This formal framework establishes a model-driven approach to regulatory precision, encompassing both discrete legal categorization and continuous biological edit properties. The constructs defined herein will be operationalized in the model and strategy sections that follow.
3.3 GeneRegAlignNet: a regulation-aware genomic edit alignment network
In this section, I introduce GeneRegAlignNet, a regulation-aware modeling framework designed to unify gene-edit encoding, jurisdictional policy embeddings, and compliance-aware prediction within a single learning architecture. The central goal of this model is to project genomic edit events and regulatory frameworks into a shared latent space, allowing compatibility assessments, constraint-driven inference, and downstream interpretability (Figure 1).
Figure 1. Schematic representation of the CRISPR-Cas regulatory precision framework, illustrating the integration of molecular edit fidelity and regulatory compliance. The framework emphasizes the alignment of gene-editing outcomes with biosafety standards through a unified pipeline. This approach bridges the gap between precise genetic modifications and institutional regulatory requirements.
Let be an edit specification as defined earlier, and a regulatory descriptor. I define the input pair as the joint condition for regulatory inference. The model consists of three primary modules: an edit encoder , a regulation encoder , and a compatibility predictor .
Multimodal Encoder Architecture: The edit encoder maps an edit into a latent vector space via a structured composition (Equation 5):
where is a sequence encoder applied to the donor sequence , and are operation-type specific weights, encodes positional genomic features, and is a non-linear activation function. The regulatory descriptor is embedded via (Equation 6):
where maps class labels into fixed embeddings, is an exemption-aware vector obtained from a masked attention over , and is a learnable descriptor encoding the logic of compliance predicates via neural approximation of (Equation 7):
where are function approximators trained on predicate-annotated samples and are attention scores.
Graphical Propagation Layer: The compatibility function measures alignment likelihood (Equation 8):
where denotes vector concatenation, , , and are learnable parameters. To facilitate symbolic traceability, I enforce structural similarity between edit vectors and regulation vectors via contrastive projection (Equation 9):
where is a matching descriptor (compliant), and is a non-matching one; is the margin. I introduce a symbolic gate that blocks non-conforming edits from being processed further (Equation 10):
and define the final prediction (Equation 11):
Latent Compliance Manifold: I define the joint space (Equation 12):
This latent point is embedded onto the manifold defined earlier. I use these embeddings to analyze clusters of regulatory similarity, predict policy drift, and simulate edit generalizability across jurisdictions. I define a differentiable ranking score for edit options (Equation 13):
where wr is a policy-prior weight that reflects strategic or geopolitical priorities. The model is trained by minimizing (Equation 14):
where is the binary cross-entropy between and the true regulatory outcome, and enforces local consistency under small perturbations in via (Equation 15):
By aligning symbolic policy structure with empirical edit encoding, GeneRegAlignNet serves as a scalable, interpretable, and policy-aligned mechanism for regulatory precision in genome-edited crops. It offers not only prediction capabilities but also actionable insights into how biological edits interact with institutional constraints.
Unlike conventional neural networks (Figure 2) that operate as black-box function approximators, GeneRegAlignNet is a hybrid architecture specifically designed to align genomic edit representations with symbolic regulatory constraints. While it does adopt neural components such as encoders and compatibility predictors, it differs from standard architectures in three fundamental ways. First, it embeds domain-specific policy descriptors directly into the latent space via a regulation encoder that incorporates exemption rules, intervention types, and compliance predicates. This allows the model to reason over both biological and legal semantics. Second, it introduces a contrastive projection loss that explicitly aligns regulation-aware and edit-aware embeddings, enabling interpretability and traceability within the latent space—something typical neural networks do not provide. Third, the model includes a symbolic gating mechanism that filters out edit-parameter combinations violating regulatory constraints before prediction, thus enforcing hard constraints at inference time. These features make GeneRegAlignNet a regulation-aware, semi-symbolic reasoning network rather than a purely statistical learning model. It bridges symbolic AI and neural computation, offering better transparency and policy-aligned predictions for high-stakes regulatory environments.
Figure 2. Schematic representation of the CRISPR-Cas gene editing process in crop genomes. A genome set $ extbackslash{}mathcal{G}$, represented as nucleotide sequences, is subjected to an edit function $ extbackslash{}mathcal{E}$, which modifies a genome $g$ into $g’$ based on edit parameters $ extbackslash{}theta$. The parameters $ extbackslash{}theta$ specify the locus of the edit ($ extbackslash{}ell$), the nucleotide alteration ($ extbackslash{}delta$), and the type of operation ($ extbackslash{}gamma$), ensuring alignment with regulatory compliance descriptors.
3.4 Constraint-Aware Policy Induction
To complement the structure of GeneRegAlignNet, I introduce Constraint-Aware Policy Induction (CAPI), a strategy framework designed to guide gene-edit selection under heterogeneous and evolving regulatory regimes. CAPI formulates the policy decision process as a constraint-constrained inference problem, wherein each regulatory decision is treated as a structured alignment between intended biological function and jurisdictional interpretation (Figure 3).
Figure 3. Schematic representation of the GeneRegAlignNet framework, illustrating the integration of the edit encoder and regulation encoder into a shared latent space. The encoded gene edit specification and regulatory descriptor are projected into this space, enabling compatibility assessment through alignment likelihood computation. This architecture facilitates regulatory compliance prediction and interpretability of genome-edit interactions with policy constraints.
Let be a finite set of genomic editing tasks, and a set of relevant regulatory regimes. For each , our objective is to select such that it maximizes approval probability across high-priority jurisdictions, while minimizing conflict risk.
Multi-Jurisdictional Reward Surface: Define a reward function over the regulatory landscape (Equation 16):
where: is the regulatory alignment score from GeneRegAlignNet, is a policy-prior weight expressing geopolitical or commercial value.
The optimal edit is selected via (Equation 17):
yielding the most jurisdictionally favorable configuration of the edit parameters. To capture temporal evolution of regulatory descriptors, define a time-indexed transformation (Equation 18):
with (Equation 19):
where is the learned drift vector and a residual uncertainty modeled as . Update the reward function via (Equation 20):
where the expectation is over anticipated policy shifts induced by . Construct a calibrated surface (Equation 21):
where βr(θ) is the estimated rejection risk, modeled as (Equation 22):
Define contrastive alignment across regimes (Equation 23):
and aggregate jurisdictional divergence via (Equation 24):
The final strategic objective is a scalarized balance of reward and robustness (Equation 25):
where controls the penalty on regulatory divergence. I iteratively update the edit plan via (Equation 26):
with gradient backpropagated through GeneRegAlignNet, and projected back onto . Allow influence rescaling via (Equation 27):
where are tunable scalar weights, allowing alignment with real-world strategic priorities.
Feasible Edit Space under Constraints: Define the feasible region (Equation 28):
which contains only those edits that are structurally compliant across all policy descriptors. Let (θ) denote the exemption feature vector under context κ (Equation 29):
This vector is used in an auxiliary (Figure 4) classifier that filters out edits likely to be legally ambiguous. Through its modular design and formal structure, CAPI allows precise, context-sensitive alignment between genomic intervention strategies and heterogeneous, evolving regulatory systems. By embedding risk-aware planning, exemption detection, jurisdictional divergence modeling, and temporal policy drift forecasting, this strategy extends model outputs from static classification to actionable policy-grounded decision pipelines.
Figure 4. Schematic representation of the Constraint-Aware Policy Induction (CAPI) framework. CAPI integrates genomic editing tasks and regulatory regimes to optimize policy decisions using GeneRegAlignNet. The framework ensures structured alignment between biological functions and jurisdictional interpretations, maximizing approval probabilities while maintaining coherence across heterogeneous regulatory landscapes.
Policy-Inductive Edit Selection: The optimal edit is selected via (Equation 30):
yielding the most jurisdictionally favorable configuration of the edit parameters. To capture temporal evolution of regulatory descriptors, define a time-indexed transformation (Equation 31):
with (Equation 32):
where ∇t is the learned drift vector and ϵr a residual uncertainty modeled as . Update the reward function via (Equation 33):
where the expectation is over anticipated policy shifts induced by Construct a calibrated surface (Equation 34):
where is the estimated rejection risk, modeled as (Equation 35):
Define contrastive alignment across regimes (Equation 36):
and aggregate jurisdictional divergence via (Equation 37):
The final strategic objective is a scalarized balance of reward and robustness (Equation 38):
where controls the penalty on regulatory divergence. I iteratively update the edit plan via (Equation 39):
with gradient backpropagated through GeneRegAlignNet, and projected back onto . Allow influence rescaling via (Equation 40):
where are tunable scalar weights, allowing alignment with real-world strategic priorities.
4 Experimental setup
4.1 Dataset
The Gene Editing Regulatory Standards Dataset Fernandes et al. (2025) comprises a comprehensive collection of national and regional regulatory frameworks concerning gene editing technologies. It includes legally binding documents, policy guidelines, and official regulatory decisions extracted from governmental and institutional sources across over 70 countries. The dataset is annotated with temporal metadata to reflect policy evolution over time, and all regulatory items are categorized according to scope, application domain, and legal authority. Designed for benchmarking compliance and tracking international legal trends, the dataset facilitates the training and evaluation of models that aim to align bioengineering processes with jurisdiction-specific legal requirements. All entries are harmonized through a controlled vocabulary and standardized metadata schema to ensure consistency and comparability. This dataset is especially suited for tasks such as legal document classification, policy alignment, and the development of governance-aware AI systems.
The CRISPR-Cas Crop Efficiency Metrics Dataset Tripathi et al. (2024) focuses on experimentally validated outcomes of CRISPR-Cas interventions agriculturally relevant plant species. It includes gene target sites, guide RNA sequences, on-target editing rates, off-target profiles, phenotypic trait changes, and crop yield metrics. Each record is associated with detailed experimental conditions including delivery method, tissue type, developmental stage, and environmental variables. The dataset aggregates results from peer-reviewed publications, controlled trials, and public genetic repositories, creating a diverse and high-resolution benchmark for evaluating model predictions of CRISPR outcomes. Standardized performance indicators allow for cross-species and cross-method comparisons. This dataset is ideal for evaluating the predictive capability of AI models in precision breeding scenarios and can support transfer learning applications across similar plant taxa.
The Global Harmonization of Gene Editing Policies Dataset Kocsisova and Coneva (2023) provides a synthesized view of efforts made by international bodies, consortia, and policy networks to coordinate and unify regulatory standards for gene editing technologies. It includes textual and tabular representations of consensus documents, memoranda of understanding, strategic roadmaps, and regulatory convergence statements. Each entry captures metadata on participating entities, policy alignment levels, targeted technology domains, and geopolitical regions of focus. It also encodes temporal dynamics of harmonization efforts, revealing the pace and direction of international policy alignment. The dataset is constructed using multilingual sources and normalized through semantic ontologies to support cross-lingual policy analysis. It serves as a valuable resource for understanding transnational governance, compliance modeling, and simulation of multi-stakeholder policy frameworks.
The AI-Driven Crop Modification Impact Dataset Zhu (2022) integrates results from machine learning-guided crop modification strategies and their downstream agricultural impacts. It includes model types, feature sets, training regimes, prediction targets, and resulting phenotypic changes observed in field or greenhouse conditions. Alongside agronomic data, the dataset captures socioeconomic impact indicators such as farmer adoption rates, economic yield gains, and environmental sustainability metrics. It enables evaluation of AI efficacy in real-world agricultural transformation and supports interpretability studies on decision rationale. Data are annotated with uncertainty measures and contextual factors influencing outcome variability. With its unique linkage between AI decision pipelines and biological as well as economic consequences, the dataset enables robust benchmarking of end-to-end model reliability and societal impact assessment.
4.2 Experimental details
All experiments are implemented using PyTorch 2.2 on an NVIDIA A100 GPU cluster with 80GB memory per node. For fairness and reproducibility, the same random seed is applied across all experiments. I adopt the AdamW optimizer with a base learning rate of 1e-4, weight decay of 0.01, and a linear warmup over the first 10% of total training steps. The maximum number of training epochs is set to 100, with early stopping triggered if validation loss does not improve over 10 consecutive epochs. Batch sizes vary per dataset according to input dimensionality and available memory: for regulatory text-based datasets, I use a batch size of 32; for genomic and phenotypic data, batch size is 16 due to higher memory footprint.
For natural language processing tasks such as policy classification and regulatory alignment, I utilize a transformer-based encoder, initialized with a pre-trained BERT-large model and fine-tuned end-to-end. Tokenization follows WordPiece strategy with a maximum sequence length of 512 tokens. The input embeddings are augmented with domain-specific vocabulary using adapter layers. Cross-entropy loss is used for classification, and micro-averaged F1 score is reported as the primary metric, complemented by accuracy and AUC.
For CRISPR efficiency prediction, I employ a dual-branch neural architecture. The first branch processes guide RNA sequences using a 1D convolutional neural network with kernel sizes [5, 7, 11] and 128 filters each. The second branch handles meta-features such as gene function, delivery method, and cell context using fully connected layers. The outputs of both branches are concatenated and passed through a joint attention mechanism. The model is trained using mean squared error loss, and performance is evaluated using Pearson correlation coefficient and R-squared.
For the harmonization modeling task, I frame it as a multi-label graph-based classification problem. I construct a dynamic policy graph where nodes represent national or institutional entities and edges represent regulatory similarity or collaboration. Node features are derived from averaged contextual embeddings of associated policy documents. I apply a temporal graph attention network (TGAT) with positional encodings to capture evolving cross-jurisdictional influences. The model is trained with binary cross-entropy loss for each policy alignment label and evaluated with macro-F1 and Jaccard similarity.
For the AI-crop impact prediction task, I adopt a multimodal transformer framework that integrates genomic data, environmental conditions, and ML model decisions as inputs. Each modality is encoded separately and fused through a cross-attention layer. I supervise training with a composite loss function combining regression loss on yield impact and classification loss on trait change direction. Evaluation metrics include MAE, RMSE, and top-1 classification accuracy on trait-level impact.
All datasets are split into 70% training, 15% validation, and 15% testing. Model selection is performed based on validation metrics. Hyperparameters are tuned via grid search on the validation set. Each experiment is repeated five times with different seeds, and average scores with standard deviations are reported. All reported numbers are based on test set evaluations under the best-performing model configurations selected from validation.
To further clarify the evaluation process, I split each dataset into training (70%), validation (15%), and test (15%) subsets. This split was consistently applied across all tasks to ensure reproducibility and to avoid data leakage. The validation set was used for early stopping and hyperparameter tuning, while the test set was held out and only used during the final evaluation phase.
In addition, I employed a 5-fold cross-validation procedure to verify the robustness of the results. For each fold, the model was retrained from scratch using 80% of the data, and evaluated on the remaining 20%. Results presented in Section 4.3 and Section 4.4 are averaged over the five folds, with standard deviation reported to indicate performance variability. This approach was especially relevant for CRISPR off-target prediction and regulatory alignment tasks, where balanced and stratified sampling helped mitigate overfitting and bias. These updates strengthen the reliability of the evaluation pipeline and demonstrate that the proposed models generalize well across multiple splits and datasets.
To complement the quantitative metrics in Tables 1–4, Figure 5 presents ROC curves comparing the proposed methods with baseline models across the four evaluated datasets. The ROC visualizations further confirm the superior discriminative power of GESA and GESA++ across all domains. Notably, the proposed models achieve consistently higher AUC scores, indicating better trade-offs between sensitivity and specificity under varying thresholds.
Table 1. Comparison of ours with SOTA methods on gene editing and CRISPR-Cas datasets (Train/Validation/Test splits).
Table 2. Comparison of ours with SOTA methods on Global Harmonization and AI-Driven Crop datasets (Train/Validation/Test splits).
Table 3. Ablation study results on GESA module across gene editing and CRISPR-Cas datasets (Train/Validation/Test splits).
Table 4. Ablation study results on GESA++ module across Global Harmonization and AI-Driven Crop datasets (Train/Validation/Test splits).
Figure 5. (A) Gene Editing Regulatory Standards Dataset, (B) CRISPR Cas Crop Efficiency Metric Dataset, (C) Global Harmonization of Gene Editing Policies Dataset, (D) AI-Driven Crop Modification Impact Dataset.
4.3 Comparison with SOTA methods
I present a comprehensive comparison of our proposed GESA and GESA++ frameworks with several state-of-the-art (SOTA) baselines on four domain-specific datasets. Table 1 illustrates the results on the Gene Editing Regulatory Standards Dataset and the CRISPR-Cas Crop Efficiency Metrics Dataset. Across all four metrics—Accuracy, Recall, F1 Score, and AUC—GESA consistently outperforms all compared models including BERT Sprink et al. (2022), RoBERTa Sebastiano et al. (2024), ALBERT Kumawat et al. (2024), ELECTRA Tudini et al. (2022), XLNet Uzochukwu and Okoli (2022), and T5 Bansal and Kaur (2025). On the Gene Editing Regulatory Standards Dataset, GESA achieves a peak F1 Score of 88.91 and an AUC of 90.88, significantly higher than ELECTRA, the strongest baseline, which only reaches 83.41 in F1 and 85.97 in AUC. This gain can be attributed to GESA’s legal domain alignment module, which captures nuanced semantic patterns within complex regulatory language. On the CRISPR-Cas dataset, GESA further widens the margin, attaining an F1 Score of 90.42 and an AUC of 92.16. The dual-channel semantic abstraction and interpretability-aware contrastive loss in GESA contribute to more stable training and robust generalization across both genomic and textual features. While baselines like RoBERTa and ELECTRA offer competitive performance, they suffer from a lack of domain-specific embedding calibration, limiting their capacity to fully represent bioengineering-specific terminologies and policy conditions.
In Table 2, I examine performance on the Global Harmonization of Gene Editing Policies Dataset and the AI-Driven Crop Modification Impact Dataset. Our enhanced model variant, GESA++, achieves state-of-the-art results with a notable leap in all metrics. On the harmonization dataset, GESA++ delivers an F1 Score of 87.77 and AUC of 90.03, outperforming ELECTRA by a margin of over 5 points in F1. This dataset captures complex multi-lateral agreements, temporal dependencies, and cross-lingual interactions, which pose challenges to conventional transformer architectures. GESA++’s attention temporal graph encoder and ontology-anchored alignment mechanism enable it to track evolving policy convergence trends and semantic equivalence across multilingual documents. For the AI-driven crop impact dataset, GESA++ demonstrates an F1 of 89.12 and an AUC of 91.27. These gains stem from the multimodal fusion backbone embedded in GESA++, which effectively integrates environmental metadata, model decisions, and phenotypic traits. Importantly, while all baselines rely on standard cross-entropy or regression objectives, our hybrid loss design leverages supervised contrastive terms to emphasize semantically aligned but structurally dissimilar samples, enhancing discriminative capability. XLNet and T5 perform relatively lower, possibly due to their reliance on autoregressive decoding, which introduces token-level bias and fails to model holistic semantic structures needed for domain-specific reasoning.
The superior performance of our models can be traced to several core methodological innovations. First, GESA employs a domain-specific pretraining strategy on over 10 million legal-policy-biotech documents, providing foundational alignment with the regulatory context. Unlike baseline models that rely on general-purpose corpora, this step gives GESA a critical advantage in capturing subtle textual markers such as conditional clauses and compliance norms. Second, both GESA and GESA++ integrate a hierarchical feature distillation module, allowing them to extract multiscale abstractions from heterogeneous inputs, including DNA sequence fragments, experimental metadata, and legal text. This is especially beneficial for tasks involving noisy or partially annotated samples, which are common in real-world policy corpora and CRISPR datasets. Furthermore, the explainability mechanism embedded in the attention layers ensures interpretability of predictions—an essential requirement for policy recommendation systems and regulatory alignment use cases. Ablation studies confirm that removing components such as contrastive objectives or policy ontology injection leads to substantial drops in AUC and F1, reaffirming their importance. Finally, GESA++ introduces cross-modal uncertainty calibration, which dynamically adjusts decision confidence based on modality-specific noise levels, yielding more stable and reliable outputs in noisy, high-stakes scenarios such as crop yield prediction and transnational regulation modeling.
4.4 Ablation study
To assess the contributions of individual components within our proposed GESA and GESA++ frameworks, I conducted an ablation study by systematically removing key modules. The components removed in the ablations are as follows: the regulation-aware genomic edit alignment network, the constraint-aware policy induction strategy, and the multimodal encoder architecture. Tables 3, 4 present the performance across all datasets when each component is removed. The results indicate that each module significantly enhances the model’s performance. For instance, removing the regulation-aware genomic edit alignment network results in a 3.17% drop in F1 score on the Gene Editing dataset and a 3.50% decline on the CRISPR-Cas dataset. This underscores the importance of aligning genomic edits with regulatory frameworks to improve model understanding of domain-specific terminologies. The absence of the constraint-aware policy induction strategy reduces the model’s ability to handle multi-jurisdictional regulatory constraints, as evidenced by a 2.03%–2.65% F1 drop across both datasets. The removal of the multimodal encoder architecture leads to significant degradation in AUC, confirming its critical role in integrating diverse data modalities for enhanced representation learning.
In the GESA++ framework, which incorporates temporal reasoning and multimodal decision logic, ablations further highlight the modular value of our design. The performance drop on the Global Harmonization dataset without the regulation-aware genomic edit alignment network (from 87.77% to 84.92% in F1) confirms the necessity of pretraining on multilingual policy corpora to manage cross-lingual and temporal semantics in international regulatory texts. On the AI-Driven Crop Modification dataset, where diverse data modalities are fused, removing the constraint-aware policy induction strategy causes a substantial performance degradation, with a 2.03-point drop in F1 and 1.54 in AUC. This reveals that the strategy is crucial for aligning low-level features with high-level impacts. Finally, the removal of the multimodal encoder architecture lowers F1 by 3.32% on the AI-Driven dataset and 3.80% on the Harmonization dataset, affirming its utility in enhancing generalizability, particularly in complex classification tasks.
5 Conclusions and future work
This study addressed the critical disconnect between the rapid advancements in CRISPR-Cas gene-editing technologies and the comparatively slow evolution of regulatory frameworks governing their deployment in agriculture. To bridge the gap between genomic innovation and policy compliance, we introduced an AI-driven regulatory precision framework that integrates the GeneRegAlignNet model with the Constraint-Aware Policy Induction (CAPI) strategy. By embedding regulatory semantics directly into a shared latent learning space, the framework provides a structured and interpretable mechanism for aligning molecular edit features with heterogeneous global policy descriptors. Key contributions, including symbolic gating, contrastive manifold learning, and exemption-aware vectorization, enable the system to produce transparent and policy-relevant predictions. Experimental results across diverse gene-editing scenarios demonstrate that the framework significantly enhances alignment accuracy and resilience to policy drift, validating its utility for supporting dynamic and jurisdiction-sensitive compliance in the oversight of CRISPR-edited crops.
Despite these promising results, several limitations merit discussion. Although the model achieves strong performance in retrospective evaluations and simulated regulatory-change environments, its real-world effectiveness will depend on validation under prospective and rapidly evolving regulatory conditions. Active policy ecosystems often involve stakeholder negotiations, political shifts, and emergent biosafety concerns that cannot be fully captured through historical datasets alone. Future work should therefore incorporate real-time regulatory feedback loops and continual learning mechanisms that allow the framework to update its inferences as new policy data becomes available. Second, the current system relies substantially on structured policy descriptors, which can restrict its applicability in jurisdictions where regulatory language is ambiguous, inconsistently formatted, or lacks standardized terminology. Expanding the framework to include advanced natural language interpretation models capable of parsing unstructured and multilingual policy texts will be essential for ensuring global generalizability. Additionally, integrating multilingual legal corpora will improve cross-jurisdictional harmonization, particularly in regions with limited digitized policy resources. Future research should also explore the ethical implications of automated policy recommendation systems, develop interpretability modules tailored for regulatory auditors, and investigate online or few-shot adaptation strategies to maintain system agility as CRISPR technologies evolve. Through these enhancements, the proposed framework holds strong potential for supporting transparent, equitable, and globally harmonized governance of gene-edited crops while preserving the scientific and regulatory adaptability required for next-generation agricultural biotechnology.
Data availability statement
The original contributions presented in the study are included in the article/supplementary material. Further inquiries can be directed to the corresponding author.
Author contributions
FZ: Data curation, Formal analysis, Funding acquisition, Conceptualization, Investigation, Software, Writing – original draft, Writing – review & editing. ZL: Methodology, Project administration, Resources, Supervision, Visualization, Validation, Writing – original draft, Writing – review & editing. ZZ: Writing – review & editing, Writing – original draft.
Funding
The author(s) declared that financial support was not received for this work and/or its publication.
Conflict of interest
The authors declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Generative AI statement
The author(s) declared that generative AI was not used in the creation of this manuscript.
Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
References
Ahmad, A., Ghouri, M. Z., Munawar, N., Ismail, M., Ashraf, S., and Aftab, S. O. (2021a). “Regulatory, ethical, and social aspects of crispr crops,” in CRISPR crops: the future of food security (Springer), 261–287. doi: 10.1007/978-981-15-7142-8_9
Ahmad, A., Munawar, N., Khan, Z., Qusmani, A. T., Khan, S. H., Jamil, A., et al. (2021b). An outlook on global regulatory landscape for genome-edited crops. Int. J. Mol. Sci. 22, 11753. doi: 10.3390/ijms222111753
Atimango, A. O., Wesana, J., Kalule, S. W., Verbeke, W., and De Steur, H. (2024). Genome editing in food and agriculture: from regulations to consumer perspectives. Curr. Opin. Biotechnol. 87, 103127. doi: 10.1016/j.copbio.2024.103127
Bansal, P. and Kaur, N. (2025). Assessing risks associated with large-scale adoption of crispr gene-edited crops. J. Crop Sci. Biotechnol. 28, 155–165. doi: 10.1007/s12892-024-00273-0
Caradus, J. R. (2023). Processes for regulating genetically modified and gene edited plants. GM Crops Food 14, 1–41. doi: 10.1080/21645698.2023.2252947
Duensing, N., Sprink, T., Parrott, W. A., Fedorova, M., Lema, M. A., Wolt, J. D., et al. (2018). Novel features and considerations for era and regulation of crops produced by genome editing. Front. bioengineering Biotechnol. 6, 79. doi: 10.3389/fbioe.2018.00079
Entine, J., Felipe, M. S. S., Groenewald, J.-H., Kershen, D. L., Lema, M., McHughen, A., et al. (2021). Regulatory approaches for genome edited agricultural plants in select countries and jurisdictions around the world. Transgenic Res. 30, 551–584. doi: 10.1007/s11248-021-00257-8
Fernandes, P. M., Favaratto, L., Merchán-Gaitán, J. B., Pagliarini, R. F., Zerbini, F. M., and Nepomuceno, A. L. (2024). “Regulation of crispr-edited plants in latin america,” in Global regulatory outlook for CRISPRized plants (Elsevier), 197–212. Available online at: https://www.sciencedirect.com/science/article/pii/B9780443184444000120.
Fernandes, P. M. B., Fernandes, A. A. R., Maurastoni, M., and Rodrigues, S. P. (2025). Lab legends and field phantoms: The tale of virus-resistant plants. Annu. Rev. Virol. doi: 10.1146/annurev-virology-092623-101850
Fiaz, S., Khan, S. A., Noor, M. A., Ali, H., Ali, N., Alharthi, B., et al. (2021). “Crispr/cas9 regulations in plant science,” in CRISPR and RNAi systems (Elsevier), 33–45. Available online at: https://www.sciencedirect.com/science/article/pii/B9780128219102000357.
Freeland, L. V., Phillips, D. W., and Jones, H. D. (2024). Precision breeding and consumer safety: A review of regulations for uk markets. Agriculture 14, 1306. doi: 10.3390/agriculture14081306
Friedrichs, S., Ludlow, K., and Kearns, P. (2022). “Regulatory and policy considerations around genome editing in agriculture,” in Applications of genome modulation and editing (Springer), 327–366. doi: 10.1007/978-1-0716-2301-5_17
Ghouri, M. Z., Munawar, N., Aftab, S. O., and Ahmad, A. (2023). “Regulation of crispr edited food and feed: legislation and future,” in GMOs and political stance (Elsevier), 261–287. Available online at: https://www.sciencedirect.com/science/article/pii/B9780128239032000044.
Gupta, S., Kumar, A., Patel, R., and Kumar, V. (2021). Genetically modified crop regulations: scope and opportunity using the crispr-cas9 genome editing approach. Mol. Biol. Rep. 48, 4851–4863. doi: 10.1007/s11033-021-06477-9
Hundleby, P. and Harwood, W. (2022). “Regulatory constraints and differences of genome-edited crops around the globe,” in Genome editing: current technology advances and applications for crop improvement (Springer), 319–341. doi: 10.1007/978-3-031-08072-2_17
Jones, M. G., Fosu-Nyarko, J., Iqbal, S., Adeel, M., Romero-Aldemita, R., Arujanan, M., et al. (2022). Enabling trade in gene-edited produce in asia and australasia: The developing regulatory landscape and future perspectives. Plants 11, 2538. doi: 10.3390/plants11192538
Kocsisova, Z. and Coneva, V. (2023). Strategies for delivery of crispr/cas-mediated genome editing to obtain edited plants directly without transgene integration. Front. Genome Editing. 5, 1209586. doi: 10.3389/fgeed.2023.1209586
Kose, A. M., Kocadagli, O., Taştan, C., Aktan, C., Ünaldı, O. M., Güzenge, E., et al. (2024). Unveiling off-target mutations in crispr guide rnas: Implications for gene region specificity. CRISPR J. 7, 168–178. doi: 10.1089/crispr.2024.0002
Kumar, A., Kumar, R., Singh, N., and Mansoori, A. (2020). “Regulatory framework and policy decisions for genome-edited crops,” in CRISPR/cas genome editing: strategies and potential for crop improvement (Springer), 193–201. doi: 10.1007/978-3-030-42022-2_9
Kumar, M., Prusty, M. R., Pandey, M. K., Singh, P. K., Bohra, A., Guo, B., et al. (2023). Application of crispr/cas9-mediated gene editing for abiotic stress management in crop plants. Front. Plant Sci. 14, 1157678. doi: 10.3389/fpls.2023.1157678
Kumawat, T., Agarwal, A., Saxena, S., and Arora, S. (2024). An analysis of global policies and regulation on genome editing in plants. Gene Editing Plants: CRISPR-Cas Its Appl. 775–793. doi: 10.1007/978-981-99-8529-6_27
Li, X., Xie, Y., Zhu, Q., and Liu, Y.-G. (2017). Targeted genome editing in genes and cis-regulatory regions improves qualitative and quantitative traits in crops. Mol. Plant 10, 1368–1370. doi: 10.1016/j.molp.2017.10.009
Li, L., Zhang, Z., and Zhang, B. (2025). Crispr meets alphafold: guiding sweet10-enhanced oil production. Trends Plant Sci. doi: 10.1016/j.tplants.2025.08.001
Lokya, V., Singh, S., Chaudhary, R., Jangra, A., and Tiwari, S. (2025). Emerging trends in transgene-free crop development: insights into genome editing and its regulatory overview. Plant Mol. Biol. 115, 84. doi: 10.1007/s11103-025-01600-x
Mbinda, W. M. (2024). “Regulatory status of crispr-edited crops in africa,” in Global regulatory outlook for CRISPRized plants (Elsevier), 327–341. Available online at: https://www.sciencedirect.com/science/article/pii/B9780443184444000053.
Movahedi, A., Aghaei-Dargiri, S., Li, H., Zhuge, Q., and Sun, W. (2023). Crispr variants for gene editing in plants: biosafety risks and future directions. Int. J. Mol. Sci. 24, 16241. doi: 10.3390/ijms242216241
Munawar, N., Ahsan, K., and Ahmad, A. (2024a). “Crispr-edited plants’ social, ethical, policy, and governance issues,” in Global regulatory outlook for CRISPRized plants (Elsevier), 367–396. Available online at: https://www.sciencedirect.com/science/article/pii/B9780443184444000119.
Munawar, N., Faheem, M., Niamat, A., Munir, A., Khan, S. H., Zahoor, M. K., et al. (2024b). “Regulatory, ethical, social, and biosafety concerns in genome-edited horticultural crops,” in CRISPRized horticulture crops (Elsevier), 421–438. Available online at: https://www.sciencedirect.com/science/article/pii/B9780443132292000260.
Niraula, S., Khanal, R., and Ghimire, N. (2024). Crispr-cas based precision genome editing: current advances and associated challenges in crop improvement and trait enhancement. Genet. Appl. 8. doi: 10.31383/ga.vol8iss2ga05
Sagawa, C. H. D., Assis, R., d., A. B., and Zaini, P. A. (2024). “Regulatory framework of crispr-edited crops in the United States,” in Global regulatory outlook for CRISPRized plants (Elsevier), 167–195. Available online at: https://www.sciencedirect.com/science/article/pii/B9780443184444000041.
Sebastiano, M. R., Hadano, S., Cesca, F., Caron, G., Lamacchia, L., Francisco, S., et al. (2024). Preclinical alternative drug discovery programs for monogenic rare diseases. should small molecules or gene therapy be used? the case of hereditary spastic paraplegias. Drug Discov. Today 29, 104138. Available online at: https://www.sciencedirect.com/science/article/pii/S1359644624002630.
Sharma, N., Thakur, K., Zinta, R., Mangal, V., Tiwari, J. K., Sood, S., et al. (2025). Genome editing research initiatives and regulatory landscape of genome edited crops in India. Transgenic Res. 34, 1–18. doi: 10.1007/s11248-025-00432-1
Sprink, T., Wilhelm, R., and Hartung, F. (2022). Genome editing around the globe: an update on policies and perceptions. Plant Physiol. 190, 1579–1587. doi: 10.1093/plphys/kiac359
Tripathi, L., Ntui, V., and Tripathi, J. N. (2024). Application of crispr/cas-based gene-editing for developing better banana. Front. Bioengineering Biotechnol. 12, 1395772 doi: 10.3389/fbioe.2024.1395772
Tudini, E., Andrews, J., Lawrence, D. M., King-Smith, S. L., Baker, N., Baxter, L., et al. (2022). Shariant platform: enabling evidence sharing across Australian clinical genetic-testing laboratories to support variant interpretation. Am. J. Hum. Genet. 109, 1960–1973. doi: 10.1016/j.ajhg.2022.10.006
Turnbull, C., Lillemo, M., and Hvoslef-Eide, T. A. (2021). Global regulation of genetically modified crops amid the gene edited crop boom–a review. Front. Plant Sci. 12, 630396. doi: 10.3389/fpls.2021.630396
Uzochukwu, S. and Okoli, A. S. (2022). “Crispr/cas9: regulatory considerations to ensure safety of gene-edited products in Nigeria,” in Biosafety and bioethics in Biotechnology (CRC Press), 19–29. Available online at: https://www.taylorfrancis.com/chapters/edit/10.1201/9781003179177-2/crispr-cas9-sylvia-uzochukwu-arinze-okoli.
Vora, Z., Pandya, J., Sangh, C., and Vaikuntapu, P. R. (2023). The evolving landscape of global regulations on genome-edited crops. J. Plant Biochem. Biotechnol. 32, 831–845. doi: 10.1007/s13562-023-00863-z
Wang, J., Zhang, L., Wang, S., Wang, X., Li, S., Gong, P., et al. (2025). Alphafold-guided bespoke gene editing enhances field-grown soybean oil contents. Advanced Sci. 12, 2500290. doi: 10.1002/advs.202500290
Wolt, J. D., Yang, B., Wang, K., and Spalding, M. H. (2016). Regulatory aspects of genome-edited crops. In Vitro Cell. Dev. Biology-Plant 52, 349–353. doi: 10.1007/s11627-016-9784-3
Zhang, D., Hussain, A., Manghwar, H., Xie, K., Xie, S., Zhao, S., et al. (2020). Genome editing with the crispr-cas system: an art, ethics and global regulatory perspective. Plant Biotechnol. J. 18, 1651–1669. doi: 10.1111/pbi.13383
Keywords: biosafety compliance, Constraint-Aware Policy Induction (CAPI) strategy, CRISPR-Ca, gene-edited crops, GeneRegAlignNet mode
Citation: Zhu F, Liu Z and Zheng Z (2026) An AI-driven framework for enhancing regulatory precision and efficiency in CRISPR-Cas gene-edited crops: challenges, opportunities, and global harmonization. Front. Plant Sci. 16:1693105. doi: 10.3389/fpls.2025.1693105
Received: 26 August 2025; Accepted: 10 December 2025; Revised: 09 December 2025;
Published: 05 February 2026.
Edited by:
Baohong Zhang, East Carolina University, United StatesReviewed by:
Taras P. Pasternak, Miguel Hernández University of Elche, SpainAli Mertcan Köse, Istanbul Commerce University, Türkiye
Copyright © 2026 Zhu, Liu and Zheng. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Zeyu Zheng, ZW1haWxAdW5pLmVkdQ==
Feng Zhu1