Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Artif. Intell.

Sec. Machine Learning and Artificial Intelligence

This article is part of the Research TopicIntelligent Artificial Lift and Multiphase Flow in the Wellbore in the Oil and Gas Production SystemsView all 5 articles

Causality-Driven Feature Representation for Connectivity Prediction

Provisionally accepted
Bruno  César de Oliveira SouzaBruno César de Oliveira Souza1*Manuel  Avila CastroManuel Avila Castro1Ahmed  EsminAhmed Esmin2Leonardo  MachadoLeonardo Machado3Alexandre  Ferreira MeloAlexandre Ferreira Melo1Anderson  de Rezende RochaAnderson de Rezende Rocha1
  • 1Artificial Intelligence Lab., Recod.ai, Universidade Estadual de Campinas Instituto de Computacao, Campinas, Brazil
  • 2Universidade Federal de Lavras Departamento de Ciencia da Computacao, Lavras, Brazil
  • 3Shell Brasil Petroleo Ltda, Rio de Janeiro, Brazil

The final, formatted version of the article will be published soon.

Causal reasoning is essential for understanding relationships and guiding decision-making in different applications, as it allows for the identification of cause-and-effect relationships between variables. By uncovering the underlying process that drives these relationships, causal reasoning enables more accurate predictions, controlled interventions, and the ability to distinguish genuine causal effects from mere correlations in complex systems. In oil field management, where interactions between injector and producer wells are inherently dynamic, it is vital to uncover causal connections to optimize recovery and minimize waste. Since controlled experiments are impractical in this setting, we must rely solely on observed data. In this paper, we develop an innovative causality-inspired framework that leverages domain expertise for causal feature learning for robust connectivity estimation. We address the challenge posed by confounding factors, latency in system responses, and the complexity of inter-well interactions that complicate causal analysis. First, we frame the problem through a causal lens and propose a novel framework that generates pairwise features driven by causal theory. This method captures meaningful representations of relationships within the oil field system. By constructing independent pairwise feature representations, our method implicitly accounts for confounder signal and enhances the reliability of connectivity estimation. Furthermore, our approach requires only limited context data to train machine learning models that estimate the connectivity probability between injectors and producers. We first validate our methodology through experiments on synthetic and semi-synthetic datasets, ensuring its robustness across varied scenarios. We then apply it to the complex Brazilian Pre-Salt oil fields using public synthetic and real-world data. Our results show that the proposed method effectively identifies injector-producer connectivity while maintaining rapid training times. This enables scalability and provides an interpretable approach for complex dynamic systems through causal theory. While previous projects have employed causal methods in the oil field context, to the best of our knowledge, this is the first time to systematically formulate the problem using causal reasoning that explicitly accounts for relevant confounders and develops an approach that effectively addresses these challenges and facilitates the discovery of interwell connections within an oil field.

Keywords: Causal Feature Learning, connectivity estimation, inter-well interactions, oil field, Injector-producer connectivity, causal reasoning, causal theory, dynamic systems

Received: 15 Aug 2025; Accepted: 15 Dec 2025.

Copyright: © 2025 Souza, Castro, Esmin, Machado, Melo and Rocha. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Bruno César de Oliveira Souza

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.