ORIGINAL RESEARCH article
Front. Remote Sens.
Sec. Remote Sensing Time Series Analysis
Volume 6 - 2025 | doi: 10.3389/frsen.2025.1555887
BERT Bi-Modal Self-Supervised Learning for Crop Classification Using Sentinel-2 and Planetscope
Provisionally accepted- 1Julich Research Center, Helmholtz Association of German Research Centres (HZ), Jülich, Germany
- 2University of Bonn, Bonn, North Rhine-Westphalia, Germany
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
Crop identification and monitoring of crop dynamics are essential for agricultural planning, environmental monitoring, and ensuring food security. Recent advancements in remote sensing technology and state-of-the-art machine learning have enabled large-scale automated crop classification. However, these methods rely on labeled training data, which requires skilled human annotators or extensive field campaigns, making the process expensive and time-consuming. Selfsupervised learning techniques have demonstrated promising results in leveraging large unlabeled datasets across domains. Yet, self-supervised representation learning for crop classification from remote sensing time series remains under-explored due to challenges in curating suitable pretext tasks. While bimodal self-supervised approaches combining data from Sentinel-2 and Planetscope sensors have facilitated pre-training, existing methods primarily exploit the distinct spectral properties of these complementary data sources. In this work, we propose novel selfsupervised pre-training strategies inspired from BERT that leverage both the spectral and temporal resolution of Sentinel-2 and Planetscope imagery. We carry out extensive experiments comparing our approach to existing baseline setups across nine test cases, in which our method outperforms the baselines in eight instances. This pre-training thus offers an effective representation of crops for tasks such as crop classification.
Keywords: BERT, Bi-modal contrastive learning, Self-supervised learning, remote sensing, Crop classification
Received: 05 Jan 2025; Accepted: 15 Apr 2025.
Copyright: © 2025 Patnala, Schultz and Gall. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: Ankit Patnala, Julich Research Center, Helmholtz Association of German Research Centres (HZ), Jülich, Germany
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.