Your new experience awaits. Try the new design now and help us make it even better

The Earth BioGenome Project Phase II: illuminating the eukaryotic tree of life

Explainer

Front Sci, 04 September 2025

Volume 3 - 2025 | https://doi.org/10.3389/fsci.2025.1514835

This is part of an article hub

The Earth BioGenome Project: a plan to sequence the DNA of life on Earth

From the mighty blue whale to the humble baker’s yeast, eukaryotes encompass an incredibly varied group of 1.67 million named species. This includes all animals, plants, fungi, and a wide range of single-celled organisms called protists.

Yet despite this extraordinary diversity, scientists have only sequenced the genomes of about 1% of known eukaryotic species. The Earth BioGenome Project (EBP) aims to change this, with a plan to sequence the genome of every known eukaryote within 10 years.

The project consists of three phases. While it has established a firm foundation during Phase I, the scale of its ambition as it enters Phase II presents technical challenges. In their Frontiers in Science lead article, Blaxter et al. provide an update on its progress and propose new plans for this next phase.

This explainer summarizes the article’s main points.

What are eukaryotes?

Eukaryotes form one of the major branches of life. They inhabit nearly every ecosystem, from deep-sea vents to cloud forests, and range from microscopic algae to some of the largest and longest-living species on Earth.

They are defined by how they store their DNA—inside a distinct nucleus enclosed within each cell. This contrasts with prokaryotes, such as bacteria, which lack a nucleus and have free-floating genetic material.

Around 1.67 million eukaryotic species have been formally named—though more than 10 million are thought to exist, and scientists discover new ones every day.

What is the purpose of the EBP?

The project’s overarching goal is to “sequence life for the future of life”—a biological “moonshot” in terms of the ambition’s scale.

“Sequencing” means identifying the sequence of the four building blocks that make up the DNA of all organisms. EBP scientists sequence DNA using a common set of guidelines to ensure they are creating a standardized, high-quality record or catalog. These records are called “reference genomes.”

Today, most of the genetic diversity within eukaryotes remains unexplored, limiting efforts to understand evolution, ecological interactions, and the genetic basis of traits across species.

By generating a rich database of reference-quality genomes for all eukaryotes, EBP aims to fill this knowledge gap and lay the foundation for future research and innovation in many fields. It will help scientists understand biodiversity, support conservation, advance medicine and agriculture, discover biological innovations, inspire bioindustry, and build a reference for life on Earth.

What has the EBP achieved so far?

During the initial set-up, the project developed the methods and ethical framework required for completing the sequencing. It also established a global network of collaborators through the coordination of 60 affiliated projects.

By the end of 2024, EBP-affiliated projects had published 1,667 genomes spanning more than 500 eukaryotic families. A further 1,798 genomes meeting EBP standards were also deposited by other researchers within the network, bringing the total number of genomes to 3,465.

Blaxter et al. write that these genomes have provided “significant insights into both fundamental and applied biological questions” such as the origins and evolution of life on Earth, and the role of genetic diversity in species’ ability to adapt to climate change.

The EBP has also created open access online hubs where researchers can store and access genomes.

What are the goals for the rest of the project?

Based on learnings from the experience to date, Blaxter et al. propose updated goals for Phases I and II of the EBP.

The authors’ plan to scale up sequencing is built around three pillars:

  • adaptive sampling: the scientists aim to have sequenced 150,000 species, and collected 300,000 samples, by the end of Phase II. To achieve this within 4 years would require completing 3,000 new genomes a month—a 10-fold increase on current rates. They will prioritize species that are important to ecosystem health, food security, pandemic control, conservation, and Indigenous peoples and local communities

  • highest genome quality: with ongoing advancements in genomic technologies, their goal is to sequence as many as possible of the 150,000 genomes to reference quality

  • equitable global partnerships: much of the Earth’s biodiversity is found in the Global South. The scientists aim to ensure that a significant portion of species collection, sample management, sequencing, assembly, annotation, and analysis be delivered by local EBP partners.

These will lay the foundations for Phase III of the EBP, which will achieve the ultimate goal of sequencing all eukaryotic species.

What are the key challenges for EBP?

To meet its Phase II goals, the Earth BioGenome Project must overcome five major technical hurdles:

  • sampling 300,000 species: finding, collecting, and storing samples at this scale is a huge task. The team intends to deploy a global workforce that can collect, taxonomically identify, store and prepare samples for DNA and RNA sequencing.

  • sequencing at speed and scale: genome sequencing needs to be faster, cheaper, and more automated. Extracting enough DNA from tiny samples, and avoiding contamination, are ongoing technical challenges

  • annotating 150,000 genomes: assigning biological meaning to each genome—such as identifying specific genes—is complex and time-consuming. New tools and methods are needed to keep pace with the enormous volume of data

  • analyzing the data: with so many genomes being produced, existing analysis methods must speed up. EBP will work with partners to explore cell diversity and other biological insights

  • reducing environmental impact: large-scale computing leaves a carbon footprint. The project will use shared tools, cloud platforms, and standardized workflows to avoid repeating analyses and limit emissions.

How much will the EBP cost?

Rapid advances in sequencing, data generation, and computer analysis have made high-quality sequencing much more achievable and less expensive than before. In 2018, Phase I was predicted to cost US$600 million. However, the latest estimate is less than half of that.

Phase I sequenced the first genomes at an average cost of US$28,000 per genome. Blaxter et al. estimate that Phase II sequencing can be completed at around US$6,100 per genome.

The overall cost for all EBP phases is estimated at US$4.42 billion over 10 years, with US$1.1 billion for Phase II and US$3.1 billion for Phase III. These figures include a US$0.5 billion Foundational Impact Project fund, which will support initiatives in the Global South to use genome sequencing in conservation, agriculture, biodiversity monitoring, and biotechnology.

However, the EBP has yet to secure full funding, although decentralized nodes have attracted funding from more than 20 countries.

The authors describe the EBP as extraordinary value for money, comparing its cost to the US$3 billion cost of the Human Genome Project and the US$10 billion invested in the Webb Telescope.