Edited by: Haja N. Kadarmideen, Technical University of Denmark, Denmark
Reviewed by: Christophe Klopp, Institut National de la Recherche Agronomique de Toulouse, France; Kieran G. Meade, Teagasc, The Irish Agriculture and Food Development Authority, Ireland
*Correspondence: Rachel Young,
This article was submitted to Livestock Genomics, a section of the journal Frontiers in Genetics
This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
The domestic water buffalo (
The domestic water buffalo (
A draft water buffalo genome was released in 2013 and published in 2017 (
Next-generation sequencing technologies allow us to generate genome-scale transcription maps providing information on both the structure and level of expression of a gene (
In the present study, we have constructed a comprehensive atlas of gene expression encompassing 220 tissue and cell samples collected from 10 river buffaloes of three different breeds (Mediterranean, Pandharpuri, and Bhadawari). We generated over 21 billion raw sequence reads which mapped to 18,730 unique genes. The dataset was used to support annotation of transcribed sequences in the new buffalo genome assembly (
Ethics approval was obtained from The Roslin Institute’s and the University of Edinburgh’s Protocols and Ethics Committees. All animal work was carried out under the regulations of the Animals (Scientific Procedures) Act 1986.
All animals used in this study were healthy. Samples were collected from six Mediterranean water buffaloes and four Indian water buffaloes (Pandharpuri and Bhadawari breeds). Tissues from the major organ systems were dissected into small pieces (100 mg) and collected into RNAlater® or snap frozen in liquid nitrogen. Bone marrow, alveolar lavage, and peripheral blood mononuclear cells (PBMCs) were collected and cryopreserved at −155°C for subsequent culture and RNA extraction. Viable cell counts were performed using Trypan blue (Gibco). All cell viabilities were >90%. Bone marrow was flushed from the posterior ribs with RPMI-1640 containing 5 mM EDTA, filtered through a 100-µm cell strainer (Corning) then pelleted by centrifugation (400 × g for 5 min). Red blood cells were removed by lysis for 5 min at room temperature in RBC lysis buffer (BioLegend) then washed in phosphate buffered saline (PBS). Alveolar lavage was performed by removing the lungs and trachea, then flushing the lungs with PBS through an endotracheal tube. The lavage was then filtered through a 100-µm cell strainer (Corning) then pelleted by centrifugation (400 × g for 10 min). Alveolar macrophages were isolated from alveolar lavages by culturing them overnight in complete medium [RPMI-1640, 20% heat-inactivated fetal calf serum (FCS) (GE Healthcare), penicillin/streptomycin (Invitrogen), and GlutaMAX Supplement (Invitrogen)] supplemented with 104 U/ml rhCSF1 at 106 cells/ml in six-well plates. The following day, non-adherent cells were removed with the media, and remaining alveolar macrophages were collected in TRIzol (Ambion). PBMCs were isolated from whole blood by centrifuging at 1,200 × g for 15 min (no brake) to obtain buffy coats. The buffy coat was then diluted in an equal volume of PBS + 2% FCS then layered over Lymphoprep (Axis-Shield) and centrifuged at 1,200 × g for 25 min (no brake). The mononuclear cell fraction was collected and washed in PBS, then red blood cells removed by lysis as detailed above. Bone marrow-derived macrophages (BMDMs) and monocyte-derived macrophages (MDMs) were obtained by culturing bone marrow cells or PBMCs, respectively, at 106 cells/ml on sterile bacteriological plastic in the presence of recombinant human colony-stimulating factor (rhCSF1; 104 U/ml; gift from Chiron, Emeryville, CA) for 10–11 days. To capture inducible innate immune effector genes, BMDMs were stimulated with 100 ng/ml lipopolysaccharide (LPS) derived from
Total RNA was isolated from 220 tissue and cell samples (
Illumina TruSeq Stranded Total and mRNA libraries were generated and sequenced by Edinburgh Genomics using the Illumina TruSeq Stranded library protocols for total RNA library preparation (part: 15031048, revision E) and mRNA library preparation (part: 15031047, revision E). Briefly, ribosomal RNA (rRNA) was depleted from samples for total RNA-Seq, using biotinylated, target-specific oligonucleotides with Ribo-Zero rRNA removal beads. Following purification, the RNA was fragmented and first-strand cDNA synthesis performed. The RNA template was then removed, and a replacement strand was synthesized incorporating dUTP in place of dTTP to generate double-stranded (ds) cDNA. The incorporation of dUTP quenches the second strand during the subsequent PCR amplification step as the polymerase will not incorporate past this nucleotide. The ds cDNA was purified; then, the 3’ ends adenylated, and indexing adapters ligated to both ends before PCR enrichment of the library. For the TruSeq Stranded mRNA libraries, poly-A-containing mRNA was purified from total RNA using poly-T oligos attached to magnetic beads. From this point, the mRNA library protocol did not differ from the protocol for total RNA library preparation. The libraries were quality controlled using an Agilent Bioanalyzer DNA 1000 Chip and quantified by qPCR before hybridization onto a flow cell. TruSeq Stranded total RNA-Seq and mRNA-Seq libraries were sequenced using an Illumina HiSeq 2500 sequencer at depths of >100 million and >25 million 125-bp paired-end reads per sample, respectively.
The sequence data for the buffalo atlas were processed using two different methods, one alignment-free and one alignment-based, as described in
Using both methods together, we progressively revised the Kallisto index and updated expression-level estimates accordingly. This iterative “multi-pass” approach to Kallisto has been described previously (
For the “first pass,” we ran Kallisto on all samples, using as its index the complete set of 45,402 predicted transcripts for the draft
We then parsed these “first pass” data, which comprised of approximately 22 billion pseudoalignments (
This “two pass” method was previously used to create an expression atlas for the domestic sheep (
The StringTie assembly is accurate with respect to the draft annotation, reconstructing all existing exon models and 82% of the transcript models (
In the transcriptome assembly created here, thousands of new transcript models are predicted, although in the absence of experimental verification, it is not easy to determine which are plausible, as opposed to stochastic noise in RNA processing or assembly artifacts. A large number of false positive transcripts are expected as the assembly integrates both mRNA-Seq and total RNA-Seq datasets. The latter measures nascent (ongoing) transcription (
Novel transcript models were retained only if they could be robustly annotated as protein-coding. To do so, the longest ORF in each exon of its set of exon models was identified. To include this transcript in the “third pass” index, we required that a) for every exon, the longest ORF is on the same strand; b) the last ORF terminates in a stop codon, rather than simply because the ORF remains open until the end of the exon; c) although the ORF of every internal exon does not have to span the entire exon length (because there may be noise in the placement of the exon/intron boundary), no internal ORF contains a stop codon (i.e., the ORF must end when the exon does); and d) the peptide, concatenated from the set of translated ORFs, is ≥50 amino acids in length. These peptides were then aligned against the NCBI non-redundant (nr) peptide database v77 (
Conservative criteria were applied to parse these alignments. For a novel transcript model to be retained, ≥ 5 alignments were required, at least one of which is to a gene model from a ruminant genus [
Using this “third pass” index, on average 60–70% of the known buffalo (UMD_CASPUR_WB_2.0) protein-coding genes were detectably expressed (average TPM, across all replicates, >1) in all tissues (
To supplement the data generated herein, we integrated additional buffalo transcriptome data from the European Nucleotide Archive (ENA) under accession number PRJEB4351. These data were generated to provide reference RNA-Seq data as part of the International Water Buffalo Genome Project (
Expression data were represented as average transcripts per million (TPM) per gene per tissue. To visualize the data, we used the network analysis tool Graphia Professional
Sample metadata for all tissue and cell samples, prepared in accordance with Functional Annotation of Animal Genomes (FAANG) Consortium metadata standards (
The complete “third pass” expression atlas, including samples derived from (
The core of this dataset was derived from four 6-month old Mediterranean buffalo. From these animals, we collected tissues from all major organ systems and, wherever possible, collected biological replicates from each sex. These tissue samples were supplemented with immune cells from two additional animals of the same breed. Collectively, the Mediterranean buffalo contributed 164 samples to the atlas. We also collected the same set of tissues from our Indian buffalo cohorts which, due to restricted availability, were older (5–7 years old). Biological replicates (2 males, 2 females) were collected where possible. The Indian animals contributed 56 samples to the atlas. A number of immune cell types were sampled, including different subsets of macrophages and their progenitors (alveolar macrophages, MDMs, BMDMs +/− LPS, bone marrow cells, and PBMCs). Previous projects in several species have indicated that macrophages are a rich source of novel mRNAs (
Two types of library were generated to capture the expression of the largest diversity of RNA species possible, ribo-depleted total RNA, and (polyA) mRNA. These two library types were sequenced at different depths: total RNA at >100 million paired-end read depth and mRNA at >25 million paired-end read depth, generating approximately 21 billion raw reads in total.
We selected a wide range of tissues for the atlas to obtain the largest diversity of transcripts possible, in addition to integrating 30 RNA-Seq libraries from a previous study (detailed in
Methods such as weighted correlation network analysis (WGCNA) and partial correlation and information theory (PCIT) have been used by others to perform gene co-expression analysis in livestock species (
Clustered network graph of the buffalo transcriptome. The buffalo atlas data were visualised by a network graph based on Pearson correlation co-efficients for gene expression patterns. Each node represents a transcript and each edge (line) represents the correlation between individual measurements above a threshold of r = 0.80. The graph comprises 15,752 nodes connected by 1,851,403 edges. Clustering of the graph using the MCL algorithm was used to assign genes to classes or clusters based on their co-expression. The clusters can be annotated depending on the tissue specificity or cellular process by function of their contents. Plots of average expression profile for a few selected clusters are given on the right. The samples used to generate the graph are shown on the X axis ordered by organ system (colour-coded) and the Y axis shows the average TPM for the cluster. The tissue specificity of gene co-expression in three selected clusters are shown. These clusters contain genes that are highly expressed in macrophages, cellular respiration and the gastrointestinal tract.
The content of the top 50 clusters is summarized in
Fifty largest clusters from network analysis.
Cluster ID | No. of transcripts | Tissue specificity | Class | Biological process | GO term |
|
---|---|---|---|---|---|---|
1 | 3,372 | Endometrium > spleen > general, relatively even | Housekeeping | Cellular nitrogen compound metabolic process | GO:0034641 | 1.91 × 10−56 |
2 | 648 | Oviduct > fallopian tube > testis > endometrium | Reproductive system | Cilium organization |
GO:0044782 |
6.67 × 10−29
|
Cell projection assembly | GO:0030031 | 1.04 × 10−26 | ||||
3 | 636 | PBMCs > endometrium > spleen > general | Housekeeping | Macromolecule metabolic process | GO:0043170 | 6.39 × 10−22 |
4 | 542 | Testis (Bhadawari) | Male reproductive | Male gamete generation | GO:0048232 | 7.50 × 10−14 |
Spermatogenesis | GO:0007283 | 4.76 × 10−13 | ||||
5 | 381 | Cerebellum > hippocampus | CNS | nervous system development | GO:0007399 | 2.50 × 10−20 |
generation of neurons | GO:0048699 | 2.82×10−17 | ||||
6 | 337 | Embryo | Developmental | No statistically significant results | ||
7 | 314 | White blood cells (WBC) | Immune | Response to cytokines | GO:0034097 | 1.42 × 10−8 |
Immune system process | GO:0002376 | 1.43 × 10−8 | ||||
Regulation of immune system process | GO:0002682 | 5.47 × 10−7 | ||||
8 | 312 | Spleen > WBC > PBMCs > lymph nodes > immune tissues | Immune | Immune system process | GO:0002376 | 1.25 × 10−18 |
Regulation of immune system process | GO:0002682 | 6.63 × 10−13 | ||||
9 | 245 | General | Housekeeping | No statistically significant results | ||
10 | 227 | Peyer’s patches > ileum > bone marrow > thymus > spleen | Pathway | Cell cycle |
GO:0007049 |
1.14 × 10−57
|
Cell cycle process | GO:0022402 | 5.60 × 10−42 | ||||
11 | 225 | Spinal cord > obex > hippocampus > cerebellum | CNS | Axon ensheathment |
GO:0008366 |
6.92 × 10−10
|
Ensheathment of neurons | GO:0042552 | 9.8 × 10−9 | ||||
12 | 192 | Heart > left ventricle > right ventricle > right atrium > thoracic esophagus | Cardiovascular system | Muscle structure development |
GO:0061061 |
4.59 × 10−8
|
Striated muscle tissue development | GO:0006941 | 1.67 × 10−6 | ||||
13 | 191 | Kidney cortex > kidney medulla > liver | Renal/endocrine system | Organic acid metabolic process |
GO:0006082 |
3.23 × 10−19
|
Carboxylic acid metabolic process | GO:0019752 | 1.60 × 10−17 | ||||
14 | 178 | Semitendinosus muscle > thoracic esophagus > longitudinal muscle > tongue | Muscular system | Muscle system process |
GO:0003012 |
1.44 × 10−17
|
15 | 157 | Liver | Liver | Blood coagulation | GO:0007596 | 2.09 × 10−18 |
Hemostasis | GO:0007599 | 3.13 × 10−18 | ||||
Coagulation | GO:0050817 | 6.26 × 10−18 | ||||
16 | 155 | Peyer’s patches > ileum > bone marrow > thymus > spleen | Immune | DNA metabolic process | GO:0006259 | 1.07 × 10−7 |
17 | 153 | Endometrium > spleen > heart > general | Pathway | Cellular respiration |
GO:0045333 |
1.29 × 10−44
|
Energy derivation by oxidation of organic compounds | GO:0015980 | 8.91 × 10−41 | ||||
18 | 152 | Endometrium > embryo > spleen > general | Housekeeping | Macromolecule metabolic process |
GO:0043170 |
4.42 × 10−6
|
Cellular macromolecule metabolic process | GO:0044260 | 1.53 × 10−5 | ||||
19 | 148 | Alveolar macrophages > BMDM + LPS | Immune | Immune system process |
GO:0002376 |
2.12 × 10−11
|
Positive regulation of defense response | GO:0031349 | 1.63 × 10−10 | ||||
20 | 135 | Macrophages > spleen | Immune | Collagen catabolic process | GO:0030574 | 5 × 10−5 |
Regulation of immune system process | GO:0002682 | 6.67 × 10−5 | ||||
Collagen metabolic process | GO:0032963 | 1.20 × 10−3 | ||||
21 | 131 | Omasum > rumen > reticulum > abomasum > tongue > tonsil | GI tract | Epidermis development |
GO:0008544 |
3.48 × 10−10
|
Epidermal cell differentiation | GO:0009913 | 3.32 × 10−7 | ||||
22 | 112 | Spleen > lymph nodes > small and large intestine > lung | Immune | Immune system process |
GO:0002376 |
8.51 × 10−10
|
Regulation of immune system process | GO:0002682 | 1.27 × 10−4 | ||||
23 | 111 | Bone marrow > spleen > BMDM +/− LPS | Immune | Protoporphyrinogen IX metabolic process | GO:0046501 | 6.87 × 10−5 |
Porphyrin-containing compound metabolic process | GO:0006778 | 9.05 × 10−5 | ||||
Tetrapyrrole metabolic process | GO:0033013 | 9.17 × 10−5 | ||||
24 | 105 | Endometrium > oviduct > fallopian tube > testis > epididymis > general | Reproductive system | Cilium organization |
GO:0044782 |
2.79 × 10−7
|
25 | 102 | Endometrium > cerebellum > spinal cord > obex | CNS | Ion transmembrane transport |
GO:0034220 |
2.27 × 10−2
|
26 | 102 | Small and large intestine | GI tract | Brush border assembly | GO:1904970 | 2.25 × 10−4 |
Regulation of microvillus organization | GO:0032530 | 1.02 × 10−2 | ||||
Regulation of cell projection size | GO:0032536 | 1.45 × 10−2 | ||||
27 | 100 | Omasum > rumen > reticulum > rectum > abomasum > cecum | GI tract | Supramolecular fiber organization |
GO:0097435 |
3.74 × 10−2
|
Actin crosslink formation | GO:0051764 | 5.85 × 10−2 | ||||
28 | 90 | Epididymis > testis | Male reproductive system | No statistically significant results | ||
29 | 79 | Ovary follicle > ovary | Female reproductive system | Regulation of hormone levels |
GO:0010817 |
6.43 × 10−4
|
30 | 76 | Pituitary gland > endometrium | Endocrine system | Endocrine system development | GO:0035270 | 3.19 × 10−8 |
Pituitary gland development | GO:0021983 | 1.7 × 10−7 | ||||
Diencephalon development | GO:0021536 | 9.21 × 10−7 | ||||
31 | 73 | Kidney cortex > kidney medulla | Renal system | Transmembrane transport | GO:0055085 | 7.50 × 10−5 |
Ion transport | GO:0006811 | 1.11 × 10−4 | ||||
Inorganic anion transport | GO:0015698 | 1.86 × 10−4 | ||||
32 | 72 | Adrenal gland | Endocrine system | Organic hydroxy compound transport | GO:0015850 | 6.24 × 10−5 |
Monoamine transport | GO:0015844 | 1.1 × 10−4 | ||||
Serotonin uptake | GO:0051610 | 2.92 × 10−3 | ||||
33 | 62 | Tongue > rumen > reticulum > tonsil | GI tract | No statistically significant results | ||
34 | 59 | General | Housekeeping | No statistically significant results | ||
35 | 54 | Testis > Peyer’s patches > ileum | Pathway | Nuclear transport | GO:0051169 | 1.35 × 10−2 |
Protein-containing complex localization | GO:0031503 | 1.94 × 10−2 | ||||
Nucleocytoplasmic transport | GO:0006913 | 2.03 × 10−2 | ||||
36 | 49 | Embryo | Developmental | No statistically significant results | ||
37 | 48 | Thyroid | Endocrine system | No statistically significant results | ||
38 | 48 | General | Pathway | Amide transport | GO:0042886 | 8.81 × 10−6 |
Protein transport | GO:0015031 | 8.92 × 10−6 | ||||
Peptide transport | GO:0015833 | 8.94 × 10−6 | ||||
39 | 47 | Ovary | Endocrine system | Not statistically significant | ||
40 | 45 | Thyroid > salivary gland > kidney medulla > lung | Endocrine system | Glycoprotein metabolic process |
GO:0009100 |
4.13 × 10−2
|
41 | 43 | Endometrium > epididymis > testis > fallopian tube > ovary follicle | Reproductive system | Regulation of animal organ morphogenesis |
GO:2000027 |
2.29 × 10−2
|
Heart development | GO:0007507 | 3.50 × 10−2 | ||||
42 | 41 | Thoracic esophagus > tongue > semitendinous muscle > left ventricle | Musculoskeletal system | No statistically significant results | ||
43 | 41 | General | Pathway | Protein folding | GO:0006457 | 2.66 × 10−12 |
Positive regulation of protein localization to chromosome, telomeric region | GO:1904816 | 6.19 × 10−12 | ||||
Regulation of establishment of protein localization to chromosome | GO:0070202 | 6.31 × 10−12 | ||||
44 | 41 | White blood cells > endometrium > spleen | Pathway | Protein modification process |
GO:0036211 |
7.93 × 10−3
|
Cellular protein modification process | GO:0006464 | 1.19 × 10−2 | ||||
45 | 37 | Occipital lobe > hippocampus | CNS | No statistically significant results | ||
46 | 36 | Lung > lymph nodes > epididymis | No class | Cardiovascular system development |
GO:0072358 |
1.12 × 10−11
|
Blood vessel development | GO:0001568 | 1.42 × 10−11 | ||||
47 | 34 | Mammary gland | Reproductive system | Proximal/distal pattern formation involved in nephron development | GO:0072047 | 1.71 × 10−2 |
Specification of loop of Henle identity | GO:0072086 | 1.90 × 10−2 | ||||
Pattern specification involved in kidney development | GO:0061004 | 1.99 × 10−2 | ||||
48 | 33 | General | No class | No statistically significant results | ||
49 | 33 | Lymph nodes > lung | Immune | Synapse pruning | GO:0098883 | 1.78 × 10−4 |
Innate immune response | GO:0045087 | 1.34 × 10−3 | ||||
Lymphocyte-mediated immunity | GO:0002449 | 3.09 × 10−2 | ||||
50 | 31 | Lymph nodes > small and large intestine | Immune | Immune system process |
GO:0002376 |
5.87 × 10−8
|
Leukocyte activation | GO:0045321 | 5.26 × 10−5 |
A list of the 50 largest co-expression clusters from the water buffalo gene expression atlas. Clusters are numbered according to their size (the largest is cluster 1). The first two columns give the cluster ID and number of transcripts present in that cluster, the following two columns describe tissue specificity and class (where possible) and the final three columns describe the biological process of the genes co-expressed in that cluster, their GO term and the associated p-value cluster according to gene ontology enrichment analysis using PANTHER.
Most of the smaller clusters contained genes whose expression was restricted to an organ system. Some clusters were specific to a single tissue or cell type, while others were clearly associated with a biological or cellular process. In many cases, the likely function of genes within any of the clusters can be inferred from their cell type enrichment or the known function of well-annotated genes within the clusters. These include organ system and tissue-specific clusters for genes expressed predominantly in the brain (clusters 5, 11, 25, and 45), heart (clusters 12 and 66), and reproductive system (clusters 2, 4, 24, 28, 29, 39, and 41). More specifically, certain clusters were enriched for the biological process GO terms of cilium organization (
We noted that replicate samples sometimes showed different expression patterns. For example, the expression of some genes in the three testis samples were not consistent. These differences may result from the differing ages of the animals (from 6 months to more than 5 years). Variation in other tissues may result from sex-specific effects, phase of oestrus cycle in the females, different husbandry (for example diet, exercise level, ambient temperature), and other factors. These differences could be explored further with a larger set of replicates. Nevertheless, in this analysis, clear, logical associations of gene expression patterns were found in spite of some differences between replicates, as presented below.
We sampled several immune tissue and cell populations to identify genes that might be associated with disease resistance and resilience traits. We identified two main macrophage clusters (clusters 19 and 20) from the atlas data, each enriched for a particular macrophage subset. Genes in cluster 19 showed the highest expression levels in alveolar macrophages (AM) with many of the genes encoding well characterized macrophage-specific proteins. Genes in this cluster were enriched for GO terms including “immune system process” (
Cluster 23 contains genes with expression peaks in bone marrow and spleen and both LPS-stimulated and unstimulated BMDMs. Biological process GO terms enriched in this cluster include “protoporphyrinogen IX metabolic process” (
As previously observed in both the sheep and pig expression atlases (
Genes associated with oxidative phosphorylation (cluster 17).
Associated pathway | Genes |
---|---|
TCA cycle |
|
Oxidative phosphorylation complex I |
|
Oxidative phosphorylation complex II |
|
Oxidative phosphorylation complex III |
|
Oxidative phosphorylation complex V |
|
Mitochondrial membrane transport |
|
Mitochondrial RNA processing |
|
Apoptosis associated |
|
Cellular respiration |
|
Fatty acid (long chain) beta-oxidation |
|
Oxidative phosphorylation related |
|
Ubiquinone biosynthesis |
|
Genes associated with oxidative phosphorylation (Cluster 17).
Genes encoding proteins involved in oxidative phosphorylation and related pathways from Cluster 17 are shown.
Cluster 10 was enriched for genes with GO terms including “cell cycle” (
Although ruminant species have anatomically equivalent gastrointestinal tracts, we considered that GI tract gene expression may differ due to differences in diet, metabolism, or habitat. To test this hypothesis, we compared gene expression in the GI tract between buffalo and sheep, using gene expression data from the sheep atlas (
Cluster 26 was enriched for genes expressed in the small and large intestines, although with highest expression in the former. Genes expressed in this cluster are enriched for the GO terms “brush border assembly” (
We also compared levels of expression of the
The detection of long non-coding RNAs (lncRNAs) from large gene expression atlas projects has added a further layer of complexity to the genome and regulation of gene expression. The ENCODE project has annotated approximately 16,000 lncRNAs in the human genome. More recently, lncRNAs have been annotated in livestock and large animal species such as sheep, goat, cattle, pig, and horse (
lncRNA network graph. A network graph of annotated lncRNAs was generated applying a correlation threshold of
Most of the lncRNA clusters were tissue- or organ system–specific. The largest lncRNA cluster (cluster 1) showed co-expression of lncRNAs in a single buffalo embryo and embryo pool. lncRNA cluster 6, a relatively small cluster containing only 39 lncRNAs, also showed co-expression in the embryo, along with the occipital lobe and longitudinal dorsal muscle. This expression pattern reflects the involvement of lncRNAs in the regulation of gene expression during development (reviewed in
All of the RNA-Seq data generated in this project have been provided to support annotation of intron-exon boundaries in the new water buffalo genome assembly (
Ethics approval was obtained from The Roslin Institute’s and the University of Edinburgh’s Protocols and Ethics Committees. All animal work was carried out under the regulations of the Animals (Scientific Procedures) Act 1986.
RY, LL, SG, SK, AA and DH contributed conception and design of the study; SB performed data analysis and curation; RY, LL, AJ, SS, SJ, VD, ZL, DI, KS, JW and DH collected the samples. LL, RY, AJ, SS, SJ and VD performed the experiments. RY, LL, SB and DH analysed the results. DH, AA, SK and SG secured funding and supervised the project. RY wrote the first draft of the manuscript; LL, SB and DH wrote sections of the manuscript. All authors contributed to manuscript revision, read and approved the submitted version.
This work was supported by a joint Biotechnology and Biological Sciences Research Council (BBSRC) (
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
We would like to thank the large number of people at the Roslin Institute, UK, Parco Tecnologico Padano (PTP) in Lodi, Italy, and BAIF in Pune, India who helped with the many aspects of the water buffalo expression atlas project. At Dryden Farm (Roslin) the post mortem was performed by Tim King, Peter Tennant and Adrian Ritchie. At PTP the post mortem was performed by Nicola Morandi. In Pune the post mortem was performed by Hemant Kadam and Asmita Kulkarni, and the tissue collection assisted by Shilpa Jayebhaye. The tissue collection and sample processing team at Roslin included Emily Clark, Kristin Sauter, Lindsey Waddell, Ailsa Carlisle, Mark Barnett, Anna Raper, Sara Clohisey, Tim Regan, Clare Pridans and Gemma Davis. The tissue collection team in Lodi included Nadia Fiandanese, Sara Botti, Raffaele Mazza and Bouabid Badaoui. Library preparation and sequencing was performed by Edinburgh Genomics, The University of Edinburgh. We would also like to thank Peter Harrison (EBI– FAANG Data Coordination Centre) for assistance with generating the sample and experimental metadata. The Roslin Institute receives core strategic funding from the Biotechnology and Biological Sciences Research Council, UK (grant numbers BB/J004235/1, BBS/E/D/20211550, BBS/E/D/20211552).
The Supplementary Material for this article can be found online at: