Cross-Species RNA-Seq Study Comparing Transcriptomes of Enriched Osteocyte Populations in the Tibia and Skull

Local site-specific differences between bones in different regions of the skeleton account for their different properties and functions. To identify mechanisms behind these differences, we have performed a cross-species study comparing RNA transcriptomes of cranial and tibial osteocytes, from bones with very different primary functions and physiological responses, collected from the same individual mouse, rat, and rhesus macaque. Bioinformatic analysis was performed to identify 32 genes changed in the same direction between sites and shared across all three species. Several well-established key genes in bone growth and remodeling were upregulated in the tibias of all three species (BMP7, DKK1, FGF1, FRZB, SOST). Many of them associate or crosstalk with the Wnt signaling pathway. These results suggest Wnt signaling-related candidates for different control of regulatory mechanisms in bone homeostasis in the skull and tibia and indicate a different balance between genetically determined structure and feedback mechanisms to strains induced by mechanical loading at the different sites.


INTRODUCTION
Patterning during embryogenesis accounts for the development of all the specialized tissues and organs in multicellular organisms (1). Skeletal patterning accounts for the shape and structure of different bones and their behavior during life. However, mechanical forces provide additional potent influences on most parts of the skeleton, both before (2,3) and after birth (4). This process of skeletal adaptation tunes bone strength to prevailing functional demands, adding material above a genetically determined baseline structure that forms and is retained even in the absence of loading stimuli.
The physical demands on bones, and therefore their optimal structures vary with their site in the skeleton. These form/function relationships vary widely throughout the skeleton (5), but we can consider examples at the opposite ends of the spectrum. Long bones need to be rigid levers for efficient locomotion, while the skull provides protection for the vulnerable brain. Structure of long bones in the limbs is regulated by the loads they experience during physiological activity (6), while the skull must retain mass and strength to resist damaging events despite the absence of any significant or regular loading.
We have shown that in a human subject, the peak strains experienced in the skull even under extreme physiological circumstances do not reach 1/10th of the magnitude or rate those in the tibia of the same person (7). Such low strains would be perceived as disuse in a long bone, yet skull bone integrity is retained even in wasting conditions that deplete material in other parts of the skeleton. As all the bones in the skeleton experience the same circulating milieu of hormones and nutrients, it seems therefore that site-specific differences in regulation of bone formation and resorption must exist to account for regional differences. It might be considered that any differences between the skull and long bones are related to the embryological origins of the bone at the two sites, rather than their functional differences. This is unlikely because the clavicle is an example of a bone that is formed partly through endochondral ossification (like the long bones) and partly through intramembranous ossification (like the skull). The clavicles are bones which are known to adapt to function and lose bone mass through disuse, like long bones and unlike the skull (8), suggesting that intramembranous ossification is not invariably associated with a lack of sensitivity to disuse. Furthermore, the lateral end of the clavicle, which is the part formed by intramembranous ossification is affected by a bone wasting condition that has no effect on bone mass in the skull (9).
In order to identify the biological basis for site specific regulation of bone mass and structure we have used RNA sequencing to compare the transcriptomes of samples of skull and tibial bones from three species: mouse, rat, and rhesus macaque. Others have tried to address this question in the past, using bone samples (that included bone marrow) from young rats (10,11) and mice (12), identifying several 100 genes that appeared regulated differently between sites. The cross-species design allowed us to focus our results on what we believe are broad biological mechanisms, and resulted in only a relatively small number of differences between sites that were shared across species and are therefore likely to reflect mechanisms in humans. The method we designed was to focus attention on enriched osteocytes populations, because osteocytes are widely accepted to be the mechanosensors in bone (13), responding rapidly to mechanical loading (14), and so influencing bone formation and resorption. More importantly, evidence has suggested that osteocytes, constituting more than 90% of bone cells, use the Wnt signaling pathway involving sclerostin and Dkk1, and crosstalk with the prostaglandin pathway in response to mechanical loading (15)(16)(17). Our current study has provided further evidence to support this notion but from a site-specific differences perspective.

Experimental Animals
Four-month old female C57BL/6 mice (Charles River, Harlow, UK), 6-month old female Wistar rats (Charles River, Harlow, UK), and 4-6 year old female rhesus macaques (Macaca mulatta) were used in this study. These ages represent bones which have undergone their periods of maximum growth rate for each species, and so transcriptome analysis should not be confounded by rapid growth of the skeleton (18). Long bones (tibias) and calvariae were dissected free of soft tissue after animals were euthanased. All procedures complied with the United Kingdom Animals (Scientific Procedures) Act 1986 (PPL 40/3499).

Bone Tissue Preparation
To obtain highly enriched populations of osteocytes, tibias of mouse and rat were cut to expose the metaphysis and centrifuged briefly to remove bone marrow (1,500 g, 30 s). Calvarial samples were obtained from the rodents with scissors, removing the sutures and avoiding any marrow spaces (diploe). Samples were obtained from the Macaque bones with a hacksaw irrigated with normal saline. All samples were dissected free of periosteum and accessible surfaces were scraped with a scalpel. They were then immersed briefly in a 1 mg/ml solution of collagenase (Sigma Aldrich) for 3-5 min at 37 • C to remove any adherent surface cells. Samples were then washed in saline before being snap frozen by complete submersion in liquid nitrogen and then stored at −80 • C. Bone samples were pulverized using a Mikro-Dismembrator-S (Braun Biotech International GmbH Melsungen, Germany), in which the bone is shaken in a robust PTFE mill chamber with an 8 mm tungsten carbide ball (both cooled in liquid nitrogen before use, and again after adding the bone pieces, before agitation). The mill with tissue sample was placed within the Mikro-Dismembrator-S and set to shake at 2,500 rpm for 45 s. The weight of the fine bone powder was recorded and it was stored for RNA extraction.

RNA Extraction From Bone
One milliliter of Trizol Reagent (Ambion) was added per 125 mg of pulverized tissue (typical tissue amounts 350-500 mg); samples were incubated in Trizol reagent for 10 min at room temperature. Samples were centrifuged at 500 g for 5 min at 4 • C, the supernatant was removed carefully to ensure no debris was transferred. 0.3 ml of chloroform was added to each 1 ml of Trizol reagent, the sample was thoroughly mixed and allowed to incubate at room temperature for 5 min. The samples were centrifuged at 12,000 g for 20 min at 4 • C. The colorless upper phase was collected and transferred to a separate tube, to which an equal volume of 70% ethanol was added and incubated at room temperature for 10 min.
The samples were then applied to spin cartridges from TRIzol Plus RNA purification kit (Ambion) following the manufactures instructions using the optional On-Column PureLink DNase (Invitrogen) treatment step. Samples were quantified using spectrophotometry (NanoDrop Thermo) and quality measured using a Bioanalyzer (Agilent). Only RNA with a RNA integrity number (RIN) above 8 was stored in −80 • C and used for RNA-Seq. Samples used for analysis were one long bone and skull pair pooled from two rats, two long bone, and skull pairs each pooled from 3 mice, and 3 long bone, and skull pairs each from individual macaques.

Library Preparation and Sequencing
cDNA library preparation and sequencing were performed by Eurofins Genomic (Ebersberg, Germany). From the total RNA sample, poly(A)+ RNA was enriched and randomized primer was used for first strand cDNA synthesis. Sequencing was performed in the 1 × 125 bp run mode, 6 samples/channel on Illumina HiSeq 2500 and generated 30 million reads/sample on average. The %Q30 of each samples were all above 86% and detailed quality control metrics were shown in Supplementary

RNA-Seq Data Processing and Functional Gene Annotation
RNA-seq data analysis pre-processing and functional gene annotation was performed by Bioinformatics Consultants (Stockholm, Sweden). Reference genome sequences and gene annotations for Rattus norvegicus (rn6) and Mus musculus (mm10) were downloaded from UCSC. For Macaca mulatta we used the MacaM genome assembly (https://www.unmc. edu/rhesusgenechip/index.htm) and its corresponding gene annotations (19). Exons of genes with multiple isoforms were merged according to their genomic coordinates using a custom Perl script. HISAT2 (v. 2.0.5) with default settings was used for mapping the sequencing reads to each respective genome (20). Sequencing reads were then counted on gene models with the HTSeq-count program (the -s flag set to "no") (21). Differential gene expression was analyzed with the R package edgeR v. 3.16.5 for each species/individual separately (22). Any gene with zero read counts in at least one sample was removed, in order to lower the chance of quantifying expression from genes with incomplete annotation in one or more genomes. The trimmed mean of M-values (TMM) normalization was applied to raw read counts prior to testing for differential expression (23). Finally, differential expression was defined as genes with a false discovery rate < 20% and an absolute fold change higher than 1.1 (24,25). In the heat-map plots, raw read counts were TMM normalized, and a variance stabilizing transformation was applied (26). Heat-maps were rendered with the heatmap.2 function in the gplots R package. Three-way ortholog pairs across the three genomes were determined through their gene symbols with the same direction of gene expression. For functional gene annotation analysis, gene Ontology enrichment analysis was conducted on all identified three-way ortholog pairs with GOrilla (Gene Ontology enRIchment anaLysis and visuaLizAtion tool) (27), PANTHER (protein annotation through evolutionary relationship) classification system (28), and DAVID (Database for Annotation, Visualization and Integrated Discovery) (29). Gene Set Enrichment Analysis (GSEA) (30) was also conducted with GSEA v.2.2.3, ran with MSigDB v. 5.2 (gene set definitions) (31).

Pipeline Design
Orthologs across the three genomes were initially determined using OrthoDB (32). A principal component analysis (PCA) was conducted on the 12 samples using normalized expression of 1:1:1 orthologs. The result of this analysis showed that samples generally clustered according to species and not by sample site (Figure 1A), which indicates that the phylogenetic divergence is stronger than site differences. Similar results can also be seen through pairwise comparisons in a heatmap where samples from the same species but different sites have more similar expression profile than samples of the same sites from a different species ( Figure 1B). This could be a consequence of analysis based on the limited 14,308 threeway orthologs that can be mapped 1:1:1 between Rat-Mouse-Macaque, due to incomplete genome annotations at the time OrthoDB was compiled. Therefore, to achieve enough power to detect transcriptome differences between different sites of skeleton across species, the differential expression (DE) analysis was carried out for each species/individual separately and orthologs were then determined using a simpler approach where the ortholog relationship is found through their gene symbols and same direction of gene expression. To this end, we propose a pipeline to determine transcriptome differences across species, which comprises of sequencing reads mapped using HISAT2, DE analysis with the R package edgeR v. 3.16.5, and gene symbols based three-way ortholog pairs determination, followed by Gene Ontology (GO) enrichment analysis conducted with GOrilla, Panther classification system, and DAVID, and Gene Set Enrichment Analysis with GSEA v.2.2.3 ( Figure 1C).

Differential Expression (DE) Analysis
DE expression was analyzed with the R package edgeR v. 3.16.5 for each species separately and differentially expressed genes were defined as genes with a false discovery rate <20% and an absolute fold change higher than 1.1 in mouse and macaque. This is to privilege sensitivity over specificity and highlight broad trends firstly (24). Therefore, 1,187 and 302 differentially expressed genes between tibias and calvariae were found in macaque and mouse, respectively (Figures 2A,B). By comparing gene symbols, there were 64 genes differentially expressed in both macaque and mouse with the same direction of change ( Figure 2C). A conventional DE analysis with edgeR was not applied to the pair of rat samples as count dispersion could not be estimated. Instead, log2([cpm calvaria]/[cpm tibia]) >2 was selected as a threshold, which gave 946 DE genes in rat. By comparing gene symbols, there were 32 genes across all three species sharing same direction of gene expression ( Figure 2D), with specifically interested genes annotated in Figures 2A,B. The list of genes with all quantitative data were ranked in a descending order in Table 1, based on fold changes of gene expression between calvariae and tibias.

Gene Ontology (GO) Enrichment Analysis
To identify enriched GO terms in this list of shared DE genes, GO enrichment analysis using the web-based GOrilla application was performed first based on the 64 DE genes shared between macaque and mouse. DE genes in rat were then taken into account for quantitative analysis. Results suggested that a group of genes (HOXA7, HOXA11, HOXC4, HOXC8, HOXC9, HOXC10, ZIC1) are enriched in pattern specification process at the level of biological process (Figures 3A,B). These enriched genes are mainly binding sequence-specific DNA as FIGURE 1 | Pipeline design to discover cross species transcriptome signature of osteocytes in tibia and calvaria using RNA-Seq. (A) A principal component analysis (PCA) suggested that samples cluster according to species but not to tissue sites (data are based on ortholog triplets determined with OrthoDB). (B) Pairwise comparisons in a heat-map suggested samples from the same species but different sites have much more similar expression profile than samples of the same sites from a different species. (C) The pipeline to determine cross-species transcriptome signature: Raw RNA-seq data was aligned to relevant genomes using HISAT2. Differential expression (DE) was analyzed with the R package edgeR v. 3.16.5. Three-way ortholog pairs were determined based on gene symbols and the same direction of expression. Gene Ontology (GO) enrichment analysis was conducted with GOrilla, Panther classification system, and DAVID, in addition to Gene Set Enrichment Analysis with GSEA. After raw RNA-Seq data was mapped to relevant genomes, DE expression was performed for each species separately. their molecular function ( Figure 3C) and located in nuclear part of the cellular component ( Figure 3D).
Gene function analysis with the PANTHER classification system suggested these DE genes mainly function in the biological processes of organismal development and morphogenesis ( Table 2), not surprisingly in skeletal system development and morphogenesis (Figures 4A,B). This result is also consistent with the GSEA analysis which suggests in the tibia that there are multiple gene sets involved in skeletal development and morphogenesis. PANTHER protein class analysis confirms the results by GOrilla analysis that many of these genes belong to the family of homeobox transcription factor and are DNA binding proteins.
Using the DAVID bioinformatics resources, functional gene annotations suggest that many of these DE genes are disease implicated genes. Bone mineral density ranks the top of the disease list and includes 10 shared DE genes (FOSB, BMP7, DKK1, FGF1, FRZB, HOXA11, HOXA7, MEPE, SOST, and TNFSF10) (Figure 4C), many of which associate with the Wnt signaling pathway. Among these DE genes, all but FOSB were up-regulated in tibias in all three species (Figure 4D).

DISCUSSION
The results of our experiments show a clearly defined set of genes that are regulated differently in the calvaria and tibia across mouse, rat, and macaque. Given the different physiological functions and responses to stimuli of bone cells at these two sites, there is a compelling case that these 32 candidates account for the bulk of this site specificity, although the mechanisms by which they do this are not completely clear. The results of the comparisons of the macaque paired samples are likely to be closer to human physiology than rodents because both are primates and therefore more related. However, the analysis of differences across species allows us to identify candidates that have a high likelihood of being broadly based regulators of mammalian site-specific differences rather than those only in any one of the separate species. Specifically, our analysis shows that samples cluster according to species more than to bone sample site ( Figure 1A). We can interpret this to mean that the impact of phylogenetic divergence on these samples is stronger than any site-specific differences. This difference could be due to fundamental regulatory differences in the skeleton of different species, e.g., different functional units within compact FIGURE 2 | Differential expression (DE) and orthologs analysis. After raw RNA-Seq data was mapped to relevant genomes, DE expression was performed for each species separately. For macaque and mouse, genes with a false discovery rate <20% and an absolute fold change higher than 1.1 between samples from tibias and calvariae were defined as having significantly differential expression. bone. However, because of species differences, we believe the genes we have identified are representative of site specific rather than other differences. Using a single species for analysis could therefore affect the power to detect important general sitespecific differences in expression, as any calvaria-tibia difference would have to be larger than the gene expression divergence between species. That inference is perhaps unsurprising, but it has important implications on likely relevance of our data when compared with studies performed in only one species (10,11).
The bioinformatic analysis of the data provided robust results within and across species. Even where a conventional differential expression (DE) analysis could not be applied to the pair of rat samples (because count dispersion could not be estimated), we were able to use a different method to assess significant DE, identifying 946 DE rat genes. The segregation of our data  into patterning genes (HOXA7, HOXA11, HOXC4, HOXC8, HOXC9, HOXC10, ZIC1) and 8 others known to be related to the regulation of bone mass and architecture (FOSB, BMP7, DKK1, FGF1, FRZB, MEPE, SOST, and TNFSF10). We show that some of these genes (i.e., HOXC8, HOXC9, HOXC10) are almost exclusively expressed in tibias, at levels >64-fold higher than in calvariae. This is consistent with other studies that suggested HOX genes are pivotal in patterning the axial and appendicular skeleton, particularly for embryonic long bone development (33)(34)(35)(36)(37)(38)(39). This is consistent with their participation in cartilage differentiation in the process of endochondral ossification which is the way that long bones grow, but is not involved in the intramembranous ossification that leads to the formation of the flat bones of the skull (39)(40)(41)(42)(43)(44)(45). The involvement of a large group of HOX transcription factors indicates that the site-specific difference in regulating response to mechanical loading by osteocytes is partly a genetically determined baseline structure since embryonic stage.
Using the DAVID bioinformatics resources to identify DE genes involved in diseases, we found bone mineral density diseases ranking the top of the list and include implicated genes: BMP7, DKK1, FGF1, FRZB, MEPE, SOST, FOSB, and TNFSF10 (or TRAIL). Most of these genes code for secreted proteins and are up-regulated in osteocytes from tibias except FOSB. This list includes not only anabolic factors that would be expected to increase bone density (BMP7, FGF1, FRZB, and FOSB) but also catabolic factors (DKK1, SOST, MEPE, TNFSF10) with roles in bone remodeling, skeletogenesis and patterning (46)(47)(48)(49)(50)(51)(52). More interestingly, many of these genes, including FRZB (53,54), BMP (55), FGF (56), DKK1 (57,58), SOST (59), and MEPE (60-62), have long been shown to regulate skeletal development in association or crosstalk with the canonical Wnt signaling (55,63). The identification of these bone mineral density and Wnt signaling-related, soluble proteins suggests osteocytes at different sites response to mechanical loading differently via potential paracrine and/or endocrine agents. Upon response to strain, osteocytes regulate bone remodeling with precisely tuned balance between anabolic and catabolic effects, targeting both osteoblasts and osteoclasts. Intriguingly, Sost −/− mice were shown to have resistance to mechanical unloading-induced bone loss and exhibited high bone mass (64), which underscores our results that Sost is less expressed in the skull (with its insensitivity to disuse) but is more expressed in the tibia which requires significant mechanical loading to maintain mass. Moreover, this is also reflected by pathological conditions caused by disrupting transcription of the SOST gene and subsequently Wnt signaling, such as sclerosteosis and the van Buchem syndrome (VBCH). Both these two closely related diseases feature generalized sclerosis, including the enlargement of the skull which endures lower physical mechanical strains (65)(66)(67). All these indicate that the canonical Wnt signaling pathway dictates skeletal site differences through regulating bone's adaptive response to mechanical loading (16,68). This could be achieved by osteocytes through a Sost-dependent feedback mechanism of maintaining quiescent bone-lining cells which are the main source of osteoblasts in adulthood (69,70). However, other possible pathways could not be overlooked, for example, zinc finger protein of cerebellum (Zic) family member   HOXA7, STC2, MAPT, TBX15, CNR1, STC1, FRZB, HOXC9, NRN1, HOXC10, TBX18, SOST, FGF1, MEIS2, DKK1, HOXB7, CLU, BMP7,  CRABP1, MECOM, TNFSF10, SEMA3E, DOK5, VLDLR, HOXB5, GAL, AK4, CECR2, ZIC1, HOXC4, BNIP3, HOXC5, MEPE, MAB21L2,  MATN3, FOSB, HAPLN1, HOXC8, HOXA11, PITX1, MYL4, ZIC2   Anatomical structure  morphogenesis   HOXA7, TBX15, STC1, FRZB, HOXC9, NRN1, HOXC10, FGF1, DKK1, HOXB7, CLU, BMP7, SEMA3E, DOK5, VLDLR, HOXB5, CECR2,  ZIC1, HOXC4, MAB21L2, HOXC8, HOXA11, PITX1   Skeletal system  development   HOXA7, TBX15, STC1, FRZB, HOXC9, HOXC10, HOXB7, BMP7, HOXB5, HOXC4, HOXC5, MEPE, MATN3, HAPLN1, Positive regulation of nucleic  acid-templated transcription   HOXA7, HOXC10, SOST, FGF1, MEIS2, HOXB7, BMP7, MECOM, HOXB5, GAL, ZIC1, EAF2, FOSB, HOXA11, PITX1, ZIC2   Positive regulation of RNA  biosynthetic process   HOXA7, HOXC10, SOST, FGF1, MEIS2, HOXB7, BMP7, MECOM, HOXB5, GAL, ZIC1, EAF2, FOSB, HOXA11, PITX1, ZIC2   Regulation of transcription  from RNA polymerase II  promoter   HOXA7, TBX15, HOXC10, TBX18, SOST, FGF1, MEIS2, DKK1, HOXB7, BMP7, VLDLR, HOXB5, GAL, ZIC1, HOXC5, EAF2,  transcription factor Zic1, one of only two down-regulated gene in tibias in our study. Although not annotated by DAVID as a bone mineral density disease implicated gene, ZIC1 has been suggested a site-specific expression pattern by osteocytes and to act as a link between mechanosensing and Wnt signaling (71). The lower expression of ZIC1 in tibias suggests a potential negative feedback pathway in osteocyte in response to mechanical stimuli. We recognize that the lack of additional technical validation and low number of biological replicates impacts on the strength of the conclusions that can be drawn from the study and the potential targets we have identified need to be confirmed and explored using in vitro models and transgenic mice in future. We can conclude that a relatively small number of genes are differentially expressed between two skeletal sites in multiple species: results that are consistent with some but not all differences found by others exploring site-specific differences in the skeleton and our data cast new light onto possible mechanisms for those differences as well as potential future clinical benefits. The approach we have used appears to have advantages over microarray studies in a single species, allowing greater refinement of data to identify key regulators of the sitespecific characteristics of the skeleton. This has been underscored by a very recent study using scRNA-Seq, suggesting that osteoblasts isolated from calvaria and cortical long bone in mice have similar transcriptomes (12), although the authors suggest that changes after isolation of the cells may have contributed to the lack of differences identified. To impact upon bone pathology, therapies to make long bones behave more like the flat bones of the skull in respect of their susceptibility to loss in response to aging, disuse, and hormonal or nutritional changes could provide powerful methods to maintain bone strength throughout life and provide a strong biological explanation for the promising development of sclerostin antibodies for the treatment of osteoporosis (72,73).

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/Supplementary Material.

ETHICS STATEMENT
The animal study was reviewed and approved by the local Research Ethics Committee of the University of Sheffield (Sheffield, UK). All the procedures were performed under a British Home Office project license (PPL 40/3499) and in compliance with the UK Animals (Scientific Procedures) Act 1986.

AUTHOR CONTRIBUTIONS
TS and GR: study design. NW and CN: study conduct and data collection. NW, NL, and TS: data analysis and interpretation. NW and TS: drafting manuscript. NW, CN, GR, and TS: approving final version of manuscript.