Impact Factor 6.684 | CiteScore 2.7
More on impact ›

METHODS article

Front. Cell Dev. Biol., 26 August 2021 |

A New Pipeline to Automatically Segment and Semi-Automatically Measure Bone Length on 3D Models Obtained by Computed Tomography

  • 1Australian Regenerative Medicine Institute, Monash University, Clayton, VIC, Australia
  • 2Department of Infectious Diseases and Public Health, City University of Hong Kong, Kowloon, Hong Kong, SAR China
  • 3Monash Biomedical Imaging, Monash University, Clayton, VIC, Australia
  • 4Department of Anatomy and Developmental Biology, Monash Biomedicine Discovery Institute, Monash University, Clayton, VIC, Australia

The characterization of developmental phenotypes often relies on the accurate linear measurement of structures that are small and require laborious preparation. This is tedious and prone to errors, especially when repeated for the multiple replicates that are required for statistical analysis, or when multiple distinct structures have to be analyzed. To address this issue, we have developed a pipeline for characterization of long-bone length using X-ray microtomography (XMT) scans. The pipeline involves semi-automated algorithms for automatic thresholding and fast interactive isolation and 3D-model generation of the main limb bones, using either the open-source ImageJ plugin BoneJ or the commercial Mimics Innovation Suite package. The tests showed the appropriate combination of scanning conditions and analysis parameters yields fast and comparable length results, highly correlated with the measurements obtained via ex vivo skeletal preparations. Moreover, since XMT is not destructive, the samples can be used afterward for histology or other applications. Our new pipelines will help developmental biologists and evolutionary researchers to achieve fast, reproducible and non-destructive length measurement of bone samples from multiple animal species.


Skeletal measurement is the pillar of many research applications, such as developmental studies on limb patterning (Summerbell, 1977; Galloway et al., 2009) and growth (Rosello-Diez et al., 2017; Marchini and Rolian, 2018), main axis segmentation (Casaca et al., 2014; Wong et al., 2015), evolutionary studies (Sears et al., 2006; Sheth et al., 2012; Kherdjemil et al., 2016), disease modeling (Chen et al., 1999; Li et al., 1999; Rowe et al., 2018), adult phenotyping of mutant models (Boskey et al., 2003), etc. Whereas clinical musculoskeletal research often uses non-destructive imaging as routine (Cheng and Wang, 2018), fundamental evolutionary and development (evo-devo) studies often rely on differential staining of bone and cartilage (the so-called ex vivo skeletal preparations) (Rigueur and Lyons, 2014; Mead, 2020) and subsequent two-dimensional (2D) imaging for quantitative comparisons of the models of interest. Despite being broadly used, the skeletal preparation technique is ridden by several disadvantages. First, it is a destructive technique in the sense that the samples cannot be used for further histological or molecular applications. Second, it involves lengthy staining and clearing of cadavers, followed by laborious and damage-prone dissection of the skeletal elements of interest, in their preparation for imaging. Third, accurate measurements depend heavily on the imaged sample being positioned as flat as possible; otherwise, the apparent length will be shorter than the real one due to parallax error. As a result, measurements are often prone to user error and require multiple measurements to calculate standard error. These limitations prompted us to seek alternative methods to measure bone length in a fast and reliable way, without destroying the sample.

X-ray microtomography (XMT) is a non-destructive imaging modality that uses radiographic projections taken at multiple angles to reconstruct two-dimensional tomograms (literally, slice images) whose pixel values represent the X-ray attenuation coefficient at each point in the imaged object (Elliott and Dover, 1982). It is common to arrange the tomograms into a stack to produce a three-dimensional (3D) image (Elliott and Dover, 1982; Christiansen, 2016; du Plessis et al., 2017). We reasoned that since XMT can be used to image undissected samples, it would allow us to scan multiple samples relatively fast, with the advantage of preserving their integrity in case they are needed for further processing. Moreover, computer-based image processing would in principle allow us to maximize the automation of the subsequent 3D reconstruction and measurements. Methods based on manual landmarking and measurement of the 3D models have already been developed, but we wanted to eliminate the human interaction component as much as possible. Our main goal, in summary, was to develop a pipeline to scan multiple whole-animal samples in a batch, and bulk-process the scans to extract linear measurements of the bones of interest. Minimal user intervention was the most important requirement, both to enable its use as a workhorse method in skeletal development labs, but also to eliminate any potential unconscious bias in the process. Within this general goal, we established three objectives: (1) to identify standard conditions (i.e., combination of scan resolution and analysis parameters) that yield low inter-batch variability; (2) to obtain a versatile pipeline that could be applied with minimal variation to a range of developmental stages; (3) to achieve enough precision to detect even small phenotypes, such as the 5–10% bone-length differences we have previously described with some of our models (Rosello-Diez et al., 2017, 2018).

In XMT, the ability to independently analyze distinct tissues relies on their accurate separation through so-called segmentation (Bouxsein et al., 2010; Weissheimer et al., 2012). Since bone has a high mean atomic number and linear X-ray attenuation coefficient compared to other body tissues, it generates strong contrast in X-ray imaging modalities such as XMT and can be readily segmented through threshold-based methods where grayscale values determine what is bone tissue and what is background (Campbell and Sophocleous, 2014). There are several modalities of segmentation. Manual segmentation involves the manual selection of the areas of interest section by section, and is therefore quite laborious and subjective, thus prone to user error (Rathnayaka et al., 2011). Semi-automated methods, on the other hand, use algorithms like edge detection (Rathnayaka et al., 2011) and/or local differences in gray values (Zhang et al., 2010) with some user input for initial parameters. Another common method is automated segmentation, whereby image-processing algorithms are used to segment elements of interest with minimal to no-user interaction (Šajn et al., 2007; Okada et al., 2008; Heidrich et al., 2013; Yiannakas et al., 2016). Algorithm-based automatic segmentation, however, requires the user to have programming knowledge and a thorough understanding of mathematical algorithms related to the image processing software being used (Rathnayaka et al., 2011). Deep learning-based segmentation methods may out-perform simple thresholding, but at a high cost for initial training (Galvez-Hernandez et al., 2021).

There are a wide range of software solutions that can analyze XMT data in the form of digital imaging communications in medicine (DICOM) files to segment a variety of high-contrast tissues like lungs (Weissheimer et al., 2012; Reynisson et al., 2015), liver (Okada et al., 2008; Huhdanpaa et al., 2011) and bone (Rios et al., 2014; Mehadji et al., 2019; Taghizadeh et al., 2019). After some pilot testing of both open-source and commercial solutions, we settled on the Mimics Innovation Suite (Materialize, Leuven, Belgium) as the one that most readily suited our needs. Mimics has been previously benchmarked against other programs like Syngo (An et al., 2017), OsiriX (Reynisson et al., 2015) and ITK-snap (Weissheimer et al., 2012), and some of its key features are its flexibility, ease of use, sensitive and controlled segmentations (Weissheimer et al., 2012; Reynisson et al., 2015) and the possibility to integrate Python scripting modules to further extend its automation capabilities.

Here we present a semi-automated analysis pipeline for the fast and robust characterization of long-bone length, using two solutions that can be adopted by non-experts: (1) Python scripting and segmentation tools of the commercially available software package Mimics; (2) a standardized pipeline in the ImageJ plugin BoneJ. We report the advantages and caveats of each method.

Materials and Methods

Animal Experiments

Mouse embryo and pup samples were retained as residual tissues from other experiments in the Rosello-Diez lab, approved by the Animal Ethics Committee at Monash University (protocol 17048). Wild-type E17.5 samples were obtained from Asmu:Swiss crosses. P7 samples consisted of tTA-negative littermates (phenotypically wild-type) obtained from crosses of females containing the left-lateral plate mesoderm specific Pitx2-Cre (Shiratori et al., 2006) and a cartilage-specific Col2a1-tTA (Rosello-Diez et al., 2018) with males bearing a TigreDragon–DTA allele (Ahmadzadeh et al., 2020). P3 samples consisted of pups obtained from crosses of females containing the left-lateral plate mesoderm specific Pitx2-Cre, a cartilage-specific Col2a1-rtTA (Posey et al., 2009) and 1 copy of an Egr1 null allele (JAX#012924) with males bearing a TigreDragon–DTA allele (Ahmadzadeh et al., 2020). Control and experimental animals were separated based on rtTA genotype, regardless of the presence of the Egr1 null allele. Doxycycline hyclate (Sigma, 0.5 mg/ml in the drinking water, with 0.5% sucrose to increase palatability) was given to the pregnant female from E12.3 to E13.8. Noon of the day the vaginal plug was detected was considered E0.5.

Samples Collection and Fixation

Mouse embryos/pups were decapitated, the skin, internal organs and adipose tissue were removed. The samples were then fixed in 4% paraformaldehyde overnight at 4°C and then stored in PBS until ready for XMT scans.

X-ray Microtomography Scans

A Siemens Inveon PET-SPECT-CT Small Animal scanner in CT modality was used for all experiments. Parameters: 20- and 40-μm resolution, 360 projections at 80 kV, 500 μA, 600 ms exposure with a 500 ms settling time between projections. Binning was applied to vary resolution with 2 × 2 for 20 μm and 4 × 4 for 40 μm scans and data was reconstructed using a Feldkamp algorithm. The samples (beheaded embryo and pup bodies) were placed in supine position over custom-fitted foam bedding, so that the limbs were not in contact with any hard surface. Scan time for each sample is roughly 10 min (plus 30 min of set up per imaging session). However, scan time can depend on the scan resolution, number of projections, exposure time and settling time between projections.

Data Processing

Data was reconstructed using a Feldkamp algorithm and further converted to DICOM files using Siemens software.

Note on File Size Limits

Current versions of Mimics (v24) have a file size limit of 256 TB (Materialize, email communication July 2021). In ImageJ the pixel limit for a 3D image stack is 262 (i.e., 231 in z and 231 in xy, about 4.6 EB in total.) because of how the image slices are arranged and the pixels indexed with signed 32-bit integers. These limits are still well above the typical datasets obtained from XMT instruments, which are in the order of 1–10 GB. As a reference, our typical 20-μm datasets are 5–7 GB.

BoneJ Software and Pipeline

FIJI was used to develop the analysis pipeline. See section “Results” for an overview; each step in the process is outlined in detail below.

Data Loading

The entire image set is loaded into memory, as many of the downstream actions requires to have the entire image set in memory. A virtual stack is possible as the initial action, if it is necessary to crop or reduce the size of the image, if the image is too large for the computer being used. When the crop/size reduction is performed, FIJI will load the entire image stack into main memory.


This method employs the segmentation method of maximum entropy. Segmentation divides the image into multiple parts (at least two, usually). This yields the objects of interest, and everything else. Some methods, such as Otsu’s three-class, can produce more than two layers. The choice of maximum entropy in this case is based on making the method as widely applicable as possible. Maximum entropy is implemented in ImageJ as a method to maximize the inter-class entropy (between the selected objects and everything else in the image). This involves the average gray values of the pixels present in each image (or in a reference image), the individual gray value of each pixel, and the gray values in the local neighborhood of each pixel. The initial threshold value the algorithm selects is based on the probability estimation from a histogram of all pixels in the image stack. Therefore, maximum entropy offers the best robustness, rejection of undesirable elements, acceptable performance in low-contrast images, and good noise tolerance. The method is described in Kapur et al. (1985). Threshold values of 280 to 500 HU were chosen, depending on the developmental stage, as a best compromise on the images included. This value may need to be adjusted depending on the CT scanner, reconstruction kernel, and the sample. The segmentation was further refined via the Erode and Watershed tools (2D versions).

Generation and Labeling of Aligned Boxes With Particle Analyser

BoneJ’s Particle Analyser code was altered to use the bone’s inertia tensor to define the three principal axes of each element, and then generate the minimum-size cuboidal box that fits the skeletal element and that is aligned with the principal axes (BoneJ styloid-r11). The coordinates and dimensions of these boxes are then appended to the Results window in Fiji. The Label Elements (3D) feature then allows the user to Ctrl-click the surfaces of interest and assign a label to them. These are added as a new column to the Results window. The macro below also includes the possibility of saving a screenshot of the 3D Viewer window with the desired orientation.

ImageJ Macro to Automate the Process

R-Script to Merge and Clean-Up Data Tables

Mimics Software and Pipeline

Mimics Research (v21.0; Materialize, Leuven, Belgium) equipped with the scripting module was used to develop the analysis pipeline and the Python script described here. See section “Results” for an overview and each step in the process is outlined in detail below.

Data Loading

DICOM files are imported into Mimics via the New Project wizard.


As soon as the DICOM data is uploaded into Mimics, the first step is to distinguish the bones from all the other tissue by defining a range of Hounsfield Units (HU) that corresponds to bone density. The first Python command in BASILISC creates a mask labeled “ALL,” which will segment all the skeletal elements present, creating a global threshold specific to this mask. Since the goal was to measure the developing mineralized part from end to end, this step had to detect immature trabecular bone at the ends of the growing elements. In our uncalibrated XMT scans, we realized that the custom minimum threshold for bone tissue defined by Mimics (226 HU) often over-represented the actual bone tissue in the scans as it selected a greater area of tissue. The optimal lower threshold for the developmental stages of interest had thus to be determined empirically. Although Mimics can take both gray scale values (GV) and HU units, the input in the script can only be into GV, and therefore the first step was to transform the data into GV to adequately segment all bones from the rest of the tissue. This is achieved through the “segment” attribute seen on the last line of code for this section. In principle, different optimal thresholds exist for different scanning conditions and certainly for different developmental stages, as the ratio between woven and lamellar bone decreases, and hence BASILISC was designed in such a way that the user can select among three pre-defined thresholds via a pop-up menu. This can be easily changed within the following section of the script (pre-defined values appear in orange font):


The purpose of this step is to segment and uniquely label all the bones of interest, using Mimics tools. This is achieved using a function that prompts the user (via a pop-up window) to select a landmark on the bone of interest. The first step in landmarking is to select the bones of interest to create a list of “landmarks.” This list contains the unique name of each selected bone and defines the order of segmentation during the process. The “indicate_landmark” function guides the user through each of the bones to be segmented by means of a dialog box, asking the user whether a given element is present in the scan or not and with two active buttons: “Select” and “Skip” (Figure 1C). The user has the option to skip an element if a given bone is not present in the scan, this would then be excluded from the analysis. If the “Select” button is activated, a second dialog box prompts the user to select a region (landmark) of the indicated element by simply clicking on it on one of the 2D views of the sample. BASILISC will automatically label and segment the selected element without further user interaction, through the Mimics “region_grow” function. A FOR loop has been included in the BASILISC script when executing the “indicate_landmark” function, so that the steps above are recursively followed for each of the bones of interest sequentially, using the name of each bone as an index within the FOR loop.


Figure 1. Bone Automated Segmentation and Interactive Length Interrogation of Standardized computerized scans (BASILISC). (A–E) Mimics procedure. (A) Diagram depicting the Mimics procedure followed by the script. (B–E) Representative screenshots of key steps in the process: threshold pre-selection and segmentation (B), element seeding (C), fit-to-surface line fitting (D), table export (E). (F–l) BoneJ procedure. (F) Diagram depicting the BoneJ procedure followed by the macro. (G) Typical output of the Particle Analyser module, as visualized with ImageJ’s 3D Viewer. Aligned boxes appear in yellow. (H,l) The BoneJ command Label Elements (3D) (in development, code at GitHub) allows the user to interact with the 3D viewer to select the bones of interest, select or write a label for them (H) and append that label.

Generation of 3D Models

Once all the elements of interest have been segmented and labeled accordingly, a function has been created in BASILISC that creates 3D models of each element, “create_3D.” A FOR loop in the script steps through each of the segmented bones and creates a 3D model of each at the highest possible resolution (Figure 1D). This provided the most accurate measurements possible and since a limited number of bones are analyzed, computing time to create each 3D model did not increase significantly.


Once BASILISC has automatically made 3D models, it will fit a center line to each bone within the “create_3D” function. This is achieved through the “analyze.create_line_fit_to_surface” attribute in Mimics. The script has been designed to then automatically obtain the length of the fitted line and save the measurement in a text file (Figure 1E). Since this step is included within the function described above, which includes a FOR loop, the line is fitted as each 3D element is made, and the measurement is recorded progressively. The text file created will have the name of the given part, e.g., RIGHT HUMERUS, followed by a comma and the corresponding length of the element. This step is done automatically without any user input required after the landmarking step has been finalized. As the file created is only labeled with the name of the developmental stage created, the user should change the name of the text file to be sample specific before analyzing the next sample.

Manual Corrections During Image Analysis

For E17.5 samples, the radius and ulna were segmented together at the thresholds we use, but they could be easily separated using the Split mask function of Mimics, as their interaction surface was quite reduced. This requires identification of specific areas where there was an overlap in pixels, manually selecting areas that corresponded to each bone in a 2D view in at least three areas across the length of the bone (proximal, distance and middle area), before extrapolating the selected regions to create two separate elements. This tool is not available in BoneJ.

Pipeline Benchmarking

For Figures 2, 3, each specimen was scanned in triplicate or quadruplicate (on three or four different days), at two resolutions each (20 and 40 μm), and each of the six scans was segmented at two different lower thresholds (in Mimics: 650 and 398 HU for P7, 398 and 226 HU for E17.5; in BoneJ: 500 and 385 HU for P7, 400 and 280 HU for E17.5) to perform length measurements. Humerus, radius, femur, tibia and clavicle (the latter only for E17.5) were analyzed for two (P7) or three (E17.5) different specimens.


Figure 2. Assessment of batch effect for multiple P7 bones across different scan and analysis conditions. (A–D) Measured length for the indicated bones of two P7 mouse pups, each scanned on three different days (triplicates joined by lines) at either 40 (A,B) or 20-μm resolution (C,D), and analyzed with either a high (A,C) or a low (B,D) minimum threshold. L, R: left, right. (E,G) Top: Heatmap for the Coefficient of Variability (CV, %) between the three batches of the indicated measurements, for the Mimics (E) and the BoneJ (G) pipelines. Bottom: 2-way ANOVA table showing the contribution and associated p-value of each source of variation of the experiment. (F,H) Representative examples of the generated 3D models (left and right from the same specimen) and their fitted lines/boxes, for tibia using Mimics (F) and for radius/ulna using BoneJ (H).


Figure 3. Assessment of batch effect for multiple E17.5 bones across different scan and analysis conditions. (A–D) Measured length for the indicated bones of two or three E17.5 mouse fetuses, each scanned in triplicate (or quadruplicate) at either 40 (A,B) or 20-um resolution (C,D), and analyzed with either a high (A,C) or a low threshold (B,D). (E,G) Top: Heatmap for the Coefficient of Variability (CV, %) between the batches of the indicated measurements for the Mimics (E) and BoneJ (G) pipelines. Bottom: mixed-effects model table showing the contribution and associated p-value of each source of variation of the experiment. (F,H) Representative example of the generated tibial 3D models and its fitted line (F) or aligned box (H).

Skeletal Preparations

After embryo collection, the skin, internal organs and adipose tissue were removed. The samples were then fixed in 95% EtOH overnight at room temperature. To remove excess fat, the samples were then incubated in acetone overnight at room temperature. To stain the cartilage, the samples were submerged in a glass scintillation vial containing Alcian blue solution (0.04% (w/v), 70% EtOH, 20% acetic acid) and incubated at least overnight at room temperature. The samples were de-stained by incubating them in 95% EtOH overnight, and then equilibrated in 70% EtOH, prior to being pre-cleared in 1% KOH solution for 1–10 h at room temperature (until blue skeletal elements were seen through). The KOH solution was replaced with Alizarin red solution (0.005% (w/v) in 1% KOH) for 3–4 h at room temperature. The Alizarin red solution was then replaced with 1–2% KOH until most soft tissues were cleared. For final clearing, the samples were progressively equilibrated through 20% glycerol:80% (1% KOH), then 50% glycerol:50% (1% KOH) and finally transferred to 100% glycerol for long-term storage.

Statistical Analysis

Experimental data are presented as the mean ± SD. P < 0.05 (two-way ANOVA) was considered statistically significant.


A Script for Bone-Length Measurement on XMT Scans With Minimal User Input Using Mimics

After testing multiple software solutions, we chose a commercial one (Materialize Mimics Research software) to develop the initial pipeline, based on the promising initial results and the possibility of automation via scripting. We thus developed a Python script that utilizes Mimics capabilities to segment and measure the mouse bones of interest (humerus, radius, ulna, tibia and sometimes clavicle) from CT scans. This script is called BASILISC (Bone Automated Segmentation and Interactive Length Interrogation on Standardized computerized scans). BASILISC is available in Github1, and designed to run in the Materialize Mimics Research software v.18 to 21, and hence there are attributes that are specific to this program. The script can be divided into four main sections: thresholding, landmarking, 3D modeling and measurement and export (Figure 1A). See Supplementary Video 1 for an overview of the whole procedure.

The first Python command in BASILISC segments all the skeletal elements, using a global threshold for bone tissue (Figure 1B). Since these were developmental samples, segmentation was applied to the mineralized region, not the cartilage poles. To increase its applicability to different developmental stages and scanning conditions, BASILISC was designed in such a way that the user can select among three pre-defined thresholds [Low, Medium, High, ranging from ∼200 to 500 Hounsfield Units (HU)] via a pop-up menu. These pre-defined values can be easily changed within the script (see section “Materials and Methods”). The next section of the script is landmarking, which uses Mimics tools to segment and uniquely label all bones of interest via a pop-up menu (Figure 1C). The user is prompted to select a region (landmark) of the indicated element by simply clicking on it on one of the 2D views of the sample. BASILISC will automatically label and segment the selected element without further user interaction. In the third section of the script, once all bones of interest have been segmented and labeled accordingly, a 3D model of each skeletal element is created (Figure 1D). Then BASILISC automatically fits a “line to surface” running from end to end along the center of each element. In the last section, the script automatically obtains the length of the fitted lines and saves the measurements to a comma-separated-values (csv) text file (Figure 1E).

A Pipeline for Streamlined Bone Length Measurement in the Open-Source ImageJ Plugin BoneJ

We also developed a similar pipeline to semi-automatically measure bone length using open-source software, as a broadly useful approach. We chose ImageJ2, a popular Java-based image analysis program developed at the United States. National Institutes of Health, as a platform for such a solution. Importantly, one of us (M.D.) previously developed an ImageJ plugin (BoneJ) with modules designed for analysis of bone geometry (Domander et al., 2021). ImageJ’s ability to load and segment XMT data was the starting point we required. We also developed new modules and new capabilities for existing ones to suit our purposes (see section “Materials and Methods”). These capabilities will soon be included as a BoneJ update and the code is currently available at The pipeline can be divided into loading, segmentation, fitting a minimal bounding box to each element and measurement of said box (Figure 1F). A brief overview follows.

We first loaded the whole data set (DICOM format) into FIJI [a package that includes ImageJ and multiple biology-oriented plugins for image analysis (Schindelin et al., 2012)]. We then proceeded with segmentation (i.e., the isolation of skeletal elements from everything else). As with Mimics, segmentation was applied to the mineralized region, not the cartilage poles. We chose to use a maximum entropy method because in our experience it offers the best robustness, acceptable performance with low-contrast images, and good noise tolerance (Kapur et al., 1985). The macro prompts the user to enter a value for the lower threshold, which needs to be adjusted depending on the stage being analyzed, and for our scans it ranged between 280 and 500 HU. We then used the Fill holes tool (2D version) to close solid structures such as the marrow, which facilitates downstream analysis. To clean up the data, we used the Remove outliers tool to remove small radio-dense objects that may appear in some images, followed by the Erode and the Watershed commands (2D versions), to facilitate the separation of structures that are frequently close to each other (e.g., radius and ulna). Finally, to obtain the length of the different bones without the time investment of selecting points manually on a 3D object, we first attempted to use the Particle Analyser tool of BoneJ (Doube, 2021) to calculate approximations of bone dimensions and plot them on 3D models of the segmented elements (Supplementary Figure 1A). Supplementary Figure 1B shows examples of the common measurements and fitted shapes that could be computed with BoneJ before this project was started, namely the maximum Feret’s diameter, a fitted ellipsoid and the three moments of inertia. The Feret’s diameter (also known as the maximum caliper diameter) is the longest distance between any two points on the particle’s surface. For rod-like structures with relatively flat ends, Feret’s diameter is a reasonable approximation of length. However, we noticed that in the case of real-life bones, the Feret’s diameter tends to find a diagonal that is not aligned with the main bone axis, leading to an overestimation of the length (Supplementary Figures 1B-B″, green arrowheads). Additionally, it is implemented as a brute-force method so it is computationally demanding. The fitted ellipsoid is the best-fit ellipsoid to the particle’s surface mesh. Since long bones are not ellipsoidal in shape, the ellipsoid fit is not precise and the fitted ellipsoid’s long axis tends to be longer than the bone’s (Supplementary Figure 1B, blue mesh). Finally, the moments of inertia are defined with respect to the three mean principal axes of the segmented element (longest, shortest and middle, Supplementary Figure 1B, red lines). Therefore, we reasoned that the intersections between the longest axis and the surface of the 3D model of the skeletal element could be used to define a straight line approximating bone length. However, we noticed that while the principal axis ran reasonably well along the center of bones such as the femur (e.g., Supplementary Figure 1B), this was not the case for other bones such as the humerus, where the identified principal axis formed a steep angle with the line that a human user would use to measure bone length (Supplementary Figure 1B″). In summary, BoneJ was able to automatically provide good approximations of bone length for only some bones, depending on their shape.

Given these limitations, we developed a new capability for BoneJ’s Particle Analyser, to generate and measure Aligned boxes fitting each element (Figure 1G). The new tool uses the bone’s inertia tensor to define the three principal axes of each element, and then generates the minimum-size cuboidal box that fits the skeletal element and that is aligned with the principal axes. The coordinates and dimensions of these boxes are then appended to the Results window in ImageJ. One challenge of this approach is that for whole-body scans, hundreds of elements are segmented and measured (Figure 1G), and the numeric IDs assigned to the different bones are not the same from sample to sample, precluding the streamlined export of only the measurements of interest. While filtering by volume in the Particle Analyzer tool allows reducing the number of elements to a certain extent, it is often impossible to reduce it only to the bones of interest. Therefore, we generated a new analysis module for BoneJ, Label Elements (3D), which allows the user to interact with the 3D Viewer in order to Control-click the elements of interest, subsequently selecting or writing the label for the clicked element in an iterative manner (Figure 1H). That label is appended to the Results window (Figure 1I), which is subsequently exported as a csv file. Lastly, we developed an R-script to clean up the data tables (i.e., to remove all the rows but the ones of interest), sort the remaining values in alphabetical order and finally to bind several data tables together, if required (see section “Materials and Methods”).

Standardized Conditions to Achieve Robustness to Batch Effect at Multiple Stages

A semi-automatic protocol to measure bone length would only be useful if it yielded consistent measurements for a given sample, scanned and analyzed repeatedly on different days. We thus explored different scan resolutions (20 and 40 μm) to analyze at least three technical replicates per resolution, obtained from two different postnatal day (P) seven mouse specimens (see section “Materials and Methods”), and assessed the reproducibility of the results (Figure 2). We report the Mimics and BoneJ results separately below.

In the Mimics pipeline, 40-μm scans showed relatively high inter-batch effect, especially for hindlimb bones, regardless of the threshold (Figures 2A,B, left), whereas 20-μm scans yielded more consistent measurements, including hindlimb bones, especially for the lower threshold (Figures 2C,D, left). To compare the batch effect more quantitatively, we then calculated the coefficient of variability (CV) for each bone’s measurements across the three batches, and compared the CV for the different conditions and bones. A 2-way analysis of variance (ANOVA) showed that there was a significant effect of the imaging and analysis conditions, although the extent of it was likely distinct for the different bones (Figure 2E). In summary, these results identified a 20-μm resolution and a 398-HU threshold as the optimal conditions to minimize inter-batch variability in this type of samples (i.e., P7 mouse long bones). Importantly, this protocol succeeded to separate radius from ulna in most cases, despite their proximity, with only a few scans requiring the manual use of Split Mask. However, it was not able to separate tibia from fibula (Figure 2F), potentially affecting tibial length measurement (see section “Discussion” below).

With BoneJ, the 20-μm resolution also led to somewhat less variable results, but in this case the differences between different resolution-threshold combinations were not statistically significant (Figure 2G). Moreover, 20-μm Low Threshold was the condition that yielded length values more similar to those obtained by Mimics (e.g., femur data in Figures 2C,D). As in the Mimics case, the femur and to some extent the humerus were the long bones most affected by the changes in resolution and threshold (Figure 2G). One obvious difference with the Mimics pipeline is that, in the BoneJ pipeline, radius and ulna were segmented together in most cases (Figure 2H). In the absence of a Split mask function that could be applied to separate those elements in a quick and efficient way, we concluded that radius measurements could not be included in the analysis.

In order to test the versatility of BASILISC across developmental stages, we performed a similar battery of scan and measurement analyses exploring different scan resolutions and threshold values at embryonic day (E) 17.5 (Figure 3). With Mimics, most of the conditions performed similarly in terms of reproducibility across batches, except for low resolution and threshold, for which some femora were not properly segmented and consequently their length was overestimated (Figures 3A–D). Although there was no overall difference in the CV across conditions (Figure 3E), the data trends suggested that a 398-HU threshold outperformed a 226-HU threshold in terms of inter-batch robustness. Moreover, length variation in individual bones due to differences in scan resolution were minimized with a 398-HU threshold (compare Figure 3A with Figure 3C and Figure 3B with Figure 3D). Importantly, tibia and fibula were again segmented together no matter the conditions of analysis (Figure 3F).

With the BoneJ pipeline, variability was somewhat higher, although again high resolution and low threshold tended to yield more robust results, and closer to those obtained by Mimics (Figures 3A–D,G). At this stage, radius and ulna were, as with Mimics, individually segmented in all cases. One important difference with the Mimics pipeline is that tibia and fibula were not always segmented together. Namely, low resolution and high threshold led to separated tibia and fibula in approximately half of the cases and therefore to high variability (Figures 3A,G); low resolution and low threshold led to fibula and tibia always segmented together, such that there was less variability in the measurements, although the measurement itself was somewhat overestimated (Figures 3B,G); high resolution and high threshold led to separated tibia and fibula in ∼88% of the cases but also to highly eroded skeletal elements (Figure 3H), which was associated with intermediate variability (Figures 3C,G); and high resolution and low threshold also led to separated tibia and fibula in ∼88% of the cases but with less eroded surfaces, which likely corresponds to more exact measurements (Figures 3D,G).

Internal Consistency Across Stages, Scan Resolutions and Segmentation Thresholds

Besides inter-batch variability, a good analysis pipeline should yield low intra-scan variability. In other words, the ratios between two specific bones should be very similar across different scans. One of the advantages of working with paired bones is the possibility of assessing internal consistency of the BASILISC method by measuring the left/right ratio for each bone and condition. We therefore calculated a left/right ratio for the P7 samples, including replicates, to determine how close the ratio was to the hypothetical value of 1 (i.e., equally long left and right paired bones) and how much variability there was between replicates. As shown in Figures 4A,B and Supplementary Figure 2, for P7 samples the variability was frequently quite low, with 20-μm resolution and low threshold again showing the lowest inter-batch variability and a L/R ratio remarkably close to 1. As parameters moved away from these optimal settings, there were several bones (femur for Mimics, femur and humerus for BoneJ) for which either the average value was not as close to 1 as for other bones, and/or the variability between batches was higher than 5% (Figures 4A,B and Supplementary Figure 2). Similarly, we calculated internal ratios for E17.5 bones to determine optimal scan and segmentation parameters. In this case we chose intra-limb ratios (humerus/radius and femur/tibia) as a normalization approach that could be achievable in the case that contralateral bones were not available (as it was our case for these scavenged samples). With this approach, the parameter of interest to estimate the precision of the approach was the variability of each measurement across replicates. High resolution and low threshold minimized variability, and in all conditions the Mimics pipeline was more accurate (Figures 4C,D and Supplementary Figure 2).


Figure 4. The comparison of intra-specimen ratios reveals the most reproducible conditions for scan and analysis. (A,B) Left/right ratio of bone length (mean ± SD) for the indicated bones and conditions at P7, using Mimics (A) or BoneJ (B). Hu/Ra/Fe/Ti, Humerus/Radius/Femur/Tibia. (C,D) Similar to panels (A,B), except that the ratios shown are Humerus/Radius (H/R) and Femur/Tibia (F/T). S1–S3, specimens 1 to 3.

In summary, these results suggest that, as expected, 20-μm resolution is preferred to 40-μm in order to achieve more robust measurements. In terms of threshold, a higher threshold is preferred to segment out bones that are close to each other, but if the separation is not always achieved this leads to higher variability of the measurement. We thus concluded that the exact threshold needs to be determined for each scanner and/or scanning condition. Of note, the threshold can be easily changed in the Mimics script and the ImageJ macro that we present here.

Correlation Bone Lengths Obtained via Mimics, BoneJ and Skeletal Preparations on the Same Samples

We next compared the bone lengths obtained by BASILISC (Mimics pipeline) with the lengths obtained from the same samples via skeletal preparations and digital measurement of photographed bones (Supplementary Figure 2D), a method frequently-used in developmental biology studies. We used eight long bones from two different specimens at P7. The linear relationship between both measurements was very good in all conditions (Figure 5A, p-value for Pearson correlation < 0.0001 in all cases), and the slopes were not significantly different (p = 0.2506), with an average common value of 0.9336. As expected, however, the BASILISC measurements that used lower thresholds tended to overestimate bone length (as the resulting 3D model includes less dense tissue), as indicated by the differences in the intercepts with the axes (Figure 5A). Overall, the conditions that yielded measurements with better correlation to the skeletal preparations were 20-μm resolution and low threshold. We thus restricted our next analysis to these conditions, comparing BoneJ and Mimics BASILISC with the skeletal preparations (Figure 5B). Both correlations showed very similar slopes and Y-intercepts with the P7 samples. Both BASILISC approaches led to greater measurements than the skeletal preparation, especially in the case of Mimics (Figure 5B).


Figure 5. Correlation between BASILISC methods and with measurements obtained via skeletal preparations. (A) Bone length measurements obtained via skeletal preparations (prep.) (X axis) and Mimics BASILISC (Y axis), for different combinations of imaging resolution and segmentation threshold (averages of three technical replicates are shown). Solid lines represent the regression line for each combination. The coefficient of determination (R2) is indicated. The table shows the slope and Y-intercept for each of the conditions used. (B) Bone length measurements obtained via skeletal preparations (prep.) (X axis) and via micro-CT with Mimics and BoneJ versions of BASILISC (Y axis), obtained at 20-μm resolution and low threshold. (C) Correlation between bone length measurements of 22 bones (10 from E17.5 embryos and 12 from P7 pups) obtained with our Mimics (X axis) and BoneJ (Y) pipelines. Each Mimics-BoneJ comparison was done on the same scan batch. In panels (A–C), the dashed black line represents a 1:1 correlation as a reference, and the dotted ones delimit the 95% confidence interval of the regression.

As the last test for our pipelines, we compared the measurements obtained by Mimics and BoneJ from the same samples and scan batches, covering a broader range of stages, i.e., from E17.5 to P7. We focused on the conditions that yield most robust results, namely 20-μm resolution and low threshold. As seen in Figure 5C, there is remarkable correlation between both methods (R2 = 0.9982), although BoneJ leads to smaller lengths than Mimics (which itself leads to greater lengths than the skeletal preps, Figures 5A,B).

We concluded that both pipelines were suitable for high-throughput analysis of bone length from whole-body scans, with minimal user intervention. Next, we tested these methods in proof-of-principle studies.

Application of BASILISC to the Analysis of Genetic Mouse Models With Altered Skeletogenesis

To test the utility of the BASILISC pipeline for the detection of skeletal phenotypes that affect bone length, we applied it to one of our models of limb asymmetry that we recently reported (Ahmadzadeh et al., 2020). In this model, diphtheria toxin expression (DTA) is activated in an inducible and reversible manner in the cartilage template that drives growth of the left limb bones, killing chondrocytes and mostly sparing the right limbs (Ahmadzadeh et al., 2020). While continuous expression of DTA from E12.5 generates extreme asymmetries by birth (not shown), transient activation from E12.3 to E13.8 (see section “Materials and Methods”) leads to a subtler asymmetry (Figure 6A), well suited to test the sensitivity of BASILISC. We scanned eight P3 mouse pups (four control and four experimental ones) and applied the Mimics version of BASILISC to measure left and right humerus, ulna, radius, femur and tibia. We then calculated the left/right ratio for each bone, and compared this parameter for all bones between experimental and control mice, via 2-way ANOVA (Figure 6B). This analysis revealed a significant effect of the Genotype on limb asymmetry, with asymmetries ranging from ∼5 to 20%, similar to our previous report (Ahmadzadeh et al., 2020). These results suggest that BASILISC can be used to quickly detect and characterize skeletal phenotypes affecting the long bones.


Figure 6. Application of BASILISC to a mouse model of limb asymmetry. (A) Mouse model of transient cell ablation in the left cartilage. L, R: Left, Right. (B) Left/right length ratios of the shown bones at P3. The table shows the results of repeated measures 2-way ANOVA (i.e., multiple bones analyzed per sample), and the numbers in the graph are p-values for Sidak’s multiple comparisons post hoc test.


Here we have presented a fast and easy method to determine calcified bone length from XMT scans of whole mouse samples, without the need for dissecting the limbs, skinning or eviscerating the bodies. We tested our algorithm on a range of developmental stages (E17.5 through P7) that covers 9 days of very fast growth (Sanger et al., 2011).

Advantages Over Classic Skeletal Preparations

As any developmental biologist working on limb patterning and/or growth has experienced, analyzing one litter’s worth of samples by the classic method of skeletal preparation, limb microdissection, photograph acquisition and length measurement on the 2D pictures takes at least ten days and close to 20 h of dedicated hands-on work (Rigueur and Lyons, 2014). With the BASILISC approach, decapitation and storage of the mouse bodies takes just a few minutes per litter; scan time is roughly 10 min per sample (plus 30 min of set up per imaging session); data loading and analysis takes ∼5 min per scan. On average, this amounts to 3–4 h of hands-on work per litter. Another advantage is that the measurement is three-dimensional, as opposed to two-dimensional, and therefore robust to orientation errors. Lastly, the scan is not destructive, and therefore the samples can be later on processed for histology or other procedures (Hopkins et al., 2015; Baier et al., 2019).

Comparison to Previous Automation Approaches

In principle, the ideal pipeline for the kind of analysis that we perform here would be a fully automated method that recognized each of the long bones from a full-body scan, measured their length and exported those measurements without user intervention. In fact, there have been very impressive attempts at achieving this goal, combining object-based image analysis (that utilizes shape and context-dependent information in addition to pixel intensity values) with machine learning. For example, Heidrich et al. (2013) used Cognition Network Technology to extract objects and their properties from XMT data of chicken embryos at multiple stages, and then used these data to train a machine learning tool for automatic long bone classification. BASILISC is obviously far from achieving that level of automation, and while ImageJ is progressively including more machine learning modules, Mimics does not currently allow it. However, one of the strengths of BASILISC stems from its simplicity, as it can be used directly on any set of data, with minimal modification of the script. Contrary to this simplicity, the pipeline described in Heidrich et al. (2013) required a large training set of close to 3,000 instances, and also complex iterative thresholding methods. Moreover, although the classification achieved via this complex process was remarkably accurate, it still required supervision and was only applied to a reduced developmental window.

In contrast to other automation procedures where edge detection has been used to determine optimal thresholds (Zhang et al., 2010; Rathnayaka et al., 2011), here we rely on a global threshold optimized by trial and error to find an optimal range of gray values. BASILISC could be further refined by implementation of widely used edge detection algorithms to further improve the segmentation process and potentially increase the accuracy of the measurements obtained. However, since intensity can vary across the length of long bones (Rathnayaka et al., 2011), edge detection would require the use of multiple thresholds to reduce the degree of error in segmentation. Thus, here we opt for a single global threshold to extend the capabilities of the algorithm for a range of developmental stages. With this approach the earliest stage at which we were able to detect the mineralized portion of the bone was E15.5 (not shown).

To our knowledge, our Mimics script is the first algorithm that makes use of the Python library within the Mimics software to automate the segmentation, 3D modeling and analysis of length of skeletal elements. Previously, Mimics has been complemented with other scripting languages like MATLAB (Huhdanpaa et al., 2011) for image processing before segmenting the data, or software like Creo elements (Rios et al., 2014) to analyze scans after they have been segmented. In the latter case, though, the reference points for length measurement had to be manually selected, which is a time-consuming step to do in 3D. Through BASILISC, segmentation and length measurements can all be obtained within the one program (be it Mimics or ImageJ) and extensive programming knowledge is not required. Furthermore, we provide a processing pipeline that goes from optimized scanning conditions of mouse samples across a range of developmental stages, to streamlined image processing and data analysis, making BASILISC a readily available tool for the research community.

It should be noted, however, that the BoneJ version of BASILISC is less accurate than the Mimics one, especially at young stages (Figures 2, 3), and this could lead to some limitations in its applications. We hypothesize that BoneJ is more sensitive to discretization and thresholding artifacts because it uses binary pixel values for input, while Mimics makes a mesh over grayscale data, smoothing out noisy pixels at the crucial areas on the bones’ extremities.

Comparison With “Real” Length Measurements

Strictly speaking, the “true value” of bone length cannot be obtained with absolute certainty by any method, as no measurement is devoid of error. However, given the widespread use of skeletal preparation, flat mounting and imaging to obtain 2D length estimations, we compared the measurements obtained by the BASILISC approach (Mimics and BoneJ) with the length obtained by skeletal preparations (Rigueur and Lyons, 2014). Of note, all conditions showed remarkable correlation between both methods, with 20-μm resolution and 398-HU threshold yielding measurements very well correlated to those obtained via skeletal preparations across the whole range of lengths analyzed. In general, the lengths obtained with Mimics and BoneJ were bigger than those obtained via skeletal preparations. While the threshold choice could have some effect on this comparison, the most likely explanation is that the classic skeletal preparation method involves quite a harsh procedure, including increasing gradients of glycerol that can shrink the sample up to 3–6% (Mabee et al., 1998). Another important consideration of the Mimics version is that the “line to surface” fitting method generates the longest possible distance, which in some cases is not strictly running parallel to the element’s main axis (e.g., Figure 2F). This obviously generates a small bias in the measurement, but as long as the same method is used to compare different experimental conditions, this bias will be consistent and is not expected to contribute to the observed biological effect. Similarly, the aligned box approximation used in our BoneJ version of BASILISC can also potentially lead to slight length overestimation, depending on the shape of the bone. Importantly, we also showed that the optimal Mimics and BoneJ versions of BASILISC yield highly correlated measurements when applied to the same scans.

Limitations and Future Improvements

One potential limitation of fitting lines or boxes to the skeletal elements is that the measurement is sensitive to shape distortion due to mutations. If the shape distortion is not consistent across different specimens, then the length comparison will be less reliable. In this case, accurate measurement would require manual identification of multiple points along the length of the element, as it would be done with a 2D picture of a skeletal preparation. A potential solution for future versions of the algorithm could be to identify bending points along the length of the element and then measure the line defined by the points.

Another limitation is that some long bones are often segmented together in our pipeline, most often tibia and fibula (Figures 2F, 3F), and sometimes radius and ulna. This is because their automatic separation would require too high a threshold. While radius and ulna can often be quickly separated manually using the Split_mask function in Mimics (see section “Materials and Methods”), this is not feasible for the tibia and fibula, because their interaction surface is too large. This issue has some impact on the tibial measurements, because the fibula protrudes a bit farther than the tibia on the distal end (Figures 2F, 3F). However, the effect is quite minor and we showed that under the right conditions the error is very consistent, as the left/right ratio for the tibia is quite tightly centered on 1 (Figure 4). Therefore, the slightly overestimated tibial lengths can still be used for comparison purposes between different genotypes and/or treatments. The decision to invest more time in splitting them as opposed to accepting the error is up to the user and depends on two main aspects: the degree of accuracy desired and the time investment required to correct the error in all samples. In our case, we opted not to correct this segmentation error, as the minor gain in accuracy would be outweighed by the extra time investment.

The aforementioned limitation would be corrected with an automatic classification system based on machine learning (Heidrich et al., 2013), but the implementation of these methods is not straightforward. If this capability is implemented in the future, it could speed up image processing even further, as in theory no user intervention would be required to seed landmarks and/or label the bones of interest.

Other Applications

The current BASILISC pipeline in Mimics only measures length of the elements, because it applies the “fit line to surface” tool to the 3D models of the bones. However, it could in principle be adapted to measure width, by fitting a cylinder to the model and interrogating the width of the cylinder. This approach would require careful selection of the fitting parameters, so that the surface of the cylinder coincides with the surface of the 3D model. The BoneJ version, in turn, can directly be used to measure bone width, just by using the other dimensions of the aligned box.

Author’s Note

Beltran Diaz et al. present a semi-automated pipeline for fast and versatile characterization of bone length from micro-CT images of mouse developmental samples.

Data Availability Statement

All the measurements presented in the study are included in the article/Supplementary Material/Table 1, further inquiries can be directed to the corresponding author/s.

Ethics Statement

The animal study was reviewed and approved by Monash Animal Ethics Committee.

Author Contributions

SBD, CH, and XQ acquired the raw data. SBD and MD developed the essential tools. SBD, CH, JN, and AR-D performed the measurements. AR-D conceived and supervised the study, designed experiments, analyzed and interpreted the data, and drafted the manuscript, helped by CH. MD and OP co-supervised aspects of the study. AR-D and SBD prepared the figures. MD, MV, and OP contributed to the scientific discussion and revised the manuscript. All authors critically reviewed the manuscript and approved the submitted version.


This work was supported by grants from the Human Frontiers Science Program (CDA00021/2019) to AR-D and the National Health and Medical Research Council (APP2002084) to AR-D and CH; and a NIF Facility Fellowship to MV. The Australian Regenerative Medicine Institute is supported by grants from the State Government of Victoria and the Australian Government. BoneJ development was supported by the Wellcome Trust (108442/Z/15/Z).

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s Note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.


We would like to thank Jonathan Bensley for initial efforts with the initial BoneJ protocol, Adrian Salavaty for help with the generation of the R script, Hyab Mehari Abraha (Panagiotopoulou laboratory) for Mimics training and Noramira Azlan (Materialise) for providing a Mimics trial license for the initial steps of the project. We would also like to thank the facilities and scientific and technical assistance of the National Imaging Facility (NIF), a National Collaborative Research Infrastructure Strategy (NCRIS) capability at Monash Biomedical Imaging (MBI), a Technology Research Platform at Monash University. We would also like to thank acknowledge the technical assistance of Tara Sepehrizadeh who was a NIF Facility Fellow.

Supplementary Material

The Supplementary Material for this article can be found online at:

Supplementary Figure 1 | (A) Representative 3D viewer result window after applying the Particle Analyser tool to a segmented and cleaned up whole-body scan (P7 mouse). (B–B″) Close-ups of femur (B), tibia [(B′), right] and humerus [(B′), left and (B″)] showing the features that can be obtained from the Particle Analyser tool, as indicated. Green arrowheads point to the ends of the maximum Feret’s diameter. The red arrowhead points to a region where the longest principal axis does not align with the skeletal element. (C) Two different samples were scanned four times at 40-μm resolution, and each of those scans were analyzed twice, with identical or nearly identical parameters. (D) Two different samples were scanned four times at 20-μm resolution, and most of those scans were analyzed once. (E) Heatmap with the average coefficient of variability for each bone and resolution.

Supplementary Figure 2 | Coefficients of variability for P7 and E17.5 analysis using Mimics and BoneJ. (A,B) Heatmaps for the CVs of the Left/Right length ratios obtained after analysis of P7 bones with Mimics (A) and BoneJ (B) pipelines. (C,D) Heatmaps for the CVs of the indicated ratios obtained after analysis of E17.5 bones with Mimics (C) and BoneJ (D) pipelines. (E) Left/Right ratios for the P7 bones used to benchmark BASILISC, measured after classic skeletal preparations.

Supplementary Table 1 | All the measurements obtained and analyzed in the manuscript.

Supplementary Video 1 | Overview of the BASILISC process in Mimics, performed on one of the scans used for this study. The messages prompted by the script were taken as screenshots and added to the video clip. Please note that an artifact of the video capture causes the mouse cursor to be shown slightly displaced from its real position.


  1. ^
  2. ^


Ahmadzadeh, E., Bayin, N. S., Qu, X., Singh, A., Madisen, L., Stephen, D., et al. (2020). A collection of genetic mouse lines and related tools for inducible and reversible intersectional misexpression. Development 147:dev186650.

Google Scholar

An, G., Hong, L., Zhou, X.-B., Yang, Q., Li, M.-Q., and Tang, X.-Y. (2017). Accuracy and efficiency of computer-aided anatomical analysis using 3D visualization software based on semi-automated and automated segmentations. Ann. Anat. Anatom. Anzeiger 210, 76–83. doi: 10.1016/j.aanat.2016.11.009

PubMed Abstract | CrossRef Full Text | Google Scholar

Baier, W., Mangham, C., Warnett, J. M., Payne, M., Painter, M., and Williams, M. A. (2019). Using histology to evaluate micro-CT findings of trauma in three post-mortem samples – First steps towards method validation. Forensic. Sci. Int. 297, 27–34. doi: 10.1016/j.forsciint.2019.01.027

PubMed Abstract | CrossRef Full Text | Google Scholar

Boskey, A. L., van der Meulen, M. C., and Wright, T. M. (2003). Guidelines for describing mouse skeletal phenotype. J. Orthop. Res. 21, 1–5. doi: 10.1016/s0736-0266(02)00102-x

CrossRef Full Text | Google Scholar

Bouxsein, M. L., Boyd, S. K., Christiansen, B. A., Guldberg, R. E., Jepsen, K. J., and Müller, R. (2010). Guidelines for assessment of bone microstructure in rodents using micro–computed tomography. J. Bone Min. Res. 25, 1468–1486. doi: 10.1002/jbmr.141

PubMed Abstract | CrossRef Full Text | Google Scholar

Campbell, G. M., and Sophocleous, A. (2014). Quantitative analysis of bone and soft tissue by micro-computed tomography: applications to ex vivo and in vivo studies. Bonekey Rep. 3, 564–564.

Google Scholar

Casaca, A., Santos, A. C., and Mallo, M. (2014). Controlling Hox gene expression and activity to build the vertebrate axial skeleton. Dev. Dyn. 243, 24–36. doi: 10.1002/dvdy.24007

PubMed Abstract | CrossRef Full Text | Google Scholar

Chen, L., Adar, R., Yang, X., Monsonego, E. O., Li, C., Hauschka, P. V., et al. (1999). Gly369Cys mutation in mouse FGFR3 causes achondroplasia by affecting both chondrogenesis and osteogenesis. J. Clin. Invest. 104, 1517–1525. doi: 10.1172/jci6690

PubMed Abstract | CrossRef Full Text | Google Scholar

Cheng, X. G., and Wang, Y. X. J. (2018). Orthopaedic imaging for translational research and clinical application. J. Orthop. Translat. 15, A1–A2.

Google Scholar

Christiansen, B. A. (2016). Effect of micro-computed tomography voxel size and segmentation method on trabecular bone microstructure measures in mice. Bone Rep. 5, 136–140. doi: 10.1016/j.bonr.2016.05.006

PubMed Abstract | CrossRef Full Text | Google Scholar

Domander, R., Felder, A., and Doube, M. (2021). BoneJ2 - refactoring established research software. Wellcome Open Res. 6:37. doi: 10.12688/wellcomeopenres.16619.2

PubMed Abstract | CrossRef Full Text | Google Scholar

Doube, M. (2021). Multithreaded two-pass connected components labelling and particle analysis in ImageJ. R. Soc. Open Sci. 8:201784.

Google Scholar

du Plessis, A., Broeckhoven, C., Guelpa, A., and le Roux, S. G. (2017). Laboratory x-ray micro-computed tomography: a user guideline for biological samples. Gigascience 6, 1–11.

Google Scholar

Elliott, J. C., and Dover, S. D. (1982). X-ray microtomography. J. Microsc. 126, 211–213.

Google Scholar

Galloway, J. L., Delgado, I., Ros, M. A., and Tabin, C. J. (2009). A reevaluation of X-irradiation-induced phocomelia and proximodistal limb patterning. Nature 460, 400–404. doi: 10.1038/nature08117

PubMed Abstract | CrossRef Full Text | Google Scholar

Galvez-Hernandez, P., Gaska, K., and Kratz, J. (2021). Phase segmentation of uncured prepreg X-Ray CT micrographs. Comp. Part A Appl. Sci. Manufact. 149:106527. doi: 10.1016/j.compositesa.2021.106527

CrossRef Full Text | Google Scholar

Heidrich, A., Schmidt, J., Zimmermann, J., and Saluz, H. P. (2013). Automated segmentation and object classification of ct images: application to in vivo molecular imaging of Avian Embryos. Int. J. Biomed. Imaging 2013:508474.

Google Scholar

Hopkins, T. M., Heilman, A. M., Liggett, J. A., LaSance, K., Little, K. J., Hom, D. B., et al. (2015). Combining micro-computed tomography with histology to analyze biomedical implants for peripheral nerve repair. J. Neurosci. Methods 255, 122–130. doi: 10.1016/j.jneumeth.2015.08.016

PubMed Abstract | CrossRef Full Text | Google Scholar

Huhdanpaa, H., Douville, C., Baum, K., Krishnamurthy, V. N., Holcombe, S., Enchakalody, B., et al. (2011). Development of a quantitative method for the diagnosis of cirrhosis. Scand. J. Gastroenterol. 46, 1468–1477. doi: 10.3109/00365521.2011.613946

PubMed Abstract | CrossRef Full Text | Google Scholar

Kapur, J. N., Sahoo, P. K., and Wong, A. K. C. (1985). A new method for gray-level picture thresholding using the entropy of the histogram. Comput. Vis. Graph. Image Process. 29, 273–285. doi: 10.1016/0734-189x(85)90125-2

CrossRef Full Text | Google Scholar

Kherdjemil, Y., Lalonde, R. L., Sheth, R., Dumouchel, A., de Martino, G., Pineault, K. M., et al. (2016). Evolution of Hoxa11 regulation in vertebrates is linked to the pentadactyl state. Nature 539, 89–92. doi: 10.1038/nature19813

PubMed Abstract | CrossRef Full Text | Google Scholar

Li, C., Chen, L., Iwata, T., Kitagawa, M., Fu, X. Y., and Deng, C. X. (1999). A Lys644Glu substitution in fibroblast growth factor receptor 3 (FGFR3) causes dwarfism in mice by activation of STATs and ink4 cell cycle inhibitors. Hum. Mol. Genet. 8, 35–44. doi: 10.1093/hmg/8.1.35

PubMed Abstract | CrossRef Full Text | Google Scholar

Mabee, P. M., Aldridge, E., Warren, E., and Helenurm, K. (1998). Effect of clearing and staining on fish length. Copeia 1998, 346–353. doi: 10.2307/1447428

CrossRef Full Text | Google Scholar

Marchini, M., and Rolian, C. (2018). Artificial selection sheds light on developmental mechanisms of limb elongation. Evolution 72, 825–837. doi: 10.1111/evo.13447

PubMed Abstract | CrossRef Full Text | Google Scholar

Mead, T. J. (2020). Alizarin red and alcian blue preparations to visualize the skeleton. Methods Mol. Biol. 2043, 207–212. doi: 10.1007/978-1-4939-9698-8_17

CrossRef Full Text | Google Scholar

Mehadji, B., Ahmed, Y., and Berteau, J.-P. (2019). A novel approach for computing 3D mice distal femur properties using high-resolution micro-computed tomography scanning. Micron 121, 1–7. doi: 10.1016/j.micron.2019.02.011

PubMed Abstract | CrossRef Full Text | Google Scholar

Okada, T., Shimada, R., Hori, M., Nakamoto, M., Chen, Y.-W., Nakamura, H., et al. (2008). Automated segmentation of the liver from 3D CT images using probabilistic Atlas and multilevel statistical shape model. Acad. Radiol. 15, 1390–1403. doi: 10.1016/j.acra.2008.07.008

PubMed Abstract | CrossRef Full Text | Google Scholar

Posey, K. L., Veerisetty, A. C., Liu, P., Wang, H. R., Poindexter, B. J., Bick, R., et al. (2009). An inducible cartilage oligomeric matrix protein mouse model recapitulates human pseudoachondroplasia phenotype. Am. J. Pathol. 175, 1555–1563. doi: 10.2353/ajpath.2009.090184

PubMed Abstract | CrossRef Full Text | Google Scholar

Rathnayaka, K., Sahama, T., Schuetz, M. A., and Schmutz, B. (2011). Effects of CT image segmentation methods on the accuracy of long bone 3D reconstructions. Med. Eng. Phys. 33, 226–233.

Google Scholar

Reynisson, P. J., Scali, M., Smistad, E., Hofstad, E. F., Leira, H. O., Lindseth, F., et al. (2015). Airway segmentation and centerline extraction from thoracic CT – comparison of a new method to state of the art commercialized methods. PLoS One 10:e0144282. doi: 10.1371/journal.pone.0144282

PubMed Abstract | CrossRef Full Text | Google Scholar

Rigueur, D., and Lyons, K. M. (2014). Whole-mount skeletal staining. Methods Mol. Biol. 1130, 113–121. doi: 10.1007/978-1-62703-989-5_9

CrossRef Full Text | Google Scholar

Rios, D., Jansson, K. S., Martetschlager, F., Boykin, R. E., Millett, P. J., and Wijdicks, C. A. (2014). Normal curvature of glenoid surface can be restored when performing an inlay osteochondral allograft: an anatomic computed tomographic comparison. Knee Surg. Sports Traumatol. Arthrosc. 22, 442–447. doi: 10.1007/s00167-013-2391-5

PubMed Abstract | CrossRef Full Text | Google Scholar

Rosello-Diez, A., Madisen, L., Bastide, S., Zeng, H., and Joyner, A. L. (2018). Cell-nonautonomous local and systemic responses to cell arrest enable long-bone catch-up growth in developing mice. PLoS Biol. 16:e2005086. doi: 10.1371/journal.pbio.2005086

PubMed Abstract | CrossRef Full Text | Google Scholar

Rosello-Diez, A., Stephen, D., and Joyner, A. L. (2017). Altered paracrine signaling from the injured knee joint impairs postnatal long bone growth. eLife 6:e27210.

Google Scholar

Rowe, D. W., Adams, D. J., Hong, S. H., Zhang, C., Shin, D. G., Renata Rydzik, C., et al. (2018). Screening gene knockout mice for variation in bone mass: analysis by muCT and histomorphometry. Curr. Osteop. Rep. 16, 77–94. doi: 10.1007/s11914-018-0421-4

PubMed Abstract | CrossRef Full Text | Google Scholar

Šajn, L., Kononenko, I., and Milèinski, M. (2007). Computerized segmentation and diagnostics of whole-body bone scintigrams. Comput. Med. Imaging Graph. 31, 531–541. doi: 10.1016/j.compmedimag.2007.06.004

PubMed Abstract | CrossRef Full Text | Google Scholar

Sanger, T. J., Norgard, E. A., Pletscher, L. S., Bevilacqua, M., Brooks, V. R., Sandell, L. J., et al. (2011). Developmental and genetic origins of murine long bone length variation. J. Exp. Zool. Part B Mol. Dev. Evol. 316B, 146–161. doi: 10.1002/jez.b.21388

PubMed Abstract | CrossRef Full Text | Google Scholar

Schindelin, J., Arganda-Carreras, I., Frise, E., Kaynig, V., Longair, M., Pietzsch, T., et al. (2012). Fiji: an open-source platform for biological-image analysis. Nat. Methods 9, 676–682. doi: 10.1038/nmeth.2019

PubMed Abstract | CrossRef Full Text | Google Scholar

Sears, K. E., Behringer, R. R., Rasweiler, J. J. T., and Niswander, L. A. (2006). Development of bat flight: morphologic and molecular evolution of bat wing digits. Proc. Natl. Acad. Sci. U.S.A. 103, 6581–6586. doi: 10.1073/pnas.0509716103

PubMed Abstract | CrossRef Full Text | Google Scholar

Sheth, R., Marcon, L., Bastida, M. F., Junco, M., Quintana, L., Dahn, R., et al. (2012). Hox genes regulate digit patterning by controlling the wavelength of a Turing-type mechanism. Science 338, 1476–1480. doi: 10.1126/science.1226804

PubMed Abstract | CrossRef Full Text | Google Scholar

Shiratori, H., Yashiro, K., Shen, M. M., and Hamada, H. (2006). Conserved regulation and role of Pitx2 in situs-specific morphogenesis of visceral organs. Development 133, 3015–3025. doi: 10.1242/dev.02470

PubMed Abstract | CrossRef Full Text | Google Scholar

Summerbell, D. (1977). Regulation of the deficiencies along the proximal distal axis of the chick wing-bud: a quantitative analysis. J. Embryol. Exp. Morphol. 41, 137–159. doi: 10.1242/dev.41.1.137

CrossRef Full Text | Google Scholar

Taghizadeh, E., Terrier, A., Becce, F., Farron, A., and Büchler, P. (2019). Automated CT bone segmentation using statistical shape modelling and local template matching. Comput. Methods Biomech. Biomed. Eng. 22, 1303–1310. doi: 10.1080/10255842.2019.1661391

PubMed Abstract | CrossRef Full Text | Google Scholar

Weissheimer, A., Menezes, L. M., Sameshima, G. T., Enciso, R., Pham, J., and Grauer, D. (2012). Imaging software accuracy for 3-dimensional analysis of the upper airway. Am. J. Orthod. Dentofacial. Orthop. 142, 801–813. doi: 10.1016/j.ajodo.2012.07.015

PubMed Abstract | CrossRef Full Text | Google Scholar

Wong, S. F., Agarwal, V., Mansfield, J. H., Denans, N., Schwartz, M. G., Prosser, H. M., et al. (2015). Independent regulation of vertebral number and vertebral identity by microRNA-196 paralogs. Proc. Natl. Acad. Sci. U.S.A. 112, E4884–E4893.

Google Scholar

Yiannakas, M. C., Mustafa, A. M., De Leener, B., Kearney, H., Tur, C., Altmann, D. R., et al. (2016). Fully automated segmentation of the cervical cord from T1-weighted MRI using PropSeg: application to multiple sclerosis. NeuroImage Clin. 10, 71–77. doi: 10.1016/j.nicl.2015.11.001

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhang, J., Yan, C. H., Chui, C. K., and Ong, S. H. (2010). Fast segmentation of bone in CT images using 3D adaptive thresholding. Comput. Biol. Med. 40, 231–236. doi: 10.1016/j.compbiomed.2009.11.020

PubMed Abstract | CrossRef Full Text | Google Scholar

Keywords: bone growth and development, micro-CT (computed tomography), BoneJ, MIMICS, skeletal phenotype analysis, limb evolution

Citation: Beltran Diaz S, H’ng CH, Qu X, Doube M, Nguyen JT, de Veer M, Panagiotopoulou O and Rosello-Diez A (2021) A New Pipeline to Automatically Segment and Semi-Automatically Measure Bone Length on 3D Models Obtained by Computed Tomography. Front. Cell Dev. Biol. 9:736574. doi: 10.3389/fcell.2021.736574

Received: 05 July 2021; Accepted: 06 August 2021;
Published: 26 August 2021.

Edited by:

Marie Kmita, Montreal Clinical Research Institute (IRCM), Canada

Reviewed by:

Kyle Harrington, Helmholtz Association of German Research Centers (HZ), Germany
Jaap Saers, University of Cambridge, United Kingdom

Copyright © 2021 Beltran Diaz, H’ng, Qu, Doube, Nguyen, de Veer, Panagiotopoulou and Rosello-Diez. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Michael Doube,; Alberto Rosello-Diez,

These authors have contributed equally to this work