In vivo Mouse Intervertebral Disc Degeneration Models and Their Utility as Translational Models of Clinical Discogenic Back Pain: A Comparative Review

Low back pain is a leading cause of disability worldwide and studies have demonstrated intervertebral disc (IVD) degeneration as a major risk factor. While many in vitro models have been developed and used to study IVD pathophysiology and therapeutic strategies, the etiology of IVD degeneration is a complex multifactorial process involving crosstalk of nearby tissues and systemic effects. Thus, the use of appropriate in vivo models is necessary to fully understand the associated molecular, structural, and functional changes and how they relate to pain. Mouse models have been widely adopted due to accessibility and ease of genetic manipulation compared to other animal models. Despite their small size, mice lumbar discs demonstrate significant similarities to the human IVD in terms of geometry, structure, and mechanical properties. While several different mouse models of IVD degeneration exist, greater standardization of the methods for inducing degeneration and the development of a consistent set of output measurements could allow mouse models to become a stronger tool for clinical translation. This article reviews current mouse models of IVD degeneration in the context of clinical translation and highlights a critical set of output measurements for studying disease pathology or screening regenerative therapies with an emphasis on pain phenotyping. First, we summarized and categorized these models into genetic, age-related, and mechanically induced. Then, the outcome parameters assessed in these models are compared including, molecular, cellular, functional/structural, and pain assessments for both evoked and spontaneous pain. These comparisons highlight a set of potential key parameters that can be used to validate the model and inform its utility to screen potential therapies for IVD degeneration and their translation to the human condition. As treatment of symptomatic pain is important, this review provides an emphasis on critical pain-like behavior assessments in mice and explores current behavioral assessments relevant to discogenic back pain. Overall, the specific research question was determined to be essential to identify the relevant model with histological staining, imaging, extracellular matrix composition, mechanics, and pain as critical parameters for assessing degeneration and regenerative strategies.

Low back pain is a leading cause of disability worldwide and studies have demonstrated intervertebral disc (IVD) degeneration as a major risk factor. While many in vitro models have been developed and used to study IVD pathophysiology and therapeutic strategies, the etiology of IVD degeneration is a complex multifactorial process involving crosstalk of nearby tissues and systemic effects. Thus, the use of appropriate in vivo models is necessary to fully understand the associated molecular, structural, and functional changes and how they relate to pain. Mouse models have been widely adopted due to accessibility and ease of genetic manipulation compared to other animal models. Despite their small size, mice lumbar discs demonstrate significant similarities to the human IVD in terms of geometry, structure, and mechanical properties. While several different mouse models of IVD degeneration exist, greater standardization of the methods for inducing degeneration and the development of a consistent set of output measurements could allow mouse models to become a stronger tool for clinical translation. This article reviews current mouse models of IVD degeneration in the context of clinical translation and highlights a critical set of output measurements for studying disease pathology or screening regenerative therapies with an emphasis on pain phenotyping. First, we summarized and categorized these models into genetic, age-related, and mechanically induced. Then, the outcome parameters assessed in these models are compared including, molecular, cellular, functional/structural, and pain assessments for both evoked and spontaneous pain. These comparisons highlight a set of potential key parameters that can be used to validate the model and inform its utility to screen potential therapies for IVD degeneration and their translation to the human condition. As treatment of symptomatic pain is important, this review provides an emphasis on critical pain-like behavior assessments in mice and explores current behavioral INTRODUCTION Low back pain (LBP) is the leading cause of disability worldwide and its prevalence continues to increase with enormous socioeconomic burdens exceeding $100 billion annually in the United States alone (1)(2)(3). Current clinical interventions include analgesics (i.e., non-narcotic pain medications, nonsteroidal anti-inflammatory drugs (NSAIDs), and opioids), physical therapy, epidural injections, and surgical interventions (4)(5)(6). While most of these treatments provide symptomatic pain relief, they do not target the underlying pathology, leading to recurrent pain and surgical interventions (6). This ultimately leads to the increased use of pain medications and contributes significantly to the escalating opioid crisis (7).
Epidemiological studies suggest intervertebral disc (IVD) degeneration is a major cause of LBP, attributing to 40% of all LBP cases (8,9). It is important to note here that the nomenclature "IVD degeneration" refers to the progressive disease characterized by cellular and matrix changes which can often, but not always, result in disc herniation with increasing severity. Disc herniation can also occur due to injury/trauma to the spine resulting in mechanical compression on nerve roots (10). The IVD is a fibrocartilaginous structure connecting vertebral bodies which plays critical roles in everyday motion, however, the etiology of IVD degeneration is a complex, multifactorial process contributing to discogenic back pain (DBP) (11). While several studies have explored regenerative therapies to treat disc degeneration, their clinical translation may be hampered by a lack of in vivo animal models that combine all the cellular, structural and functional aspects of IVD degeneration including symptomatic pain. Individual models recapitulate different aspects of IVD degeneration and different results are often gained when different models are used.
Numerous pre-clinical models have been used to assess therapies for IVD regeneration ranging from large (sheep, goats, pigs, cattle, canine) to small (rabbits, rats, mice) animals. In vivo models in particular are advantageous over in vitro models as they incorporate complex systemic interactions which are relevant for assessing pain/cognitive behaviors that are otherwise difficult to recapitulate in vitro (12). Many of these in vivo models have been discussed in detail in numerous review articles which cover advantages and disadvantages along with major findings (12)(13)(14)(15)(16)(17). The goal and novel aspect of this review are to summarize relevant similarities and differences across mouse models and human IVD degeneration with an emphasis on evaluating the clinical translation of downstream parameters and the metrics of pain. Rodent models such as mice are widely used and are advantageous in their accessibility, affordability, tunability, and abundance of molecular tools/genomic databases compared to other large animal models (12). While many studies have used mouse models of IVD degeneration to test therapies for DBP, to enhance clinical translation, standardization of key variables ranging from the method of model induction to the downstream parameters measured may be beneficial. In particular, there are diverse array of pain-like behavior tests available and challenges in the interpretation of these behaviors across numerous studies also highlights the need to ground the choice of relevant assays in translation to the human condition. In addition, many studies differ in the age and sex of animals used which critically influence IVD degeneration and the pathogenesis of pain both in mice and humans (18)(19)(20). The lack of standardization in which parameters are assessed across models may hamper the clinical translatability of using mouse as a translational model for regenerative therapies. Often the differing array of parameters used across models relate to the specific tools/skill set investigators have readily available to them, pointing toward a need to consider multidisciplinary teams as critical to components of our in vivo animal model studies.
Thus, this review aims to (1) highlight current/existing mouse models of IVD degeneration for translational research as it relates to human LBP along with their advantages and limitations, (2) compare/contrast degeneration induction methods and outcomes between these different mice models, and (3) assess the clinical relevance of these outcomes to identify a subset of critical/core parameters essential for validating both the model and its utility to screen regenerative therapies to treat LBP. The following sections will review current mice models, the different levels of assessments used across studies, how these assessed parameters compare with the clinical human condition and conclude with critical assessment parameters.

MICE MODELS OF INTERVERTEBRAL DISC DEGENERATION The Human Intervertebral Disc vs. Mouse Intervertebral Disc
The human IVD is comprised of three main components: the inner nucleus pulposus (NP), the surrounding annulus fibrosus (AF), and the superior and inferior cartilage endplates (CEP) (21). The healthy NP is highly gelatinous and comprised of proteoglycans, namely aggrecan [i.e., glycosaminoglycan (GAG)], and collagen II which provides load distribution and compressive force absorption (22,23). In contrast, the surrounding AF region is comprised of an aligned matrix of collagen I fibers and functions to contain the NP and anchor the disc to adjacent vertebrae to resist torsional and tensional loads (24,25). While under characterized, the CEP plays a critical role in the diffusion of nutrients to the largely avascular IVD from the vertebral bodies as well as the removal of metabolic waste (26). However, with advancing degeneration, the IVD demonstrates decreases in cellularity and proteoglycan synthesis along with increased catabolism, inflammation, immune cell infiltration, changes in IVD structure, altered mechanics, and neurovascular invasion, all of which are critical factors to assess in IVD degeneration models (27)(28)(29)(30).
In comparison to the human IVD, mice, despite their small stature, have many commonalities with humans as summarized in Figure 1. On the cellular level, humans and mice IVDs both exhibit decreased cellularity with increased apoptosis, senescence, and immune infiltration during aging or induced IVD degeneration (13,31). At a molecular level, nutrient deficiency, and reduced extracellular matrix (ECM) production have been found in mouse models of IVD degeneration and degenerated human IVDs. Immune infiltration has been identified in painful degenerate human and mouse IVDs as evidenced by the presence and recruitment of mast cells and macrophages (32)(33)(34)(35)(36)(37). Structurally, human and mouse IVDs are avascular and aneural in their healthy state with similar components (NP, AF, CEP) and ECM tissue structure. Interestingly, rodent lumbar IVDs are most geometrically analogous to humans compared to other animals based on percent deviation of normalized disc height, anterior-posterior width, and NP area (38). In IVD degeneration, changes at the structure/function level in both humans and mice include decreased disc height/volume, neurovascular invasion, limited nutrition, reduced hydration, AF disorganization, and a more fibrous NP (38)(39)(40)(41)(42). In terms of pain and pain-like behaviors, humans and mice can experience impaired gait, decreased activity time, and reduced range of motion due to IVD degeneration along with mechanical and thermal hypersensitivity and changes in neuronal plasticity (43)(44)(45)(46)(47).
In contrast, drawbacks of using mice as a translational model of LBP are mainly due to their small size, low forces and reduced diffusion distances across their IVD (0.35 ± 0.09 mm 2 ), quadrupedal nature with altered mechanical forces, higher cellularity, and retention of notochordal cells throughout adulthood (13,38). Ideally, research involving therapeutic studies should include quantification of pain, but the measurement of pain differs from human patients to mice as animals cannot communicate pain levels and are not affected by other risk factors of LBP such as lifestyle and career occupation (48). Instead, behavioral assessments for "pain-like" behaviors are used and inferred. Nonetheless, mouse models allow for preclinical validation of disease pathogenesis and evaluation of potential regenerative therapies and therefore remain critical in vivo research models. The existing mouse models can be categorized into aging and genetic, mechanically induced and puncture models.

Age-Related Spontaneous Degeneration and Genetic Models
Age-related changes representative of degeneration are not as severe in mice, as they retain their notochordal cell population throughout adulthood, compared to animals without notochordal cell retention such as primates and chondrodystrophic dogs (49). While less severe, mice do develop age related changes at later stages of life. Specifically, wild-type C57BL/6J mice at 2 years old (old age) demonstrate features of IVD degeneration such as reduced disc height, bulging, and neurovascular invasion in both caudal and lumbar IVDs along with upregulation of inflammatory cytokines tumor necrosis factor-alpha (TNFα) and interleukin-1 beta (IL-1β) compared to younger mice (50). Alvarez et al. also showed degeneration at 2 years in wild-type mice associated with downregulated Forkhead box O (FOXO) expression (51). Taken together, aging in wild-type mice may mimic IVD degeneration that occurs with age in humans and has high clinical relevance. However, laboratory mice typically have a lifespan of 2 years, potentially limiting the utility of using aged mice for studying regenerative therapies as they may not survive the length of treatment.
As such, genetically modified mice have been used in place of wild-type mice for "accelerated aging" or for studying specific mechanisms involved in the pathogenesis of IVD degeneration. The bile duct ligation (BDL) strain of mice, a result of an autosomal mutation in the ky gene, develop kyphoscoliosis with structural changes in the vertebrae and cervical IVDs coupled with degeneration and herniation (52). Growth differentiation factor 5 (GDF5) deficient mice also demonstrate signs of IVD degeneration with changes in collagen and proteoglycan content (53). In addition, secreted protein acidic and rich in FIGURE 1 | Comparison of the human vs. mice intervertebral disc and changes in degeneration: humans are represented on the left along with mice on the right and their respective similarities in the healthy and degenerate/diseased IVD (D = occurrence in degeneration) overlapping in the middle. Top middle depicts relative timeline of mice age compared to humans. Depiction of human and mice in this figure generated using BioRender.com.
cysteine-null (SPARC-null) transgenic mice demonstrate agerelated changes in IVD composition and structure such as decreased disc height index (DHI) and GAG with increasing severity. SPARC-null mice also exhibit similar human pain-like behaviors such as decreased tolerance to axial stretch, thermal sensitivity, and impaired locomotion (54). More mouse models that stimulate an "age-related" degenerative phenotype include Excision Repair 1 (ERCC1)-mutant mice that lack important DNA repair mechanisms, suggesting aberrant DNA repair may contribute to IVD degeneration (55). Most recently, SM/J mice have been shown to exhibit spontaneous IVD degeneration due to poor cartilage healing (56). As degradation of the ECM plays a key role in the pathogenesis of IVD degeneration, many ECM genetic knockout models have also been assessed (i.e., collagen II, collagen IX, aggrecan, and biglycan) and as expected demonstrate significant ECM loss/disorganization with impaired disc and vertebrae development (57)(58)(59)(60)(61)(62). A review by Jin et al. highlights additional genetic knockouts with degenerative IVD phenotypes (12). While specific mutations have been identified in some populations and global deletions remain rare, genetic models demonstrate accelerated disease/aging which is of clinical relevance and value for animal studies given how long degeneration can take to progress in humans (63)(64)(65).
Age-related models are advantageous as they share many similarities with disease pathogenesis that occurs in degenerate human IVDs. However, it is important to note that IVD degeneration does not happen solely due to age, but results from multiple factors such as genetics, environment, and lifestyle; furthermore, IVD degeneration can also occur in the younger population (63,65,66). Limitations of using aged mice as a model for regenerative therapies include the long-term care of aging mice and the short lifespan of the mice. Genetic mouse models are excellent tools allowing the investigation of specific genes/proteins/pathways contributing to IVD degeneration. Limitations of these models may include the multifactorial disease process that occurs in humans which may not be genespecific, however, these models still allow for screening of novel regenerative therapies for IVD degeneration.

Mechanically Induced and Puncture Models
Mechanical loading is an important regulator of IVD development and homeostasis and altered biomechanics plays a significant role in IVD degeneration in humans as evidenced by the increased risk of IVD degeneration in manual laborers and changes due to physical exercise (9). As such, a wide array of mechanically induced models have been developed that alter or modify the mechanical environment experienced by mice IVDs to induce degeneration. Instability models such as surgical resection of posterior elements (i.e., facet joints and spinous processes), tail bending, static/dynamic compression, and axial loading models have demonstrated that altering the mechanical loading environment applied to the IVD can lead to progressive disc degeneration characterized by cell death, decreased disc height, decreased collagen II and aggrecan expression, and disorganization of AF structure (67)(68)(69)(70)(71)(72)(73)(74). Other models which alter spinal loading via induced bipedal behaviors exhibit accelerated degenerative characteristics due to abnormal mechanical stress compared to quadrupedal mice (75,76). A recent bipedal mice model was established by placing mice in water to promote standing as, mice are aqua-phobic, and have demonstrated decreased DHI with increasing degeneration grade (76). Reduced spinal loading due to microgravity, investigated through space flight, has also been shown to alter the viscoelastic behaviors of caudal mouse IVDs upon return to earth (77). A model of whole-body vibration has also induced structural changes in mice IVD indicated by structural NP and AF deficits (78). Collectively, these models are capable of inducing degeneration by manipulating the external loads applied to the IVD and these models demonstrate high clinical relevance given the significant changes and enhanced stresses and loads experienced by humans with degeneration and aging.
Another common method to induce IVD degeneration is via needle puncture, which induces a direct injury to the IVD and is thought to depressurize the NP, thereby changing the normal internal mechanisms of load support (79,80). Needle puncture models have shown similar hallmarks of IVD degeneration as in humans including decreased DHI, reduced disc hydration and cellularity, matrix disorganization, and reduced torsional and compressive stiffness (79,(81)(82)(83)(84)(85)(86). The severity of degeneration can be influenced by multiple factors such as needle gauge, surgical procedure, depth of puncture, and region of injury (i.e., caudal vs. lumbar) (40). There has been considerable work in multiple animal models to determine how the size of the puncture relative to IVD height influences degeneration (79,(87)(88)(89)(90). Across models, there appears to be a threshold in which a smaller injury ratio (i.e., needle diameter to disc height ratio) induces less severe degeneration and above which severe degeneration is induced, however, the exact threshold varies based on the species and region of the puncture (87,90). In mice, an injury ratio of 90% (26G needle) significantly altered IVD compressive and torsional mechanics while a 65% injury ratio (29G needle) had no significant differences (79). It is important to note that puncture models are useful due to the acute or rapid degeneration of the punctured discs. However, the puncture itself is generally not intentionally induced in humans in the present day.
Mechanical and puncture induced models are advantageous as they do not require significantly aged mice or genetic modification to induce accelerated degeneration. However, it is worth noting that the severity of degeneration relies heavily on the degree of mechanical loading or injury. To date, puncture models have primarily been induced in caudal level IVDs due to surgical accessibility. However, lumbar level puncture models may be more representative of LBP and give the capability to assess pain-like behaviors. In addition, injury induced degeneration often results in post-injury inflammatory processes which may be mechanistically different from spontaneous IVD degeneration and more representative of acute trauma (91). Thus, the type and degree of degeneration that the therapeutic strategy aims to target are critical for determining the relevant animal model utilized.

Miscellaneous Models
Other models exist such as dietary-induced diabetic or Tobacco smoke inhalation models which demonstrate increased disc degeneration with increases in cell senescence and reduced proteoglycan synthesis compared to control mice not exposed to a high-fat diet or tobacco products (92,93). Multiple studies have also shown that diabetes and obesity are significant risk factors for IVD degeneration which may serve as models if studying diabetic/obesity-related IVD degeneration (94)(95)(96)(97)(98).

Methodology and Scope of Review
To focus the scope of this comparative review on translational mice models of LBP for studying regenerative therapies, articles were collected via PubMed and Scopus with multiple combinations of the search terms: "Intervertebral Disc, " "Low Back Pain, " "Mice, " "Mouse, " "Model, " "Discogenic back pain" and "degeneration." Articles included in the review met the following criteria: characterization of a mouse model, direct/indirect induction of IVD degeneration (lumbar or caudal), and potential for use as an IVD degeneration model for regenerative medicine (i.e., no genetic deficits that greatly alter other tissues/organs). Exclusion criteria included: studies with a focus on specific pathologies/mechanisms or spinal diseases beyond IVD degeneration (i.e., Scoliosis, kyphosis, arthritis) or LBP unrelated to IVD degeneration (i.e. direct neuronal trauma without IVD degeneration). Respective articles assessing Molecular and Cellular changes (Table 1), Tissue & Structure/Function ( Table 2), and Pain-like behavior ( Table 3) were compared for their induction methods (aging, injury type, degeneration level (i.e., caudal vs lumbar), length of study, model demographics (age, sex, strain), along with study results. The following sections will review different assessments to identify key parameters for translation to the human condition.

MOLECULAR AND CELLULAR ASSESSMENTS
Several molecular and cellular assessment methods can be used to characterize IVD degeneration. Transcriptome expression utilizing quantitative reverse transcription-polymerase chain reaction (RT-qPCR) can be used for the evaluation of healthy phenotypic markers, matrix markers, catabolic enzymes [i.e., matrix metalloproteinase (MMPs), ADAM Metallopeptidase (ADAMTs), Tissue inhibitor matrix metalloproteinase 1 (TIMP1)] and inflammatory cytokines (55,56,73,86,106). However, mice IVDs are small in size, reducing the ability to obtain sufficient amounts of total RNA per disc. Thus, the pooling of several IVD levels is often required, limiting the ability to assess level effects within each mouse. In-situ hybridization can also be used to localize and quantify transcriptome expression (86,99). Commonly, immunohistochemistry (IHC) and immunofluorescence (IF) are used to detect target proteins within cells of histological tissue sections. This includes cellular expression of matrix proteins such as collagen, aggrecan,  (86) Tail skin incised and IVD exposed. Puncture of IVD (31 G) to 1 mm depth into mice tail IVDs (noted as an "annular" puncture)    Table 1.
To quantify apoptotic cells within IVD tissue, the TUNEL assay can be used. Cellularity within the different IVD compartments is commonly characterized via histological stains, namely Hematoxylin & Eosin (H&E), Safranin O-with or without Fast Green, or FAST multi-chrome staining. Additionally, the loss of a notochordal phenotype, transition to a mature NP phenotype, or cell clustering/hypertrophy can be accessed via changes in cell morphology in combination with phenotypic markers either via gene or protein measures (107). In addition to IVD cells, immune cells play a large role in the pathogenesis of LBP including macrophages and mast cell infiltration which can be accessed via IHC/IF, green fluorescent protein labeled transgenic mice (32)(33)(34)36).

Comparisons of Molecular and Cellular Assessments Between Models
Most mouse models that included molecular and cellular assessments described in this section have based their assessment of disc degeneration on changes in cellularity and ECM markers, with few papers examining changes in phenotypic and inflammatory markers ( Table 1). This may be due to the limited size of mice IVDs and availability of tissue for multiple assessments as previously mentioned. Multiple strains of mice have been used including wild-type, CD-1, SPARC-null, SM/J, and Swiss Webster mice (55,56,69,73,76,78,84,86,99); interestingly only one study compared both male and female mice in this section (56). Parameters have been evaluated at different time points ranging from pre-op (if mechanically/puncture induced), 0 days (immediately after puncture) and up to 2 years. These cell/molecular level comparisons can be categorized into cell morphology & cellularity, cell-specific Extracellular Matrix Genes, and phenotypic & inflammatory markers. Despite immune infiltration as a contributor to LBP, only expression of inflammatory cytokines was assessed in the following studies.

Cell Morphology and Cellularity
In all studies, regardless of the induction method, mice strain, or sex, the intervention to induce degeneration increased apoptosis and decreased cellularity in NP and CEP regions of the IVD (55,56,73,76,78,86,99). Interestingly, in the bipedal mouse, the size of NP cells decreased compared to control mice whereas, in the SM/J mice, cellular hypertrophy was observed with loss of vacuolated cells (56,76). This loss of notochordal cell phenotype and transition to chondrocyte-like mature NP cells was also observed in compression, and puncture models of lumbar and caudal IVDs (84,86,99).

Extracellular Matrix Genes
As maintenance of ECM homeostasis is critical for IVD structure/function, it is not surprising that most models include assessments of cellular ECM. MMP13 expression was typically assessed in mechanically induced models and found to be upregulated as early as 1-week post-induction. Interestingly, the whole-body vibration model showed no differences in MMP13 or ADAMTs4/5 compared to other mechanically induced models (78). Most studies demonstrated decreased collagen 2 and aggrecan expression in the IVD while collagen 1 expression was inconsistent across models, time points within models, as well as assessment method (i.e., RT-qPCR on entire IVD or IHC); suggesting different degenerative mechanisms between different models (73,78,86,99).  -19), CA3, Syndecan 4 (SDC4), and Glucose transport 1 (GLUT-1) with increased expression of connective tissue growth factor (CTGF) and RUNX2, suggesting loss of a healthy NP phenotype (56). When assessing inflammatory cytokines in wild-type mice with punctured lumbar IVDs increased TNFα and IL-1β expression was observed at 2 and 4 weeks, indicative of acute inflammation (84). Interestingly, in SM/J mice no differences in IL-6, ADAMTs4, or fibromodulin (FMOD) were identified, but a decrease in vascular endothelial growth factor (VEGFα) gene expression was observed (56, 108), suggesting differences in both phenotypic and inflammatory marker expression across models.

Translatability of Model Results to the Human Condition
Molecular and cellular changes that characterize IVD degeneration in humans include decreased cellularity (with increased apoptosis and senescence), immune infiltration, loss of proteoglycan/GAG & collagen 2, and increased fibrosis/collagen 1 in the NP (109). Many of the mouse models of IVD degeneration reviewed here demonstrated similar molecular and cellular changes as the human. However, one major difference was inconsistent changes in collagen 1 across different studies. In humans, large vacuolated notochordal cells predominate before 10 years of age and transition into chondrocyte-like cells during adolescence. In degeneration, the NP decreases in cellularity with more degenerate cells (110). This characteristic change in cell phenotype and number was observed in puncture models along with the SM/J model, providing clinical relevance to the human condition on the cellular level for using these models to study IVD degeneration and regenerative therapies. Healthy NP phenotypic markers identified in the human IVD include GLUT-1, Aggrecan/collagen 2 ratios>20, sonic hedgehog, Brachyury, KRT18/19, CA12, and CD24 (111). Markers such as GLUT-1, aggrecan, KRT-19 were also decreased in mice with IVD degeneration. Unlike the NP, the AF and CEP are under characterized and some studies have suggested COL1A1, Elastin (ELN), Mohawk (MKX), and Scleraxis (SCX) as potential markers of the healthy AF (112)(113)(114)(115). In addition, there is an increase in catabolic and pro-inflammatory markers (i.e., TNFα, IL-1β, IL-6, MMPs, ADAMTs) in degenerate human discs compared to healthy IVDs (116,117). Increased expression of TNFα, MMP13, and IL-1β in mice IVDs was consistent with the human condition while many mouse models found no differences ADAMTs genes expression. Importantly, infiltration of immune cells, notably macrophages and mast cells, have been found in degenerate human IVDs and mouse IVDs post-injury, warranting the need to assess immune cell infiltration in mice models to account for immune-related systemic effects (32-34, 36, 37).

Critical Molecular and Cellular Assessments for Mouse Models of IVD Degeneration
Given the molecular & cellular parameters highlighted here, gene expression seemed to vary between studies and was often inconsistent when comparing relevant human markers while histological and IHC assessments for protein level assessments were more consistent when compared with the human condition. Thus, in terms of molecular markers, TNFα and IL-1β are critical inflammatory markers while MMP13 may be a good catabolic marker for assessment of degeneration in mice IVDs as they are most relevant to the human condition. However, due to size limitations, the temporal expression on the gene level, and difficulty isolating high-quality RNA from single-level IVDs, transcriptome level changes may be more challenging to corroborate between mouse models. Histological stains such as H&E, Safranin O, are good for the assessment of cellularity while TUNEL assay can give information related to apoptosis consistent with human IVD degeneration.

TISSUE AND STRUCTURAL/FUNCTIONAL ASSESSMENTS
Assays such as dimethyl methylene blue (DMMB) are used to measure GAG and are typically normalized to DNA content (Hoechst or Pico green) or tissue weight (55,79,86,101). For collagen content, Hydroxyproline or Sircol assays are used, although the specific collagen type is indistinguishable. Structural tissue level assessments are commonly assessed via histological staining such as H&E for overall tissue organization, Safranin O (with/without Fast green) or alcian blue (AB) for proteoglycans, picrosirius red (PSR) or Elastica van Gieson for collagen, Azan trichrome and Goldner's Masson's for more general cartilage and bone, and Weigert's Resorcin Fuchsin for elastin (50,73,82,102,(118)(119)(120)(121). Multichromatic staining (i.e., FAST) can enhance visualization of tissue-level changes (122,123). Histological grading schemes can be used to determine the degree or severity of IVD degeneration (82,102,123).
In addition to histology, in vivo and ex vivo imaging methods including, radiographs, micro-computed tomography (microCT), and magnetic resonance imaging (MRI) are widely used for characterizing gross characteristics of tissue integrity/structure such as tissue hydration, disc height, grade of degeneration, and bone mineral density (50, 54-56, 69, 73, 76, 84, 86, 101, 102). DHI and Disc wedging index (DWI) can be derived from image analyses where DHI is the thickness/height of the IVD relative to vertebrae length and DWI is a measure of the anterior vs. posterior angulation, or wedging, in the sagittal plane (88,124). For example, a DWI > 1 indicates that the IVD is wedged with pressure at the posterior region and increased likelihood of pressure to the spinal cord and dorsal root ganglion (DRGs). In addition, Pfirrmann grading, an MRI based degeneration scale for humans ranging from 1 = healthy to 5 = severe disease, can be determined via calculating MRI index and normalization to both control IVDs and background signal intensity as described in Onishi et al. (82,125).
The mechanical behaviors of mice motion segments have been evaluated both in vitro or in vivo and mechanical testing is often used to inform or quantify IVD structure-function (99,126,127). Multiple testing protocols have been developed to assess behaviors in vitro under multiple loading modes such as compression/tension, torsion, and creep using a mechanical testing system ( Table 2). Depending on the mode of testing, mechanical parameters can be extracted either directly from the raw data or derived from fitting the data with various mathematical models and used to quantify changes that occur with degeneration.

Comparisons of Structural and Functional Assessments Between Models
In mouse models of IVD degeneration ( Table 2), most studies assessed changes in disc structure & degeneration grade via histology, and DHI via radiographs or microCT. Interestingly, despite being critical parameters of disc structure/function, few studies assessed mechanics or disc hydration. Multiple genetic strains of mice were used, including the wild-type, CD-1, SPARCnull, SM/J, ERCC1-XPF, and Swiss Webster mice with only 2 studies comparing both male and female mice. Parameters were assessed from pre-op to 2 years of age and included imaging parameters (X-ray, microCT, MRI) for DHI and hydration, histology for disc structure and grade, neurovascular invasion, and mechanics.

Imaging: Radiographs, MicroCT, and MRI
Radiographs and microCT imaging are commonly used across models to assess DHI. Most models demonstrated decreased DHI in degenerative groups except for the subcutaneous puncture model where DHI initially decreased at 2 weeks but increased at 4 to 6-week time points. Mice with lumbar instability demonstrated increased CEP volume and porosity in mice along with decreased IVD volume starting at 2 weeks and bone loss at 16 weeks (100). Genetic SPARC-null mice assessed DWI and found wedging of IVDs in both SPARC-null and old wild-type mice, but with increased severity in SPARC-null groups (54). Similarly, ERCC1 mice demonstrated decreased DHI at 3 weeks which was similar to old aged mice with decreased bone mineralization (50,55). In both mouse tail compression and needle puncture models, IVD hydration decreased as early as 2 weeks post-op and even earlier in larger needle puncture models (73,101). Although MRI measurements were included, Pfirrmann grades were not assessed in these studies as per Onishi et al. (82).

Histological Assessment, ECM, and Grading
Histological staining of mice spines or motion segments has been widely used to assess changes in IVD structure. All models showed reduced proteoglycan in the NP, AF disorganization with serpentine-like lamellae, decreased IVD height, and reduced distinction of NP and AF boundaries indicative of collagen infiltration into the NP region. This change in structure correlates with decreased GAG content in all models of degeneration in the NP region and increases in collagen 1. However, these changes occurred at different time points depending on the degree of puncture (i.e., increased needle size showed degeneration early on), compression (increasing AF disorganization with increased compression), and mode of mechanical disruption (direct injury to IVD vs. indirect mechanic induction such as in bipedal mice) (78,86,99,128). Compared to wild-type mice that often show signs of IVD degeneration at 1 year, mechanically induced/puncture models exhibit characteristics of degeneration at earlier times (∼2 weeks) and genetic models (SPARC-null, SM/J, and ERCC1 mice) show signs at ∼3 weeks (50,(54)(55)(56). As expected, IVD grades increased with the severity of degeneration across all models but the grading scheme varied across studies ( Table 2). Thus, a standardized histopathology scoring system using machine learning algorithms for the mouse model as well as the human has been proposed (129,130).

Neurovascular Invasion
Lumbar IVD puncture models demonstrate increased nerve growth factor (NGF) expression starting at 2 weeks and Protein Gene Product 9.5 (PGP9.5) starting at 4 weeks similar to old wild-type mice, with increased expression of PGP9.5 and vascular marker CD31 with age (50,84). In SPARC-null mice, PGP9.5 and calcitonin gene-related peptide (CGRP) expression were both upregulated in IVDs, indicative of nerve fiber ingrowth in these different models regardless of the mechanism of IVD degeneration (103).

IVD Mechanics
One of the IVD's main functions in the spine is to facilitate motion while supporting physiologic loads and the objective of many therapies is to restore the mechanical function of the IVD. Thus, assessing the mechanical behaviors and properties of the IVD is critical to characterize disease progression and assess therapeutic efficacy. However, the small-scale mechanical testing of mice motion segments often requires custom devices and has not been universally applied. The studies that have conducted mechanical testing on mouse degeneration models have demonstrated that degeneration induced via compressive overload on caudal IVDs resulted in increased compressive stiffness with no difference in bending stiffness or strength compared to healthy controls (99). However, under creep loading degenerated IVDs from the same model demonstrated a reduction in the strain-dependence of swelling pressure determined from fitting with a fluid transport model (131). Another model of compressive force induced via suture showed increased compressive stiffness over 4 weeks (73). In caudal puncture models, compressive stiffness, torsional stiffness, and torque all increased with needle size (79). SM/J mice showed increased compressive stiffness, suggesting that the mechanical function was altered in these models compared to healthy controls (56).

Translatability of Model Results to the Human Condition
On the structural level, human IVD degeneration is characterized by dehydration of the NP due to proteoglycan loss, AF disorganization resulting in a loss of AF-NP boundaries, decreased DHI, and sclerosis of the CEP. In the case of herniated IVDs, the IVD contains AF fissures leading to NP protrusion (109,117,132,133). These changes in the NP and AF were all observed in the mice models discussed here however degenerative changes in the CEP region were mostly unquantified ( Table 2). Additionally, this loss of IVD architecture in degeneration creates a permissive environment for vasculature and nerve ingrowth into the usually avascular and aneural healthy IVD which may be a source of pain in DBP (134)(135)(136). This correlates with the increased nerve fiber ingrowth in degenerative mice IVD as seen in the assessed studies. In the clinic, structural changes in the IVDs of living patients are often detected via MRI and CT or x-ray which correlates with the use of MRI and microCT in mouse models to detect structural changes in the IVD, spinal cord, and adjacent tissues. Furthermore, human IVDs can also be assessed macroscopically ex vivo using the Thompson grading scheme (137). However, this grading scheme is limited clinically as it is typically performed on cadaveric tissue. Instead, the Pfirrmann grading scale can be used to grade human IVDs imaged via T2-weighted MRI assessing DHI, hydration, and NP/AF compartmentalization (125). The present mouse studies primarily utilized histological assessments which may be attributed to the accessibility of histological techniques compared to imaging modalities such as MRI. It is also important to note that degenerate IVDs imaged as black discs on MRI do not always correlate with LBP; patients can have several degenerate IVDs and experience little or no pain (95,138). This highlights a need to also quantify measures of pain as changes on the structural level alone cannot always be used as a proxy for diagnosis or treatment of LBP.
These structural changes that develop during degeneration induce corresponding changes in IVD mechanical behaviors & properties. Mechanical properties of tissues can vary substantially based on the size and dimensions of the tested material; similar to how a thick piece of rope is stiffer than a thin piece of rope of the same material. This concept also applies to the IVD as the physical dimensions and associated axial and torsional properties vary between species due to IVD size (38,139,140). To account for these geometric differences the properties can be normalized to the geometry of the specimen, similar to material properties in material science, and thereby allow comparison of properties independent of geometric biases. For example, if the mechanical behaviors of IVDs of various species and anatomical locations (i.e., lumbar and caudal) are normalized to their respective geometries, the differences between species and region is reduced and in some cases are insignificant which allows direct comparison between mice and human IVDs (140,141). For example, torsional stiffness (K) and torque range (TR) are often used to characterize the rotational behaviors of IVDs. Comparing the raw measurements there is a difference of ∼4 orders of magnitude between human (K Human = 3.18 ± 0.89 Nm/degree) and mouse (K Mouse = 1.1 * 10 −4 ± 1.83 * 10 −5 Nm/degree) IVDs. However, after normalization to the disc height and polar moment of inertia (i.e., the shape's resistance to torsional deformation) there are no significant differences between the two species (139). Studies have further compared the normalized axial and viscoelastic creep properties across species and demonstrated that normalized material properties are largely conserved across species (139,140,142). This suggests that when comparing normalized mechanical parameters, mouse models are reasonable models of the human condition. In addition to comparing normalized mechanical parameters extracted directly from mechanical testing, other studies have compared how analytically derived parameters determined from fitting mechanical data with theoretical models under similar loading conditions compare between regions and with degeneration. For example, the response of human and mouse IVDs to creep loading has been fitted with a fluid-transport model, and similar decreases in the parameter corresponding to the strain dependence of swelling pressure decreases in both mouse models and degenerated human IVDs (77,99,143).

Critical Structural and Functional Assessments for Mouse Models of IVD Degeneration
For the structural and functional parameters assessed, histological staining was consistently used across most mouse models with limited differences between models and correlated well with changes observed in human IVD degeneration. Thus, histological assessment of reduced proteoglycan in the NP, reduced NP/AF demarcation, and disorganization of the AF matrix are all critical factors to assess when confirming degeneration in mouse models and the regenerative potential of therapies. DMMB assessments are consistent between all studies not only within these models but across the IVD field. Collectively, the combination of these observations in addition to standardized histopathological scoring systems will improve the comparability across studies and their translation to the human condition (129,130). T2-weighted MRI to assess Pfirrmann grade, IVD hydration, and microCT for DHI is also strongly recommended as these assessments are used clinically in human patients and parameters are consistent across mouse models. Notability, assessment of mechanical stiffness is important given the consistent increase in compressive stiffness observed in diseased mice and human IVDs, yet the expertise and access to equipment to do these tests may warrant collaborations across disciplines.

PAIN BEHAVIORAL ASSESSMENTS
While there is a clear correlation between the severity of IVD degeneration and LBP, not all clinical cases of IVD degeneration present with painful symptoms in humans (144).
Thus, assessments of pain are crucial in the evaluation of therapies aiming to treat symptomatic pain in addition to ex vivo IVD measures. While some pain markers can be evaluated ex vivo, pain perception is a cortical activity requiring the peripheral and central nervous systems (145). Therefore, the brain is required to perceive pain, and this is a component that in vitro models cannot recapitulate as pain is due to the complex interplay between the different components of the nervous system and surrounding tissue. The following sections will summarize the different classifications of pain associated with LBP as well as the available behavioral assessments for mice models categorized into evoked vs. spontaneous pain measures. Based on these comparisons, recommended assessments will be discussed.

Types of Pain and Changes in the Sensory Nervous System
Pain can be categorized broadly into neuropathic or nociceptive pain and LBP likely has neuropathic and nociceptive contributors. Neuropathic pain results from direct injury to neuronal tissue. In the case of LBP, the pain can arise from IVD herniation which causes pressure on the nerve root and innervating DRGs, resulting in inflammation, radiculopathy, and damage to the nervous system directly. Nociceptive pain results from injury to non-neuronal tissues such as from the muscles surrounding the IVD or the IVD joint itself, where neoinnervation can occur and peripheral nociceptive neurons are stimulated/excited and transmit signals to the central nervous system (146,147). Although not widely understood, an emerging third type of pain known as "nociplastic pain" may also play a role in LBP with a distinct mechanism unlike neuropathic or nociceptive pain (148). LBP can also be classified by the duration of pain (acute or chronic). Acute LBP occurs due to tissue trauma with patient recovery within a month, is typically self-limiting, and becomes subacute in the 1-3 month range (149). Chronic LBP lasts more than 12 weeks and patients who present with acute pain may develop chronic LBP over time (150,151). Acute pain is often protective and alerts individuals to potentially damaging environmental stimuli, thereby aiding in the healing process. Meanwhile, chronic pain serves no protective role and can be debilitating (152). LBP can also be classified into spontaneous or movement evoked discomfort with localization to the lower back and spine, or, radiating pain which also affects the legs due to injury or inflammation of the nerve root (153,154).
Anatomically, the IVD is innervated bilaterally by neurons with cell bodies residing in the DRGs. DRGs are heterogeneous (including proprioceptors, nociceptors, Schwann cells, fibroblasts and satellite glial cells) and are responsible for transmitting signals from the periphery to the central nervous system via projections into the dorsal horn (155,156). Functional changes and sensitization of the sensory neurons due to IVD degeneration can lead to neuropathic or nociceptive pain via multiple mechanisms (147). During IVD degeneration, nerve endings expand from the outer AF to the inner AF and NP regions of the disc due to decreases in chondroitin-sulfated proteoglycans and elevated levels of neurotrophic factors [i.e., neurotrophins such as NGF and brain-derived neurotrophic factor (BDNF)] and this can lead to nerve sensitization (29,136,147,(157)(158)(159)(160). In addition, the degenerate IVD is largely innervated by small nociceptors that express voltagegated sodium channels (VGSCs) or transient receptor potential cation channel subfamily V member 1 (TRPV1) that regulate neuronal activity (161). Persistent inflammation in the IVD can sensitize these neurons and induce changes, causing altered action potential duration, hyper-excitability, lowered thresholds to stimuli, and enhanced pain as observed in rodent models (50,(161)(162)(163). The activation of Protease-activated receptor 2 (PAR2) on DRG sensory neurons can regulate acute and chronic pain by activating the extracellular signal-regulated protein kinase (ERK 1/2) signaling pathway (162,164,165). However, the exact mechanisms driving LBP remain unclear with a limited number of mice models investigating changes at the DRG level.

In vitro Pain Assessments
In vitro pain assessments in this section are related to neuronal function/activity of innervating DRGs and spinal cord taken from mice models ex vivo while neurovascular invasion (nerve ingrowth and neo-angiogenesis) into IVD tissue were addressed in the previous section. Ex vivo characterization of isolated neurons has been performed to determine the role of DRGs in discogenic neuropathic/nociceptive pain. Electrophysiology and IHC have provided insight into the function and expression patterns of the ion channels/receptors and have highlighted their role in the pain signaling pathway in rat models but few ion/channels have been assessed in mice (161,166). In addition, calcium imaging is a standard method to quantify changes in ion channel activity in the neurons (167). At the gene level, transcriptome analysis of the DRG neurons has been used to identify key genes associated with pain perception (168). In addition, IHC on the DRGs from a lumbar IVD puncture model demonstrated increased key pain markers, CGRP, Tropomyosin receptor kinase A (TrKA), and NGF from 2 to 12 weeks postop and this is further supported by increased gene expression of CGRP, substance P, TrKA, BDNF, TRPV1, Neuropeptide Y (NPY), VGSCs Nav1.7/1.8 (84). In SPARC-null mice, CGRP reactivity was also elevated in addition to NPY in DRG neurons (103). IHC on the spinal cord showed increased expression of astrocyte marker, Glial fibrillary acidic protein (GFAP), and also microglia expression at 2 weeks in the puncture model, potentially due to inflammation from injury, while SPARC-null mice found increasing numbers of astrocyte and microglial cells with age (84,169). Thus, the DRGs are key determinants in the induction and maintenance of both neuropathic and nociceptive pain, making their assessment in LBP models critical.

In vivo Pain Behavior Assessment
Pain-like behavioral assessments can be classified into evoked or spontaneous pain as illustrated in Figure 2. Spontaneous pain occurs in the absence of specific stimuli and is more indicative of clinical chronic pain conditions (170). Examples of spontaneous pain in mice may include audible vocalization, avoidance, and self-mutilation (171). Meanwhile, in evoked behavioral assessments, mice are presented with a stimulus representing sensory modalities that allow for the measurement of pain thresholds (172). Signs of evoked pain may include motor reflexes such as limb withdrawal from the stimulus, reduced locomotion, or agitation. Evoked pain can further be categorized into hyperalgesia and allodynia. According to the International Association for the Study of Pain (IASP), allodynia is "pain due to a stimulus that does not normally provoke pain" while hyperalgesia is "increased pain from a stimulus that normally provokes pain" (173). The following sections describe the different behavioral assessments used in rodent models of LBP and their associations with human pain.

Spontaneous Pain Behavior Assessments
Spontaneous behavioral assessments include open field, gait analyses, burrowing, and grimace scales. Open field involves the placement of mouse on a square field and allowing freeroaming to assess exploratory behavior with quantification of multiple parameters such as the number of times an area was visited, movement/sedentary time, and rearing (174). Gait and weight-bearing analyses are used to measure nociception in mice as ambulation can mechanically stimulate the spine and alter the way mice walk and bear weight. Several platforms are available for mice as reviewed by Deuis et al. (152). A limitation of weight-bearing analysis for the study of LBP is that these assessments may be easier to interpret for unilateral injury models comparing differences between uninjured vs. injured sides rather than an overall assessment of gait. Grimace (that includes orbital tightening, nose bulge or cheek bulk or ear position) can be observed and scored on a Grimace Scale as a measure of pain intensity in mice from a scale of 0 as being normal to 2 as in severe pain (severely altered facial features) (175). While the grimace test is considered accurate, a significant level of pain is required to detect these facial changes and is better utilized for pain of acute/moderate duration, not chronic pain such as LBP. In addition to grimace, Paw behaviors can be assessed but can be unreliable as the lifting behavior is not observed universally in pain models (16). Burrowing can be used to measure nociception where burrowing material is placed in the mouse cage and the burrowed material quantified before and after the mouse is introduced into the cage (176). Rodents experiencing pain or discomfort will burrow less compared to normal mice.

Evoked Pain Behavior Assessments
The von Frey assay is commonly used to evaluate mechanical allodynia via application of calibrated monofilaments or a handheld device to the plantar surface with evoked behaviors quantified (152). The Tail Flick test involves a heat stimulus to the mouse tail via a direct light beam or dipping the tail in hot water (∼46-52 • C) until a tail-flick is elicited (177). However, this also requires mouse restraint (may cause significant stress) and clinical translatability of this assessment is unclear (may be a spinal reflex rather than a pain response) (152). A cold or hot plate can be used to determine thermal hyperalgesia and involves placing mice on a heated or cooled plate and recording time for paw withdrawal, licking, stamping, leaning, or jumping in response to the stimulus. Alternatively, dynamic cold/hot plate tests can be used where the mouse is placed on the plate at nonnoxious temperatures with the temperature increased/decreased and the temperature at which the mouse responds is recorded (152). The Hargreaves test uses radiant or an infrared heat stimulus aimed at the planter paw of the mouse through a glass platform and the time to withdraw is recorded (178). This test is preferred to the hot plate test as it allows the measurement of individual ipsilateral and contralateral heat thresholds while the cold or hotplate applies the stimulus to both paws at the same time (179). Similarly, if the goal is to record the temperature threshold rather than latency to static temperature, a thermal probe test can be used where the mouse is placed on a wire mesh where a probe of increasing temperature is applied to the paw and temperature at withdrawal is recorded (180). The acetone evaporation test is a measure of thermal allodynia and involves applying acetone to the mouse plantar paw surface with the number of evoked responses or severity of responses recorded (181). This is advantageous to the cold plate as a unilateral application can be achieved. Similarly, a cold plantar assay can be deployed via the application of dry or wet ice to the mice planter paw with the latency to withdrawal recorded for quantification of cold allodynia and hyperalgesia (182).
Grip Force is used to assess neuromuscular activity in rodents and is an indicator of muscle inflammation and deep tissue pain (183). The test uses grip meters, placing the mouse on a grip or rod, then the mouse is pulled back by the tail to induce stretching, and the peak force at the point of release is recorded. Similarly, an alternative is the hanging wire test where mice are placed on a wire mesh with grip induced before platform inversion and the latency to fall recorded (184). Decreased latency to fall would imply decreased motor function and increased pain. The tail suspension assay involves the suspension of mice by the tails (taped to the edge of a ridged platform) and the immobility/mobile time, rearing, full-extension, amongst other behaviors can be analyzed. While this test has been historically used as a measure of depression in mice, Millecamps et al. have used this as an assessment of axial pain (54,185). More mobility will be observed in mice with axial stretching sensitivity due to pain compared to normal mice. FlexMaze assays were first utilized by Millecamps et al. where the mouse was placed in a maze and forced to undergo lateral flexion as they maneuvered their way through the maze (54). This allows for the measurement of lateral-flexion discomfort in mice similar to lateral human flexion. For locomotor capacity, a rotarod assessment may be used (186).

Cognitive Assessments
Although not utilized yet in mouse models of LBP, cognitive assessments such as the Barnes maze and Morris water maze may be useful to correlate DBP with psychosocial or nervous system injury/impairment (187). Although impairment in cognitive function due to LBP is rare, Schiltenwolf et al. revealed that patients with chronic LBP have slowed speeds of information processing and working memory (188).

Comparison of Pain Assessments Between Models
Unlike ex vivo assessments of DRGs which have been widely utilized, behavioral assessments of pain in mice models of DBP are less well characterized. Only 35% of the papers included in this review assess pain as a major outcome and a majority of these use the SPARC-null mouse (54,103,104,189). Thus, published models measuring pain-like behaviors are limited to the ventral lumbar puncture model, SPARC-null, and aged wild-type mice as shown in Table 3. Male or female mice were used ranging from 6 weeks to 2 years old with one study comparing both sexes (50) in terms of behavioral assessments. Results for spontaneous pain & locomotor capacity, mechanical & thermal sensitivities, and axial/lateral pain are described below.

Spontaneous Pain and Locomotor Capacity
Open field assessments of aged wild-type mice showed decreased rearing (time spent on hind limbs) with age and a decline in general activity similar to SPARC-null and injured mice with the addition of decreased burrowing in the puncture model after 4 weeks (50,54,84). This suggests that SPARC-null and injured mice experience spontaneous pain similar to agerelated degeneration evidenced by decreased movement and rearing which requires significant trunk stability and lower body strength. Locomotor function was assessed in SPARCnull mice with interesting results between studies. In a study comparing 3 and 9-month-old SPARC-null to wild-type agematched controls mice on the rotarod, no differences between the groups were observed which suggested no generalized nervous system dysfunction (105). In studies including male SPARC-null mice ranging from 6 to 78 weeks in age, rotarod assessment demonstrated improvement with age in SPARC-null mice and wild-type controls but drastically decreased after 70 weeks in SPARC-null mice while wild-type mice plateaued and the authors suggest this may be due to learned behavior on the rotarod (54). In female mice, within the same age range, rotarod activity suggested decreased physical function with age in SPARC-null and wild-type mice with no effect of strain (169).

Mechanical and Thermal Sensitivity
No differences in mechanical sensitivity were identified by von-Frey in SPARC-null mice compared to wild-type mice or effects of aging/time although some slight sensitivity was observed on the lower back (54,(103)(104)(105). Meanwhile, mice with multiple injured IVDs showed a significant decrease in threshold sensitivity at 12 weeks, suggesting increased mechanical sensitivity and mechanical allodynia while single level puncture CD-1 mice did not exhibit any difference (84,85). Cold sensitivity assessed via acetone demonstrated increased evoked behaviors in female mice with punctured IVDs and in young male SPARC-null mice increasing hypersensitivity correlated with age. Interestingly, female SPARC-null mice exhibited cold hypersensitivity later at 18 weeks (54,85,104). In SPARC-null mice tail immersion was used to assess cold sensitivity but no differences were identified. Hot Plate, Hargreaves, Tail flick, and capsaicin were used as measures of heat sensitivity. Injured mice demonstrated decreased latency by 8 weeks post-op while SPARC-null mice exhibited no difference between young/old or wild-type mice in heat sensitivity, which highlights potential differences in pain mechanisms between injury puncture and genetic models. This is further supported by aged wild-type mice that demonstrated no correlation of thermal hyperalgesia to age, sex, or weight of mice (50).

Axial and Lateral Pain
Axial discomfort assessed by grip force or tail suspension in SPARC-null mice and mice with single IVD level injury showed decreased resistance to force along with decreased time immobile and increased time in self-supporting modes, suggesting SPARCnull and injury mice may experience axial discomfort and attempt to mitigate this through behaviors to alleviate axial stretch. This effect has also been observed in old wild-type mice (50). The FlexMaze demonstrated decreases in physical function associated with lateral flexion and reduced exploration speeds in SPARC-null mice (54). These assessments have not been made on mechanically induced or IVD puncture mice models.

Translatability of Model Results to the Human Condition
A major difference between pain assessments in humans and animal is the inability of animals to communicate their pain. Humans can verbally communicate pain while for animals we rely on changes in pain-like behaviors as a proxy for pain. Unlike mouse models, humans have environmental influencers to pain and pain is also subjective as pain tolerances are different across patients. Physicians have clinical patient history and they provide in-depth physical examinations with pain scales (Numeric Pain Rating Scale (NPRS), Pain disability index (PDI), Visual Analog scale (VAS), and Oswestry index). Humans also possess psychosocial aspects which can affect pain while mice may not. However, some studies suggest mice have the potential for high-level emotional pain (190). As reviewed by Mogil et al., evoked assessments often fail in their translation to the clinic and thus spontaneous pain may be more clinically relevant to the human condition (191). It is evident from closer observation of mouse behavioral assessments, especially in evoked assessments, that the variability is heightened, and multiple factors can attribute to differences in pain-like behavior such as testing environment, the force of pull such as in grip tests, experimenter, and animal stress. Such factors need to be considered when assessing pain-like behaviors.
Neuropeptides such as CGRP have been identified in small DRG neurons involved in pain perception (192), and NPY is also upregulated in nerve injury/inflammation and present in the lumbar IVDs in humans and mice (193,194). In addition, sodium ion channels are phosphorylated post-injury which allows for increased nociceptive signaling due to greater current density (195,196). In direct molecular comparison of mice and human DRGs, TRK receptors present in small nociceptive neurons and TRPV1 were similar between humans and mice, while human neurons have a larger average size (∼1.5-3 times larger than the mouse) (197). The mice models above demonstrated increased expression of neuropeptides along with increased sodium channels as assessed by IHC and RT-qPCR. Interestingly, receptor tyrosine kinase (RET), a pivotal protein in neuronal development, and Nav1.8/1.9 are significantly more abundant in TRKA+ cells of human DRGS compared to mice (197). Localization of CGRP and TRPV1 differs between humans and rodents (198) as does Nav1.8 activity (198,199). This suggests that, while changes in specific markers may translate to human pain, their localization with in neuronal populations between species may be distinct.
Synonymous with open field and rotarod assessments, human patients with radiculopathy have slower gait speed, shorter travel distances, and greater standing time similar to aged old mice (44)(45)(46)(47)200). Patients with LBP have increased movement evoked fatigue, decreased physical activity, as well as reduced flexibility (201). In humans, mechanical hyperalgesia may also be measured similar to algometers in mice with the application of force with gradual intensity to the patient's lower back (43). Interestingly, mechanical allodynia is more relevant in patients with radicular pain due to nerve compression or inflammation which may explain the lack of differences when conducting von-Frey on mice models with IVD injury or genetic disposition that lack direct neuronal trauma (202). In terms of thermal hyperalgesia, humans with LBP may experience coldness, radiating pain, and cold allodynia in one or both legs which correlates with findings in mice using the cold plate and acetone tests along with the mouse models of IVD degeneration in this review (24,203). Grip force and tail suspension tests showed axial discomfort which is also relevant in humans and sensitivity to stretching in mice models may be predictive of lumbar stiffness as perceived in humans (204). Additionally, mice models of IVD degeneration under tail suspension tend to have decreased time immobile with more time spent mitigating pain which is similar to the human condition where humans are more likely to modify movements to avoid pain than patients without pain (200,205,206). It is noteworthy to mention that "pain" is complex and multifaceted and cannot be directly measured in animal models. However, surrogate indirect measures such as observations of pain-like behaviors can be used to provide insight into DBP and the potential translation of therapies for LBP. Overall, the direct translation of pain in mice to human pain requires further interrogation, and advances such as multi-institutional collaborations involving multiple disciplines are critical before we can fully interpret the translation between models (207).

Critical Pain Assessments for Mouse Models of IVD Degeneration
Based on the review of pain-like behaviors described above, we have identified key ex vivo and in vivo pain-like behaviors that can be utilized in mouse models. Given their similarity to humans and consistency across models, ex vivo assessment of CGRP, TRKA, TRPV1, and sodium channels levels are recommended. For in vivo assessments, the recommendations are derived from the small number of studies that assessed painlike behaviors due to their consistent results and an increase in the number of studies may further elucidate more critical behavioral parameters. In vivo assessments could include open field for spontaneous pain-like behaviors, grip force & tail suspension for axial discomfort, and cold plate/acetone due to their similarity to human pain outcomes and consistency across models. Thermal hypersensitivity measures are less common in humans and have demonstrated inconsistencies across studies. While mechanical sensitivity is relevant in humans, the lack of significant difference in von-Frey in mice limits its clinical potential.

Criteria for Choosing the Optimal Model
In this review, we have highlighted several mouse models of IVD degeneration with the potential for understanding disease mechanisms and for screening regenerative therapies for LBP in vivo. Methods of assessment on the molecular, cellular, tissue, structure/function, and pain level were discussed and highlighted in light of their clinical translation to LBP in humans. Several prominent induction methods and downstream parameters stood out between the assessed mouse models. Mechanically induced models such as those which use compression, instability, whole-body vibrations, and bipedalism along with needle puncture models demonstrated IVD degeneration at early time points compared to naturally aging mice and may be more clinically relevant as an acute model of IVD degeneration following herniation/trauma with controlled level-specific effects. These models also indicate that the mode of puncture is critical with changes observed with different sized needle gauges which alters the degree of degeneration (79). Genetic/aging models such as ERCC1, SM/J, and SPARC-null mice exhibited hallmarks of IVD degeneration at early ages compared to aged wild-type mice without forced trauma on the IVD and may be good models for spontaneously occurring chronic LBP. Clinical limitations to note include the fact that cause of accelerated aging in these mice may be due to developmental/global defects. IVD degeneration was present across all lumbar IVDs in genetic models (varying in degree of degeneration) such as the global SPARC-null mice while in humans, IVD degeneration related to LBP is more prevalent in lower lumbar levels (L4/L5, L5/S1) (54).
In consensus, model selection criteria are largely dependent on the research question and regenerative therapy assessed. For example, critical differences between mice and human IVDs are their small size and reduced diffusion of nutrients across the CEP. Therefore, they may not be optimal for studying cell-based therapies as the cell viability in a mouse model will likely differ in larger human IVDs. Rather, mice may serve as a good tool for studying small molecule therapies, drug evaluation, and non-viral gene delivery (83,101). For example, the SPARC-null mice has been used for studying potential therapies targeting the IL-8 pathway and toll-like receptor 4 inhibition (189,208). Another important limiting factor is the presence of notochordal cells in mouse models, however, accelerated aging models do demonstrate a shift of notochordal to NP chondrocyte-like cells such as in SM/J mice (56). Another research question relates to the stage/severity of degeneration the therapy is targeting. For example, mice models of IVD degeneration caused by mechanical manipulation or direct puncture typically develop degenerative changes relatively quickly (which can vary depending on the degree of mechanical forces/puncture, needle size or injection of pro-inflammatory cytokines) while genetic models may be more representative of slower developing IVD degeneration as a result of aging. Thus, the target population should be considered as in humans, elderly patients present with lumbar spinal stenosis due to a decrease in NP hydration and narrowed IVD height while younger patients are more prone to AF rupture and NP herniation (30,209). Despite some limitations, these mice models are advantageous in their wide array of molecular assessments and genetic phenotypes and may serve as an excellent intermediate model between in vitro and more clinically relevant but logistically challenging large models such as the chondrodystrophic dog (49). They provide the ability to assess IVD structure/function and pain parameters in living animals whilst providing a more efficient way to screen regenerative therapies without using large animals, further contributing to the 3R principles (Reduce, Replacement, Refinement).

Recommended Downstream Parameters
This review highlights critical cellular/molecular, tissue structure/functional, and pain assessments for determining the validity and efficacy of regenerative therapies in the mouse model as illustrated in Figure 3. Of significance, sex differences in pain perception and IVD degeneration have been found in humans as well as animal models inclusive of articles within this review, which warrants the need to include both male and females studies to accurately represent the clinical patient population (19,(210)(211)(212). It is also imperative as sex bias in animal testing has resulted in clinical limitations and only 10% of the studies presented here assessed sex differences as shown in Table 4, which highlights the percentage of models in this review assessing each parameter. Evidently, the parameters that were included most assessed IVD morphology (85%) and DHI & hydration (66%) which are critical parameters used to determine IVD structure and function. Spontaneous pain, neuronal function, mechanics, and locomotor capacity were parameters that were least assessed (<25%). While these measures are important, there under use in mouse models of LBP may be a result of a lack of investigator access to the proper tools/expertise. A solution maybe to enhance collaborations amongst research groups with differing areas of expertise so that these outcome measures can be included in more mouse models. In terms of timeline, IVD models saw effects as early as 1-week postinduction/injury or as early as 8 weeks in accelerated aging mice. However, results at earlier time points in mechanically induced or puncture models may be more related to acute inflammation and thus may not accurately represent IVD degeneration in the human. Longer time points are recommended with incremental time points if logistically possible (i.e., 4, 8, 12, weeks vs. end time point) to study the progression of the disease model over time and therapeutic outcomes both short and long term. The review of specific outcome measures provides a comprehensive overview of IVD degeneration in the mouse model as well as the potential to screen of regenerative therapies. As illustrated in Figure 3, these include (1) Histology and IHC to determine changes in disc structure and ECM, including nerve and immune cell infiltration, (2) Molecular and biochemical changes in ECM (proteoglycan and collagen) via DMMB or Hydroxyproline/Sircol assays, (3) Imaging IVD joints using microCT and MRI to assess changes in DHI, disc hydration, and grade, (4) Mechanical assessments for compressive stiffness, and (5) Pain-like behaviors such as open field, grip force, tail suspension, and cold plate/acetone tests. Histological assessments were used in most mice models to determine histopathological degeneration scores, structural integrity, disc height, and cellularity/cell morphology. IHC allows for the spatial assessment of specific proteins of interest and can be used to quantify changes in ECM matrix proteins, catabolic proteins, and even immune infiltration and neurovascular invasion into the IVD joint or neuronal structures such as innervating DRGs and spinal cord ex vivo as proxies of pain. Assays such as DMMB allow for quantitative assessments of proteoglycans in mice IVDs which is directly related to IVD hydration while TUNEL assays give information on changes in cellularity and cell death. Imaging methods such as microCT and MRI can be performed in vivo or ex vivo to quantify changes in IVD joint structure including DHI, DWI, hydration by disc intensity, as well as assessment of bone health adjacent to the IVDs. Lastly, as a major goal of regenerative therapies for IVD degeneration is to treat clinically relevant LBP, pain parameters in vivo are extremely critical in addition to ex vivo assessments.
In conclusion, this review has highlighted existing models of IVD degeneration for translational research and the treatment of LBP (acute vs. chronic), compared/contrasted induction methods (Mechanical, injury, genetic, age), and has outlined critical methods/parameters for both characterizing disease and downstream assessment of regenerative therapies in the mouse model. We hope this review will assist with model selection and critical parameters for assessments within the LBP research community to further push the translatability of clinically relevant therapies.