- Department of Environmental Toxicology, Texas Tech University, Lubbock, TX, United States
Introduction: Species identification and sex determination have significant relevance in wildlife monitoring, conservation, and forensic investigations. This study explores the feasibility of using Raman spectroscopy coupled with Partial Least Squares Discriminant Analysis (PLS-DA) for sex determination of white-tailed deer (Odocoileus virginianus) based on serum and plasma samples.
Methods: A total of 720 Raman spectra (360 serum, 360 plasma) were acquired from five male and five female deer. PLS-DA models were developed using full data and four calibration-to-validation splits (1:4, 2:3, 3:2, 4:1).
Results: The serum-based PLS-DA model achieved predictive accuracies of 79.2% (1:4), 99.1% (2:3), 99.3% (3:2), and 98.6% (4:1), with balanced calibration-to-validation splits (≥2:3) demonstrating excellent robustness and generalizability exceeding 98%. The plasma-based model showed high internal accuracy (∼99.2%) but reduced external accuracy depending on the split: 85.8% (1:4), 82.4% (2:3), 86.1% (3:2), and 57.0% (4:1), reflecting greater variability and susceptibility to overfitting. Spectral comparison revealed consistent Raman bands in both matrices associated with proteins, lipids, and carbohydrates, with subtle but distinct sex-specific intensity differences. Males showed enhanced signals at ∼1,315 and ∼1,445 cm-1 (lipid regions), while females displayed stronger signals at ∼1,220 -1,280 and ∼1,540 -1,630 cm-1 (protein regions).
Conclusion: These findings highlight the potential of Raman-PLSDA models —particularly with serum —as a reliable, rapid, and nondestructive method for sex classification in forensic wildlife contexts, especially when morphological features are absent or degraded. This approach may enable field-adaptable identification that enhance species conservation and forensic investigations involving incomplete or decomposed biological remains.
1 Introduction
Sex identification plays an important role in wildlife biology, ecological monitoring, forensic science, and veterinary diagnostics. It helps our understanding of key biological phenomena, including reproductive strategies, hormonal regulation, immune function, disease susceptibility, behavioral ecology, and population dynamics (Sibeaux et al., 2016; Mariela et al., 2020; Lemaître et al., 2020; Slezak et al., 2023; Khorozyan, 2025). In wildlife management, sex ratios represent key demographic parameters that guide conservation planning, enforcing wildlife protection laws, developing breeding programs, and population viability analyses. Furthermore, sex-specific variation in physiology and metabolism has important implications in toxicology, pharmacokinetics, and disease susceptibility, particularly in studies involving endocrine disruptors, immune-modulating agents, or infectious diseases (Simpkins et al., 1997; Zárate et al., 2017; Takahashi et al., 2020; Lutshumba et al., 2023).
In many field-based studies and retrospective forensic analyses, visual determination of sex is frequently impossible, as blood samples are often collected from free-ranging animals during sedation or brief encounters without recording physical sex characteristics. Typically, these animals are released back into their habitat with minimal handling leaving only biological materials such as serum or plasma for subsequent laboratory evaluation. Similarly, samples obtained post-mortem often lack critical data, including sex information which complicates retrospective studies (Wilson and White, 1998; Brinkman and Hundertmark, 2008). This presents particular challenges for species like white-tailed deer (Odocoileus virginianus), in which males and females differ markedly in disease susceptibility, hormonal fluctuations, metabolic processes, and stress responses (Waid and Warren, 1984; Millspaugh et al., 2002; Potratz et al., 2019). Consequently, developing reliable analytical techniques for the retrospective determination of sex from blood-derived biofluids, including serum and plasma, could significantly enhance the forensic, scientific, and conservation relevance of existing and future biological datasets.
Traditional sex identification methods in wildlife research relies on physical examination, morphometric measurements, or molecular techniques such as PCR amplification of sex chromosome-linked genes (e.g., SRY, ZFX/ZFY, or Amelogenin) (Morin et al., 2001; Hedmark et al., 2004). PCR-based techniques that amplify sex-specific genes have proven effective in identifying sex from feces, blood, and muscle tissue in red deer, moose, sika deer, and other ungulates (Wilson and White, 1998; Yamauchi et al., 2000; Huber et al., 2002; Brinkman and Hundertmark, 2008; Han et al., 2009; Goro et al., 2017). While genotypic (i.e., sex-specific genetic) assays offer high accuracy, they require specialized molecular laboratories, involve invasive sampling, and are often impractical in field settings. Furthermore, DNA degradation in non-invasively collected or archived samples can reduce assay reliability (Deagle et al., 2006; Zayats et al., 2009; Ben Larbi et al., 2012; Molbert et al., 2023; Bhoyar et al., 2024). Given these limitations, there is a clear need for alternative, non-destructive, rapid, and reliable analytical methods suitable for degraded or archived samples.
Biofluids capture the overall physiological state, including sex-specific hormonal profiles like testosterone and estrogen, which significantly influence immune responses, muscle metabolism, stress reactions, and metabolomic signatures (Kerr, 1986; Lemaitre et al., 2020; Takahashi et al., 2020; Jackson et al., 2023; Lutshumba et al., 2023). Consequently, determining sex from biofluids greatly improves the interpretation of endocrine, toxicological, and disease-related data, refines conservation models, supports reproductive health management, and allows for more precise veterinary interventions.
Spectroscopic techniques such as Raman spectroscopy have become increasingly popular as alternative, label-free tools for biological analysis. Raman spectroscopy, coupled with multivariate statistical analyses, provides a promising non-invasive approach for molecular profiling of biofluids. This technique utilizes inelastic scattering of monochromatic light to detect vibrational modes of chemical bonds in biomolecules, such as proteins, lipids, nucleic acids, and carbohydrates, enabling reliable identification of sex-specific biochemical differences (Movasaghi et al., 2007; Lednev, 2012; Sikirzhytskaya et al., 2017; Leal et al., 2018; Carota et al., 2022). This technique requires minimal sample preparation and can generate reproducible biochemical signatures that are sensitive to physiological changes (Gajjar et al., 2012; McLaughlin et al., 2014). Notably, Raman spectroscopy has been successfully applied in differentiating healthy and diseased states, monitoring pharmacological effects, detecting stress biomarkers, and classifying cell types in both clinical and veterinary settings (Krafft and Sergo, 2006; Sikirzhytskaya et al., 2017; Baker et al., 2016; Chen et al., 2021; Pirutin et al., 2023). Its key advantage is its reliability with archived or field-collected samples, as it remains resilient to sample degradation and still preserves meaningful biochemical information over time (Yu et al., 2012; Ratna et al., 2023). Raman spectral profiles of serum and plasma are sensitive to disease state and physiological condition (Jackson et al., 2023). For example, hormonal variations such as changes in estrogen and testosterone levels can influence lipid and protein profiles in serum, resulting in detectable differences in the Raman spectra (Simpkins et al., 1997; Zárate et al., 2017; Silveira et al., 2017; Nieuwoudt et al., 2020; Zhang et al., 2022). To translate these complex spectral datasets into useful identification outputs, machine learning algorithms such as Partial Least Squares Discriminant Analysis (PLS-DA) are often employed.
Despite its growing applications in biomedicine and veterinary science, there is still a gap in using Raman spectroscopy and Partial Least Squares Discriminant Analysis (PLS-DA) for sex identification in cervids using blood-derived fluids. PLS-DA is a supervised multivariate technique that transforms high-dimensional Raman spectral data into a smaller set of latent variables that maximize class separation, enabling effective pattern recognition and classification (Barker and Rayens, 2003; Lee et al., 2018; Zontov et al., 2020). Its ability to handle collinearity among spectral variables and perform well with limited sample sizes makes PLS-DA particularly well suited for exploratory studies in wildlife science, where datasets are often limited, heterogeneous, or partially incomplete. More recently, hybrid approaches that combine PLS-DA with artificial neural networks (ANNs) have further improved predictive performance, achieving high-accuracy species identification from blood (Takamura et al., 2019), sex classification from nails (Mitu et al., 2023), and disease detection from serum (Li et al., 2015). Multivariate machine-learning models integrating feces, plasma, and urine have also underscored their translational potential in clinical applications (Koopman et al., 2025).
White-tailed deer (O. virginianus) represent an ecologically and economically important cervid species in North America with implications for disease ecology, chronic wasting disease, population management, and environmental toxicology. Their well-documented physiological dimorphism, including seasonal hormonal fluctuations, metabolic adaptations, and immune function differences between sexes, makes them an ideal model for evaluating sex-specific biochemical patterns (Waid and Warren, 1984; Millspaugh et al., 2002; Potratz et al., 2019; Jackson et al., 2023). These characteristics provide a strong rationale for exploring non-invasive identification tools for sex identification. Despite their importance in wildlife health monitoring and forensic investigations, no previous study has investigated whether Raman spectral profiles obtained from blood-derived fluids such as serum or plasma can effectively distinguish sex in this species.
In this study, we investigated the potential of Raman spectroscopy combined with PLS-DA modeling to classify sex in white-tailed deer using serum and plasma samples. Our aim was to establish whether molecular-level sex differences in these biofluids could yield distinguishable spectral signatures suitable for supervised classification. We hypothesized that the biochemical variation between males and females would be sufficient to allow statistically significant discrimination using Raman spectral data. If validated, this approach could provide a valuable tool for field-based wildlife research, forensic wildlife investigations involving poaching cases or illegal trafficking, and veterinary diagnostics where traditional sex identification methods are not feasible, particularly when morphological evidence is degraded or when only biological fluid samples are available for analysis.
2 Materials and methods
This study was conducted in compliance with the Texas Tech University Institutional Animal Care and Use Committee (IACUC) under protocol number 14039–08. A total of ten white-tailed deer (O. virginianus)-five males and five females ranging in age from one to 6 years old-were utilized. The animals were kept in high-fenced pens and fed freely with unrestricted access to food and water. To sedate them, we used a mix of Medetomidine HCl and Ketamine HCl, each at a dose of 2.5 mg/kg body weight via darting and reversed with atipamezole HCl administered intramuscularly. Blood was collected aseptically and stored at −80 °C for subsequent analysis.
Whole blood from white-tailed deer were collected into serum-separator tubes and centrifuged at 2000 rpm for 15 min. The resulting serum supernatant was transferred to 2 mL Eppendorf Safe-Lock tubes using Fisherbrand® Elite micropipettes and stored at −80 °C. For protein precipitation, 50 µL of serum was combined with 5 µL of an internal standard (1 μg mL-1 in methanol) and 195 µL of ice-cold acetonitrile, vortexed, and centrifuged at 10,000 rpm for 10 min at 4 °C. A 150 µL portion of the supernatant was passed through a 0.22 µm filter and diluted 1:1 with H2O. For each deer, two 100 µL serum aliquots and two 100 µL plasma aliquots were prepared, yielding 20 aliquots of each matrix across the study cohort. On the day of analysis, aliquots were thawed at room temperature and vortex-mixed for 15 s to ensure homogeneity. Plasma was processed in parallel using the same protocol. From every 100 µL aliquot, two 50 µL droplets were dispensed onto microscope slides pre-covered with aluminum foil. Slides were dried under vacuum (SP Bel-Art Lab Companion desiccator, −0.05 bar) for 1 h.
Raman spectral acquisition was performed using a ThermoFisher Scientific DXR3 Raman Microscope equipped with a 785 nm excitation laser. Software driven calibration was carried out monthly using internal reference data. The laser parameters configured in the OMNIC software included a laser power of 30 mW, a slit aperture of 50 μm, an exposure time of 25 s, and 20 accumulations per sampling location. The spectral resolution was set at 2 cm-1, covering a Raman shift range from 399 to 3,299 cm-1, and the cosmic ray threshold was set to medium. For each droplet, Raman spectra were acquired from three distinct, non-overlapping locations, with three replicate spectra obtained at each location. This approach yielded a total of nine spectra per droplet, resulting in 18 spectra per aliquot. Consequently, each individual deer provided 36 spectra for serum and 36 spectra for plasma. Overall, the spectral dataset comprised 360 serum spectra and 360 plasma spectra (36 per deer for each biological matrix), amounting to a comprehensive collection of 720 spectra for subsequent multivariate analysis. The microscope stage was carefully adjusted using the joystick and fine-focus knobs to target areas without cracks and to ensure optimal focus. Spectra were labeled systematically and archived on secure drives.
The OMNIC spectral (.SPC) files for serum and plasma were imported into MATLAB R2023b (version 23.2.0.2428915, MathWorks, Natick, MA, USA) and analyzed using the PLS Toolbox 9.3 (Eigenvector Research Inc., Manson, WA, USA). Raw Raman spectra were preprocessed to reduce fluorescence background and correct inter-sample variability. Baseline correction was applied using an Automatic Weighted Least Squares (AWLS) algorithm with a polynomial order of 4, followed by 1-norm area normalization and mean-centering to prepare the data for multivariate analysis. Spectral modeling and evaluation were conducted separately for serum and plasma datasets. Classification performance was evaluated using classification error rate and Matthews Correlation Coefficient (MCC). Additionally, prediction coefficient of determination (R2) and root mean square error of prediction (RMSEP) were calculated as PLS-DA generates continuous discriminant scores for each sample that are subsequently classified using a threshold, providing complementary measures of model prediction quality and class separation.
Partial Least Squares Discriminant Analysis (PLS-DA), a widely utilized supervised classification method in chemometric studies, was employed to model spectral variations associated with sex differences. This technique is particularly advantageous for analyzing high-dimensional spectroscopic data due to its effectiveness in reducing data dimensionality and capturing discriminative information relevant to class separation through latent variables (Barker and Rayens, 2003; Brereton and Lloyd, 2014). PLS-DA identifies combinations of spectral features that enhance separation between predefined groups by maximizing covariance between the spectral data and class labels, thus enabling clear visualization and interpretation of classification results (Ballabio and Consonni, 2013; Lee et al., 2018).
Model performance was evaluated using full internal cross-validation (Venetian blinds, 10 splits, blind thickness = 1) to ensure unbiased and systematic latent variable selection with the optimal number of latent variables determined by minimizing the root mean square error of cross-validation (RMSECV) (Westerhuis et al., 2008). Given the limited sample size (N = 10 animals, five per sex), we employed multiple calibration-to-validation splits to assess model robustness and identify optimal training conditions. External predictive performance and model generalizability were evaluated using four calibration-to-validation splits (1:4, 2:3, 3:2, and 4:1), where the dataset of ten individual animals (five males and five females) was partitioned by animal to ensure that all samples from a given individual were assigned exclusively to either the calibration or validation set. This validation approach simulated realistic, independent testing scenarios, thus strengthening confidence in the robustness and forensic applicability of the developed PLS-DA models for sex classification using Raman spectral data (Virkler and Lednev, 2008; Sikirzhytski et al., 2010; Szymańska et al., 2011; Gromski et al., 2015). In each split, a balanced number of male and female animals were used for calibration, with the remainder reserved for prediction. The following calibration-to-validation configurations were implemented: (1) 1:4 split—2 deer (one male, one female; 72 spectra) for calibration and 8 deer (four males, four females; 288 spectra) for validation; (2) 2:3 split—4 deer (two males, two females; 144 spectra) for calibration and six deer (three males, three females; 216 spectra) for validation; (3) 3:2 split—6 deer (three males, three females; 216 spectra) for calibration and four deer (four males, two females; 144 spectra) for validation; and (4) 4:1 split—8 deer (four males, four females; 288 spectra) for calibration and two deer (one male, one female; 72 spectra) for validation. After calibration, each model was used to predict class membership for samples in the respective validation set. This entire analytical workflow was identically applied to the plasma dataset, which also consisted of 360 spectra from the same 10 animals, using the same preprocessing steps and calibration-to-validation partitions.
Score plots were generated to visualize latent space clustering between male and female samples. Receiver Operating Characteristic (ROC) curve analysis, along with the Area Under the Curve (AUC) metric, was employed to provide a threshold-independent evaluation of model performance. The ROC curve plots sensitivity (true positive rate) against the false positive rate (1 − specificity) across various classification thresholds, offering a robust framework for assessing binary classification accuracy—an essential consideration in forensic and wildlife contexts where misclassification can carry significant interpretive consequences (Fawcett, 2006; Zou et al., 2007; Ruiz-Perez et al., 2020). This method is particularly valuable in forensic and wildlife datasets where class boundaries may be subtle, and false classifications carry significant interpretive consequences. The AUC, defined as the integral of the ROC curve, summarizes the overall model performance in a single scalar value ranging from 0.5 (no discrimination) to 1.0 (perfect discrimination), with values above 0.8 typically interpreted as excellent (Bradley, 1997; Hanley and McNeil, 1982). In the context of spectroscopic classification, AUC has been widely used to evaluate chemometric models in applications such as disease diagnosis, body fluid identification, and trace evidence analysis (Virkler and Lednev, 2008; Sikirzhytski et al., 2011).
To assess classification robustness across different chemometric approaches, Linear Discriminant Analysis (LDA) was implemented alongside PLS-DA. LDA is a supervised dimensionality reduction and classification method that finds linear combinations of features maximizing between-class separation while minimizing within-class variance. Unlike PLS-DA, which intrinsically handles high-dimensional data through latent variable extraction, LDA requires explicit dimensionality reduction via PCA when the number of features exceeds sample size, making PCA-LDA a standard two-step approach in vibrational spectroscopy.
3 Results
3.1 Raman spectral characteristics of serum and plasma
Because of sex specific hormone patterns, variations in biomolecules are reflected in the Raman spectra of biofluids like serum and plasma, enabling differentiation between sexes (Nieuwoudt et al., 2020). The mean Raman spectra of serum and plasma samples from white-tailed deer displayed remarkably similar profiles (Figure 1), consistent with their shared biological origin but subtle intensity differences were noted. Both matrices exhibited characteristic Raman peaks corresponding to fundamental vibrational modes of key biomolecules such as proteins, lipids, and carbohydrates. Prominent and well-resolved peaks were observed at ∼1,003 cm-1 (phenylalanine ring breathing), ∼1,155 (C=C stretching in carotenoids), ∼1,315–1,330 cm-1 (CH2 twisting in lipids), ∼1,445 cm-1 (CH2/CH3 bending), and ∼1,655 cm-1 (amide I) (Movasaghi et al., 2007; Rygula et al., 2013; Czamara et al., 2014; Atkins et al., 2017; Huang et al., 2019; Parachalil et al., 2020).
Figure 1. Mean pre-processed Raman spectra of serum (A) and plasma (B) samples from white-tailed deer. Red spectra correspond to female samples and green spectra to male samples, showing consistent biochemical band patterns with subtle sex-dependent variation.
Preprocessed mean serum spectra (Figure 1A) exhibited relatively sharper and more intense amide I (∼1,655 cm-1) and phenylalanine (∼1,003 cm-1) peaks compare to plasma spectra (Figure 1B). Although the full spectral range (399–3,299 cm-1) was used for multivariate analysis, Figure 1 displays only the biologically informative fingerprint region (399–1800 cm-1) where sex-specific differences were observable. Subtle sex-specific trends were observed, with male spectra demonstrating relatively enhanced signals in the lipid-associated regions at ∼1,315–1,330 cm-1 (CH2 twisting) and ∼1,445 cm-1 (CH2/CH3 bending), indicative of elevated lipid content (Galli et al., 2017; Liu et al., 2018; Matsumoto et al., 2024). In contrast, female spectra displayed stronger signals in protein-associated regions at ∼1,220–1,280 cm-1 (Amide III), ∼755 cm-1, and ∼1,540–1,630 cm-1 (Amide I/II), suggesting higher contributions from protein vibrational modes These regions overlap with characteristic hormone bands: testosterone at 786, 856, 1,490, and 1,636 cm-1 (Ondieki et al., 2023), and female hormones at 682–902 and 1,291–1,625 cm-1 (Ondieki et al., 2022), supporting hormonal contributions to sex-specific spectral patterns. These findings are consistent with broader evidence of sex-linked Raman variations in biological fluids (Nieuwoudt et al., 2020; Yin et al., 2025).
Principal component analysis (PCA), an unsupervised dimensionality reduction method, revealed modest sex-based clustering with substantial overlap between male and female samples for both matrices (Supplementary Figure S1 in Supplementary Material). Mean spectra with standard deviations (Supplementary Figure S1 insets) illustrate the subtle spectral differences underlying these clustering patterns. The limited natural separation in unsupervised PCA highlights the necessity of supervised multivariate classification approaches such as PLS-DA to effectively discriminate between sexes based on these subtle spectral differences.
3.2 Serum-based PLS-DA model performance
The serum dataset yielded a highly robust PLS-DA classification model for sex determination in white-tailed deer. Using six latent variables (LVs), the full model-trained on all 360 spectra-achieved 99.4% overall accuracy, with a total error rate of just 0.56%. Female spectra were classified with a true positive rate (TPR) of 1.00 and F1-score of 0.994, while male spectra were predicted with a TPR of 0.988 and F1-score of 0.994. The model’s Matthews Correlation Coefficient (MCC) reached 0.989, confirming high reliability. In the PLS-DA score plot of the full model, male and female samples were well separated, with tight intra-class clustering and no significant outliers, indicating strong discriminative performance.
External validation demonstrated the model’s adaptability under varying calibration-to-validation configurations. The 1:4 split-using only 72 spectra (2 deer: one male, one female) for calibration and 288 spectra (8 deer: four males, four females) for validation-resulted in reduced predictive performance, with overall accuracy declining to 79.2% (prediction error = 20.8%). The MCC dropped to 0.544, and score plots exhibited partial overlap between male and female clusters, particularly near the classification boundary (Figure 2A). Despite this, the ROC analysis still yielded an AUC of 0.854, confirming moderate discriminatory power under limited training conditions.
Figure 2. Partial Least Squares Discriminant Analysis (PLS-DA) score plots for sex classification of white-tailed deer using serum (A–D) and plasma (E–H) Raman spectra under different calibration-to-validation splits: 1:4 (A,E), 2:3 (B,F), 3:2 (C,G), and 4:1 (D,H). Red markers indicate female samples, green markers indicate male samples, and dashed lines represent the classification threshold (Y = 0.5).
In contrast, the 2:3 split-with 144 calibration spectra (4 deer: two males, two females) and 216 validation spectra (6 deer: three males, three females)-greatly improved results. Overall accuracy rose to 99.1% (error = 0.93%), and the MCC reached 0.982. ROC curve analysis showed a steep gradient with an AUC of 0.998, highlighting robust, threshold-independent discrimination. The score plot revealed well-separated male and female clusters with minimal inter-class dispersion (Figure 2B).
The 3:2 split-with 216 calibration spectra (6 deer: three males, three females) and 144 validation spectra (4 deer: two males, two females)-achieved comparable outcomes, with 99.3% accuracy (error = 0.69%). Male and female TPRs remained near perfect, and the MCC peaked at 1.00, indicating flawless classification. The ROC curve was sharply defined, and the score plot revealed complete separation of sex classes with symmetrical distribution and no significant misclassifications (Figure 2C).
Finally, the 4:1 split-using 288 calibration spectra (8 deer: four males, four females) and 72 validation spectra (2 deer: one male, one female)-provided stable and generalizable performance. Overall accuracy remained high at 98.6% (error = 1.39%). Female and male spectra were both classified with high accuracy (TPRs of 0.986 and 1.00, respectively), and MCC stood at 0.973. The ROC curve for this configuration reached an AUC of 1.00, consistent with excellent discriminatory ability, and score plots maintained well-separated and tightly clustered groups (Figure 2D).
3.3 Plasma-based PLS-DA model performance
The PLS-DA model constructed from plasma spectra-also consisting of 360 spectra-performed well internally but showed more variable outcomes during external validation. Using five latent variables, the full model achieved 99.2% overall accuracy, with TPRs of 0.988 (female) and 1.000 (male), both classes attaining F1-scores above 0.994, and an MCC of 0.983. As with serum, the full model score plot demonstrated clear sex separation, although class boundaries appeared slightly less distinct.
Performance under the 1:4 split (72 calibration, 288 validation spectra), accuracy declined to 85.8% (error = 14.2%), with MCC = 0.716. Female and male spectra yielded TPRs of 0.899 and 0.882, with corresponding F1-scores of 0.854 and 0.861. The score plot revealed some inter-class overlaps (Figure 2E). The ROC analysis returned an AUC of 0.955, suggesting that reduced calibration data affected plasma-based prediction more than serum.
In the 2:3 split, overall accuracy dropped further to 82.4% (error = 17.6%), with MCC = 0.649. Male TPR was 0.796 (F1 = 0.819), while female TPR remained at 0.851 (F1 = 0.828). The ROC curve (AUC ≈0.918) showed modest inter-class separation, with noticeable overlaps in the score plot (Figure 2F).
In the 3:2 split, performance plateaued at 86.1% accuracy (error = 13.9%) with MCC = 0.631. Male classification remained perfect (TPR = 1.000), while female prediction dropped substantially (TPR = 0.569). The ROC curve confirmed uneven classification confidence, and score plots revealed asymmetrical separation of classes (Figure 2G).
Unexpectedly, the 4:1 split (288 calibration, 72 validation spectra) produced the weakest plasma performance. Accuracy declined sharply to 57.0% (error = 43%), and MCC fell to 0.27. Despite the large calibration set, the model failed to generalize well, suggesting overfitting. The ROC curve returned an AUC of 0.8773, and the score plot displayed diffuse clustering with significant inter-class overlap (Figure 2H).
3.4 PCA-LDA based models
To assess the generalizability of our classification approach across different chemometric methods, Linear Discriminant Analysis (LDA) was performed alongside PLS-DA using the 2:3 calibration-to-validation split. From the ten white-tailed deer (five males, five females), four animals (two per sex) were randomly selected for model calibration (144 spectra), while the remaining six animals (three per sex) were reserved for external validation (216 spectra). Given the high-dimensional nature of Raman spectral data (3,007 wavenumbers) relative to sample size, Principal Component Analysis (PCA) was performed prior to LDA to prevent rank deficiency and ensure numerical stability. PCA was fit exclusively on the calibration set, and the number of principal components was determined by retaining those that cumulatively explained 95% of the spectral variance, following standard practice in chemometric classification. For serum, seven principal components were retained (95.0% variance explained), while plasma required 12 components (95.1% variance explained). The LDA model was then trained on the PCA-transformed calibration data and evaluated on the independently selected validation set. This PCA-LDA approach is widely established in vibrational spectroscopy and analogous to the dimensionality reduction inherent in PLS-DA, where latent variables serve a similar function to principal components.
Both methods demonstrated comparable classification performance on external validation, with PLS-DA achieving 99.1% accuracy (5 LVs) and LDA achieving 92.6% accuracy (7 PCs) for serum, while plasma showed 82.4% (PLS-DA, 3 LVs) and 82.9% (LDA, 12 PCs) accuracy. The consistency across both supervised methods validates the robustness of sex-specific spectral patterns, with serum demonstrating superior discriminatory capacity regardless of classification approach (Table 1).
Table 1. External validation performance comparison of PLS-DA and LDA for sex determination in deer serum and plasma (2:3 calibration-to-validation split, n = 10 animals). LDA was performed on PCA-transformed data. Both methods show consistent performance within each matrix, validating the robustness of sex-specific spectral signatures.
4 Discussion
This study demonstrates the feasibility and robustness of Raman spectroscopy combined with Partial Least Squares Discriminant Analysis (PLS-DA) for sex classification in white-tailed deer using serum and plasma samples. By capturing subtle, sex-specific biochemical signatures embedded in vibrational spectra, this approach offers a non-invasive and label-free identification tool that holds promise for wildlife forensic applications where traditional sexing methods are not applicable.
4.1 Biochemical discrimination in Raman spectra
Both serum and plasma spectra showed characteristic protein (∼1,003, ∼1,655 cm-1), lipid (∼1,300, ∼1,445 cm-1), and carotenoid (1,155, 1,525 cm-1) bands, consistent with previous mammalian studies (Movasaghi et al., 2007; Rygula et al., 2013; Czamara et al., 2014; Atkins et al., 2017; Huang et al., 2019; Parachalil et al., 2020). Female spectra displayed stronger protein-associated peaks, whereas males showed enhanced lipid vibrations, suggesting sex-linked metabolic differences. Comparable findings have been reported in other species: chicken embryos (Galli et al., 2016), chick blood (Matsumoto et al., 2024), and plasma lipidomics (Ishikawa et al., 2013). Similar protein–lipid contrasts were also observed in human cortical bone (Nieuwoudt et al., 2020) and fingernails (Yin et al., 2025). In serum and plasma, vibrational spectroscopy has enabled quantitative protein–metabolite assessment (Byrne et al., 2020) and high identification accuracy when paired with deep learning for endocrine, cancer, and kidney disorders (Chen et al., 2020; Chen et al., 2021; Yang et al., 2023). Collectively, vibrational spectroscopy of biological fluids has gained considerable attention in recent years, especially for protein analysis in human plasma and serum samples. Our approach builds on this foundation by applying similar techniques to wildlife sex determination—a novel application that demonstrates the broader potential of spectroscopic methods beyond traditional clinical diagnostics. This has particular relevance for forensic applications, where rapid, non-destructive identification methods would be incredibly useful for wildlife trafficking investigations or conservation enforcement cases.
4.2 Serum vs. plasma: identification stability and performance
Our serum results showed an expected relationship between training set size and model performance. When we used only 72 spectra for training (1:4 split), the model struggled—classification errors jumped to 20.8%, which was not surprising given the limited data. However, what is particularly remarkable is how dramatically performance improved once we provided more balanced training sets. The 2:3, 3:2, and 4:1 splits all delivered consistent results, with classification errors dropping below 1.4% and AUC values consistently exceeding 0.99. This was not just a marginal improvement; we observed roughly a 15-fold reduction in error rates. The progression from 20.8% error with limited training data to near-perfect classification with adequate training suggests that sex-related spectral differences in serum are profound and consistent across individuals, requiring only a small, balanced calibration set to achieve reliable classification. In contrast to serum, plasma models plateaued at 82%–86% accuracy across all splits with adequate test sets, with no improvement from additional calibration samples, suggesting that sex-related spectral signatures in plasma are inherently less distinct or they vary more, while the poor 4:1 performance (57.0%) probably reflects measurement unreliability with only two test samples rather than model degradation. Serum consistently outperformed plasma across all classification metrics, including MCC, AUC, and prediction R2, particularly in external validation splits. The removal of fibrinogen and other clotting factors during serum preparation create more stable and consistent profiles for spectral analysis (Bonifacio et al., 2014; Banerjee et al., 2022; Paul and Veenstra, 2022). Comparative metabolomics and spectroscopic studies further confirm that serum provides cleaner and more reproducible data than anticoagulant-treated plasma (Sotelo-Orozco et al., 2021; Vignoli et al., 2022). Recent diagnostic applications using serum Raman and ATR-FTIR spectroscopy also achieved high classification accuracy across clinical contexts, including pituitary adenomas and gastric cancer, reinforcing serum’s value as a diagnostic substrate (Banerjee et al., 2022; Pang et al., 2024).
In contrast, plasma, while rich in molecular content, showed greater variability, particularly under the 4:1 training split, reflecting unstable performance assessment with only two test samples (Atkins et al., 2017; Kralova et al., 2024). Although plasma’s molecular richness can support broader biomarker discovery (Byrne et al., 2020), its susceptibility to anticoagulant effects introduces spectral variability (Vignoli et al., 2022; Thachil et al., 2024; Denery et al., 2011). Nonetheless, under balanced calibration strategies, plasma achieved moderate predictive reliability, consistent with previous reports highlighting its identification potential when serum is unavailable (Hu et al., 2023; Dinesh et al., 2017; Doty et al., 2017; Doty et al., 2016; Muro and Lednev, 2017). Importantly, recent findings emphasize that plasma performance can be improved with careful preprocessing and algorithm selection (Kralova et al., 2024).
Taken together, these findings indicate that serum represents the more stable and reliable substrate for Raman-based sex classification in deer. Plasma, while inherently more variable, still offers moderate identification capacity under optimized calibration and preprocessing conditions. This dual perspective suggests that serum should be prioritized for high-confidence applications, whereas plasma may serve as a practical alternative in scenarios where serum samples are inaccessible.
While this proof-of-concept study achieved robust classification with systematic validation across multiple data partitions and algorithms, the sample size (ten animals) was limited by the availability of well-characterized wildlife specimens. Future validation with larger cohorts across different populations, age classes, and physiological states will be essential to confirm generalizability. Despite these constraints, the method-independent high performance of serum-based classification provides strong preliminary evidence for spectroscopic sex determination in wildlife forensics.
4.3 Sex-specific classification patterns
Male samples consistently achieved higher classification accuracy than female samples across all calibration schemes. Similar trends have been reported in Raman studies of saliva and bloodstains, where male spectra clustered more tightly while female samples showed greater dispersion, leading to higher misclassification rates (Muro et al., 2016a; Sikirzhytskaya et al., 2017). This disparity is often attributed to lower within-group variability in males, showing more stable testosterone-related protein profiles or stronger Raman-active lipid features, whereas female spectra are influenced by hormonal fluctuations and reproductive variability, contributing to broader metabolic heterogeneity (Joshi et al., 2024). Such hormonal influences may be further influenced by seasonal reproductive cycles, which could introduce additional temporal variation in female spectral signatures. Future studies should investigate whether sampling across different seasons or reproductive stages affects classification accuracy and spectral dispersion, particularly in females, which could lead to optimal sampling protocols for sex discrimination applications. The PLS-DA score plots in our study similarly showed tighter male clustering and wider female spread, reinforcing that sex-specific classification outcomes reflect both concentration and structural variation in biomolecules. This is clearly illustrated in Figure 3, where PLS score plots constructed from the complete datasets visualize the overall clustering patterns for both serum (Figure 3A) and plasma (Figure 3B). Male samples (green) form compact clusters centered near the origin, while female samples (red) display greater dispersion. Despite the larger female scatter, clear separation between sexes is maintained with minimal overlap in both matrices, confirming that sex-specific biochemical signatures provide robust discrimination even in the presence of within-sex variability. These findings are consistent with broader assertions that Raman spectra can capture complex biochemical and metabolic phenotypes (Kumar, 2012).
Figure 3. PLS Score plots of the first three latent variables for (A) serum and (B) plasma. Male samples (green) cluster more compactly near the origin, while female samples (red) show greater dispersion. Clear separation between sexes is maintained despite female heterogeneity. Models built on complete dataset (n = 360 spectra per matrix from 10 animals) for visualization.
4.4 Calibration strategies and model generalization
We systematically moved one female and one male between calibration and validation sets, allowing us to evaluate performance across different data partitioning scenarios while maintaining class balance-a critical consideration when working with small datasets where traditional single train-test splits can yield unreliable performance estimates. Our results clearly show that balanced calibration-to-validation splits of serum spectra (2:3, 3:2) yielded the most consistent results, with higher generalization and class separation. The 1:4 split was limited by insufficient training data. In plasma, the 4:1 split (2 samples in external validation) reflects probably higher biological variability with only two test samples. Given the limited sample size in our study and high dimensionality of Raman spectra, proper validation is critical. Gromski et al. (2015) emphasize that PLS-DA requires rigorous statistical validation to avoid misleading results, especially when variables outnumber samples. Our systematic evaluation of multiple calibration-to-validation splits (1:4, 2:3, 3:2, 4:1) addresses this concern by assessing model outcomes across different partitioning scenarios rather than relying on a single train-test split. For small datasets, leave-one-out or leave-multiple-out is often preferred over single train-test split. Cross-validation studies further emphasize that predictive reliability depends on both dataset size and balanced class representation (Westerhuis et al., 2008). More broadly, metabolomics research cautions that poorly designed calibration and validation schemes can inflate accuracy while undermining true predictive power, underscoring the need for careful validation to ensure robust and generalizable PLS-DA models (Worley and Powers, 2016).
4.5 Forensic and field applications
This approach offers significant advantages in wildlife forensic science, particularly in cases involving juvenile, decomposed, or fragmented remains where traditional sexing is infeasible. Raman spectroscopy enables rapid, non-destructive analysis with minimal sample volume, and is increasingly compatible with portable instrumentation (Virkler and Lednev, 2008; Mistek et al., 2016; Muro and Lednev, 2016; Muro et al., 2016b). The success of portable Raman systems in discriminating human and non-human blood (Fujihara et al., 2017) and their growing use in clinical diagnostics (Yang et al., 2023; Hu et al., 2023; Kralova et al., 2024) underline their readiness for deployment in wildlife and conservation contexts. With portable Raman systems, investigators could analyze dried blood spots on-site, providing immediate results that could influence charging decisions - especially important for CITES species where sex-specific regulations apply.
5 Conclusion
This study confirms that Raman spectroscopy, combined with chemometric analysis via PLS-DA, can reliably classify sex in white-tailed deer using both serum and plasma samples. The models developed achieved high classification performance, with serum consistently outperforming plasma in terms of predictive accuracy, robustness, and generalizability—particularly under cross-validated and externally validated conditions. Key vibrational features associated with proteins, lipids, and carbohydrates enabled reliable sex differentiation, even in the presence of inter-individual variability. The findings emphasize the importance of matrix selection and calibration strategy when developing chemometric models for forensic and wildlife applications. Serum emerged as the preferred substrate due to its reduced biochemical complexity and lower spectral variability. Nevertheless, plasma has retained acceptable classification potential and may serve as a complementary or alternative matrix under certain conditions.
From a forensic science perspective, this methodology offers a rapid, non-destructive, and reagent-free approach to sex identification that can be extended to field applications and species conservation programs. As Raman instrumentation becomes more portable and data processing pipelines more automated, this strategy holds promise for routine deployment in wildlife management, ecological monitoring, and legal enforcement of sex-specific hunting regulations. Future work should explore the integration of advanced spectral feature selection, larger population datasets, and multi-species validation to further enhance the robustness and utility of vibrational spectroscopy in forensic wildlife science.
Data availability statement
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.
Ethics statement
The animal study was approved by Texas Tech University Institutional Animal Care and Use Committee (IACUC) under protocol number 14039–08. The study was conducted in accordance with the local legislation and institutional requirements.
Author contributions
MM: Data curation, Formal Analysis, Investigation, Software, Writing – original draft. ES: Funding acquisition, Resources, Writing – review and editing. LH: Conceptualization, Investigation, Project administration, Supervision, Writing – review and editing.
Funding
The author(s) declared that financial support was received for this work and/or its publication. This work received funding from the Deer Breeder’s Association, a not-for-profit organization.
Acknowledgements
We thank Texas Tech University for technical support.
Conflict of interest
The author(s) declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Generative AI statement
The author(s) declared that generative AI was used in the creation of this manuscript. English language editing was performed using Grammarly. The authors take full responsibility for the final content.
Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Supplementary material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/frans.2025.1727520/full#supplementary-material
References
Atkins, C. G., Buckley, K., Blades, M. W., and Turner, R. F. B. (2017). Raman spectroscopy of blood and blood components. Appl. Spectrosc. 71 (5), 767–793. doi:10.1177/0003702816686593
Baker, M. J., Hussain, S. R., Lovergne, L., Untereiner, V., Hughes, C., Lukaszewski, R. A., et al. (2016). Developing and understanding biofluid vibrational spectroscopy: a critical review. Chem. Soc. Rev. 45 (7), 1803–1818. doi:10.1039/c5cs00585j
Ballabio, D., and Consonni, V. (2013). Classification tools in chemistry. Part 1: linear models. PLS-DA. Anal. Methods 5 (16), 3790. doi:10.1039/c3ay40582f
Banerjee, A., Halder, A., Jadhav, P., Bankar, R., Pattarkine, J., Hole, A., et al. (2022). Metabolomics profiling of pituitary adenomas by raman spectroscopy, attenuated total reflection-fourier transform infrared spectroscopy, and mass spectrometry of serum samples. Anal. Chemistry 94 (34), 11898–11907. doi:10.1021/acs.analchem.2c02487
Barker, M., and Rayens, W. (2003). Partial least squares for discrimination. J. Chemom. 17 (3), 166–173. doi:10.1002/cem.785
Ben Larbi, M., Tircazes, A., Feve, K., Tudela, F., and Bolet, G. (2012). Reliability of non-invasive tissue sampling methods for DNA extraction in rabbits (Oryctolagus cuniculus). World Rabbit Sci. 20 (2). doi:10.4995/wrs.2012.1077
Bhoyar, L., Mehar, P., and Chavali, K. (2024). An overview of DNA degradation and its implications in forensic caseworks. Egypt. J. Forensic Sci. 14 (1), 15. doi:10.1186/s41935-024-00389-y
Bonifacio, A., Marta, S. D., Riccardo, S., Cervo, S., Steffan, A., Colombatti, A., et al. (2014). Surface-enhanced Raman spectroscopy of blood plasma and serum using Ag and Au nanoparticles: a systematic study. Anal. Bioanal. Chem. 406 (9-10), 2355–2365. doi:10.1007/s00216-014-7622-1
Bradley, A. P. (1997). The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognit. 30 (7), 1145–1159. doi:10.1016/s0031-3203(96)00142-2
Brereton, R. G., and Lloyd, G. R. (2014). Partial least squares discriminant analysis: taking the magic away. J. Chemom. 28 (4), 213–225. doi:10.1002/cem.2609
Brinkman, T. J., and Hundertmark, K. J. (2008). Sex identification of northern ungulates using low quality and quantity DNA. Conserv. Genet. 10 (4), 1189–1193. doi:10.1007/s10592-008-9747-2
Byrne, H. J., Bonnier, F., McIntyre, J., and Parachalil, D. R. (2020). Quantitative analysis of human blood serum using vibrational spectroscopy. Clin. Spectrosc. 2, 100004. doi:10.1016/j.clispe.2020.100004
Carota, A. G., Campanella, B., Carratore, R. D., Bongioanni, P., Giannelli, R., and Legnaioli, S. (2022). Raman spectroscopy and multivariate analysis as potential tool to follow Alzheimer’s disease progression. Anal. Bioanal. Chem. 414 (16), 4667–4675. doi:10.1007/s00216-022-04087-3
Chen, H., Chen, C., Wang, H., Chen, C., Guo, Z., Tong, D., et al. (2020). Serum Raman spectroscopy combined with a multi-feature fusion convolutional neural network diagnosing thyroid dysfunction. Optik 216, 164961. doi:10.1016/j.ijleo.2020.164961
Chen, C., Wu, W., Chen, C., Chen, F., Dong, X., Ma, M., et al. (2021). Rapid diagnosis of lung cancer and glioma based on serum raman spectroscopy combined with deep learning. J. Raman Spectrosc. 52 (11), 1798–1809. doi:10.1002/jrs.6224
Czamara, K., Majzner, K., Pacia, M. Z., Kochan, K., Kaczor, A., and Baranska, M. (2014). Raman spectroscopy of lipids: a review. J. Raman Spectrosc. 46 (1), 4–20. doi:10.1002/jrs.4607
Deagle, B. E., Eveson, J. P., and Jarman, S. N. (2006). Quantification of damage in DNA recovered from highly degraded samples – a case study on DNA in faeces. Front. Zoology 3 (1), 11. doi:10.1186/1742-9994-3-11
Denery, J. R., Nunes, A. A. K., and Dickerson, T. J. (2011). Characterization of differences between blood sample matrices in untargeted metabolomics. Anal. Chem. 83 (3), 1040–1047. doi:10.1021/ac102806p
Dinesh, M., Maguire, A., Bryant, J., Armstrong, J. G., Dunne, M., Finn, M., et al. (2017). Development of a high throughput (HT) Raman spectroscopy method for rapid screening of liquid blood plasma from prostate cancer patients. Analyst 142 (8), 1216–1226. doi:10.1039/c6an02100j
Doty, K. C., McLaughlin, G., and Lednev, I. K. (2016). A Raman ‘spectroscopic clock’ for bloodstain age determination: the first week after deposition. Anal. Bioanal. Chem. 408 (15), 3993–4001. doi:10.1007/s00216-016-9486-z
Doty, K. C., Muro, C. K., and Lednev, I. K. (2017). Predicting the time of the crime: bloodstain aging estimation for up to two years. Forensic Chem. 5, 1–7. doi:10.1016/j.forc.2017.05.002
Fawcett, T. (2006). An introduction to ROC analysis. Pattern Recognit. Lett. 27 (8), 861–874. doi:10.1016/j.patrec.2005.10.010
Fujihara, J., Fujita, Y., Yamamoto, T., Nishimoto, N., Kimura-Kataoka, K., Kurata, S., et al. (2017). Blood identification and discrimination between human and nonhuman blood using portable Raman spectroscopy. Int. J. Leg. Med. 131 (2), 319–322. doi:10.1007/s00414-016-1396-2
Gajjar, K., Heppenstall, L. D., Pang, W., Ashton, K. M., Trevisan, J., Patel, I. I., et al. (2012). Diagnostic segregation of human brain tumours using Fourier-transform infrared and/or Raman spectroscopy coupled with discriminant analysis. Anal. Methods 5 (1), 89–102. doi:10.1039/C2AY25544H
Galli, R., Preusse, G., Uckermann, O., Bartels, T., Krautwald-Junghanns, M.-E., Koch, E., et al. (2016). In ovo sexing of domestic chicken eggs by Raman spectroscopy. Anal. Chem. 88 (17), 8657–8663. doi:10.1021/acs.analchem.6b01868
Galli, R., Koch, E., Preusse, G., Schnabel, C., Bartels, T., Krautwald-Junghanns, M.-E., et al. (2017). Contactless in ovo sex determination of chicken eggs. Curr. Dir. Biomed. Eng. 3 (2), 131–134. doi:10.1515/cdbme-2017-0027
Goro, H., Naito, S., Erina, N., Ueda, Y., Sato, Y., Pastrana, J. A., et al. (2017). Morphometric and genetic determination of Age class and sex for fecal pellets of Sika deer (Cervus nippon). Mammal. Study 42 (4), 1–8. doi:10.3106/041.042.0406
Gromski, P. S., Muhamadali, H., Ellis, D. I., Xu, Y., Correa, E., Turner, M. L., et al. (2015). A tutorial review: Metabolomics and partial least squares-discriminant analysis – a marriage of convenience or a shotgun wedding. Anal. Chim. Acta 879, 10–23. doi:10.1016/j.aca.2015.02.012
Han, S.-H., Lee, S.-S., Cho, I.-C., Oh, M.-Y., and Oh, H.-S. (2009). Species identification and sex determination of Korean water deer (Hydropotes inermis argyropus) by Duplex PCR. J. Appl. Animal Res. 35 (1), 61–66. doi:10.1080/09712119.2009.9706986
Hanley, J. A., and McNeil, B. J. (1982). The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143 (1), 29–36. doi:10.1148/radiology.143.1.7063747
Hedmark, E., Flagstad, Ø., Segerström, P., Persson, J., Landa, A., and Ellegren, H. (2004). DNA-Based individual and sex identification from wolverine (Gulo gulo) faeces and urine. Conserv. Genet. 5 (3), 405–410. doi:10.1023/b:coge.0000031224.88778.f5
Hu, D., Wang, J., Cheng, T., Li, H., Zhang, F., Zhao, D., et al. (2023). Comparative analysis of serum and saliva samples using Raman spectroscopy: a high-throughput investigation in patients with polycystic ovary syndrome and periodontitis. BMC Women’s Health 23 (1), 522. doi:10.1186/s12905-023-02663-y
Huang, S., Wang, P., Tian, Y., Bai, P., Chen, D., Wang, C., et al. (2019). Blood species identification based on deep learning analysis of Raman spectra. Biomed. Opt. Express 10 (12), 6129–6144. doi:10.1364/boe.10.006129
Huber, S., Bruns, U., and Arnold, W. (2002). Sex determination of red deer using polymerase chain reaction of DNA from feces. Wildl. Soc. Bull., 1973–2006. Available online at: https://www.jstor.org/stable/3784655.
Ishikawa, M., Tajima, Y., Murayama, M., Senoo, Y., Maekawa, K., and Saito, Y. (2013). Plasma and serum from nonfasting men and women differ in their lipidomic profiles. Biol. and Pharmaceutical Bulletin 36 (4), 682–685. doi:10.1248/bpb.b12-00799
Jackson, N., Hassan, J., and Byrne, H. J. (2023). Raman spectroscopic analysis of human serum samples of convalescing COVID-19 positive patients. Clin. Spectrosc. 5, 100028. doi:10.1016/j.clispe.2023.100028
Joshi, R., Goswami, D., Saha, P., Hole, A., Mandhare, P., Wadke, R., et al. (2024). Serum Raman spectroscopy: unearthing the snapshot of distinct metabolic profile in patients with congenital heart defects (CHDs). Heliyon 10 (16), e34575. doi:10.1016/j.heliyon.2024.e34575
Kerr, K. D. (1986). Sex identification of white-tailed deer using frozen venison. J. Forensic Sci. 31 (3), 11120–11122. doi:10.1520/jfs11120j
Khorozyan, I. (2025). Conservation implications of sex-specific daily movements of leopards: a global perspective. Biol. Conserv. 302, 110928. doi:10.1016/j.biocon.2024.110928
Koopman, N., Jaspers, Y., van Leeuwen, P. T., Chronas, K., Li Yim, A. Y. F., Diederen, K., et al. (2025). Integrated multi-omics of feces, plasma and urine can describe and differentiate pediatric active Crohn’s Disease from remission. Commun. Med. 5 (1), 281. doi:10.1038/s43856-025-00984-7
Krafft, C., and Sergo, V. (2006). Biomedical applications of Raman and infrared spectroscopy to diagnose tissues. Spectroscopy 20 (5-6), 195–218. doi:10.1155/2006/738186
Kralova, K., Kral, M., Vrtelka, O., and Setnicka, V. (2024). Comparative study of Raman spectroscopy techniques in blood plasma-based clinical diagnostics: a demonstration on Alzheimer’s disease. Spectrochimica Acta Part A Mol. Biomol. Spectrosc. 304, 123392. doi:10.1016/j.saa.2023.123392
Kumar, C. S. S. R. (2012). Raman spectroscopy for nanomaterials characterization. Berlin: Springer. doi:10.1007/978-3-642-20620-7
Leal, L. B., Nogueira, M. S., Canevari, R. A., and Carvalho, L. F. C. S. (2018). Vibration spectroscopy and body biofluids: literature review for clinical applications. Photodiagnosis Photodyn. Ther. 24, 237–244. doi:10.1016/j.pdpdt.2018.09.008
Lednev, I. (2012). The author(s) shown below used federal funds provided by the U.S. Department of Justice and prepared the following final report: document title: application of Raman Spectroscopy for an Easy-to-Use, on-Field, Rapid, Nondestructive, Confirmatory Identification of Body Fluids. Available online at: https://www.ojp.gov/pdffiles1/nij/grants/239079.pdf.
Lee, L. C., Liong, C.-Y., and Jemain, A. A. (2018). Partial least squares-discriminant analysis (PLS-DA) for classification of high-dimensional (HD) data: a review of contemporary practice strategies and knowledge gaps. Analyst 143 (15), 3526–3539. doi:10.1039/C8AN00599K
Lemaître, J.-F., Ronget, V., Tidière, M., Allainé, D., Berger, V., Cohas, A., et al. (2020). Sex differences in adult lifespan and aging rates of mortality across wild mammals. Proc. Natl. Acad. Sci. 117 (15), 8546–8553. doi:10.1073/pnas.1911999117
Li, X., Yang, T., Li, S., Jin, L., Wang, D., Guan, D., et al. (2015). Noninvasive liver diseases detection based on serum surface enhanced Raman spectroscopy and statistical analysis. Opt. Express 23 (14), 18361–18372. doi:10.1364/oe.23.018361
Liu, Y., Chen, Y., Zhang, Y., Kou, Q., Zhang, Y., Wang, Y., et al. (2018). Detection and identification of estrogen based on surface-enhanced resonance raman scattering (SERRS). Mol. Basel, Switz. 23 (6), 1330. doi:10.3390/molecules23061330
Lutshumba, J., Wilcock, D., Monson, N. L., and Stowe, A. M. (2023). Sex-based differences in effector cells of the adaptive immune system during Alzheimer’s disease and related dementias. Neurobiol. Disease 184, 106202. doi:10.1016/j.nbd.2023.106202
Mariela, G., Laura, C., and Belant, J. L. (2020). Planning for carnivore recolonization by mapping sex-specific landscape connectivity. Glob. Ecol. Conservation 21, e00869. doi:10.1016/j.gecco.2019.e00869
Matsumoto, S., Ogino, A., Onoe, K., Ukon, J., and Ishigaki, M. (2024). Chick sexing based on the blood analysis using Raman spectroscopy. Sci. Rep. 14 (1), 15999. doi:10.1038/s41598-024-65998-y
McLaughlin, G., Doty, K. C., and Lednev, I. K. (2014). Raman spectroscopy of blood for species identification. Anal. Chem. 86 (23), 11628–11633. doi:10.1021/ac5026368
Millspaugh, J. J., Washburn, B. E., Milanick, M. A., Beringer, J., Hansen, L. P., and Meyer, T. M. (2002). NonInvasive techniques for stress assessment in WhiteTailed deer. Wildl. Soc. Bull. 30 (3), 899–907. doi:10.2307/3784245
Mistek, E., Halámková, L., Doty, K. C., Muro, C. K., and Lednev, I. K. (2016). Race differentiation by Raman spectroscopy of a bloodstain for forensic purposes. Anal. Chem. 88 (15), 7453–7456. doi:10.1021/acs.analchem.6b01173
Mitu, B., Trojan, V., and Halámková, L. (2023). Sex determination of human nails based on attenuated total reflection fourier Transform infrared spectroscopy in forensic context. Sensors 23 (23), 9412. doi:10.3390/s23239412
Molbert, N., Hamid, R. G., Johansson, T., Mostadius, M., and Hansson, M. (2023). An evaluation of DNA extraction methods on historical and roadkill Mammalian specimen. Sci. Rep. 13 (1), 13080. doi:10.1038/s41598-023-39465-z
Morin, P. A., Chambers, K. E., Boesch, C., and Vigilant, L. (2001). Quantitative polymerase chain reaction analysis of DNA from noninvasive samples for accurate microsatellite genotyping of wild chimpanzees (Pan troglodytes verus). Mol. Ecol. 10 (7), 1835–1844. doi:10.1046/j.0962-1083.2001.01308.x
Movasaghi, Z., Rehman, S., and Rehman, I. U. (2007). Raman spectroscopy of biological tissues. Appl. Spectrosc. Rev. 42 (5), 493–541. doi:10.1080/05704920701551530
Muro, C. K., and Lednev, I. K. (2016). Identification of individual red blood cells by Raman microspectroscopy for forensic purposes: in search of a limit of detection. Anal. Bioanal. Chem. 409 (1), 287–293. doi:10.1007/s00216-016-0002-2
Muro, C. K., and Lednev, I. K. (2017). Race differentiation based on raman spectroscopy of semen traces for forensic purposes. Anal. Chem. 89 (8), 4344–4348. doi:10.1021/acs.analchem.7b00106
Muro, C. K., de Souza Fernandes, L., and Lednev, I. K. (2016a). Sex determination based on raman spectroscopy of saliva traces for forensic purposes. Anal. Chem. 88 (24), 12489–12493. doi:10.1021/acs.analchem.6b03988
Muro, C. K., Doty, K. C., de Souza Fernandes, L., and Lednev, I. K. (2016b). Forensic body fluid identification and differentiation by Raman spectroscopy. Forensic Chem. 1 (1), 31–38. doi:10.1016/j.forc.2016.06.003
Nieuwoudt, M. K., Rayomand, S., Patel, R., Holtkamp, H., Aguergaray, C., Watson, M., et al. (2020). Raman spectroscopy reveals age- and sex-related differences in cortical bone from people with osteoarthritis. Sci. Rep. 10 (1), 19443. doi:10.1038/s41598-020-76337-2
Ondieki, A. M., Birech, Z., Kaduki, K. A., Kaingu, C. K., Ndeke, A. N., and Namanya, L. (2022). Biomarker Raman bands of estradiol, follicle-stimulating, luteinizing, and progesterone hormones in blood. Vib. Spectrosc. 118, 103342. doi:10.1016/j.vibspec.2022.103342
Ondieki, A. M., Birech, Z., Kaduki, K. A., Mwangi, P. W., Juma, M., and Chege, B. M. (2023). Label-free assaying of testosterone and growth hormones in blood using surface-enhanced Raman spectroscopy. Vib. Spectrosc. 129, 103605. doi:10.1016/j.vibspec.2023.103605
Pang, N., Yang, W., Yang, G., Yang, C., Tong, K., Yu, R., et al. (2024). The utilization of blood serum ATR-FTIR spectroscopy for the identification of gastric cancer. Discov. Oncol. 15 (1), 350. doi:10.1007/s12672-024-01231-6
Parachalil, D. R., McIntyre, J., and Byrne, H. J. (2020). Potential of Raman spectroscopy for the analysis of plasma/serum in the liquid state: recent advances. Anal. Bioanal. Chem. 412 (9), 1993–2007. doi:10.1007/s00216-019-02349-1
Paul, J., and Veenstra, T. D. (2022). Separation of serum and plasma proteins for In-Depth proteomic analysis. Separations 9 (4), 89. doi:10.3390/separations9040089
Pirutin, S. K., Jia, S., Yusipovich, A. I., Shank, M. A., Parshina, E. Y., and Rubin, A. B. (2023). Vibrational spectroscopy as a tool for bioanalytical and biomonitoring studies. Int. J. Mol. Sci. 24 (8), 6947. doi:10.3390/ijms24086947
Potratz, E. J., Brown, J. S., Gallo, T., Anchor, C., and Santymire, R. M. (2019). Effects of demography and urbanization on stress and body condition in urban white-tailed deer. Urban Ecosyst. 22 (5), 807–816. doi:10.1007/s11252-019-00856-8
Ratna, G., Edwards, K. L., Chiarelli, T. L., Fanson, K. V., Ganswindt, A., Keeley, T., et al. (2023). Biomarkers of reproductive health in wildlife and techniques for their assessment. Theriogenology Wild 3, 100052. doi:10.1016/j.therwi.2023.100052
Ruiz-Perez, D., Guan, H., Madhivanan, P., Mathee, K., and Narasimhan, G. (2020). So you think you can PLS-DA? BMC Bioinforma. 21 (S1), 2. doi:10.1186/s12859-019-3310-7
Rygula, A., Majzner, K., Marzec, K. M., Kaczor, A., Pilarczyk, M., and Baranska, M. (2013). Raman spectroscopy of proteins: a review. J. Raman Spectrosc. 44 (8), 1061–1076. doi:10.1002/jrs.4335
Sibeaux, A., Michel, C. L., Bonnet, X., Caron, S., Fournière, K., Gagno, S., et al. (2016). Sex-specific ecophysiological responses to environmental fluctuations of free-ranging Hermann’s tortoises: implication for conservation. Conserv. Physiol. 4 (1), cow054. doi:10.1093/conphys/cow054
Sikirzhytskaya, A., Sikirzhytski, V., and Lednev, I. K. (2017). Determining gender by Raman spectroscopy of a bloodstain. Anal. Chem. 89 (3), 1486–1492. doi:10.1021/acs.analchem.6b02986
Sikirzhytski, V., Virkler, K., and Lednev, I. K. (2010). Discriminant analysis of raman Spectra for body fluid identification for forensic purposes. Sensors 10 (4), 2869–2884. doi:10.3390/s100402869
Sikirzhytski, V., Sikirzhytskaya, A., and Lednev, I. K. (2011). Multidimensional raman spectroscopic signatures as a tool for forensic identification of body fluid traces: a review. Appl. Spectrosc. 65 (11), 1223–1232. doi:10.1366/11-06455
Silveira, L., Rita, R. N. N., Enrique Giana, H., Zângaro, R. A., Tadeu, M., Fernandes, A. B., et al. (2017). Quantifying glucose and lipid components in human serum by Raman spectroscopy and multivariate statistics. Lasers Med. Sci. 32 (4), 787–795. doi:10.1007/s10103-017-2173-2
Simpkins, J. W., Green, P. S., Gridley, K. E., Singh, M., de Fiebre, N. C., and Rajakumar, G. (1997). Role of estrogen replacement therapy in memory enhancement and the prevention of neuronal loss associated with alzheimer’s disease. Am. J. Med. 103 (3), 19S25S. doi:10.1016/S0002-9343(97)00260-X
Slezak, C. R., Masse, R. J., and McWilliams, S. R. (2023). Sex-specific differences and long-term trends in habitat selection of American woodcock. J. Wildl. Manag. 88 (2), e22518. doi:10.1002/jwmg.22518
Sotelo-Orozco, J., Chen, S.-Y., Hertz-Picciotto, I., and Slupsky, C. M. (2021). A comparison of serum and plasma blood collection tubes for the integration of epidemiological and metabolomics data. Front. Mol. Biosci. 8, 682134. doi:10.3389/fmolb.2021.682134
Szymańska, E., Saccenti, E., Smilde, A. K., and Westerhuis, J. A. (2011). Double-check: validation of diagnostic statistics for PLS-DA models in metabolomics studies. Metabolomics 8 (S1), 3–16. doi:10.1007/s11306-011-0330-3
Takahashi, T., Ellingson, M. K., Wong, P., Israelow, B., Lucas, C., Klein, J., et al. (2020). Sex differences in immune responses that underlie COVID-19 disease outcomes. Nature 588, 1–6. doi:10.1038/s41586-020-2700-3
Takamura, A., Halamkova, L., Ozawa, T., and Lednev, I. K. (2019). Phenotype profiling for forensic purposes: determining donor sex based on fourier transform infrared Spectroscopy of urine traces. Anal. Chem. 91 (9), 6288–6295. doi:10.1021/acs.analchem.9b01058
Thachil, A., Wang, L., Mandal, R., Wishart, D., and Blydt-Hansen, T. (2024). An overview of pre-analytical factors impacting metabolomics analyses of blood samples. Metabolites 14 (9), 474. doi:10.3390/metabo14090474
Vignoli, A., Tenori, L., Morsiani, C., Turano, P., Capri, M., and Luchinat, C. (2022). Serum or plasma (and which plasma), that is the question. J. Proteome Res. 21 (4), 1061–1072. doi:10.1021/acs.jproteome.1c00935
Virkler, K., and Lednev, I. K. (2008). Raman spectroscopy offers great potential for the nondestructive confirmatory identification of body fluids. Forensic Sci. Int. 181 (1-3), e1–e5. doi:10.1016/j.forsciint.2008.08.004
Waid, D. D., and Warren, R. J. (1984). Seasonal variations in physiological indices of adult female white-tailed deer in texas. J. Wildl. Dis. 20 (3), 212–219. doi:10.7589/0090-3558-20.3.212
Westerhuis, J. A., Hoefsloot, H. C. J., Smit, S., Vis, D. J., Smilde, A. K., van Velzen, E. J. J., et al. (2008). Assessment of PLSDA cross validation. Metabolomics 4 (1), 81–89. doi:10.1007/s11306-007-0099-6
Wilson, P. J., and White, B. N. (1998). Sex identification of elk (Cervus elaphus canadensis), moose (alces alces), and white-tailed deer (Odocoileus virginianus) using the polymerase chain reaction. J. Forensic Sci. 43 (3), 477–482. doi:10.1520/jfs16172j
Worley, B., and Powers, R. (2016). PCA as a practical indicator of OPLS-DA model reliability. Curr. Metabolomics 4 (2), 97–103. doi:10.2174/2213235x04666160613122429
Yamauchi, K., Hamasaki, S., Miyazaki, K., Kikusui, T., Takeuchi, Y., and Mori, Y. (2000). Sex determination based on fecal DNA analysis of the amelogenin gene in Sika deer (Cervus nippon). J. Veterinary Med. Sci. 62 (6), 669–671. doi:10.1292/jvms.62.669
Yang, J., Chen, X., Luo, C., Li, Z., Chen, C., Han, S., et al. (2023). Application of serum SERS technology combined with deep learning algorithm in the rapid diagnosis of immune diseases and chronic kidney disease. Sci. Rep. 13 (1), 15719. doi:10.1038/s41598-023-42719-5
Yin, N.-H., Griffiths, F., Mann, C., Dawes, H., van Arkel, R., Bukhari, M., et al. (2025). Raman spectroscopy identified fingernail compositional differences between sexes and age-related changes but not handedness or fingers in a healthy cohort. PLOS One 20 (8), e0329092. doi:10.1371/journal.pone.0329092
Yu, Z., Zhai, G., Singmann, P., He, Y., Xu, T., Prehn, C., et al. (2012). Human serum metabolic profiles are age dependent. Aging Cell 11 (6), 960–967. doi:10.1111/j.1474-9726.2012.00865.x
Zárate, S., Stevnsner, T., and Gredilla, R. (2017). Role of estrogen and other sex hormones in brain aging. Neuroprotection and DNA repair. Front. Aging Neurosci. 9, 430. doi:10.3389/fnagi.2017.00430
Zayats, T., Young, T. L., Mackey, D. A., Malecaze, F., Calvas, P., and Guggenheim, J. A. (2009). Quality of DNA extracted from mouthwashes. PLoS ONE 4 (7), e6165. doi:10.1371/journal.pone.0006165
Zhang, F., Tan, Y., Ding, J., Cao, D., Gong, Y., Zhang, Y., et al. (2022). Application and progress of raman spectroscopy in Male reproductive system. Front. Cell Dev. Biol. 9, 823546. doi:10.3389/fcell.2021.823546
Zontov, Y. V., Rodionova, O.Ye., Kucheryavskiy, S., and Pomerantsev, A. L. (2020). PLS-DA – a MATLAB GUI tool for hard and soft approaches to partial least squares discriminant analysis. Chemom. Intelligent Laboratory Syst. 203, 104064. doi:10.1016/j.chemolab.2020.104064
Keywords: Raman spectroscopy, PLS-DA, white-tailed deer, sex determination, chemometric classification, wildlife forensics, multivariate analysis, pattern recognition
Citation: Majumder MS, Smith E and Halámková L (2026) Sex determination of white-tailed-deer (Odocoileus virginianus) from plasma and serum samples by using Raman spectroscopy and PLS-DA method: a forensic perspective. Front. Anal. Sci. 5:1727520. doi: 10.3389/frans.2025.1727520
Received: 17 October 2025; Accepted: 03 December 2025;
Published: 05 January 2026.
Edited by:
Mohamed O. Amin, Department of Chemistry The RNA Institute, United StatesReviewed by:
Tianyi Dou, Bruker, United StatesSuchita Rawat, Garden City University, India
Kelly M. Elkins, Towson University, United States
Copyright © 2026 Majumder, Smith and Halámková. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Lenka Halámková, bGVua2EuaGFsYW1rb3ZhQHR0dS5lZHU=
Ernest Smith