AUTHOR=Keeper Jeremy Horst , Seto Jong , Oren Ersin Emre , Horst Orapin V. , Hung Ling-Hong , Samudrala Ram TITLE=Accurate informatic modeling of tooth enamel pellicle interactions by training substitution matrices with Mat4Pep JOURNAL=Frontiers in Materials VOLUME=Volume 11 - 2024 YEAR=2024 URL=https://www.frontiersin.org/journals/materials/articles/10.3389/fmats.2024.1436379 DOI=10.3389/fmats.2024.1436379 ISSN=2296-8016 ABSTRACT=Extracellular matrices direct the formation of mineral constituents into self-assembled mineralized tissues. We investigate the protein and mineral constituents to better understand the underlying mechanisms that lead to mineralized tissue formation. Specifically, we study the protein–hydroxyapatite interactions that govern the development and homeostasis of teeth and bone in the oral cavity. Characterization would enable improvements in the design of peptides to regenerate mineralized tissues and control attachments such as ligaments and dental plaque. Progress has been limited because no available methods produce robust data for assessing organic–mineral interfaces. We show that tooth enamel pellicle peptides contain subtle sequence similarities that encode hydroxyapatite binding mechanisms by segregating pellicle peptides from control sequences using our previously developed substitution matrix-based peptide comparison protocol with improvements. Sampling diverse matrices, adding biological control sequences, and optimizing matrix refinement algorithms improve discrimination from 0.81 to 0.99 AUC in leave-one-out experiments. Other contemporary methods fail regarding this problem. We find hydroxyapatite interaction sequence patterns by applying the resulting selected refined matrix (“pellitrix”) to cluster the peptides and build subgroup alignments. We identify putative hydroxyapatite maturation domains by application to enamel biomineralization proteins and prioritize putative novel pellicle peptides identified by In-StageTip (iST) mass spectrometry. The sequence comparison protocol outperforms other contemporary options for this small and heterogeneous group and is generalized for application to any group of peptides. As a result, this platform has broad impacts on peptide design, with direct applications to microbiology, biomaterial design, and tissue engineering.