AUTHOR=Nelms Mark D. , Antonijevic Todor , Ring Caroline , Harris Danni L. , Bever Ronnie Joe , Lynn Scott G. , Williams David , Chappell Grace , Boyles Rebecca , Borghoff Susan , Edwards Stephen W. , Markey Kristan TITLE=Chemistry domain of applicability evaluation against existing estrogen receptor high-throughput assay-based activity models JOURNAL=Frontiers in Toxicology VOLUME=Volume 6 - 2024 YEAR=2024 URL=https://www.frontiersin.org/journals/toxicology/articles/10.3389/ftox.2024.1346767 DOI=10.3389/ftox.2024.1346767 ISSN=2673-3080 ABSTRACT=The U.S. EPA's EDSP Tier 1 assays are used to screen for potential endocrine system–disrupting chemicals. The current study investigated the utility of chemical clustering to evaluate the screening approach using a 4-assay model as a test case. Although the full original assay battery is no longer available, the demonstrated contribution of chemical clustering is broadly applicable and the data analysis can be applied to future evaluation of minimal assay models for consideration in screening. Chemical structures were collected for the EDSP UoC and ER model chemicals and grouped based on structural similarity. ER model chemicals with a clearly defined structure not present in the EDSP UoC were assigned to clusters using a k-NN approach. Performance of a 4-assay model compared with the full ER agonist model was analyzed as related to chemical clustering. This was a case study, and a similar analysis can be performed with any subset model in which the same (or subset of) chemicals are screened. Overall, the best 4-assay subset ER agonist model resulted in 122 false-positive and only 2 false-negative predictions compared with the full ER agonist model. These 122 false-positives were from 91 different chemical clusters. Most false positives were active in only two of the four assays, whereas all but 11 true positive chemicals were active in at least three assays. False positive chemicals also tended to have lower AUC values, with 110 out of 122 false positives having an AUC value below 75% of the positives as predicted by the full ER agonist model. Many false positives demonstrated borderline activity. Although this is a descriptive analysis of previous results, several concepts can be applied to any screening model used in the future. Chemical clustering provides a means of ensuring future screening evaluations consider the broad chemical space represented by the EDSP UoC. Clusters can also assist in prioritizing future chemicals for screening based on the activity of known chemicals in those clusters. The lessons learned from this case study can be easily applied to future evaluations of model applicability and screening to evaluate future datasets.