OPINION article

Front. Psychol., 13 August 2024

Sec. Neuropsychology

Volume 15 - 2024 | https://doi.org/10.3389/fpsyg.2024.1452462

Performance validity testing: the need for digital technology and where to go from here

  • Department of Psychiatry and Behavioral Sciences, Northwestern University Feinberg School of Medicine, Chicago, IL, United States

Introduction

Neuropsychological testing can inform practitioners and scientists about brain-behavior relationships that guide diagnostic classification and treatment planning (Donders, ). However, not all examinees remain engaged throughout testing and some may exaggerate or feign impairment, rendering their performance non-credible and uninterpretable (Roor et al., 2024). It is therefore important to regularly assess the validity of data obtained during a neuropsychological evaluation (Sweet et al., 2021). However, performance validity assessment (PVA) is a complex process. Practitioners must know when and how to use multiple performance validity tests (PVTs) while accounting for various contextual, diagnostic, and intrapersonal factors (Lippa, 2018). Furthermore, inaccurate PVA can lead to erroneous and potentially harmful judgments regarding an examinee's mental health and neuropsychological status. Although the methods used to address these complexities in PVA are evolving (Bianchini et al., ; Boone, ), improvement is still needed.

Modern digital technologies have the potential to significantly improve PVA, but such technologies have not received much attention. Most PVTs used today are pencil-and-paper tests developed several decades ago (Martin et al., 2015), and digital innovations have largely been confined to computerized validity testing (see Table 1). Meanwhile, other areas of digital neuropsychology have rapidly expanded. Technologies can now capture high-dimensional data conducive to precision medicine (Parsons and Duffield, 2020; Harris et al., 2024), and this surge in digital assessment may soon become the rule rather than exception for neuropsychology (Bilder and Reise, ; Germine et al., 2019). If PVA does not keep pace with other digital innovations in neuropsychology, many validity tests and methods may lose relevance.

Table 1

Material-specificityPerformance validity test/methodReferences
Memory-focused freestanding PVTsMemory integrated language test (MIL)Finley et al., 2024b; Leese et al., 2024b
Coin in hand–extended versionDaugherty et al.,
Inventory of problems – memory (IOP-M)Giromini et al., 2020; Erdodi et al.,
DETECTSPaulo and Albuquerque, 2019
Computerized forced-choice test (CFCT)Gutiérrez and Gur, 2011
Medical symptom validity test (MSVT)Green, 2004
Word memory test (WMT)Green, 2003
Computerized test of memory malingering (TOMM)Rees et al., 1998
Computerized assessment of response bias (CARB)Allen et al.,
Tests of neuropsychological malingering (TNM)Pritchard and Moses, 1992
Non-memory-focused freestanding PVTsMaking change test (MCT)Finley et al., 2024b; Leese et al., 2024a
The shell game task*Bryant et al.,
Multi-level pattern memory test (MPMT)Omer and Braw, 2021
Tests of attentional distraction (TOAD)Morey, 2019
Nonverbal medical symptom validity test (NV-MSVT)Green, 2008
Portland digit recognition test-computerizedRose et al., 1995
Victoria symptom validity test (VSVT)Slick et al., 1995
Forced choice test of nonverbal ability (FCTNV)Frederick and Foster, 1991
Multi-digit memory test (MDMT)Bolter and Niccolls,
Mixed freestanding PVTsPediatric performance validity test suite (PdPVTS)McCaffrey et al., 2020
Memory validity profile (MVP)Brooks and Sherman, ; Brooks et al.,
Embedded PVTs/methodsPenn computerized neurocognitive battery (PennCNB)Scott et al., 2023
National Institutes of Health Toolbox® (NIHTB)Abeare et al.,
MOXO-d-continuous performance test (CPT)Berger et al., ; Winter and Braw, 2022
Conners continuous performance test (CPT; Versions 2 and 3)Ord et al., 2010; Erdodi et al., 2014; Shura et al., 2016; Sharland et al., 2018; Lichtenstein et al., 2019; Scimeca et al., 2021; Finley et al., 2023a,b; Robinson et al., 2023;
Test of variables of attention (TOVA)Leark et al., 2002; Marshall et al., 2010; Nicholls et al., 2020
Automated neuropsychological assessment metrics (ANAM) performance validity indexRoebuck-Spencer et al., 2013; Meyers et al., 2022
Immediate post-concussion assessment and cognitive testing (ImPACT)Erdal, ; Schatz and Glatts, 2013; Lovell, 2015; Siedlik et al., 2015; Gaudet and Weyandt, 2017; Higgins et al., 2017; Manderino and Gunstad, 2018; Raab et al., 2020
CNS vital signs batteryBrooks et al.,
NeuroTrax batteryHegedish et al., 2012; Bar-Hen et al.,

Existing digital performance validity tests and methods.

*Presented as a professional conference poster, not a published article.

This paper aims to increase awareness of how digital technologies can improve PVA so that researchers within neuropsychology and relevant organizations have a clinically and scientifically meaningful basis for transitioning to digital platforms. Herein, I describe five ways in which digital technologies can improve PVA: (1) generating more informative data, (2) leveraging advanced analytics, (3) facilitating scalable and sustainable research, (4) increasing accessibility, and (5) enhancing efficiencies.

Generating more informative data

Generating a greater volume, variety, and velocity of data core and ancillary to validity testing may improve the detection of non-credible performance. With these data, scientists and practitioners can better understand the dimensionality of performance validity and assess it effectively, especially in cases without clear evidence of fabrication. However, capturing sundry data in PVA is challenging, as practitioners are often limited to a few PVTs throughout an evaluation that is completed in a single snapshot of time (Martin et al., 2015). Furthermore, many PVTs index redundant information because they have similar detection paradigms that generate only one summary cut-score (Boone, ). Digital technologies can address these issues by capturing additional aspects of performance validity without increasing time or effort.

Digitally recording the testing process is one way to generate more diverse data points than a summary score. Some process-based metrics are already employed in PVA, including recording response consistency and exaggeration across test items (Schroeder et al., 2012; Finley et al., 2024a). For example, Leese et al. (2024a) found that using a digital software to assess discrepancies between item responses and correct answers improved the detection of non-credible performance. Using digital tools to objectively and unobtrusively record response latencies and reaction times during testing is another useful process-based approach (Erdodi and Lichtenstein, 2021; Rhoads et al., 2021). Examinees typically cannot maintain consistent rates of slowed response latencies across items when attempting to feign impairment (Gutiérrez and Gur, 2011). Various software can record these process-based scores (e.g., item-level indices of response time, reliable span, and exaggeration magnitude) in most existing tests if they are migrated to tablets/computers (Kush et al., 2012). Recording both the process and outcome (summary scores) of test completion can index dimensions of performance validity across and within tests.

Technologies can also record biometric data ancillary to validity testing. Biometrics including oculomotor, cardiovascular, body gesture, and electrodermal responses are indicators of cognitive load and are associated with deception (Ayres et al., ). Deception is believed to increase cognitive load because it requires more complex processing to falsify a response (Dinges et al., ). Although deception is different from non-credible performance, neuroimaging research suggests non-credible performance can be indicative of greater cognitive effort (Allen et al., ). For this reason, technologies like eye-tracking have been used to augment PVA (Braw et al., ). These studies are promising, but other avenues within this literature have yet to be explored due to technological limitations. Fortunately, many technologies now possess built-in cameras, accelerometers, gyroscopes, and sensors that “see,” “hear,” and “feel” at a basic level, and may be embedded within existing PVTs to record biometrics.

Technologies under development for cognitive testing may also provide informative data that has not yet been linked to PVA. For example, speech analysis software for verbal fluency tasks (Holmlund et al., 2019) could identify non-credible word choice or grammatical errors. Similarly, digital phenotyping technologies may identify novel and useful indices during validity testing, such as keystroke dynamics (e.g., slowed/inconsistent typing; Chen et al., ) embedded with PVTs requiring typed responses. These are among many burgeoning technologies that can generate higher dimensional data needed for robust PVA without adding time or labor. However, access to a greater range and depth of data requires advanced methods to effectively and efficiently analyze the data.

Leveraging advanced analytics

Fortunately, technologies can leverage advanced analytics to rapidly and accurately analyze a large influx of digital data in real time. Although several statistical approaches are described within the PVA literature (Boone, ; Jewsbury, 2023), machine learning (ML) and item response theory (IRT) analytics may be particularly useful for analyzing large volumes of interrelated, nonlinear, and high-dimensional data at the item level (Reise and Waller, 2009; Mohri et al., 2012).

Not only can these approaches analyze more complex data but they can also improve the development and refinement of PVTs relative to classical measurement approaches. For example, person-fit statistics is an IRT approach that has been used to identify non-credible symptom reporting in dichotomous and polytomous data (Beck et al., ). This approach may also improve embedded PVTs by estimating the extent to which each item-level response deviates from one's true abilities (Bilder and Reise, ). Scott et al. (2023) found that using person-fit statistics helped embedded PVTs detect subtle patterns of non-credible performance. IRT is especially amenable to computerized adaptive testing, which adjusts each item's difficulty based on one's response. Computerized adaptive testing systems can create shorter and more precise PVTs with psychometrically equivalent alternative forms (Gibbons et al., 2008). These systems can also detect careless responding based on unpredictable error patterns that deviate from normal difficulty curves. Detecting careless responding may be useful for PVTs embedded within digital self-paced continuous performance tests (e.g., Nicholls et al., 2020; Berger et al., ). Other IRT approaches can improve PVTs by scrutinizing item difficulty and discriminatory power and identifying culturally biased items. For example, differential item functioning is an IRT approach that may identify items on English-verbally mediated PVTs that are disproportionately challenging for those who do not speak English as their primary language, allowing for appropriate adjustments.

ML has proven useful in symptom validity test development (Orrù et al., 2021) and may function similarly for PVTs. Two studies recently investigated whether supervised ML improves PVA (Pace et al., 2019; Hirsch et al., 2022). Pace et al. (2019) found that a supervised ML model trained with various features (demographics, cognitive performance errors, response time, and a PVT score) discriminated between genuine and simulated cognitive impairment with high accuracy. Using similar features, Hirsch et al. (2022) found that their supervised models had moderate to weak prediction of PVT failure in a clinical attention-deficit/hyperactivity disorder sample. No studies have used unsupervised ML for PVA. It is possible that unsupervised ML could also identify groups of credible and non-credible performing examinees using relevant factors such as PVT scores, litigation status, medical history, and referral reasons, without explicit programming. Software can be developed to extract data for the ML via computerized questionnaires or electronic medical records. Deep learning, a form of ML that processes data using multiple dimensions, may also detect complex and anomalous patterns indicative of non-credible performance. Deep learning may be especially useful for analyzing response sequences over time (e.g., non-credible changes in performance across repeat medico-legal evaluations). Furthermore, deep-learning models may be effective at identifying inherent statistical dependencies and patterns of non-credible performance, and thus generating expectations of how genuine responses should appear. Combining these algorithms with other statistical techniques that assess response complexity and highly anomalous responses (e.g., Lundberg and Lee, 2017; Parente and Finley, 2018; Finley and Parente, 2020; Orrù et al., 2020; Mertler et al., 2021; Parente et al., 2021, 2023; Finley et al., 2022; Rodriguez et al., 2024), may increase the signal of non-credible performance. These algorithmic approaches can improve as we better understand cognitive phenotypes and what is improbable for certain disorders using precision medicine and bioinformatics.

Facilitating scale and sustainability

To optimize the utility of these digital data, technologies can include point-of-testing acquisition software that automatically transfers data to cloud-based, centralized repositories. These repositories facilitate sustainable and scalable innovations by increasing data access and collaboration among PVA stakeholders (see Reeves et al., 2007 and Gaudet and Weyandt, 2017 for large-scale developments of digital tests with embedded PVTs). Multidisciplinary approaches are needed to make theoretical and empirical sense of the data collected via digital technologies (Collins and Riley, ). With more comprehensive and uniform data amenable to data mining and deep-learning analytics, collaborating researchers can address overarching issues that remain poorly understood within research. For example, with larger centralized data researchers can directly evaluate different statistical approaches (e.g., chaining likelihood ratios vs. multivariable discriminant function analysis, Bayesian model averaging, or logistic regression) as well as the joint validity of standardized test batteries (Davis, ; Erdodi, ; Jewsbury, 2023). Such data and findings could also help determine robust criterion-grouping combinations, given that multiple PVTs assessing complementary aspects of performance across various cognitive domains may be necessary for a strong criterion-grouping combination (Schroeder et al., 2019; Soble et al., 2020). Similarly, researchers could expand upon existing decision-making models (e.g., Rickards et al., 2018; Sherman et al., 2020) by using these comprehensive data to develop algorithms that automatically generate credible/non-credible profiles based on the type and proportion or number of PVTs failed in relation to various contextual and diagnostic factors, symptom presentations, and clinical inconsistencies (across medical records, self- and informant-reports, or behavioral observations). A greater range and depth of data may further help elucidate the extent to which several putative factors—such as bona fide injury/disease, normal fluctuation and variability in testing, level of effort (either to perform well or to deceive), and symptom validity, among others—are associated with performance validity (Larrabee, 2012; Bigler, ). Understanding these associations could help identify the mechanisms underlying non-credible performance.

Collaboration is especially needed for basic and applied sciences to coalesce unique aspects of PVA that have been studied independently, such as integrating neuropsychology and neurocognitive processing theories to develop more sophisticated stimuli/paradigms (Leighton et al., 2014). For example, less applied scientific models, such as memory familiarity vs. conscious recollection theories, may be applied to clinically available PVTs to reduce false-positive rates in certain neurological populations (Eglit et al., ). Similar areas of cognitive science have also shown that using pictorial or numerical stimuli (vs. words) across multiple learning trials can reduce false-positive errors in clinical settings (Leighton et al., 2014). Furthermore, integrating data in real time into these repositories offers a sustainable and accurate way of estimating PVT failure base rates and developing cutoffs accordingly. Finally, as proposed by the National Neuropsychology Network (Loring et al., 2022), a centralized repository for digital data that is backward-compatible with analog test data can provide a smooth transition from traditional pencil-and-paper tests to digital formats. These repositories (including those curated via the National Neuropsychology Network) thus enable sustainable innovation by supporting continuous incremental refinement of PVTs over time.

Increasing accessibility

As observed in other areas of neuropsychology (Miller and Barr, 2017), digital technologies can offer more accessible PVA. Specifically, web-based PVTs can help access underserved and geographically restricted communities, but with the understanding that disparities in digital technology may also exist. Although more web-based PVTs are needed, not every PVT requires digitization for telehealth (e.g., Reliable Digit Span; Kanser et al., 2021; Harrison and Davin, 2023). Digital PVTs can also increase accessibility in primary care settings where digital cognitive screeners are being developed for face-to-face evaluations and may be completed in distracting, unsupervised environments (Zygouris and Tsolaki, 2015). Validity indicators could be embedded within these screeners rather than creating new freestanding PVTs. The National Institutes of Health Toolbox® (Abeare et al., ) and Penn Computerized Neurocognitive Battery (Scott et al., 2023) are well-established digital screeners with embedded PVTs that offer great promise for these evaluations. In primary care, embedded PVTs could serve as preliminary screeners for atypical performance that warrants further investigation. Digital PVTs may also increase accessibility in research settings. Although it is not highly likely research volunteers would deliberately feign impairment, they may lose interest, doze off, or rush through testing (An et al., ), especially in dementia-focused research where digital testing is common. Some digitally embedded PVTs have been developed for ADHD research (Table 1) and may be used in other research focused on digital cognitive testing (Bauer et al., ).

Enhancing efficiencies

Finally, the application of digital technologies introduces new efficiencies; in PVA, they hold the promise of improved standardization and administration/scoring accuracy. Technologies can leverage automated algorithms to reduce time spent on scoring and routine aspects of PVA (e.g., finding/adjusting PVT cutoffs according to various contextual/intrapersonal factors). Automation would allow providers to allocate more time to case conceptualization and responding to (rather than detecting) validity issues. Greater efficiencies in PVA translate into greater cost-efficiencies as well as reduced collateral expenses for specialized training, testing support, and materials (Davis, ). Further, digital PVTs can automatically store, retrieve, and analyze data to generate multiple relevant scores (e.g., specificity, sensitivity, predictive power adjusted for diagnostic-specific base rates, false-positive estimates, and likelihood ratios or probability estimates for single/multivariable failure combinations). Automated scoring will likely become increasingly useful as more PVTs and data are generated.

Limitations and concluding remarks

By no means an exhaustive review, this paper describes five ways in which digital technologies can improve PVA. These improvements can complement rather than replace the uniquely human aspects of PVA. Thus, the upfront investments required to transition to digital approaches are likely justifiable. However, other limitations deserve attention before making this transition. As described elsewhere (Miller and Barr, 2017; Germine et al., 2019), limitations to digital assessment may include variability across devices, which can impose different perceptual, motor, and cognitive demands that affect the reliability and accuracy of the tests. Variations in hardware and software within the same class of devices can affect stimulus presentation and response (including response latency) measurement. Individual differences in access to and familiarity with technology may further affect test performance. Additionally, the rapid advancement in technologies suggests that hardware and software can quickly become obsolete. A large influx of data and the application of “black box” ML algorithms and cloud-based repositories also raises concerns regarding data security and privacy. Addressing these issues and implementing digital methods into practice or research would require substantial technological and human infrastructure that may not be attainable in certain settings (Miller, 2019). Indeed, the utility of digital assessments likely depends on the context in which they are implemented. For example, PVA is critical in forensic evaluations but the limitations described above could challenge compliance with the evolving standards for the admissibility of scientific evidence in these evaluations. Further discussion of these limitations along with the logistical and practical considerations for a digital transition is needed (for further discussion, see Miller, 2019; Singh and Germine, 2021). Finally, other digital opportunities, such as using validity indicators with ecological momentary assessment and virtual reality technologies, merit further discussion. Moving forward, scientists are encouraged to expand upon these digital innovations to ensure that PVA evolves alongside the broader landscape of digital neuropsychology.

Statements

Author contributions

J-CF: Conceptualization, Investigation, Methodology, Resources, Supervision, Writing – original draft, Writing – review & editing.

Funding

The author(s) declare that no financial support was received for the research, authorship, and/or publication of this article.

Acknowledgments

I would like to thank Jason Soble and Anthony Robinson for providing their expertise and guidance during the preparation of this manuscript.

Conflict of interest

The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

  • 1

    AbeareC.ErdodiL.MessaI.TerryD. P.PanenkaW. J.IversonG. L.et al. (2021). Development of embedded performance validity indicators in the NIH Toolbox Cognitive Battery. Psychol. Assess.33, 9096. 10.1037/pas0000958

  • 2

    AllenL. M.ConderR. L.GreenP.CoxD. R. (1997). CARB'97 manual for the computerized assessment of response bias.Durham, NC: CogniSyst.

  • 3

    AllenM. D.BiglerE. D.LarsenJ.Goodrich-HunsakerN. J.HopkinsR. O. (2007). Functional neuroimaging evidence for high cognitive effort on the Word Memory Test in the absence of external incentives. Brain Injury21, 14251428. 10.1080/02699050701769819

  • 4

    AnK. Y.KaplounK.ErdodiL. A.AbeareC. A. (2017). Performance validity in undergraduate research participants: A comparison of failure rates across tests and cutoffs. Clin. Neuropsychol.31, 193206. 10.1080/13854046.2016.1217046

  • 5

    AyresP.LeeJ. Y.PaasF.Van MerrienboerJ. J. (2021). The validity of physiological measures to identify differences in intrinsic cognitive load. Front. Psychol.12:702538. 10.3389/fpsyg.2021.702538

  • 6

    Bar-HenM.DonigerG. M.GolzadM.GevaN.SchweigerA. (2015). Empirically derived algorithm for performance validity assessment embedded in a widely used neuropsychological battery: validation among TBI patients in litigation. J. Clin. Exper. Neuropsychol.37, 10861097. 10.1080/13803395.2015.1078294

  • 7

    BauerR. M.IversonG. L.CernichA. N.BinderL. M.RuffR. M.NaugleR. I. (2012). Computerized neuropsychological assessment devices: joint position paper of the American Academy of Clinical Neuropsychology and the National Academy of Neuropsychology. Arch. Clin. Neuropsychol.27, 362373. 10.1093/arclin/acs027

  • 8

    BeckM. F.AlbanoA. D.SmithW. M. (2019). Person-fit as an index of inattentive responding: a comparison of methods using polytomous survey data. Appl. Psychol. Meas.43, 374387. 10.1177/0146621618798666

  • 9

    BergerC.LevA.BrawY.ElbaumT.WagnerM.RassovskyY. (2021). Detection of feigned ADHD using the MOXO-d-CPT. J. Atten. Disord.25, 10321047. 10.1177/1087054719864656

  • 10

    BianchiniK. J.MathiasC. W.GreveK. W. (2001). Symptom validity testing: a critical review. Clin. Neuropsychol.15, 1945. 10.1076/clin.15.1.19.1907

  • 11

    BiglerE. D. (2014). Effort, symptom validity testing, performance validity testing and traumatic brain injury. Brain Injury28, 16231638. 10.3109/02699052.2014.947627

  • 12

    BilderR. M.ReiseS. P. (2019). Neuropsychological tests of the future: How do we get there from here?. Clin. Neuropsychol.33, 220245. 10.1080/13854046.2018.1521993

  • 13

    BolterJ. F.NiccollsR. (1991). Multi-Digit Memory Test. Wang Neuropsychological Laboratories. Boone, K. B. (2021). Assessment of Feigned Cognitive Impairment. London: Guilford Publications.

  • 14

    BooneK. B. (2021). Assessment of Feigned Cognitive Impairment. London: Guilford Publications.

  • 15

    BrawY. C.ElbaumT.LupuT.RatmanskyM. (2024). Chronic pain: Utility of an eye-tracker integrated stand-alone performance validity test. Psychol. Injury Law.13, 139151. 10.1007/s12207-024-09507-6

  • 16

    BrooksB. L.Fay-McClymontT. B.MacAllisterW. S.VassermanM.ShermanE. M. (2019). A new kid on the block: the memory validity profile (MVP) in children with neurological conditions. Child Neuropsychol.25, 561572. 10.1080/09297049.2018.1477929

  • 17

    BrooksB. L.ShermanE. M. (2019). Using the Memory Validity Profile (MVP) to detect invalid performance in youth with mild traumatic brain injury. Appl. Neuropsychol.8, 319325. 10.1080/21622965.2018.1476865

  • 18

    BrooksB. L.ShermanE. M.IversonG. L. (2014). Embedded validity indicators on CNS Vital Signs in youth with neurological diagnoses. Arch. Clin. Neuropsychol.29, 422431. 10.1093/arclin/acu029

  • 19

    BryantA. M.PizzoniaK.AlexanderC.LeeG.Revels-StrotherO.WeekmanS.et al. (2023). 77 The Shell Game Task: Pilot data using a simulator-design study to evaluate a novel attentional performance validity test. J. Int. Neuropsychol. Soc.29, 751752. 10.1017/S1355617723009359

  • 20

    ChenM. H.LeowA.RossM. K.DeLucaJ.ChiaravallotiN.CostaS. L.et al. (2022). Associations between smartphone keystroke dynamics and cognition in MS. Digital Health8:234. 10.1177/20552076221143234

  • 21

    CollinsF. S.RileyW. T. (2016). NIH's transformative opportunities for the behavioral and social sciences. Sci. Transl. Med.8, 366ed14. 10.1126/scitranslmed.aai9374

  • 22

    DaughertyJ. C.QueridoL.QuirozN.WangD.Hidalgo-RuzzanteN.FernandesS.et al. (2021). The coin in hand–extended version: development and validation of a multicultural performance validity test. Assessment28, 186198. 10.1177/1073191119864652

  • 23

    DavisJ. J. (2021). “Interpretation of data from multiple performance validity tests,” in Assessment of feigned cognitive impairment, ed. K. B. Boone (London: Guilford Publications), 283306.

  • 24

    DavisJ. J. (2023). Time is money: Examining the time cost and associated charges of common performance validity tests. Clin. Neuropsychol.37, 475490. 10.1080/13854046.2022.2063190

  • 25

    DingesL.FiedlerM. A.Al-HamadiA.HempelT.AbdelrahmanA.WeimannJ.et al. (2024). Exploring facial cues: automated deception detection using artificial intelligence. Neural Comput. Applic.26, 127. 10.1007/s00521-024-09811-x

  • 26

    DondersJ. (2020). The incremental value of neuropsychological assessment: a critical review. Clin. Neuropsychol.34, 5687. 10.1080/13854046.2019.1575471

  • 27

    EglitG. M.LynchJ. K.McCaffreyR. J. (2017). Not all performance validity tests are created equal: the role of recollection and familiarity in the Test of Memory Malingering and Word Memory Test. J. Clin. Exp. Neuropsychol.39, 173189. 10.1080/13803395.2016.1210573

  • 28

    ErdalK. (2012). Neuropsychological testing for sports-related concussion: how athletes can sandbag their baseline testing without detection. Arch. Clin. Neuropsychol.27, 473479. 10.1093/arclin/acs050

  • 29

    ErdodiL.CalamiaM.HolcombM.RobinsonA.RasmussenL.BianchiniK. (2024). M is for performance validity: The iop-m provides a cost-effective measure of the credibility of memory deficits during neuropsychological evaluations. J. Forensic Psychol. Res. Pract.24, 434450. 10.1080/24732850.2023.2168581

  • 30

    ErdodiL. A. (2023). Cutoff elasticity in multivariate models of performance validity assessment as a function of the number of components and aggregation method. Psychol. Inj. Law16, 328350. 10.1007/s12207-023-09490-4

  • 31

    ErdodiL. A.LichtensteinJ. D. (2021). Invalid before impaired: An emerging paradox of embedded validity indicators. Clin. Neuropsychol.31, 10291046. 10.1080/13854046.2017.1323119

  • 32

    ErdodiL. A.RothR. M.KirschN. L.Lajiness-O'NeillR.MedoffB. (2014). Aggregating validity indicators embedded in Conners' CPT-II outperforms individual cutoffs at separating valid from invalid performance in adults with traumatic brain injury. Arch. Clin. Neuropsychol.29, 456466. 10.1093/arclin/acu026

  • 33

    FinleyJ. C. A.BrookM.KernD.ReillyJ.HanlonR. (2023b). Profile of embedded validity indicators in criminal defendants with verified valid neuropsychological test performance. Arch. Clin. Neuropsychol.38, 513524. 10.1093/arclin/acac073

  • 34

    FinleyJ. C. A.BrooksJ. M.NiliA. N.OhA.VanLandinghamH. B.OvsiewG. P.et al. (2023a). Multivariate examination of embedded indicators of performance validity for ADHD evaluations: a targeted approach. Appl. Neuropsychol.23, 117. 10.1080/23279095.2023.2256440

  • 35

    FinleyJ. C. A.KaddisL.ParenteF. J. (2022). Measuring subjective clustering of verbal information after moderate-severe traumatic brain injury: A preliminary review. Brain Injury36, 10191024. 10.1080/02699052.2022.2109751

  • 36

    FinleyJ. C. A.LeeseM. I.RoseberryJ. E.HillS. K. (2024b). Multivariable utility of the Memory Integrated Language and Making Change Test. Appl. Neuropsychol. Adult 1–8. 10.1080/23279095.2024.2385439

  • 37

    FinleyJ. C. A.ParenteF. J. (2020). Organization and recall of visual stimuli after traumatic brain injury. Brain Injury34, 751756. 10.1080/02699052.2020.1753113

  • 38

    FinleyJ. C. A.RodriguezC.CernyB.ChangF.BrooksJ.OvsiewG.et al. (2024a). Comparing embedded performance validity indicators within the WAIS-IV Letter-Number Sequencing subtest to Reliable Digit Span among adults referred for evaluation of attention deficit/hyperactivity disorder. Clin. Neuropsychol. 2024, 117. 10.1080/13854046.2024.2315738

  • 39

    FrederickR. I.FosterH. G. (1991). Multiple measures of malingering on a forced-choice test of cognitive ability. Psychol. Assess.3, 596602. 10.1037/1040-3590.3.4.596

  • 40

    GaudetC. E.WeyandtL. L. (2017). Immediate Post-Concussion and Cognitive Testing (ImPACT): a systematic review of the prevalence and assessment of invalid performance. Clin. Neuropsychol.31, 4358. 10.1080/13854046.2016.1220622

  • 41

    GermineL.ReineckeK.ChaytorN. S. (2019). Digital neuropsychology: Challenges and opportunities at the intersection of science and software. Clin. Neuropsychol.33, 271286. 10.1080/13854046.2018.1535662

  • 42

    GibbonsR. D.WeissD. J.KupferD. J.FrankE.FagioliniA.GrochocinskiV. J.et al. (2008). Using computerized adaptive testing to reduce the burden of mental health assessment. Psychiatr. Serv.59, 361368. 10.1176/ps.2008.59.4.361

  • 43

    GirominiL.ViglioneD. J.ZennaroA.MaffeiA.ErdodiL. A. (2020). SVT Meets PVT: development and initial validation of the inventory of problems–memory (IOP-M). Psychol. Inj. Law13, 261274. 10.1007/s12207-020-09385-8

  • 44

    GreenP. (2003). Manual for the Word Memory Test for Windows.Kelowna: Green's Publishing.

  • 45

    GreenP. (2004). Green's Medical Symptom Validity Test (MSVT) for microsoft windows: User's manual. Kelowna: Green's Publishing.

  • 46

    GreenP. (2008). Green's Nonverbal Medical Symptom Validity Test (NV-MSVT) for microsoft windows: User's manual 1.0. Kelowna: Green's Publishing.

  • 47

    GutiérrezJ. M.GurR. C. (2011). “Detection of malingering using forced-choice techniques,” in Detection of malingering during head injury litigation, ed. C. R. Reynolds (Cham: Springer), 151167. 10.1007/978-1-4614-0442-2_4

  • 48

    HarrisC.TangY.BirnbaumE.CherianC.MendheD.ChenM. H. (2024). Digital neuropsychology beyond computerized cognitive assessment: Applications of novel digital technologies. Arch. Clin. Neuropsychol.39, 290304. 10.1093/arclin/acae016

  • 49

    HarrisonA. G.DavinN. (2023). Detecting non-credible performance during virtual testing. Psychol. Inj. Law16, 264272. 10.1007/s12207-023-09480-6

  • 50

    HegedishO.DonigerG. M.SchweigerA. (2012). Detecting response bias on the MindStreams battery. Psychiat. Psychol. Law19, 262281. 10.1080/13218719.2011.561767

  • 51

    HigginsK. L.DenneyR. L.MaerlenderA. (2017). Sandbagging on the immediate post-concussion assessment and cognitive testing (ImPACT) in a high school athlete population. Arch. Clin. Neuropsychol.32, 259266. 10.1093/arclin/acw108

  • 52

    HirschO.FuermaierA. B.TuchaO.AlbrechtB.ChavanonM. L.ChristiansenH. (2022). Symptom and performance validity in samples of adults at clinical evaluation of ADHD: a replication study using machine learning algorithms. J. Clin. Exp. Neuropsychol.44, 171184. 10.1080/13803395.2022.2105821

  • 53

    HolmlundT. B.ChengJ.FoltzP. W.CohenA. S.ElvevågB. (2019). Updating verbal fluency analysis for the 21st century: applications for psychiatry. Psychiatry Res.273, 767769. 10.1016/j.psychres.2019.02.014

  • 54

    JewsburyP. A. (2023). Invited commentary: Bayesian inference with multiple tests. Neuropsychol. Rev.33, 643652. 10.1007/s11065-023-09604-4

  • 55

    KanserR. J.O'RourkeJ. J. F.SilvaM. A. (2021). Performance validity testing via telehealth and failure rate in veterans with moderate-to-severe traumatic brain injury: a veterans affairs TBI model systems study. NeuroRehabilitation49, 169177. 10.3233/NRE-218019

  • 56

    KushJ. C.SpringM. B.BarkandJ. (2012). Advances in the assessment of cognitive skills using computer-based measurement. Behav. Res. Methods44, 125134. 10.3758/s13428-011-0136-2

  • 57

    LarrabeeG. J. (2012). Performance validity and symptom validity in neuropsychological assessment. J. Int. Neuropsychol. Soc.18, 625630. 10.1017/S1355617712000240

  • 58

    LearkR. A.DixonD.HoffmanT.HuynhD. (2002). Fake bad test response bias effects on the test of variables of attention. Arch. Clin. Neuropsychol.17, 335342. 10.1093/arclin/17.4.335

  • 59

    LeeseM. I.FinleyJ. C. A.RoseberryS.HillS. K. (2024a). The Making Change Test: Initial validation of a novel digitized performance validity test for tele-neuropsychology. Clin. Neuropsychol. 2024, 114. 10.1080/13854046.2024.2352898

  • 60

    LeeseM. I.RoseberryJ. E.SobleJ. R.HillS. K. (2024b). The Memory Integrated Language Test (MIL test): initial validation of a novel web-based performance validity test. Psychol. Inj. Law17, 3444. 10.1007/s12207-023-09495-z

  • 61

    LeightonA.WeinbornM.MayberyM. (2014). Bridging the gap between neurocognitive processing theory and performance validity assessment among the cognitively impaired: a review and methodological approach. J. Int. Neuropsychol. Soc.20, 873886. 10.1017/S135561771400085X

  • 62

    LichtensteinJ. D.FlaroL.BaldwinF. S.RaiJ.ErdodiL. A. (2019). Further evidence for embedded performance validity tests in children within the Conners' continuous performance test–second edition. Dev. Neuropsychol.44, 159171. 10.1080/87565641.2019.1565535

  • 63

    LippaS. M. (2018). Performance validity testing in neuropsychology: A clinical guide, critical review, and update on a rapidly evolving literature. Clin. Neuropsychol.32, 391421. 10.1080/13854046.2017.1406146

  • 64

    LoringD. W.BauerR. M.CavanaghL.DraneD. L.EnriquezK. D.ReiseS. P.et al. (2022). Rationale and design of the national neuropsychology network. J. Int. Neuropsychol. Soc.28, 111. 10.1017/S1355617721000199

  • 65

    Lovell (2015). ImPACT test administration and interpretation manual. Available at: http://www.impacttest.com (accessed July 23, 2024).

  • 66

    LundbergS. M.LeeS. I. (2017). “A unified approach to interpreting model predictions,” in Advances in neural information processing systems, eds. I. Guyon, Von Luxburg, U., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., et al. (New York: Curran Associates), 47654774.

  • 67

    ManderinoL.GunstadJ. (2018). Collegiate student athletes with history of ADHD or academic difficulties are more likely to produce an invalid protocol on baseline impact testing. Clin. J. Sport Med.28, 111116. 10.1097/JSM.0000000000000433

  • 68

    MarshallP.SchroederR.O'BrienJ.FischerR.RiesA.BlesiB.et al. (2010). Effectiveness of symptom validity measures in identifying cognitive and behavioral symptom exaggeration in adult attention deficit hyperactivity disorder. Clin. Neuropsychol.24, 12041237. 10.1080/13854046.2010.514290

  • 69

    MartinP. K.SchroederR. W.OdlandA. P. (2015). Neuropsychologists' validity testing beliefs and practices: a survey of North American professionals. Clin. Neuropsychol.29, 741776. 10.1080/13854046.2015.1087597

  • 70

    McCaffreyR. J.LynchJ. K.LearkR. A.ReynoldsC. R. (2020). Pediatric performance validity test suite (PdPVTS): Technical manual. Multi-Health Systems, Inc.

  • 71

    MertlerC. A.VannattaR. A.LaVeniaK. N. (2021). Advanced and Multivariate Statistical Methods: Practical Application and Interpretation.London: Routledge. 10.4324/9781003047223

  • 72

    MeyersJ. E.MillerR. M.VincentA. S. (2022). A validity measure for the automated neuropsychological assessment metrics. Arch. Clin. Neuropsychol.37, 17651771. 10.1093/arclin/acac046

  • 73

    MillerJ. B. (2019). Big data and biomedical informatics: Preparing for the modernization of clinical neuropsychology. Clin. Neuropsychol.33, 287304. 10.1080/13854046.2018.1523466

  • 74

    MillerJ. B.BarrW. B. (2017). The technology crisis in neuropsychology. Arch. Clin. Neuropsychol.32, 541554. 10.1093/arclin/acx050

  • 75

    MohriM.RostamizadehA.TalwalkarA. (2012). Foundations of Machine Learning. London: The MIT Press.

  • 76

    MoreyL. C. (2019). Examining a novel performance validity task for the detection of feigned attentional problems. Appl. Neuropsychol.26, 255267. 10.1080/23279095.2017.1409749

  • 77

    NichollsC. J.WinstoneL. K.DiVirgilioE. K.FoleyM. B. (2020). Test of variables of attention performance among ADHD children with credible vs. non-credible PVT performance. Appl. Neuropsychol.9, 307313. 10.1080/21622965.2020.1751787

  • 78

    OmerE.BrawY. (2021). The Multi-Level Pattern Memory Test (MPMT): Initial validation of a novel performance validity test. Brain Sci.11, 10391055. 10.3390/brainsci11081039

  • 79

    OrdJ. S.BoettcherA. C.GreveK. W.BianchiniK. J. (2010). Detection of malingering in mild traumatic brain injury with the Conners' Continuous Performance Test–II. J. Clin. Exp. Neuropsychol.32, 380387. 10.1080/13803390903066881

  • 80

    OrrùG.MazzaC.MonaroM.FerracutiS.SartoriG.RomaP. (2021). The development of a short version of the SIMS using machine learning to detect feigning in forensic assessment. Psychol. Inj. Law14, 4657. 10.1007/s12207-020-09389-4

  • 81

    OrrùG.MonaroM.ConversanoC.GemignaniA.SartoriG. (2020). Machine learning in psychometrics and psychological research. Front. Psychol.10:2970. 10.3389/fpsyg.2019.02970

  • 82

    PaceG.OrrùG.MonaroM.GnoatoF.VitalianiR.BooneK. B.et al. (2019). Malingering detection of cognitive impairment with the B test is boosted using machine learning. Front. Psychol.10:1650. 10.3389/fpsyg.2019.01650

  • 83

    ParenteF. J.FinleyJ. C. A. (2018). Using association rules to measure subjective organization after acquired brain injury. NeuroRehabilitation42, 915. 10.3233/NRE-172227

  • 84

    ParenteF. J.FinleyJ. C. A.MagalisC. (2021). An association rule general analytical system (ARGAS) for hypothesis testing in qualitative and quantitative research. Int. J. Quant. Qualit. Res. Methods9, 113. Available online at: https://ssrn.com/abstract=3773480

  • 85

    ParenteF. J.FinleyJ. C. A.MagalisC. (2023). A quantitative analysis for non-numeric data. Int. J. Quant. Qualit. Res. Methods11, 111. 10.37745/ijqqrm13/vol11n1111

  • 86

    ParsonsT.DuffieldT. (2020). Paradigm shift toward digital neuropsychology and high-dimensional neuropsychological assessments. J. Med. Internet Res.22:e23777. 10.2196/23777

  • 87

    PauloR.AlbuquerqueP. B. (2019). Detecting memory performance validity with DETECTS: a computerized performance validity test. Appl. Neuropsychol.26, 4857. 10.1080/23279095.2017.1359179

  • 88

    PritchardD.MosesJ. (1992). Tests of neuropsychological malingering. Forensic Rep.5, 287290.

  • 89

    RaabC. A.PeakA. S.KnodererC. (2020). Half of purposeful baseline sandbaggers undetected by ImPACT's embedded invalidity indicators. Arch. Clin. Neuropsychol.35, 283290. 10.1093/arclin/acz001

  • 90

    ReesL. M.TombaughT. N.GanslerD. A.MoczynskiN. P. (1998). Five validation experiments of the Test of Memory Malingering (TOMM). Psychol. Assess.10, 1020. 10.1037/1040-3590.10.1.10

  • 91

    ReevesD. L.WinterK. P.BleibergJ.KaneR. L. (2007). ANAM® Genogram: Historical perspectives, description, and current endeavors. Arch. Clin. Neuropsychol.22, S15S37. 10.1016/j.acn.2006.10.013

  • 92

    ReiseS. P.WallerN. G. (2009). Item response theory and clinical measurement. Annu. Rev. Clin. Psychol.5, 2748. 10.1146/annurev.clinpsy.032408.153553

  • 93

    RhoadsT.ReschZ. J.OvsiewG. P.WhiteD. J.AbramsonD. A.SobleJ. R. (2021). Every second counts: a comparison of four dot counting test scoring procedures for detecting invalid neuropsychological test performance. Psychol. Assess.33, 133141. 10.1037/pas0000970

  • 94

    RickardsT. A.CranstonC. C.TouradjiP.BechtoldK. T. (2018). Embedded performance validity testing in neuropsychological assessment: potential clinical tools. Appl. Neuropsychol.25, 219230. 10.1080/23279095.2017.1278602

  • 95

    RobinsonA.CalamiaM.PennerN.AssafN.RazviP.RothR. M.et al. (2023). Two times the charm: Repeat administration of the CPT-II improves its classification accuracy as a performance validity index. J. Psychopathol. Behav. Assess.45, 591611. 10.1007/s10862-023-10055-7

  • 96

    RodriguezV. J.FinleyJ. C. A.LiuQ.AlfonsoD.BasurtoK. S.OhA.et al. (2024). Empirically derived symptom profiles in adults with attention-deficit/hyperactivity disorder: An unsupervised machine learning approach. Appl. Neuropsychol.23, 110. 10.1080/23279095.2024.2343022

  • 97

    Roebuck-SpencerT. M.VincentA. S.GillilandK.JohnsonD. R.CooperD. B. (2013). Initial clinical validation of an embedded performance validity measure within the automated neuropsychological metrics (ANAM). Arch. Clin. Neuropsychol.28, 700710. 10.1093/arclin/act055

  • 98

    RoorJ. J.PetersM. J.Dandachi-FitzGeraldB.PondsR. W. (2024). Performance validity test failure in the clinical population: A systematic review and meta-analysis of prevalence rates. Neuropsychol. Rev.34, 299319. 10.1007/s11065-023-09582-7

  • 99

    RoseF. E.HallS.Szalda-PetreeA. D. (1995). Portland digit recognition test-computerized: measuring response latency improves the detection of malingering. Clin. Neuropsychol.9, 124134. 10.1080/13854049508401594

  • 100

    SchatzP.GlattsC. (2013). “Sandbagging” baseline test performance on ImPACT, without detection, is more difficult than it appears. Arch. Clin. Neuropsychol.28, 236244. 10.1093/arclin/act009

  • 101

    SchroederR. W.MartinP. K.HeinrichsR. J.BaadeL. E. (2019). Research methods in performance validity testing studies: Criterion grouping approach impacts study outcomes. Clin. Neuropsychol.33, 466477. 10.1080/13854046.2018.1484517

  • 102

    SchroederR. W.Twumasi-AnkrahP.BaadeL. E.MarshallP. S. (2012). Reliable digit span: A systematic review and cross-validation study. Assessment19, 2130. 10.1177/1073191111428764

  • 103

    ScimecaL. M.HolbrookL.RhoadsT.CernyB. M.JennetteK. J.ReschZ. J.et al. (2021). Examining Conners continuous performance test-3 (CPT-3) embedded performance validity indicators in an adult clinical sample referred for ADHD evaluation. Dev. Neuropsychol.46, 347359. 10.1080/87565641.2021.1951270

  • 104

    ScottJ. C.MooreT. M.RoalfD. R.SatterthwaiteT. D.WolfD. H.PortA. M.et al. (2023). Development and application of novel performance validity metrics for computerized neurocognitive batteries. J. Int. Neuropsychol. Soc.29, 789797. 10.1017/S1355617722000893

  • 105

    SharlandM. J.WaringS. C.JohnsonB. P.TaranA. M.RusinT. A.PattockA. M.et al. (2018). Further examination of embedded performance validity indicators for the Conners' Continuous Performance Test and Brief Test of Attention in a large outpatient clinical sample. Clin. Neuropsychol.32, 98108. 10.1080/13854046.2017.1332240

  • 106

    ShermanE. M.SlickD. J.IversonG. L. (2020). Multidimensional malingering criteria for neuropsychological assessment: A 20-year update of the malingered neuropsychological dysfunction criteria. Arch. Clin. Neuropsychol.35, 735764. 10.1093/arclin/acaa019

  • 107

    ShuraR. D.MiskeyH. M.RowlandJ. A.Yoash-GantzR. E.DenningJ. H. (2016). Embedded performance validity measures with postdeployment veterans: Cross-validation and efficiency with multiple measures. Appl. Neuropsychol.23, 94104. 10.1080/23279095.2015.1014556

  • 108

    SiedlikJ. A.SiscosS.EvansK.RolfA.GallagherP.SeeleyJ.et al. (2015). Computerized neurocognitive assessments and detection of the malingering athlete. J. Sports Med. Phys. Fitness56, 10861091.

  • 109

    SinghS.GermineL. (2021). Technology meets tradition: A hybrid model for implementing digital tools in neuropsychology. Int. Rev. Psychiat.33, 382393. 10.1080/09540261.2020.1835839

  • 110

    SlickD. J.HoopG.StraussE. (1995). The Victoria Symptom Validity Test.Odessa, FL: Psychological Assessment Resources. 10.1037/t27242-000

  • 111

    SobleJ. R.AlversonW. A.PhillipsJ. I.CritchfieldE. A.FullenC.O'RourkeJ. J. F.et al. (2020). Strength in numbers or quality over quantity? Examining the importance of criterion measure selection to define validity groups in performance validity test (PVT) research. Psychol. Inj. Law13, 4456. 10.1007/s12207-019-09370-w

  • 112

    SweetJ. J.HeilbronnerR. L.MorganJ. E.LarrabeeG. J.RohlingM. L.BooneK. B. (2021). American Academy of Clinical Neuropsychology (AACN) 2021 consensus statement on validity assessment: Update of the 2009 AACN consensus conference statement on neuropsychological assessment of effort, response bias, and malingering. Clin. Neuropsychol.35, 10531106. 10.1080/13854046.2021.1896036

  • 113

    WinterD.BrawY. (2022). Validating embedded validity indicators of feigned ADHD-associated cognitive impairment using the MOXO-d-CPT. J. Atten. Disord.26, 19071913. 10.1177/10870547221112947

  • 114

    ZygourisS.TsolakiM. (2015). Computerized cognitive testing for older adults: a review. Am. J. Alzheimer's Dis. Other Dement.30, 1328. 10.1177/1533317514522852

Summary

Keywords

performance validity, malinger, feign, digital, artificial intelligence, technology, neuropsychology, computerized

Citation

Finley J-CA (2024) Performance validity testing: the need for digital technology and where to go from here. Front. Psychol. 15:1452462. doi: 10.3389/fpsyg.2024.1452462

Received

20 June 2024

Accepted

29 July 2024

Published

13 August 2024

Volume

15 - 2024

Edited by

Alessio Facchin, Magna Graecia University, Italy

Reviewed by

Ruben Gur, University of Pennsylvania, United States

Tyler M. Moore, University of Pennsylvania, United States, in collaboration with reviewer RG

Rachael L. Ellison, Rosalind Franklin University of Medicine and Science, United States

Updates

Copyright

*Correspondence: John-Christopher A. Finley

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Outline

Cite article

Copy to clipboard


Export citation file


Share article

Article metrics