- 1State Key Laboratory of Cognitive Science and Mental Health, Institute of Psychology, Chinese Academy of Sciences, Beijing, China
- 2Department of Psychology, University of Chinese Academy of Sciences, Beijing, China
Large language models (LLMs) have shown promise in generating personality-tailored persuasive messages, yet their effectiveness remains inconsistent across contexts. This research systematically investigated how the characteristics of recommended products or actions shapes the efficacy of LLM- generated personality-tailored persuasion through three experimental studies (N = 618). Study 1 revealed that personality-matching effects were limited and inconsistent when the core features of the recommended product or action were not controlled. Qualitative analysis suggested that uncontrolled semantic variation across personality framings obscured persuasive effects. Study 2 demonstrated that explicitly anchoring messages to core product or action features—while allowing stylistic variation—produced robust personality-matching effects across multiple traits and topics. Study 3 extended findings across diverse domains (health, consumer products, entertainment, prosocial behavior) and confirmed that topic-specific stereotypes systematically influence message effectiveness independent of recipient personality. Messages aligning with widely shared topic expectations (e.g., high-Extraversion framing for music festivals, high-Agreeableness framing for donations) were preferred across audiences regardless of individual traits. These findings reveal two critical boundary conditions for LLM-based personalized persuasion: stabilizing core content through semantic anchoring facilitates the emergence of personality-matching effects, and topic stereotypes create baseline preferences that may amplify or attenuate personalization benefits. Practically, effective implementation requires anchoring core content while modulating style, and evaluating topic stereotypes before applying trait customization. This work clarifies when and how generative AI can reliably enhance persuasive communication across mental health, public health, and consumer domains. Study 3 extended the findings across diverse domains (health, consumer products, entertainment, and prosocial behavior) and suggested a consistent role of topic-specific stereotypes in shaping message effectiveness, above and beyond recipient personality.
1 Introduction
Persuasive communication exerts a profound and pervasive influence on human life. Within mental health contexts, persuasive messages can motivate individuals to cultivate adaptive habits (e.g., maintaining regular mindfulness practice or attending therapy (1)), facilitate help-seeking (e.g., engaging in counseling or psychiatric care (2, 3)), and modify maladaptive cognitions that undermine psychological well-being (e.g., rumination or self-stigma (4, 5)). Comparable mechanisms underpin persuasive interventions across diverse societal domains, including public health (e.g., vaccination campaigns (6–8)), environmental sustainability (e.g., energy conservation (9, 10)), political participation (e.g., voter turnout (11, 12)), and consumer behavior (e.g., adoption of ethical products (13, 14)). Designing and delivering messages that consistently and effectively shape beliefs, attitudes, and behaviors across these contexts remains a central challenge for both researchers and practitioners.
Extensive research indicates that tailoring persuasive messages to the audience’s psychological characteristics enhances their effectiveness. Specifically, aligning message features with recipients’ enduring traits—such as personality, motivations, attitudes, or identity—can heighten perceived relevance and self-referencing, thereby facilitating attitude and behavioral change (15–17). This alignment is particularly consequential in mental health contexts, where generic interventions frequently suffer from low adherence and disengagement. Recent studies demonstrate that personalization strategies can substantially improve adherence to web-based intervention protocols (1) and foster sustained engagement with digital therapeutic resources (18, 19). Notably, personalization yields stronger effects when it targets psychologically meaningful dimensions rather than relying on superficial cues (20–22). The Big Five personality framework, the dominant taxonomy in personality psychology, has shown strong empirical robustness, cross-cultural replicability, and broad coverage of trait variation (23, 24). Specifically, prior research has shown that Big Five traits are systematically associated with individuals’ responses to persuasive messages (25–28).
The rapid emergence of large language models has opened unprecedented opportunities for automating persuasive message generation. Recent evidence suggests that LLM-produced texts can match or even surpass human-authored content in persuasive impact (29–35). This development raises an important question: Can LLMs be harnessed to generate personalized persuasive messages that account for individual differences, such as personality traits? Emerging empirical findings offer encouraging evidence. For example, Simchon et al. (36) demonstrated that ChatGPT-generated political advertisements congruent with recipients’ Openness were significantly more persuasive than incongruent versions. Extending these findings, Matz et al. (37) conducted four studies comprising seven experiments and found that GPT-generated personalized messages consistently outperformed generic baselines across consumer, political, and health domains. Personality-matched messages were perceived as more persuasive—particularly for Extraversion and Openness—and, in some cases, enhanced participants’ willingness to pay for target products.
Nonetheless, findings on the efficacy of LLM-based personalized persuasion remain mixed. Large-scale experiments have shown that dynamically targeted AI-generated messages do not always outperform non-targeted versions (38). Similarly, Xu and Zhao (39) found that personality-label-based prompting (e.g., “high openness”) elicited personality-matching effects only under limited topic conditions. Taken together, these results suggest that the success of LLM-driven personalization is not unconditional but shaped by identifiable boundary conditions. Given the expanding role of generative AI in shaping persuasive communication in diverse domains, clarifying when and how such personalization succeeds is theoretically and societally imperative (22).
Many factors shape persuasive effectiveness, and the inherent content requirements of the recommended product or action constitute a particularly consequential boundary condition. Persuasion research shows that argument quality—the strength, relevance, and evidentiary support of a message—critically determines effectiveness under conditions of thoughtful processing (40). Yet what counts as strong argumentation varies systematically across domains: health-behavior messages achieve maximal effectiveness when they successfully target perceived barriers, benefits, self-efficacy, and threat perceptions (41, 42); environmental persuasion proves most compelling when messages activate biospheric values and affirm prosocial self-concepts (43); whereas consumer persuasion is strengthened when content aligns with the brand’s dominant concept—whether functional, symbolic, or experiential (44). These domain-specific patterns indicate that each recommended product or action imposes distinct semantic and functional requirements —or “core features,” such as the necessity of highlighting protective efficacy for a vaccine or practical utility for a smartwatch—that constrain which message content can produce persuasive outcomes; thus, even personality-tailored messages should also satisfy these specific content demands to achieve meaningful impact. This observation suggests a potential solution: anchoring the key semantic features of the recommended product or action in the prompt may help stabilize personality-tailored messages while preserving topic-central content. We refer to this prompt-design principle as semantic anchoring.
Concurrently, audiences may hold preexisting stereotyped perceptions of different products or actions. Research on brand and product personality demonstrates that consumers reliably attribute human-like traits to brands and products, forming stable and measurable personality impressions (45–47). Accordingly, some topic-linked impressions can be meaningfully characterized in personality terms and may partially overlap with the Big Five framework (48). Importantly, such topic stereotypes can shape the effectiveness of personality-tailored persuasion by predisposing audiences to experience certain personality framings as more natural, appropriate, or credible than others. For example, health-prevention topics may be perceived as “responsible” or “careful,” whereas high-tech product may be viewed as “curious” or “innovative.” These implicit stereotypes may systematically advantage some personality framings of message over others, thereby obscuring or amplifying apparent matching.
Despite growing evidence that perceptions of the product or action critically shape persuasive outcomes, research on LLM-based personality-tailored persuasion has paid limited attention to this factor. To address this gap, the present work systematically investigates how the characteristics of recommended products or actions influence the effectiveness of LLM-generated personality-tailored persuasion. Across three studies, we employ a stepwise design that progresses from relatively open- ended, trait-consistent prompting—i.e., instructing the LLM to generate messages aligned with a specified Big Five trait—to increasingly constrained generation grounded in topic-central features. We focus on two theoretically grounded questions: (a) whether semantically anchoring prompts by pre-specifying the core features of the recommended product or action helps stabilize personality-matching effects, and (b) whether topic-specific stereotypes systematically influence the relative persuasiveness of different personality framings. By answering these questions, the current research seeks to identify key boundary conditions and influential factors governing LLM-driven personality-tailored persuasion, offering new insights into how such models may be more reliably leveraged for persuasive communication in broad domains. Building on this foundation, we propose two hypotheses.
H1: Fixing the core features of the recommended product or action in the prompt facilitates the emergence of personality-matching effects.
H2: Topic-specific stereotypes influence how audiences respond to personality-tailored persuasive messages.
2 Study 1
Study 1 was designed to examine whether personality-tailored messages generated from previously validated prompts can produce reliable personality-matching effects, when the content of the persuasive messages were not further controlled.
2.1 Materials and methods
In Study 1, we recruited 244 participants through Wenjuanxing, the largest online survey platform in China. The inclusion criteria were: (1) age between 18 and 60 years; (2) fluency in Chinese; and (3) provision of informed consent. The exclusion criterion was a self-reported history of any psychiatric disorder. All eligibility criteria were assessed based on self-report due to the online nature of the study. Prior to data analysis, 55 participants were excluded from analyses due to failure to pass at least one of two attention-check items, resulting in a final analytic sample of 189 participants. The mean age of participants was 23.89 years (SD = 6.06), and the sample was approximately balanced by gender (92 males, 97 females). Most participants (95.8%) reported having a bachelor’s degree or higher.
Using the GPT-4 API (version gpt-4-1106-preview) with temperature set to 0.7 to allow stylistic variation and seed fixed at 1234 to ensure reproducibility, we generated personality-tailored persuasive messages for two representative topics: influenza vaccination (public health) and smartwatches (consumer product). Each topic was paired with two Big Five traits—Openness to Experience and Conscientiousness—at both high and low levels, yielding eight messages in total. We manipulated one trait at a time to ensure analytical clarity in isolating the focal mechanisms of interest. Manipulating multiple traits simultaneously would substantially complicate interpretation, as null effects could stem from ineffective trait-level personalization, cross-trait interactions, or attenuation by topic stereotypes. Examples of the prompts and model-generated messages for the smartwatch condition are shown in Table 1 (see Supplementary Material, Section 1.2 for the complete set of prompts and generated messages.) The prompting framework followed the structured, role-based approach proposed by Xu and Zhao (39), which employs personality cues to guide message generation while instructing the model to avoid explicitly referencing these traits in the output.
To enrich the personality cues within this framework, we incorporated trait descriptors inspired by Matz et al. (37)’s application of the Ten-Item Personality Inventory (TIPI; (49)) in personalized persuasion research. Specifically, GPT-4 was prompted to write messages for individuals described as open to new experiences and artistic versus down-to-earth and traditional (Openness), and as dependable and organized versus disorganized and careless (Conscientiousness). This descriptor-based strategy preserved the original prompt structure while providing clearer linguistic cues.
To verify that the generated messages conveyed the intended personality orientations, ten graduate students majoring psychology independently classified each stimulus as reflecting high or low Openness or Conscientiousness. Expert agreement reached 100%, indicating that the personality framings were unambiguously identifiable.
After providing informed consent, participants read paired messages within each topic, with the presentation order of high- and low-trait versions randomized to control for order effects. After each pair, participants evaluated the two messages comparatively on three dimensions: perceived persuasiveness (“This message is persuasive”), attitude change (“After reading this message, I feel more positive toward the topic”), and behavioral inclination (“After reading this message, I am more likely to take the suggested action”). Following (37), each comparison employed an 11-point bipolar scale ranging from 1 (“Message A is more persuasive”) to 11 (“Message B is more persuasive”), with the midpoint of 6 indicating that both messages were perceived as equally persuasive. The bipolar format minimized response substitution bias Graham and Coppock (50) and provided a direct measure of participants’ relative evaluations of message effectiveness.
Participants then completed the Chinese version of the Big Five Inventory–44 (BFI–44; (51)) to assess their personality traits on a 5-point Likert scale ranging from 1 (strongly disagree) to 5 (strongly agree). Internal consistency was acceptable for the focal dimensions of Openness (α = .77) and Conscientiousness (α = .85).
To test the personality-matching hypothesis (H1), participants were categorized into high and low groups for each focal trait using a median-split approach. This approach is a commonly used and broadly accepted analytical practice in psychological research (52–55), enabling a clear test of the theoretically specified personality-matching hypotheses. Participants whose scores fell exactly at the median were excluded from the corresponding trait analysis to ensure clear group differentiation. To examine topic stereotype effects (H2), one-sample tests were used to assess whether overall message preferences (including all participants) deviated from the scale midpoint. All statistical tests were two-tailed with an alpha level of .05, and false discovery rate (FDR) correction was applied to account for multiple comparisons.
2.2 Results and discussion
To test the personality-matching effect, participants were median-split on the focal traits, and independent-samples tests were conducted on the high-trait message preference scores for each condition. Figure 1 displays the distributions and group comparisons across all trait–topic combinations; detailed statistics are reported in Supplementary Material Table S4. After false discovery rate (FDR) correction, a significant matching effect emerged only for the influenza-vaccine messages tailored to Conscientiousness. Participants high in Conscientiousness showed stronger preferences for the high-Conscientiousness message than did low Conscientiousness participants on all three outcomes: perceived persuasiveness, (173.75) = 4.38, < .001, = 0.66; attitude change, (173.98) = 4.28, < .001, = 0.64; and behavioral inclination, (173.98) = 4.79, < .001, = 0.72. No significant effects were observed for any other trait–topic combinations.
Figure 1. Effects of trait-message congruence on persuasion outcomes in Study 1. Raincloud plots display distributions, individual data points, and boxplots for high-trait participants (green) versus low-trait participants (orange) across different message topics and personality traits. (A–F) show smartwatch messages; (G–L) show influenza vaccine messages. P-values from independent samples t-tests comparing preference for high-trait messages between groups are displayed above each comparison. *p<.05, **p<.01, ***p<.001.
To assess the possible effects of topic stereotype, one-sample tests compared participants’ mean preference scores for each condition against the midpoint 6 of the 11-point scale, where scores above 6 indicated a preference for the high-trait message(see Supplementary Table S5 in the Supplementary Material for full results). In the smartwatch topic, preference scores for messages tailored to Openness and Conscientiousness were significantly below the midpoint across all outcomes (all s < .001, s 0.24–0.43), indicating an overall advantage for low-Openness and low-Conscientiousness messages. In contrast, in the influenza-vaccine topic, high-Conscientiousness messages showed a small but reliable advantage in perceived persuasiveness, (188) = 2.24, = .026, = 0.16, whereas low-Openness messages were strongly preferred across all outcomes (all s < .001, s 0.82–1.01). These patterns remained significant after FDR correction. When controlling for demographic variables including age and gender, the overall pattern of results remained consistent, providing additional evidence for the robustness of the observed effects.
Study 1 examined whether personality-tailored messages generated from previously validated prompts could produce reliable matching effects when the substantive features of the message content were not controlled. The expected matching patterns did not consistently emerge, with a significant effect appearing only in the Conscientiousness influenza vaccination condition. To better understand this inconsistency, we conducted qualitative inspections of the generated messages, and revealed notable variation in how different personality framings represented the product or action’s essential features. For instance, in the smartwatch condition (Table 1), the low-Openness message foregrounded concrete utilities such as health monitoring and mobile payment (e.g., “one-tap access to health data, calls, and payments”), whereas the high-Openness message shifted toward symbolic and experiential themes (e.g., “a creative extension of your individuality” and “a timepiece of art”). Although stylistically appropriate, the latter paid comparatively little attention to the functional benefits that typically anchor persuasion in this topic—consumers generally choose smartwatches for their core practical functions rather than their ability to express individuality. This example illustrates how, in the absence of content constraints, personality-tailored messages can diverge in semantic focus. Such divergence may weaken argument strength and help explain why personality-matching effects were difficult to detect.
In addition to these inconsistencies, the results also showed cross-group preference. In the smartwatch topic, low-Openness and low-Conscientiousness framings were generally preferred, whereas in the influenza-vaccine topic, high-Conscientiousness and low-Openness framings received broader endorsement. Given that personality matching effects were not consistently detected, it remains unclear whether these audience-wide preferences reflect topic stereotypes or result from unsuccessful personality tailoring. If personality-tailored messages inadequately conveyed the core features of the recommended product or action, this could undermine their persuasiveness and similarly produce such generalized preferences. Further experimentation is needed to distinguish between these potential sources.
Taken together, the findings of Study 1 suggest that constraining the core features of the recommended product or action may be a useful—and perhaps necessary—step for enabling LLMs to generate effective personality-tailored persuasive messages. Future research should test the efficacy of this approach and further examine whether stereotype-consistent baseline preferences reliably emerge across topics.
3 Study 2
Building on the findings of Study 1, Study 2 tested whether anchoring the core features of the recommended product or action would enable personality-tailored persuasive messages to produce stable personality-matching effects.
3.1 Materials and methods
Study 2 replicated the procedure of Study 1 with one key methodological improvement: semantic anchoring. Prior LLM-based personality-tailored persuasion research has typically allowed free message generation, which may simultaneously vary both substantive content and expressive style (37, 38). To control for substantive content and isolate stylistic variations, the high- and low-trait messages were constrained to convey identical core features of the recommended product or action, ensuring that the observed personality-targeted expressions did not alter the core points of the message. For example, in the smartwatch condition, we added the instruction “highlight concrete daily-management functions such as health monitoring, answering calls, and mobile payment” to the prompt, thereby fixing the key product features communicated across all generated persuasive messages (Table 2; see Supplementary Material, Section 2.2 for all materials).
We recruited 269 participants through Wenjuanxing. The inclusion and exclusion criteria were identical to those in Study 1. Prior to data analysis, participants who failed at least one of two attention-check items were excluded (), yielding a final analytic sample of 219 participants (116 males, 103 females; = 26.37, = 4.21). Most participants (99.1%) held a bachelor’s degree or higher.
All stimuli in this study were pre-validated using the same expert-judgment procedure employed in Study 1, ensuring that each message clearly expressed its intended personality orientation.
As in Study 1, participants evaluated paired messages on an 11-point bipolar scale assessing relative persuasiveness, attitude change, and behavioral inclination. They then completed the Chinese BFI-44 (51), which demonstrated good internal consistency for Conscientiousness (α = .90) and Openness (α = .78).
Data analysis procedures followed the same approach as Study 1, using median-split categorization (excluding median scores) to test personality-matching effects and one-sample tests to examine topic stereotypes, with FDR correction applied.
3.2 Results and discussion
Compared with Study 1, where the personality-matching effect was limited to the vaccine–Conscientiousness condition, Study 2 revealed a broader pattern once message content was standardized (Figure 2; detailed statistics in Supplementary Material Table S9). Independent-samples tests showed significant personality-matching effects across nearly all conditions (s < .05, s = 0.32–0.45). The smartwatch–Conscientiousness condition also exhibited a small effect in the predicted direction ( = 0.26), although the associated test did not reach statistical significance, (217) = 1.80, = .072.
Figure 2. Effects of trait-message congruence on persuasion outcomes in Study 2. Raincloud plots display distributions, individual data points, and boxplots for high-trait participants (green) versus low-trait participants (orange) across different message topics and personality traits. (A–F) show smartwatch messages; (G–L) show influenza vaccine messages. P-values from independent samples t-tests comparing preference for high-trait messages between groups are displayed above each comparison. *p<.05, **p<.01, ***p<.001.
One-sample tests further confirmed cross-group preference for messages framed according to specific personality traits (see Supplementary Table S10 in the Supplementary Material for full results). For the smartwatch topic, participants showed an overall preference for low-Openness messages (s(218) = to , s < .010, s = 0.17–0.19). For the vaccine topic, participants showed overall preferences for both low-Openness (s(218) = to , s < .001, s = 0.41–0.43) and high-Conscientiousness messages (s(218) = 7.07–7.25, s < .001, s = 0.48–0.49). All effects remained significant after FDR correction and were unaffected by demographic controls.
Study 2 tested H1 by examining whether fixing the core features of the recommended product or action would allow LLM-generated personality-matching effects to emerge more reliably. The results strongly supported this hypothesis: once semantic drift was minimized, stylistic variations aligned with personality traits produced stable matching effects across topics. Building on this foundation of successful personality tailoring and consistent matching effects, the cross-group preferences observed in Study 2 are more plausibly interpreted as stable, stereotype-consistent preference patterns. For instance, low-Openness framings for smartwatches were rated more favorably overall—likely reflecting the perception of smartwatches as primarily functional, utility-oriented devices. Likewise, high-Conscientiousness framings for influenza vaccination received higher ratings across recipients, consistent with the widely shared view of vaccination as a responsible, duty-oriented health behavior. These tendencies parallel the topic stereotypes identified in Study 1, suggesting that such expectations can shape how audiences respond to the messages independently of their own personality traits, even when semantic content is held constant.
As in Study 1, Study 2 focused on two representative persuasive topics from public health and consumer domains—receiving an influenza vaccine and purchasing a smartwatch—and examined two personality traits: Conscientiousness and Openness. Important questions remain for future research: Does semantic anchoring remain effective for LLM-based personality customization across other topics and traits? Do topic stereotype effects persist across a broader range of persuasion contexts? Further empirical investigation is needed to address these questions and establish the generalizability of the present findings.
4 Study 3
Study 3 was designed to examine whether topic-specific stereotypes systematically shape responses to personality-tailored persuasive messages across a broader set of topics (H2), while also providing a further test of whether personality-matching effects could be stably achieved in broader conditions when message content is anchored to the core features of the recommended product or action (H1).
4.1 Materials and methods
Study 3 employed the same design and procedure as Study 2, extending the paradigm to include multiple topics and additional personality traits. We recruited 269 participants through Wenjuanxing. The inclusion and exclusion criteria were identical to those in Study 1. Priorto data analysis, participants who failed at least one of two attention-check items were excluded (), yielding a final analytic sample of 210 participants. The final sample (102 males, 108 females; = 24.18, = 2.79) was highly educated (99.5% held a bachelor’s degree or higher).
Five persuasive topics recommending different products or actions—each likely reflecting prototypical personality-related stereotypes—were identified through a priori consensus among three psychologists with expertise in personality and social psychology, who independently reviewed the topics and agreed on their associations with the Big Five traits. The selected topics included: photography workshops (high Openness), Pop Mart blind boxes (low Conscientiousness), Strawberry Music Festival (high Extraversion), disaster relief donations (high Agreeableness), and C’estbon bottled water (neutral). For each stereotype-associated topic, GPT-4 generated both high- and low-trait messages; for the neutral topic, high- and low-trait versions were created for Openness and Conscientiousness, yielding 12 messages in total (see Supplementary Material, Section S3.2 for all prompts and generated messages).
Message generation followed the same prompting framework as in Study 2, using semantic anchoring to ensure content equivalence across conditions. The core features defined by subject-matter experts included: creative use of light, composition aesthetics, and artistic vision development for photography workshops; surprise, trendy collectibles, and effortless enjoyment through the blind box mechanism for Pop Mart blind boxes; diverse atmosphere, open and free environment, and immersive musical experience for the music festival; post-disaster recovery support and concrete assistance that brings hope to affected communities for donations; and pure water quality, refreshing taste, and reliable health and safety for bottled water. All stimuli were pre-validated following the same expert-judgment procedure as in Study 1 to confirm personality-trait clarity.
Measures and procedures were identical to those in Study 2. Participants compared paired high- and low-trait messages on three 11-point bipolar scales assessing perceived persuasiveness, attitude change, and behavioral inclination (37, 50). They then completed the Chinese BFI–44 (51) assessing Openness (), Conscientiousness (), Extraversion (), and Agreeableness ().
Data analysis followed the same procedures as in Studies 1 and 2. Median-split categorization (excluding participants at the median) was used to test personality-matching effects, and one-sample t tests examined topic stereotype effects across all participants.
4.2 Results and discussion
Independent-samples t tests revealed significant personality-matching effects in 16 of the 18 experimental conditions (s < .05), all of which remained significant after FDR correction (Figure 3; detailed statistics in Supplementary Material Table S14). Effect sizes ranged from to , indicating overall medium effects. The two nonsignificant conditions exhibited the same directional pattern.
Figure 3. Effects of trait-message congruence on persuasion outcomes in Study 3. Raincloud plots display distributions, individual data points, and boxplots for high-trait participants (green) versus low-trait participants (orange) across different message topics and personality traits. (A–C) show photography workshop messages; (D–F) show blind box messages; (G–I) show music festival messages; (J–L) show donation messages; (M–R) show bottled water messages. P-values from independent samples t-tests comparing preference for high-trait messages between groups are displayed above each comparison. *p<.05, **p<.01, ***p<.001.
One-sample t tests were conducted to examine the effects of topic stereotype, testing whether message preferences significantly differed from the scale midpoint 6 (see Supplementary Table S15 in the Supplementary Material for full results). 13 of the 18 conditions were significant (s < .05) and remained so after FDR correction. Effect sizes ranged from d = 0.18 to 0.74, suggesting that certain personality-tailored messages demonstrated persuasive advantages across audiences. All four topics theoretically associated with specific personality traits—including photography workshops (high Openness), music festivals (high Extraversion), disaster donations (high Agreeableness), and blind boxes (low Conscientiousness)—showed significant preferences consistent with theoretical predictions. Specifically, high-Openness messages for photography workshops were rated more favorably (s(209) = 4.48–5.00, s < .001, s = 0.31–0.35), as were high-Extraversion messages for music festivals (s(209) = 9.37–10.67, s < .001, s = 0.65–0.74) and high-Agreeableness messages for disaster donations (s(209) = 7.09–7.56, s < .001, s = 0.49–0.52), whereas the blind box topic favored low-Conscientiousness messages (s < .05, – ). In contrast, message preferences for bottled water topic did not deviate from the midpoint (all < 0.20), consistent with its theoretically neutral nature. After controlling for demographic covariates (age and gender), the overall pattern of results remained unchanged, confirming the robustness of both personality-matching and topic stereotype effects.
Study 3 extended the examination to a broader set of topics and personality traits and asked whether the patterns observed in the earlier studies would generalize. Across this wider range, we again found stable personality-matching effects under semantically anchored prompts, further corroborating H1. At the same time, clear stereotype-consistent preference patterns emerged: framing styles such as high-Openness wording for photography workshops, high-Extraversion wording for music festivals, high-Agreeableness wording for disaster-relief donations, and low-Conscientiousness wording for blind boxes received more favorable evaluations across recipients 378 regardless of their own personalities. These topic stereotype effects indicate that message effectiveness is shaped not only by personality congruence with the recipient but also by the degree to which a framing style fits widely shared expectations about the topic, providing strong support for H2.
5 General discussion
Across three studies, the present research systematically examined how the effectiveness of LLM-generated personality-tailored persuasion is shaped by content constraints and topic-specific expectations. Study 1 showed that when the core features of the recommended product or action were not controlled, some personality-targeted versions drifted away from the topic’s functional center, and the resulting cross-version variability in argument strength contributed to limited and inconsistent personality-matching effects. At the same time, Study 1 revealed cross-group preferences, though it remained unclear whether these reflected topic stereotypes or resulted from unsuccessful personality tailoring.
Study 2 addressed the content-drift problem by anchoring the core features of the recommended product or action. Under these semantically aligned conditions, robust personality-matching effects emerged across multiple traits and topics, providing clear support for H1. Study 2 also provided clear evidence of topic stereotype effects, demonstrating that topic-specific expectations systematically influence the persuasive impact of messages—independently of recipient personality—even when semantic content is held constant.
Study 3 extended the investigation to a broader and more heterogeneous set of topics. The persistence of personality-matching effects across nearly all conditions demonstrates that semantic anchoring could stabilize these effects in broader persuasive contexts, offering additional confirmation of H1. Crucially, these observed personality-matching effects reflect systematic tendencies across relative personality levels within our samples. At the same time, Study 3 showed that the stereotype effects appear to be a pervasive phenomenon in LLM-generated personality-tailored persuasion, offering strong and convergent evidence for H2.
5.1 Why semantic anchoring stabilizes personality-matching effects
A central finding of this work is that personality-matching effects emerged more reliably when message content was anchored to the core features of the recommended product or action. This result suggests that semantic anchoring operates as a methodological safeguard against LLM-specific semantic drift, helping to ensure that observed matching effects are more likely to reflect stylistic congruence rather than unintended variation in argument quality. In contrast, without such anchoring, trait-targeted message variants may differentially emphasize core benefits versus peripheral appeals, leading to inconsistent argument strength that obscures matching effects. This key improvement to the standard practice in prior LLM-based personality-tailored persuasion research (37, 38) offers an innovative way to better leverage the message generation capabilities of large language models.
Mechanistically, this drift may stem from how LLMs interpret and expand trait-descriptive prompts. The TIPI adjective pairs used in our instructions—such as “open to new experiences and artistic” versus “down-to-earth and traditional” for Openness or “dependable and organized” versus “disorganized and careless” for Conscientiousness (49)—function as compact semantic cues. Rather than treating these cues as markers of deeper motivational orientations, LLMs process them primarily as stylistic signals. Prior work shows that LLMs rely heavily on pattern completion and surface-level linguistic markers (56, 57). Thus, when prompted with adjectives such as “artistic” or “dependable,” the model expands these terms through highly patterned expressions (e.g., “explore the unknown,” “discover life’s artistry,” “pursue excellence,” “enhance efficiency”) that recur across topics regardless of their functional demands. This pattern-matching process reliably produces stylistic alignment with the trait cues but does not ensure that the content meaningfully engages with the core features of the recommended product or action.
Crucially, nothing in this process provides a mechanism for linking personality traits to the motivational routes or argument structures that would be persuasive for a given behavior. TIPI descriptors do not specify which concrete concerns, benefits, or facilitators are most compelling for people high versus low in a trait. Correspondingly, current LLMs lack an inference system capable of constructing a structured chain such as trait cue → motivational orientation → functionally strong arguments.
Instead, they rely on unambiguous stylistic markers—such as symbolic phrases like “timepiece of art” to signal Openness, or functional terms like “efficient” and “excellence” for Conscientiousness—to represent personality. This reliance helps explain why the generated messages were easily identifiable by expert raters (Study 1), yet did not yield consistent persuasive effects. In the absence of such a mapping, generation is guided by semantic proximity rather than by psychologically meaningful personality–behavior linkages.
By contrast, human-authored personalization explicitly fills in this missing layer. In Hirsh et al. (25), designers first identified the motivational priorities theoretically associated with each Big Five trait—for example, order, efficiency, and achievement for Conscientiousness, or creativity, novelty, and intellectual exploration for Openness—and then crafted arguments that expressed these priorities while preserving coherence with the product’s core features. The resulting messages differed in motivational framing but remained anchored to the topic’s functional benefits, ensuring both relevance and persuasive adequacy.
Consequently, while human-authored messages achieve motivational alignment without compromising argument strength, unconstrained LLM outputs tend to exhibit content drift and stylistic templating. The persuasive materials used in our studies illustrate this mechanism clearly. Conceptually, smartwatches are typically positioned as functional products that solve practical problems—monitoring health data, managing notifications, and enabling mobile payments. Yet the LLM-generated messages diverged in how these core features were represented: the low-Openness version foregrounded these practical utilities, whereas the high-Openness version, driven by cues such as “open to new experiences” and “artistic,” shifted toward symbolic and experiential themes with limited attention to essential functions. This semantic drift moves the message away from the functional script that usually anchors persuasion in this category, weakening argument strength and making congruence less consequential. In line with evidence that matched messages are persuasive only when they offer strong, functionally relevant arguments—because motivational congruence encourages more systematic processing of the content (58)—such stylistically personalized but substantively weak messages may fail to produce personality-matching effects.
Anchoring the core features of the recommended product or action in the prompt directly targets this problem. By first specifying the functional backbone of the message—such as health monitoring, call management, mobile payment for smartwatches, or disease prevention, personal protection for vaccination—all generated messages are forced to cover the same set of core benefits and risks, regardless of personality framing. Within this constrained semantic space, the trait descriptors primarily modulate how these shared elements are presented: high-Openness messages can describe mobile payment as “completing a payment with a light touch at a café corner,” whereas low-Openness messages can stress “a convenient payment feature that handles everyday shopping with ease”; high-Conscientiousness messages can frame vaccination as “a responsible, scientifically grounded way to prevent flu and protect both your own health and the health of others,” whereas low-Conscientiousness messages can emphasize “preventing flu with a quick shot so you can avoid unnecessary trouble in your daily routine.” In short, content anchoring reduces variance in argument quality and topic relevance, allowing personality-congruent stylistic cues to operate on a more even baseline.
5.2 Topic stereotypes: a key determinant of personality-tailored persuasion
Across studies, we also observed a second form of systematic variation in message effectiveness—one that arises not from individual differences but from the shared social meaning associated with each persuasive topic. We refer to this pattern as the topic stereotype effect. By “topic stereotypes,” we mean the culturally shared expectations about how a given topic is typically talked about, and what motivational or stylistic orientation “fits” the topic (59). These expectations often resemble personality-like traits along the Big Five dimensions. For instance, disaster relief donations are widely associated with warmth, care, and prosocial concern; and music festivals are linked to energy, sociability, and excitement. When the stylistic framing of a message aligns with these socially shared stereotypes, the message tends to be perceived as more appropriate and compelling—even when such framing conflicts with the recipient’s own personality profile.
Several psychological mechanisms help explain why topic–message congruence produces reliable persuasive benefits. First, different topics carry specific functional scripts: health topics emphasize safety and responsibility, while experiential topics emphasize novelty and exploration. When messages align with these scripts using matching personality styles, audiences experience a sense of appropriateness—a “feels right” effect consistent with regulatory fit theory (60, 61). Second, topic stereotypes function as shared schemas that shape expectations about how topics should be presented. Schema-congruent messages are processed more fluently, and this ease enhances evaluations (59, 62). This also explains why neutral topics show no overall advantage: without alignable schemas, stylistic cues cannot generate fluency gains. Third, for value-laden topics such as prosocial or risk-management issues, messages framed with high Agreeableness or Conscientiousness resonate more strongly with societal value frameworks, enhancing both legitimacy and relevance (63–66). Finally, cultural value frameworks may amplify certain stereotypes. In Chinese contexts, for example, public health and collective welfare topics are evaluated through strongly normative lenses, which may further elevate the persuasiveness of high-Conscientiousness or high-Agreeableness framings (67, 68).
The origins of the topic stereotype effect are fundamentally distinct from those of personality-matching effects. Personality-matching effects arise from individual differences—namely, recipients’ differential preferences aligned with their trait dispositions, whereas topic stereotype effects stem from shared contextual schemas—culturally shaped expectations about how certain topics are typically discussed. These schemas apply uniformly across the entire audience, meaning that stylistic alignment benefits all recipients rather than a trait-defined subset. Consequently, topic stereotype effects often exert broader influence on persuasive outcomes. Indeed, across our studies, the effect sizes associated with topic stereotypes were often larger than those for personality matching, underscoring the substantial and non-negligible role of topic stereotypes in personality-tailored persuasion.
5.3 Practical implications
The present findings also clarify how topic content can be operationalized in applied LLM-based personalization systems. Anchoring the core features of the recommended product or action provides a practical solution to the semantic drift that often arises when LLMs over-extend trait-descriptive cues. In real-world applications—such as product promotion, public-service messaging or mental health communication—designers can treat content anchoring as a structural scaffold: by first articulating the topic’s essential benefits, risks, and use-cases, they ensure that all generated variants share a coherent argumentative foundation. Within this stabilized semantic space, personality cues can then adjust tone and stylistic framing without altering the message’s functional meaning. This content-anchored personalization approach offers a feasible and scalable pathway for deploying personality-tailored persuasion with current-generation LLMs, which excel at stylistic modulation but require structural guidance to maintain motivationally grounded argumentation.
Furthermore, the interplay between topic stereotypes and personality matching offers potential guidance for strategic prioritization: When might practitioners invest in personality customization versus relying on stereotype-congruent messaging? Our findings suggest that the answer may depend on the strength of the stereotypes associated with the topic.
For topics with strong, culturally shared stereotypes (e.g., music festivals), our results indicate a general baseline preference for the stereotype-congruent style across the population. In such cases, practitioners may consider prioritizing stereotype-congruent messaging as a reasonable baseline to ensure stylistic legitimacy and reduce the risk of perceived inappropriateness. For instance, in mental health contexts, therapeutic communication is normatively anchored in a stereotype of “Warmth,” emphasizing empathy, emotional support, and non-judgment. By contrast, campaigns aimed at destigmatizing mental illness or promoting social welfare may align more naturally with stereotypes of “Responsibility” or “Competence,” highlighting civic duty and factual clarification. In these settings, personalization appears most beneficial for high-congruence segments, whereas for individuals whose traits conflict with the topic stereotype, adhering to the dominant normative style (e.g., maintaining Warmth in therapeutic messaging) is likely safer than aggressive personality tailoring, which may violate expectations of care and thereby undermine trust, perceived appropriateness, and adherence.
By contrast, when topics lack a strong, culturally shared stereotype (e.g., Bottled Water), where no dominant stylistic norm exists, personality customization may play a more central role. In the absence of a universally effective “default” style, LLM-driven tailoring may offer a viable path to persuasive effectiveness by adjusting the framing to resonate with divergent audience preferences. To implement this tiered approach, practitioners might consider gauging the baseline strength of topic stereotypes—potentially through pilot testing to detect dominant stylistic preferences or through exploratory analyses of historical engagement patterns—to inform whether personalization should serve as a primary consideration (in neutral contexts) or an incremental optimization.
At the same time, it is important to delineate the limits of this strategic framework for applied deployment, as the translational relevance of these insights may vary across domains that differ in consequentiality. For example, mental health communication spans a broad spectrum, ranging from relatively low-consequential contexts (e.g., public psychoeducation and the promotion of self-care behaviors) to more consequential situations (e.g., encouraging individuals to initiate, adhere to, or persist in a specific treatment). For the former, our findings offer relatively direct design guidance: content anchoring can ensure a coherent informational foundation, while adherence to topic-appropriate stylistic norms can help maintain perceived legitimacy and approachability. For the latter, however, effective personalization is likely to require a more holistic consideration of factors beyond audience personality alone, including clinical severity and risk, stage of change, and relational context. Accordingly, our results are best interpreted as most applicable to lowering the entry barrier for early-stage mental-health-related communication and scalable support, rather than as a standalone solution for the more complex, interactive dynamics characteristic of psychotherapy or treatment decision-making.
5.4 Limitations and future directions
The present study has several limitations that may inspire future work. First, with respect to measurement and analysis, the dependent measures relied primarily on self-reported persuasiveness and intentions, which are vulnerable to social desirability biases (69) and may not fully translate into actual behavior. Future research could strengthen external validity by incorporating behavioral field data, such as click-through or choice records. In addition, the use of median splits to categorize participants depends on the specific distribution of the current sample. While this approach captures relative congruency, supplementary analyses using continuous personality scores (see Supplementary Tables S1-S3, S6-S8, S11-S13) yielded consistent patterns, suggesting that the observed effects are not artifacts of dichotomization.
Second, technical constraints of current LLMs limited the complexity of our experimental manipulations. We manipulated one personality trait at a time because preliminary testing revealed that current models process multi-trait prompts additively rather than holistically (37), hindering integrated personality expression. While this single-trait approach facilitated clearer interpretation of focal mechanisms, multi-trait manipulations could be explored in future work as LLM capabilities continue to evolve. Relatedly, our reliance on concise trait descriptors (e.g., TIPI-based prompts) and semantic anchoring, while providing robust content stabilization, may limit the extent to which generated messages reflect deeper, motivation-based personality mechanisms beyond surface-level stylistic features. Future work could explore more sophisticated prompt architectures, such as chain-of-thought reasoning or plan-then-write workflows (70, 71), which may support the construction of more coherent arguments while preventing semantic drift and enabling more motivationally grounded forms of personalization. Additionally, it would be informative to benchmark LLM-generated personality-tailored messages against those crafted by human experts within the same experimental framework. Such head-to-head comparisons could help clarify the extent of the “automation gap” and the respective contributions of algorithmic scalability and human expertise in personality-tailored message design.
Third, regarding the identification of topic stereotypes in Study 3, we relied on expert consensus to establish theoretical links between topics and personality traits. While this approach ensured face validity, it did not incorporate independent empirical measurement from lay audiences prior to message generation. Future research could strengthen causal inference by adopting a two-stage design: first empirically identifying and validating topic stereotypes through norming studies (e.g., quantitative assessments of topic-linked personality perceptions), and then using verified topics to test their persuasive effects.
Finally, although the present stimuli covered diverse domains and provided an important foundation for LLM-supported mental health communication, investigation into specific mental health communication topics is highly necessary. We strongly recommend systematically incorporating mental health-relevant stimuli across diverse contexts in future research.
Data availability statement
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.
Ethics statement
The studies involving humans were approved by Institutional Review Board of the Institute of Psychology, Chinese Academy of Sciences. The studies were conducted in accordance with the local legislation and institutional requirements. The participants provided their written informed consent to participate in this study.
Author contributions
SX: Conceptualization, Formal Analysis, Investigation, Methodology, Software, Visualization, Writing – original draft, Writing – review & editing. NZ: Conceptualization, Methodology, Supervision, Writing – review & editing. ZZ: Validation, Writing – review & editing.
Funding
The author(s) declared that financial support was received for this work and/or its publication. This work was financially supported by Beijing Natural Science Foundation, IS23088.
Acknowledgments
The authors sincerely thank all participants and their contributions to this study.
Conflict of interest
The author(s) declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Generative AI statement
The author(s) declared that generative AI was used in the creation of this manuscript. Generative AI tools were used to assist with language refinement and to improve the clarity and readability of the manuscript. All conceptualization, study design, data analysis, and interpretation were conducted solely by the authors. The authors carefully reviewed, edited, and verified all AI-assisted text to ensure accuracy and scientific integrity.
Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Supplementary material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpsyt.2026.1756792/full#supplementary-material
References
1. Kelders SM, Kok RN, Ossebaard HC, and Gemert-Pijnen JEV. Persuasive system design does matter: A systematic review of adherence to web-based interventions. J Med Internet Res. (2012) 14:e2104. doi: 10.2196/jmir.2104
2. Suka M, Yamauchi T, and Yanagisawa H. Responses to persuasive messages encouraging professional help seeking for depression: comparison between individuals with and without psychological distress. Environ Health Prev Med. (2019) 24:29. doi: 10.1186/s12199-019-0786-8
3. Tam MT, Wu JM, Zhang CC, Pawliuk C, and Robillard JM. A systematic review of the impacts of media mental health awareness campaigns on young people. Health Promotion Pract. (2024) 25:907–20. doi: 10.1177/15248399241232646
4. Halsall T, Garinger C, Dixon K, and Forneris T. Evaluation of a social media strategy to promote mental health literacy and help-seeking in youth. J Consumer Health Internet. (2019) 23:13–38. doi: 10.1080/15398285.2019.1571301
5. Thompson A, Hollis S, Herman KC, Reinke WM, Hawley K, and Magee S. Evaluation of a social media campaign on youth mental health stigma and help-seeking. School Psychol Rev. (2020) 50:36–41. doi: 10.1080/2372966X.2020.1838873
6. James EK, Bokemper SE, Gerber AS, Omer SB, and Huber GA. Persuasive messaging to increase COVID-19 vaccine uptake intentions. Vaccine. (2021) 39:7158–65. doi: 10.1016/j.vaccine.2021.10.039
7. Hing NYL, Woon YL, Lee YK, Kim HJ, Lothfi NM, Wong E, et al. When do persuasive messages on vaccine safety steer COVID-19 vaccine acceptance and recommendations? Behavioural insights from a randomised controlled experiment in Malaysia. BMJ Global Health. (2022) 7:e009250. doi: 10.1136/bmjgh-2022-009250
8. Steffens MS, Bullivant B, Kaufman J, King C, Danchin M, Hoq M, et al. Testing persuasive messages about booster doses of COVID-19 vaccines on intention to vaccinate in Australian adults: A randomised controlled trial. PloS One. (2023) 18:e0286799. doi: 10.1371/journal.pone.0286799
9. Midden C, McCalley L, Ham J, and Zaalberg R. Using persuasive technology to encourage sustainable behavior: conference; Pervasive, in: International Conference on Pervasive Computing, workshop on Pervasive Persuasive Technology and Environmental Sustainability, Sydney: Australia. (2008) pp. 83–6.
10. Miller LB. From persuasion theory to climate action: insights and future directions for increasing climate-friendly behavior. Sustainability. (2025) 17:2832. doi: 10.3390/su17072832
11. Gerber AS, Huber GA, Doherty D, Dowling CM, and Panagopoulos C. Big five personality traits and responses to persuasive appeals: results from voter turnout experiments. Political Behav. (2013) 35:687–728. doi: 10.1007/s11109-012-9216-y
12. Barton J, Castillo M, and Petrie R. Negative campaigning, fundraising, and voter turnout: A field experiment. J Economic Behav Organ. (2016) 121:99–113. doi: 10.1016/j.jebo.2015.10.007
13. People&Global Business Association, Fraser JR, Chung M, and Cheon HJ. enEthical Consumption in the Digital Age: Analyzing Benefit Types, Temporal Distance, and Normative Factors for Gen Z. GLOBAL BUSINESS FINANCE REVIEW, Seoul, Korea: People & Global Business Association Vol. 28. (2023). pp. 50–67. doi: 10.17549/gbfr.2023.28.3.50.
14. Hanss D and Böhm G. Promoting purchases of sustainable groceries: An intervention study. J Environ Psychol. (2013) 33:53–67. doi: 10.1016/j.jenvp.2012.10.002
15. Dodoo NA and Wen TJ. A path to mitigating SNS ad avoidance: tailoring messages to individual personality traits. J Interactive Advertising. (2019) 19:116–32. doi: 10.1080/15252019.2019.1573159
16. Hawkins RP, Kreuter M, Resnicow K, Fishbein M, and Dijkstra A. Understanding tailoring in communicating about health. Health Educ Res. (2008) 23:454–66. doi: 10.1093/her/cyn004
17. Rimer BK and Glassman B. Tailoring communications for primary care settings. Methods Inf Med. (1998) 37:171–7. doi: 10.1055/s-0038-1634520
18. Rouvere J, Griffith Fillipo IR, Romanelli M, Sharma A, Mosser BA, Nguyen T, et al. Personalization strategies for increasing engagement with digital mental health resources: Sequential multiple assignment randomized trial. JMIR Ment Health. (2025) 12:e73188. doi: 10.2196/73188
19. Isa AK. Exploring digital therapeutics for mental health: Ai-driven innovations in personalized treatment approaches. World J Advanced Res Rev. (2024) 24:2733–49. doi: 10.30574/wjarr.2024.24.3.3997
20. Webb MS, Simmons VN, and Brandon TH. Tailored interventions for motivating smoking cessation: using placebo tailoring to examine the influence of expectancies and personalization. Health Psychology: Off J Division Health Psychology Am psychol Assoc. (2005) 24:179–88. doi: 10.1037/0278-6133.24.2.179
21. Dijkstra A. The psychology of tailoring-ingredients in computer-tailored persuasion. Soc Pers Psychol Compass. (2008) 2:765–84. doi: 10.1111/j.1751-9004.2008.00081.
22. Teeny JD and Matz SC. We need to understand “when” not “if” generative AI can enhance personalized persuasion. Proc Natl Acad Sci. (2024) 121:e2418005121. doi: 10.1073/pnas.2418005121
23. McCrae RR and Costa PT Jr. Personality trait structure as a human universal. Am Psychol. (1997) 52:509–16. doi: 10.1037/0003-066X.52.5.509
24. John O and Srivastava S. The Big Five Trait taxonomy: History, measurement, and theoretical perspectives. New York, United States: Guilford Press (1999).
25. Hirsh J, Kang S, and Bodenhausen G. Personalized persuasion: tailoring persuasive appeals to recipients’ Personality traits. psychol Sci. (2012) 23:578–81. doi: 10.1177/0956797611436349
26. Matz SC, Kosinski M, Nave G, and Stillwell DJ. Psychological targeting as an effective approach to digital mass persuasion. Proc Natl Acad Sci United States America. (2017) 114:12714–9. doi: 10.1073/pnas.1710966114
27. Winter S, Maslowska E, and Vos AL. The effects of trait-based personalization in social media advertising. Comput Hum Behav. (2021) 114:106525. doi: 10.1016/j.chb.2020.106525
28. Alqahtani F, Meier S, and Orji R. Personality-based approach for tailoring persuasive mental health applications. User Modeling User-Adapted Interaction. (2022) 32:253–95. doi: 10.1007/s11257-021-09289-5
29. Bai H, Voelkel JG, Muldowney S, Eichstaedt JC, and Willer R. LLM-generated messages can persuade humans on policy issues. Nat Commun. (2025) 16:6037. doi: 10.1038/s41467-025-61345-5
30. Spitale G, Biller-Andorno N, and Germani F. AI model GPT-3 (dis)informs us better than humans. Sci Adv. (2023) 9:eadh1850. doi: 10.1126/sciadv.adh1850
31. Karinshak E, Liu SX, Park JS, and Hancock JT. Working with AI to persuade: examining a large language model’s ability to generate pro-vaccination messages. Proc ACM Human-Computer Interaction. (2023) 7:1–29. doi: 10.1145/3579592
32. Goldstein JA, DiResta R, Sastry G, Musser M, Gentzel M, and Sedova K. enGenerative Language Models and Automated Influence Operations: Emerging Threats and Potential Mitigations. Ithaca, NY: arXiv (Cornell University Library). (2023).
33. Palmer AK and Spirling A. Large Language Models Can Argue in Convincing and Novel Ways About Politics: Evidence from Experiments and Human Judgement. London, United Kingdom: Taylor & Francis Ltd. (2023).
34. Bordia S. Using Large Language Models to Assist Content Generation in Persuasive Speaking. Intersect: The Stanford Journal of Science, Technology, and Society, (2023) 16(2).
35. Nisbett N and Spaiser V. How convincing are AI-generated moral arguments for climate action? Front Climate. (2023) 5. doi: 10.31234/osf.io/q8hsr
36. Simchon A, Edwards M, and Lewandowsky S. The persuasive effects of political microtargeting in the age of generative artificial intelligence. PNAS Nexus. (2024) 3:pgae035. doi: 10.1093/pnasnexus/pgae035
37. Matz SC, Teeny JD, Vaid SS, Peters H, Harari GM, and Cerf M. The potential of generative AI for personalized persuasion at scale. Sci Rep. (2024) 14:4692. doi: 10.1038/s41598-024-53755-0
38. Hackenburg K and Margetts H. Evaluating the persuasive influence of political microtargeting with large language models. Proc Natl Acad Sci United States America. (2024) 121:e2403116121. doi: 10.1073/pnas.2403116121
39. Xu S and Zhao N. Efficacy of Personality-Labeled Prompts in Generating Trait-Specific Persuasive Messages. In: Meen T-H, Yang C-F, and Chang C-Y, editors. Proceedings of the 7th International Conference on Knowledge Innovation and Invention, vol. 1 . Springer Nature, Singapore (2026). p. 146–54. doi: 10.1007/978-981-95-2113-5_15
40. Petty RE and Cacioppo JT. enCommunication and Persuasion. New York, NY: Springer (1986). doi: 10.1007/978-1-4612-4964-1
41. Champion VL and Skinner CS. The health belief model. In: Health behavior and health education: Theory, research, and practice, 4th ed. Jossey-Bass/Wiley, Hoboken, NJ, US (2008). p. 45–65.
42. Alyafei A and Easton-Carr R. The Health Belief Model of Behavior Change. In: StatPearls. StatPearls Publishing, Treasure Island (FL (2025).
43. Bolderdijk JW, Steg L, Geller ES, Lehman PK, and Postmes T. Comparing the effectiveness of monetary versus moral motives in environmental campaigning. Nat Climate Change. (2013) 3:413–6. doi: 10.1038/nclimate1767
44. Park CW, Jaworski BJ, and MacInnis DJ. Strategic brand concept-image management. J Marketing. (1986) 50:135–45. doi: 10.1177/002224298605000401
45. Aaker JL. Dimensions of brand personality. J Marketing Res. (1997) 34:347–56. doi: 10.2307/3151897
46. Govers PCM and Schoormans JPL. Product personality and its influence on consumer preference. J Consumer Marketing. (2005) 22:189–97. doi: 10.1108/07363760510605308
47. Geuens M, Weijters B, and De Wulf K. A new measure of brand personality. Int J Res Marketing. (2009) 26:97–107. doi: 10.1016/j.ijresmar.2008.12.002
48. Caprara G, Barbaranelli C, and Guido G. Personality as metaphor: extension of the psycholexical hypothesis and the five factor model to brand and product personality description. Eur Adv Consumer Res. (1998) 3:61–9.
49. Gosling SD, Rentfrow PJ, and Swann WB. A very brief measure of the Big-Five personality domains. J Res Pers. (2003) 37:504–28. doi: 10.1016/S0092-6566(03)00046-1
50. Graham MH and Coppock A. Asking about attitude change. Public Opin Q. (2021) 85:28–53. doi: 10.1093/poq/nfab009
51. John OP, Donahue EM, and Kentle RL. Big Five Inventory. Washington, D.C., United States: Institution: American Psychological Association (2012). doi: 10.1037/t07550-000
52. Braddock K, Schumann S, Corner E, and Gill P. The moderating effects of “dark” personality traits and message vividness on the persuasiveness of terrorist narrative propaganda. Front Psychol. (2022) 13:779836. doi: 10.3389/fpsyg.2022.779836
53. Paek H-J and Hove T. How the media effects schema and the persuasion ethics schema affect audience responses to antismoking campaign messages. Health Communication. (2018) 33:526–36. doi: 10.1080/10410236.2017.1279003
54. Han E-K, Park C, and Khang H. Exploring linkage of message frames with personality traits for political advertising effectiveness. Asian J Communication. (2018) 28:247–63. doi: 10.1080/01292986.2017.1394333
55. Winter S, Maslowska E, and Vos AL. The effects of trait-based personalization in social media advertising. Comput Hum Behav. (2021) 114:106525. doi: 10.1016/j.chb.2020.106525
56. Lee J, Alvero AJ, Joachims T, and Kizilcec R. Poor alignment and steerability of large language models: evidence from college admission essays. ArXiv:2503.20062. (2025). doi: 10.48550/arXiv.2503.20062
57. Muñoz-Ortiz A, Gómez-Rodríguez C, and Vilares D. Contrasting linguistic patterns in human and LLM-generated news text. Artif Intell Rev. (2024) 57:265. doi: 10.1007/s10462-024-10903-2
58. Updegraff JA, Sherman DK, Luyster FS, and Mann TL. The effects of message quality and congruency on perceptions of tailored health communications. J Exp Soc Psychol. (2007) 43:249–57. doi: 10.1016/j.jesp.2006.01.007
59. Meyers-Levy J and Tybout AM. Schema congruity as a basis for product evaluation. J Consumer Res. (1989) 16:39–54. doi: 10.1086/209192
60. Cesario J, Grant H, and Higgins ET. Regulatory fit and persuasion: transfer from “Feeling right. J Pers Soc Psychol. (2004) 86:388–404. doi: 10.1037/0022-3514.86.3.388
61. Cesario J, Higgins ET, and Scholer AA. Regulatory fit and persuasion: basic principles and remaining questions. Soc Pers Psychol Compass. (2008) 2:444–63. doi: 10.1111/j.1751-9004.2007.00055.x
62. Reber R, Schwarz N, and Winkielman P. Processing fluency and aesthetic pleasure: is beauty in the perceiver’s processing experience? Pers Soc Psychol Review: Off J Soc Pers Soc Psychology Inc. (2004) 8:364–82. doi: 10.1207/s15327957pspr0804_3
63. Kidwell B, Farmer A, and Hardesty DM. Getting liberals and conservatives to go green: Political ideology and congruent appeals. J Consumer Res. (2013) 40:350–67. doi: 10.1086/670610
64. Feinberg M and Willer R. From gulf to bridge: When do moral arguments facilitate political influence? Pers Soc Psychol Bull. (2015) 41:1665–81. doi: 10.1177/0146167215607842
65. Feinberg M and Willer R. Moral reframing: A technique for effective and persuasive communication across political divides. Soc Pers Psychol Compass. (2019) 13:e12501. doi: 10.1111/spc3.12501
66. Voelkel JG and Feinberg M. Morally reframed arguments can affect support for political candidates. Soc psychol Pers Sci. (2018) 9:917–24. doi: 10.1177/1948550617729408
67. Zhang Y and Gelb BD. Matching advertising appeals to culture: The influence of products’ use conditions. J Advertising. (1996) 25:29–46. doi: 10.1080/00913367.1996.10673505
68. Uskul AK and Oyserman D. When message-frame fits salient cultural frame, messages feel more persuasive. Psychol Health. (2010) 25:321–37. doi: 10.1080/08870440902759156
69. Podsakoff PM, MacKenzie SB, Lee J-Y, and Podsakoff NP. Common method biases in behavioral research: A critical review of the literature and recommended remedies. J Appl Psychol. (2003) 88:879–903. doi: 10.1037/0021-9010.88.5.879
70. Wei J, Wang X, Schuurmans D, Bosma M, Ichter B, Xia F, et al. Chain-of-thought prompting elicits reasoning in large language models. In: Proceedings of the 36th International Conference on Neural Information Processing Systems, vol. 22. Curran Associates Inc, Red Hook, NY, USA (2022). p. 24824–37.
71. Yao L, Peng N, Weischedel R, Knight K, Zhao D, and Yan R. Plan-and-write: towards better automatic storytelling. In: Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, vol. 33. AAAI Press, Honolulu, Hawaii, USA (2019). p. 7378–85. doi: 10.1609/aaai.v33i01.33017378
Keywords: AI-generated messages, Big Five personality, large language models, personality-tailored persuasion, personalized communication, persuasive communication, semantic anchoring, topic stereotypes
Citation: Xu S, Zhou Z and Zhao N (2026) How topic content shapes LLM personality-tailored persuasion: semantic anchoring and topic stereotype effects. Front. Psychiatry 17:1756792. doi: 10.3389/fpsyt.2026.1756792
Received: 29 November 2025; Accepted: 06 January 2026; Revised: 25 December 2025;
Published: 30 January 2026.
Edited by:
Zengda Guan, Shandong Jianzhu University, ChinaReviewed by:
Klaus Mueller, Stony Brook University, United StatesZhuoran Huang, Northeastern University, United States
Copyright © 2026 Xu, Zhou and Zhao. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Nan Zhao, emhhb25hbkBwc3ljaC5hYy5jbg==
Zili Zhou1,2