Your new experience awaits. Try the new design now and help us make it even better

SYSTEMATIC REVIEW article

Front. Commun., 19 September 2025

Sec. Culture and Communication

Volume 10 - 2025 | https://doi.org/10.3389/fcomm.2025.1645168

Deep learning in cultural imagery dissemination: a systematic scoping review of AI-driven visual transmission mechanisms

  • 1The School of Humanities, Tongji University, Shanghai, China
  • 2Faculty of Applied Sciences, Macao Polytechnic University, Macao, Macao SAR, China

Background: In an era of rapid media technology and AI advancement, deep learning (DL)-driven visual images (VI) is emerging as a critical mode of cultural transmission (CT). Despite the growing application of DL in the VI domain, there is a lack of a systematic review that comprehensively explores its transmission pathways, mechanisms of influence, and associated challenges. This study aims to systematically explore the pathways and impacts of DL-driven VI in CT and identify key trends and issues in the field through a systematic scoping review of existing literature.

Methods: This review analyzes 18 studies published between 2015 and 2024. The literature search was conducted across five databases: WOS, ScienceDirect, Scopus, ACM, and A&HCI. The research was undertaken rigorously following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews (PRISMA-ScR) guidelines, ensuring systematic selection, extraction, and analysis of the identified studies.

Results: The study analyzed the literature from four aspects: transmission pathways, content, technology, and cultural context, identifying three main research areas: (1) the influence mechanisms of AI and social media on cultural transmission; (2) the role of VI in cross-cultural communication; and (3) the application of AI and digital technology in the conservation of Cultural Ecosystem Services (CES). The study finds that AI-driven visual technologies significantly enhance the breadth and impact of CT, particularly through DL algorithms. However, the field faces critical challenges such as algorithmic bias, cultural homogenization, and the reliability of user-generated content.

Conclusion: By systematically synthesizing the existing literature, this study provides a theoretical foundation for future research and points to emerging research directions, such as how to use DL to address ethical challenges in cultural communication and explore the differences in the application of DL and VI in different cultural contexts.

1 Introduction

Human behavior, shaped by preferences, beliefs, and norms, is partly a result of genetic evolution and partly acquired through generations via learning and other forms of social interaction (Panchanathan, 2024). This transmission occurs through both inter-generational and inter-generational social interactions and is referred to as CT (Bisin and Verdier, 2025). CT is a unique human behavior, central to which is the high-fidelity replication of cultural traits (Crema et al., 2024), accurately copying elements, values, and customs so that each generation can build upon the knowledge and practices of the previous one (Hewlett et al., 2024). This process distinguishes human culture from that of non-human primates and is a key mechanism in fostering intercultural understanding, respect, and global diversity (Legare, 2017). While traditional CT media such as language, writing, and rituals have deep historical roots, they are prone to information loss during cross-temporal and cross-spatial transmission, resulting in discontinuities in CT (Della Lena and Panebianco, 2021; Schönpflug, 2008).

Against this backdrop, VI has increasingly become a vital mode of CT. As one of the primary channels for CT, imagery has a long history (Eerkens and Lipo, 2007). From prehistoric cave paintings to medieval religious art, and from modern photography to digital media, the patterns and impacts of image dissemination have evolved alongside technological advancements (Homer, 1998; Robb, 2020). Early images were primarily used to record societal values and religious beliefs, while modern VI focuses on the immediacy and diversity of visuals, effectively conveying different cultural lifestyles, societal values, and aesthetic ideals (Heise, 2004; Mirzoeff, 1999). With the rapid development of globalization and digital technologies, the widespread use of visual media has greatly enhanced the efficiency of CT and facilitated emotional resonance between individuals, especially in cross-cultural communication, where the influence of VI now surpasses that of traditional textual communication (Fahmy et al., 2014).

Compared to textual dissemination, VI’s intuitive nature breaks down language barriers, extending the reach and scope of cultural transmission (Soreanu and German, 2022). The dissemination of urban imagery and landscapes utilizes VI to vividly depict the historical and cultural evolution of cities, making it an effective CT tool for showcasing unique cultural characteristics and historical heritage (Bai, 2023; Huang and Yang, 2016). While VI demonstrates significant advantages across various CT domains, its limitations are equally evident. Due to its intuitive and emotionally resonant nature, VI is regarded by scholars as an effective tool for fostering intercultural understanding (Mirzoeff, 1999). Despite these challenges, AI technology presents new opportunities to enhance the application of VI in CT. By combining AI, VI has significantly accelerated the global flow of CT (Somaini, 2023). Moreover, integrating AI and digital technologies offers innovative solutions for preserving and reconstructing cultural heritage, such as image restoration driven by DL and the digital reconstruction of virtual cultural sites (Basu et al., 2023). The advances in AI technology, especially DL, provide innovative solutions to these challenges with their powerful image processing and generation capabilities, playing a core role in the cultural transmission of visual imagery (Li and Wang, 2022; Somaini, 2023). DL plays a core role in the CT of VI through its powerful image processing and generation capabilities. This study primarily focuses on DL algorithms such as Convolutional Neural Networks (CNNs) (Xia et al., 2025), Large Language Models (LLMs) (Luo et al., 2025), and Natural Language Processing (NLP) (Jiang et al., 2023). However, despite its breakthroughs, AI’s limitations in generating and disseminating VI remain substantial, particularly regarding dataset representativeness and the simplicity of algorithmic designs, which could introduce biases in the CT process (Laba, 2024; Shahbazi et al., 2023).

Current reviews on CT, especially those focusing on VI, still exhibit certain limitations in scope and methodological frameworks (see Appendix A). First, these studies have not sufficiently addressed the complexity and diversity of CT mechanisms, often focusing on a single perspective or partial analysis. For example, the study by Wang et al. focuses solely on gamification in cultural heritage, without considering the broader applications of VI (Wang et al., 2024). Similarly, Romanazzi et al. limit their analysis to economic evaluation methods based on a single database (Romanazzi et al., 2023). Research by Plieninger et al. and Scholte et al. lacks a systematic review and focuses solely on ecological service systems (Plieninger et al., 2015; Scholte et al., 2015), while Hegetschweiler et al. concentrate only on the European context (Hegetschweiler et al., 2017).

In the context of the AI era, exploring pathways and methods for VI in CT can promote the effective dissemination and preservation of cultural heritage across regions. The specific research questions are as follows:

1. What are the scientific outcomes and geographical distributions of studies related to VI in CT (e.g., country distribution, publication years)?

2. Which DL has been used?

3. What are the CT themes and VI processing methods?

4. What platforms and dissemination paths can be used for VI-based CT?

5. What are the impacts of different forms of VI processing on CT?

This study systematically explores the pathways and impacts of VI in CT, examining its potential for CT. By comprehensively addressing these topics, this research contributes to understanding how VI affects CT and identifying key trends and issues in the field of VI research in CT.

2 Methods

2.1 Search strategy

This study employs the PRISMA method for conducting a systematic scoping review, aiming to enhance the transparency and reproducibility of the literature screening process. The literature searches spans five databases: Web of Science, ScienceDirect, Scopus, ACM Digital Library, and A&HCI. The PRISMA approach ensures the structural rigor of the research process while increasing the transparency and consistency of the review process. This systematic scoping review adheres to the guidelines outlined in the Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews (PRISMA-ScR).

The literature search was conducted on November 20, 2024, using search terms including “image,” “image vision,” “computer vision,” “cultural,” “cultural transmission,” “cultural communication,” “public,” and “service.” The search strategy included the following steps: (1) a preliminary screening in all databases to exclude irrelevant studies; (2) removal of duplicate references using EndNote 21.4 software; (3) detailed screening of titles and abstracts to identify studies relevant to the research topic; and (4) full-text review to exclude studies that were not directly related to the research topic. The specific search formula is shown in Table 1 (The PRISMA-ScR checklist is provided in Appendix B).

Table 1
www.frontiersin.org

Table 1. Selected databases and search formats.

2.2 Data selection and extraction

All retrieved records were imported into the reference management software EndNote version 21.5, and duplicate entries were removed using the software. Two independent reviewers (JY and TL) screened the titles and abstracts of the articles based on predefined inclusion criteria. In cases of discrepancies between the reviewers, a third reviewer (PP) was consulted to reach a consensus. The inclusion criteria (see Table 2) for selecting studies were as follows: (1) research involving the application of VI in CT; (2) involves the specific application of CT and inheritance; (3) VI studies based on AI technology; (4) studies presenting original research; (5) published between 2015 and 2024; (6) full text in English.

Table 2
www.frontiersin.org

Table 2. Inclusion and exclusion criteria.

2.3 Data charting

A data extraction form was developed based on the scoping review methodology guidelines provided by the Joanna Briggs Institute. After conducting a preliminary trial with five articles, the form was revised to enhance its effectiveness and accuracy. The form includes the following key data extraction items: author, publication year, country, AI technology, topic, VI type, target group, dissemination platform, dissemination path, and findings. Data extraction was performed by two independent reviewers, and any discrepancies were resolved through consultation with a senior reviewer to ensure the accuracy and consistency of the extracted data.

2.4 Collating, summarizing, and reporting the results

During this process, all three authors concurred on narrowing the scope of categories to ensure that they specifically address the research questions of this study. Descriptive statistics were employed to systematically organize, summarize, and report the results. The findings were presented in a narrative format, complemented by figures and tables to enhance clarity.

3 Results

As illustrated in Figure 1, a total of 15,743 articles were retrieved through the systematic search. After removing duplicate entries using EndNote software, 15,461 articles remained. Two reviewers independently screened the titles and abstracts, excluding 15,395 articles that were not directly related to the research topic and 17 non-English articles. Studies focusing on image usage in advertising and marketing, corporate image construction, and body image perception were excluded due to their lack of direct relevance to the CT theme. Additionally, excluded studies primarily emphasized the functionality of images in business, education, or health promotion rather than the specific analysis or application of CT. The remaining 49 articles underwent a thorough evaluation by both reviewers, resulting in the exclusion of 30 articles. Ultimately, 18 articles were included in the scope evaluation of the system (see Table 3).

Figure 1
Flowchart detailing the identification and screening process for studies. Initially, 15,743 records were identified from databases. After removing duplicates and non-English entries, 15,454 records were screened, with 15,395 excluded. Fifty-one reports were sought, two not retrieved. Forty-nine were assessed for eligibility, excluding thirty-one for reasons like mismatched topics and focus issues. Eighteen studies were included in the review.

Figure 1. PRISMA flowchart.

Table 3
www.frontiersin.org

Table 3. Overview of study characteristics.

3.1 Publication characteristics of studies

Figure 2 illustrates the annual evolution trend of the included literature. The time series indicates that relevant research began in 2015, gradually developing thereafter, and reaching a peak in 2022 with a total of four studies published. During the COVID-19 pandemic, the closure of cultural institutions and the global shift to online spaces prompted a transition of cultural content toward virtual engagement. This increased the need to explore the role of digital and virtual media in CT, which may explain the rise in publications in 2022 and the following year (Li et al., 2022).

Figure 2
Line graph showing yearly numeric values from 2015 to 2024. Peaks at 2 in 2015, 2019, and 2024, with a significant spike to 4 in 2022. Values drop to 1 in 2016-2018, 2020-2021, and rise to 3 in 2023.

Figure 2. Annual number of publications.

Regarding the geographical distribution of the research, Figure 3 displays the number of contributions by country and region. China published six papers, accounting for 33.3% of the total, temporarily leading the field. China has 56 cultural and natural heritage sites listed on the UNESCO World Heritage List, ranking second globally. This leadership is likely closely related to the rapid development of China’s digital cultural industry in recent years and government support for visual culture research. At the regional level, Asia dominates with ten articles, representing 55.6%, followed by Europe with six articles (33.3%), and North America with two articles (11.1%).

Figure 3
Stacked bar chart showing counts for Asia, Europe, and North America by country. Asia has contributions from China, Singapore, Israel, and Japan. Europe includes Portugal, Italy, Germany, France, and the UK. North America solely shows the USA. A legend provides color codes for each country.

Figure 3. Regional distribution map of publications.

3.2 AI technology

In the included studies, the majority employed DL algorithms for AI-based VI processing in CT applications, particularly CNN. 11 studies utilized CNN to process and generate image data. CNN has significant advantages in handling image data, enabling tasks such as image classification and detection. Secondly, cloud computing was used for data processing in two studies. Image processing often involves textual data; for example, input text data can be transformed into VI through generative AI, thereby involving LLM and NLP. However, HCI and XAI were less frequently employed in the research (see Figure 4).

Figure 4
Pie chart showing distribution percentages of various technologies. DL-CNN dominates with sixty-one percent, followed by Cloud Computing at eleven percent. VR and LLM each hold six percent. XAI is at six percent, while HCI and NLP each have five percent.

Figure 4. AI technology type.

3.3 Research topics

The thematic classification identified 18 topics across seven major categories. CES evaluation and utilization was the most prominent, encompassing five topics, highlighting researchers’ focus on assessing and applying CES in various contexts, essential for urban planning and environmental management. Landscape & environmental management, as well as urban image & perception, included four and two topics, respectively, emphasizing the role of landscape management and urban image in enhancing sustainability and residents’ quality of life. Classification systems and recreation & aesthetic evaluation each comprised two topics, indicating the need for specific classification methodologies and aesthetic value assessments. Cultural heritage and technology in urban Studies featured one and two topics, respectively. This reflects the increasing recognition of cultural heritage tourism and emerging technologies, such as hybrid 3D virtual museums and generative AI, in enriching cultural experiences and research. Overall, the distribution of topics underscores current academic interests in CES, landscape management, and technological applications (see Table 4).

Table 4
www.frontiersin.org

Table 4. Research topics.

3.4 Target group

The analysis reveals that social media users (36%), tourists (28%), and urban residents (22%) are the primary target groups across the studies. The frequent combination of these groups highlights the importance of social media in urban and tourism research, as well as the need to understand the dynamics between local residents and visitors. Additionally, the inclusion of niche groups points to diverse research interests that can be further explored to enrich the understanding of cultural and urban ecosystems (see Figure 5).

Figure 5
Pie chart showing four groups: Social Media Users 36% (blue), Tourists 28% (orange), Urban Residents 22% (yellow), and Other Specific Groups 14% (green).

Figure 5. Target group.

3.5 Dissemination platforms

The analysis reveals that Flickr and social media are the predominant dissemination platforms, with usage frequencies of 41 and 35%, respectively. This emphasizes the importance of visual content and social interactions in information dissemination. The strategic combination of multiple platforms underscores the need to reach a broader and more varied audience. While emerging technologies are currently underutilized, their potential for enhancing dissemination strategies is evident, presenting opportunities for future research and application. The integration of physical and digital dissemination methods also highlights the evolving landscape of information sharing, aiming for more comprehensive and effective communication (see Table 5).

Table 5
www.frontiersin.org

Table 5. Dissemination platform.

3.6 The path of CT

The analysis reveals that digital dissemination (50%) is the predominant pathway for CT, highlighting its central role in contemporary cultural exchange. The strategic combination of digital dissemination with localization adaptation (22%) and social media dissemination (11%) underscores the importance of contextual and interactive approaches in enhancing the effectiveness of CT. Cross-cultural dissemination (11%) also plays a notable role, reflecting the interconnectedness of global cultures. Cultural inheritance (6%) maintains a unique position, emphasizing the preservation of traditional cultural elements (see Figure 6).

Figure 6
Bar chart showing five categories for dissemination. Cultural Inheritance scores 1, Social Media Dissemination scores 2, Cross-cultural Dissemination scores 2, Localization Adaptation scores 4, and Digital Dissemination scores 9.

Figure 6. The path of CT.

3.7 Findings

The analysis of the findings reveals several key themes underpinning the mechanisms of CT within urban contexts. Social media images emerge as a predominant factor, influencing both urban image and cultural cognition across various domains such as urban parks, building functions, and coastal landscapes. Specifically, SMP facilitates the dissemination and enhancement of cultural understanding through visual content, as evidenced by multiple findings.

Image classification and recognition technologies play a crucial role in supporting CES evaluation and promoting broader cultural dissemination. These technologies enhance the efficiency and accuracy of disseminating cultural landscapes and intangible cultural heritage, leveraging tools such as computer vision, open-source classifiers, and multi-modal image classification. DL further augments CT by generating immersive narratives and facilitating aesthetic dissemination. The integration of advanced technologies like LiDAR, remote sensing, and 3D models enhances the depth and reach of cultural dissemination efforts, enabling more comprehensive and interactive cultural experiences. Cultural flagship events significantly contribute to the reconstruction of urban images and the enhancement of cultural identity. The use of unlabeled images presents both advantages and limitations, indicating a need for balanced approaches in CT strategies. Overall, the findings underscore the synergistic interplay between social media, advanced image technologies, and DL in fostering effective CT.

4 Discussion

4.1 The role of AI technologies in visual imagery cultural transmission

DL enhances CT by generating immersive narratives and promoting aesthetic dissemination. Studies show that AI-generated narratives and images that visually present users in a story context can achieve good immersion and engagement (Dong, 2025). Image classification and recognition technologies play a crucial role in supporting CES evaluations and promoting broader cultural dissemination (Ju, 2024). DL improves the efficiency and accuracy of disseminating cultural landscapes and intangible cultural heritage by leveraging tools such as computer vision and multimodal image classification (Gîrbacia, 2024). Pre-trained CNNs have shown high effectiveness in accurately identifying cultural heritage elements from social media images (Belhi et al., 2021). For instance, some studies have used pre-trained VGG19 and Xception models to significantly improve the accuracy and efficiency of traditional cultural heritage image classification through transfer learning (Janković Babić, 2024). The integration of advanced technologies such as LiDAR, remote sensing, and 3D modeling enhances the depth and breadth of CT by creating interactive cultural experiences for urban populations (Y. Li et al., 2023), as such technologies enable computers to capture in-depth environmental data and generate artistic output based on sensing. A hybrid 3D virtual museum combines panoramic images and models to offer a more realistic and interactive cultural experience (Barrile et al., 2022).

4.2 The impact of social media platforms on visual imagery cultural transmission

Social media images, through a collective construction process, significantly influence cultural cognition and the formation of urban identity across different urban contexts (Loughran et al., 2015). Unlike the unidirectional dissemination of traditional media, SMP facilitate two-way cultural exchange through user-generated content and interactive functions (such as comments, reposts, and likes), encouraging users to actively participate in content creation and sharing (Eroglu, 2023). This collective construction process significantly influences cultural cognition and enhances the role of visual content in promoting cultural understanding (Gooding, 2004). Moreover, real-time content dissemination on SMP significantly accelerates CT relative to the slower, fixed schedules of traditional broadcast media (Chukwu, 2023). On a technical level, image classification and recognition technologies play a crucial role in supporting CES evaluations and promoting broader cultural dissemination. These technologies, utilizing tools like computer vision and multimodal image classification, improve the efficiency and accuracy of transmitting cultural landscapes and intangible cultural heritage. Research indicates that platforms like Flickr and Weibo are core channels for disseminating visual content and facilitating social interaction (Liang et al., 2022).

4.3 Visual imagery and cross-cultural transmission

VI, due to its intuitive and universal nature, can transcend language barriers, simplifying complex cultural concepts and promoting understanding without the need for translation (Vishwakarma, 2023). For example, traditional Chinese New Year pictures express themes of happiness, good fortune, and prosperity through symbols like figures, animals, and plants, which helps audiences from different cultural backgrounds understand core Chinese cultural values (Welch, 2013). Algorithmic diversification strategies, such as “explore mode” and randomized recommendations (Fang et al., 2020), are effective in broadening users’ exposure to cross-cultural content by correcting the “filter bubble” effect often found in recommendation systems (Grossetti et al., 2021).

Furthermore, AI-driven visual technologies are pivotal in creating immersive and interactive cultural experiences that drive cross-cultural communication. In the context of digital cultural tourism, technologies like virtual tours and 3D modeling allow global audiences to explore cultural heritage sites and traditions regardless of physical distance (Napolitano et al., 2018). These visual mediums not only attract tourists but also serve as educational tools, enabling a deeper appreciation of diverse cultures. The application of AI in this domain transforms passive viewing into an active, engaging experience, fostering a more direct and personal connection between individuals and foreign cultures.

4.4 Challenges and risks posed by AI

Despite advancements, AI technology introduces several challenges in CT. A primary concern is algorithmic bias, as many AI models are trained on datasets dominated by Western aesthetic standards, leading to biases in recognizing non-Western cultural content. This can distort cultural heritage representations and exacerbate cultural inequalities (Gosal and Ziv, 2020). AI models often lack a deep understanding of cultural symbols and contexts, resulting in misinterpretations, especially with religious or historical images (Fan et al., 2023). Another significant risk is cultural homogenization and the loss of creativity. In creative fields, reliance on repetitive data can stifle innovation, leading to homogenized cultural products (Fan et al., 2023).

Privacy and data security also present pressing concerns. Training facial recognition models requires vast amounts of personal images, raising issues of privacy infringement (Su et al., 2023). AI can be misused to create fake images or videos, which can mislead the public and undermine social trust, such as fabricated political videos used as propaganda (SHI et al., 2021). To mitigate these issues, both technological and methodological improvements are essential. Expanding AI training datasets to include diverse cultural elements such as language, music, and text can help reduce bias (Fan et al., 2023). Incorporating cultural context into algorithm design through cultural tagging or building knowledge bases can enhance the understanding of cultural nuances and reduce misinterpretations (Huai et al., 2022). Developing explainable AI algorithms is crucial for improving transparency in decision-making and identifying potential biases (Huai et al., 2022).

4.5 Future directions

Future research should expand its methodological scope to provide more balanced insights into the application of AI and SMP in diverse cultural contexts. This can be achieved by incorporating a wider range of sources from various cultural backgrounds and conducting more nuanced analyses of the practical challenges and ethical implications. Additionally, the potential of digital tools such as VR, 3D modeling, and panoramic imaging to simultaneously strengthen cultural engagement and ecological conservation should be explored. This could be achieved by creating virtual tours of heritage sites that highlight both their cultural significance and the need for ecological protection. Ultimately, future studies should investigate the potential of user-generated content in supporting CES by developing collaborative cultural-ecological conservation models that foster public participation and promote conservation efforts.

4.6 Limitations

This scoping review has several limitations and strengths that warrant discussion. Many studies relied heavily on user-generated content from SMP, which inherently introduces biases. Factors such as sample selection, tagging practices, and subjective interpretations may distort the representativeness of cultural narratives. These limitations are compounded by the under-representation of diverse cultural contexts, particularly from regions with limited access to advanced technological tools. Addressing this imbalance would require broader geographic and demographic inclusion in future research. In conclusion, while this systematic scoping review provides valuable insights into the role of VI in CT, particularly in the era of DL, it is crucial for future research to address these limitations by expanding the scope of the literature review, incorporating a broader range of sources, and providing a more nuanced analysis of the challenges and practicalities of implementing AI and SMP strategies in diverse cultural contexts.

5 Conclusion

In this study, we conducted a systematic scoping review to analyze the role of DL-driven VI in CT. We have identified that DL-driven visual technologies, especially DL algorithms, significantly enhance the breadth and impact of CT. One of our contributions also highlighted key challenges, including algorithmic bias, cultural homogenization, and the reliability of user-generated content. Future research should focus on improving the inclusivity of DL algorithms, addressing biases in cultural representation, and enhancing the accuracy and authenticity of content through advanced image recognition technologies. This research provides a foundational framework for understanding the complex interplay between AI, VI, and CT, paving the way for more nuanced and effective applications in the future.

Data availability statement

The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.

Author contributions

JY: Conceptualization, Writing – review & editing, Investigation, Formal analysis, Data curation, Writing – original draft. TL: Investigation, Writing – review & editing, Methodology. YL: Writing – review & editing, Visualization, Methodology, Investigation. PP: Writing – review & editing, Supervision.

Funding

The author(s) declare that no financial support was received for the research and/or publication of this article.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The authors declare that no Gen AI was used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

Bai, Y. (2023). Historical changes of urban landscape in the field of visual culture: take the bund as an example. Lecture Notes Educ. Psychol. Pub. Media 22, 23–34. doi: 10.54254/2753-7048/22/20230209

Crossref Full Text | Google Scholar

Barrile, V., Bernardo, E., Fotia, A., and Bilotta, G. (2022). A combined study of cultural heritage in archaeological museums: 3D survey and mixed reality. Heritage 5, 1330–1349. doi: 10.3390/heritage5030069

Crossref Full Text | Google Scholar

Basu, A., Paul, S., Ghosh, S., Das, S., Chanda, B., Bhagvati, C., et al. (2023). Digital restoration of cultural heritage with data-driven computing: a survey. IEEE Access 11, 53939–53977. doi: 10.1109/ACCESS.2023.3280639

Crossref Full Text | Google Scholar

Belhi, A., Ahmed, H. O., Alfaqheri, T., Bouras, A., Sadka, A. H., and Foufou, S. (2021). Study and evaluation of pre-trained CNN networks for cultural heritage image classification. In Belhi, A., A. Bouras, A. K. Al-Ali, and A. H. Sadka Data analytics for cultural heritage: Current trends and concepts (pp. 47–69): Cham Springer.

Google Scholar

Bisin, A., and Verdier, T. (2025). Economic models of cultural transmission. Cham: Springer.

Google Scholar

Cardoso, A. S., Renna, F., Moreno-Llorca, R., Alcaraz-Segura, D., Tabik, S., Ladle, R. J., et al. (2022). Classifying the content of social media images to support cultural ecosystem service assessments using deep learning models. Ecosystem Serv. 54:101410. doi: 10.1016/j.ecoser.2022.101410

Crossref Full Text | Google Scholar

Chukwu, O. J. (2023). Interrogating the online internet-based broadcast media stations: platforms, implications and emerged paradigms. J. Manage. Sci. 13, 74–81. doi: 10.26524/jms.13.36

Crossref Full Text | Google Scholar

Crema, E. R., Bortolini, E., and Lake, M. (2024). How cultural transmission through objects impacts inferences about cultural evolution. J. Archaeol. Method Theory 31, 202–226. doi: 10.1007/s10816-022-09599-x

Crossref Full Text | Google Scholar

Della Lena, S., and Panebianco, F. (2021). Cultural transmission with incomplete information. J. Econ. Theory 198:105373. doi: 10.1016/j.jet.2021.105373

Crossref Full Text | Google Scholar

Dong, A. (2025). LUMIEA: Enhancing user engagement in storytelling: Empowering personal narratives through AI-generated environments and tactile interaction in mixed reality. Toronto, ON: OCAD University.

Google Scholar

Eerkens, J. W., and Lipo, C. P. (2007). Cultural transmission theory and the archaeological record: providing context to understanding variation and temporal changes in material culture. J. Archaeol. Res. 15, 239–274. doi: 10.1007/s10814-007-9013-z

Crossref Full Text | Google Scholar

Eizenberg, E., and Cohen, N. (2015). Reconstructing urban image through cultural flagship events: the case of bat-yam. Cities 42, 54–62. doi: 10.1016/j.cities.2014.09.003

Crossref Full Text | Google Scholar

Eroglu, D. I. (2023). Medium is the message: Unraveling the social media platforms' effects on communication and opinions. Blacksburg VI: Virginia Polytechnic Institute and State University.

Google Scholar

Fahmy, S., Bock, M., and Wanta, W. (2014). Visual communication theory and research: A mass communication perspective. Cham: Springer.

Google Scholar

Fan, T., Wang, H., and Deng, S. (2023). Intangible cultural heritage image classification with multimodal attention and hierarchical fusion. Expert Syst. Appl. 231:120555. doi: 10.1016/j.eswa.2023.120555

Crossref Full Text | Google Scholar

Fang, H., Zhang, D., Shu, Y., and Guo, G. (2020). Deep learning for sequential recommendation: algorithms, influential factors, and evaluations. ACM Trans. Inf. Syst. 39, 1–42. doi: 10.1145/3426723

Crossref Full Text | Google Scholar

Ferracani, A., Bertini, M., Pala, P., Nannotti, G., Principi, F., and Becchi, G. (2024). “Personalized generative storytelling with AI-visual illustrations for the promotion of knowledge in cultural heritage tourism,” in Paper Presented at the Proceedings of the 6th Workshop on the Analysis, Understanding and Promotion of Heritage Contents.

Google Scholar

Gîrbacia, F. (2024). An analysis of research trends for using artificial intelligence in cultural heritage. Electronics 13:3738. doi: 10.3390/electronics13183738

Crossref Full Text | Google Scholar

Gooding, D. (2004). Cognition, construction and culture: visual theories in the sciences. J. Cogn. Cult. 4, 551–593. doi: 10.1163/1568537042484896

Crossref Full Text | Google Scholar

Gosal, A., and Ziv, G. (2020). Landscape aesthetics: spatial modelling and mapping using social media images and machine learning. Ecol. Indic. 117:106638. doi: 10.1016/j.ecolind.2020.106638

Crossref Full Text | Google Scholar

Grossetti, Q., Du Mouza, C., Travers, N., and Constantin, C. (2021). Reducing the filter bubble effect on twitter by considering communities for recommendations. Int. J. Web Inf. Syst. 17, 728–752. doi: 10.1108/IJWIS-06-2021-0065

Crossref Full Text | Google Scholar

Hegetschweiler, K. T., de Vries, S., Arnberger, A., Bell, S., Brennan, M., Siter, N., et al. (2017). Linking demand and supply factors in identifying cultural ecosystem services of urban green infrastructures: a review of European studies. Urban For. Urban Green. 21, 48–59. doi: 10.1016/j.ufug.2016.11.002

Crossref Full Text | Google Scholar

Heise, D. (2004). Is visual culture becoming our canon of art? Art Educ. 57, 41–46. doi: 10.1080/00043125.2004.11653567

Crossref Full Text | Google Scholar

Hewlett, B. S., Boyette, A. H., Lew-Levy, S., Gallois, S., and Dira, S. J. (2024). Cultural transmission among hunter-gatherers. Proc. Natl. Acad. Sci. 121:e2322883121. doi: 10.1073/pnas.2322883121

PubMed Abstract | Crossref Full Text | Google Scholar

Hoffmann, E. J., Abdulahhad, K., and Zhu, X. X. (2023). Using social media images for building function classification. Cities 133:104107. doi: 10.1016/j.cities.2022.104107

Crossref Full Text | Google Scholar

Homer, W. I. (1998). Visual culture: a new paradigm. Am. Art. 12, 6–9. doi: 10.1086/424309

Crossref Full Text | Google Scholar

Hu, Q., Yu, D., Wang, S., Fu, C., Ai, M., and Wang, W. (2017). Hybrid three-dimensional representation based on panoramic images and three-dimensional models for a virtual museum: data collection, model, and visualization. Inf. Vis. 16, 126–138. doi: 10.1177/1473871616655467

Crossref Full Text | Google Scholar

Huai, S., Chen, F., Liu, S., Canters, F., and Van de Voorde, T. (2022). Using social media photos and computer vision to assess cultural ecosystem services and landscape features in urban parks. Ecosystem Serv. 57:101475. doi: 10.1016/j.ecoser.2022.101475

Crossref Full Text | Google Scholar

Huang, Y., and Yang, S. (2016). “The orientation of urban image and the strategy of cultural communication,” in Paper Presented at the 2nd International Conference on Computer Engineering, Information Science & Application Technology (ICCIA 2017)

Google Scholar

Janković Babić, R. (2024). A comparison of methods for image classification of cultural heritage using transfer learning for feature extraction. Neural Comput. & Applic. 36, 11699–11709. doi: 10.1007/s00521-023-08764-x

Crossref Full Text | Google Scholar

Jiang, Y., Pang, P. C.-I., Wong, D., and Kan, H. Y. (2023). Natural language processing adoption in governments and future research directions: a systematic review. Appl. Sci. 13:12346. doi: 10.3390/app132212346

Crossref Full Text | Google Scholar

Ju, F. (2024). Mapping the knowledge structure of image recognition in cultural heritage: a scientometric analysis using CiteSpace, VOSviewer, and bibliometrix. J. Imaging 10:272. doi: 10.3390/jimaging10110272

PubMed Abstract | Crossref Full Text | Google Scholar

Laba, N. (2024). Engine for the imagination? Visual generative media and the issue of representation. Media Cult. Soc. 46, 1599–1620. doi: 10.1177/01634437241259950

Crossref Full Text | Google Scholar

Lee, H., Seo, B., Koellner, T., and Lautenbach, S. (2019). Mapping cultural ecosystem services 2.0–potential and shortcomings from unlabeled crowd sourced images. Ecol. Indic. 96, 505–515. doi: 10.1016/j.ecolind.2018.08.035

Crossref Full Text | Google Scholar

Legare, C. H. (2017). Cumulative cultural learning: development and diversity. Proc. Natl. Acad. Sci. 114, 7877–7883. doi: 10.1073/pnas.1620743114

PubMed Abstract | Crossref Full Text | Google Scholar

Li, X., Liang, X., Yu, T., Ruan, S., and Fan, R. (2022). Research on the integration of cultural tourism industry driven by digital economy in the context of COVID-19—based on the data of 31 Chinese provinces. Front. Public Health 10:780476. doi: 10.3389/fpubh.2022.780476

PubMed Abstract | Crossref Full Text | Google Scholar

Li, R., and Wang, C. (2022). Cultural and creative product design and image recognition based on deep learning. Comput. Intell. Neurosci. 2022, 1–9. doi: 10.1155/2022/7256584

PubMed Abstract | Crossref Full Text | Google Scholar

Li, Y., Zhao, L., Chen, Y., Zhang, N., Fan, H., and Zhang, Z. (2023). 3D LiDAR and multi-technology collaboration for preservation of built heritage in China: a review. Int. J. Appl. Earth Obs. Geoinf. 116:103156. doi: 10.1016/j.jag.2022.103156

Crossref Full Text | Google Scholar

Liang, X., Hua, N., Martin, J., Dellapiana, E., Coscia, C., and Zhang, Y. (2022). Social media as a medium to promote local perception expression in China’s world heritage sites. Land 11:841. doi: 10.3390/land11060841

Crossref Full Text | Google Scholar

Loughran, K., Fine, G. A., and Hunter, M. A. (2015). “Urban spaces, city cultures, and collective memories” in Routledge international handbook of memory studies. eds. A. L. Tota and T. Hagen (London: Routledge), 193–204.

Google Scholar

Luo, Y. T., Liu, T., Pang, P. C.-I., Wang, Z., and Chan, K. I. (2025). Exploring information interaction preferences in an LLM-assisted learning environment with a topic modeling framework. Appl. Sci. 15:7515. doi: 10.3390/app15137515

Crossref Full Text | Google Scholar

Mirzoeff, N. (1999). An introduction to visual culture, vol. 274. London: Routledge.

Google Scholar

Motte, E., and McInnes, R. (2019). Using artistic imagery to improve understanding of coastal landscape changes on the Rance estuary (French Channel coast). Geoheritage 11, 961–972. doi: 10.1007/s12371-018-00341-2

Crossref Full Text | Google Scholar

Napolitano, R. K., Scherer, G., and Glisic, B. (2018). Virtual tours and informational modeling for conservation of cultural heritage sites. J. Cult. Herit. 29, 123–129. doi: 10.1016/j.culher.2017.08.007

Crossref Full Text | Google Scholar

Panchanathan, K. (2024). 15 Cultural Evolution. Hum. Behav. Ecol. 92:356.

Google Scholar

Plieninger, T., Bieling, C., Fagerholm, N., Byg, A., Hartel, T., Hurley, P., et al. (2015). The role of cultural ecosystem services in landscape management and planning. Curr. Opin. Environ. Sustain. 14, 28–33. doi: 10.1016/j.cosust.2015.02.006

Crossref Full Text | Google Scholar

Richards, D. R., and Friess, D. A. (2015). A rapid indicator of cultural ecosystem service usage at a fine spatial scale: content analysis of social media photographs. Ecol. Indic. 53, 187–195. doi: 10.1016/j.ecolind.2015.01.034

Crossref Full Text | Google Scholar

Richards, D. R., and Tunçer, B. (2018). Using image recognition to automate assessment of cultural ecosystem services from social media photographs. Ecosystem Serv. 31, 318–325. doi: 10.1016/j.ecoser.2017.09.004

Crossref Full Text | Google Scholar

Robb, J. (2020). Art (pre) history: ritual, narrative and visual culture in Neolithic and bronze age Europe. J. Archaeol. Method Theory 27, 454–480. doi: 10.1007/s10816-020-09471-w

PubMed Abstract | Crossref Full Text | Google Scholar

Romanazzi, G. R., Koto, R., De Boni, A., Palmisano, G. O., Cioffi, M., and Roma, R. (2023). Cultural ecosystem services: a review of methods and tools for economic evaluation. Environ. Sustain. Indic. 20:100304. doi: 10.1016/j.indic.2023.100304

Crossref Full Text | Google Scholar

Scholte, S. S., Van Teeffelen, A. J., and Verburg, P. H. (2015). Integrating socio-cultural perspectives into ecosystem service valuation: a review of concepts and methods. Ecol. Econ. 114, 67–78. doi: 10.1016/j.ecolecon.2015

Crossref Full Text | Google Scholar

Schönpflug, U. (2008). Cultural transmission: Psychological, developmental, social, and methodological aspects. Cambridge: Cambridge University Press.

Google Scholar

Shahbazi, N., Lin, Y., Asudeh, A., and Jagadish, H. (2023). Representation bias in data: a survey on identification and resolution techniques. ACM Comput. Surv. 55, 1–39. doi: 10.1145/3588433

Crossref Full Text | Google Scholar

Shi, J., Honjo, T., Yazawa, Y., and Furuya, K. (2021). Recognition and classification of homogeneous landscape with visitor–employed photography and cloud image annotation API—an example of the Riverscape in Nihonbashi, Tokyo, Japan. Landscape Architecture Frontiers 9, 12–31. doi: 10.15302/J-LAF-1-020054

Crossref Full Text | Google Scholar

Somaini, A. (2023). Algorithmic images: artificial intelligence and visual culture. Grey Room :93, 74–115. doi: 10.1162/grey_a_00383

Crossref Full Text | Google Scholar

Soreanu, C., and German, L. (2022). Visual communication in cultural media. The Rashomon effect in the image globalization paradigm. Rev. Art Educ. 26, 177–185. doi: 10.2478/rae-2023-0025

Crossref Full Text | Google Scholar

Su, L., Chen, W., Zhou, Y., and Fan, L. (2023). Exploring city image perception in social media big data through deep learning: a case study of Zhongshan City. Sustainability 15:3311. doi: 10.3390/su15043311

Crossref Full Text | Google Scholar

Van Berkel, D. B., Tabrizian, P., Dorning, M. A., Smart, L., Newcomb, D., Mehaffey, M., et al. (2018). Quantifying the visual-sensory landscape qualities that contribute to cultural ecosystem services using social media and LiDAR. Ecosystem Serv. 31, 326–335. doi: 10.1016/j.ecoser.2018.03.022

PubMed Abstract | Crossref Full Text | Google Scholar

Vishwakarma, V. K. (2023). Translating cultural nuances: challenges and strategies. ELT Voices 13:8268531. doi: 10.5281/ZENODO.8268531

Crossref Full Text | Google Scholar

Wang, H., Gao, Z., Zhang, X., Du, J., Xu, Y., and Wang, Z. (2024). Gamifying cultural heritage: exploring the potential of immersive virtual exhibitions. Telemat. Inform. Rep. 15:100150. doi: 10.1016/j.teler.2024.100150

Crossref Full Text | Google Scholar

Welch, P. B. (2013). Chinese art: A guide to motifs and visual imagery. Vermont: Tuttle Publishing.

Google Scholar

Winder, S. G., Lee, H., Seo, B., Lia, E. H., and Wood, S. A. (2022). An open-source image classifier for characterizing recreational activities across landscapes. People Nat. 4, 1249–1262. doi: 10.1002/pan3.10382

Crossref Full Text | Google Scholar

Xia, S., Xia, Y., Liu, T., Luo, Y., and Pang, P. C.-I. (2025). Application of deep learning models in gastric cancer pathology image analysis: a systematic scoping review. BMC Cancer 25:1257. doi: 10.1186/s12885-025-14662-3

PubMed Abstract | Crossref Full Text | Google Scholar

You, S., Zheng, Q., Chen, B., Xu, Z., Lin, Y., Gan, M., et al. (2022). Identifying the spatiotemporal dynamics of forest ecotourism values with remotely sensed images and social media data: a perspective of public preferences. J. Clean. Prod. 341:130715. doi: 10.1016/j.jclepro.2022.130715

Crossref Full Text | Google Scholar

Zhao, X., Lu, Y., Huang, W., and Lin, G. (2024). Assessing and interpreting perceived park accessibility, usability and attractiveness through texts and images from social media. Sustain. Cities Soc. 112:105619. doi: 10.1016/j.scs.2024.105619

Crossref Full Text | Google Scholar

Keywords: deep learning, cultural transmission, visual images, systematic scoping review, DL

Citation: Yang J, Liu T, Luo YT and Pang PC-I (2025) Deep learning in cultural imagery dissemination: a systematic scoping review of AI-driven visual transmission mechanisms. Front. Commun. 10:1645168. doi: 10.3389/fcomm.2025.1645168

Received: 11 June 2025; Accepted: 09 September 2025;
Published: 19 September 2025.

Edited by:

Ankan Bhattacharya, Hooghly Engineering and Technology College, India

Reviewed by:

Abhranil De, Hooghly Engineering and Technology College, India
Krishanu Kundu, GL Bajaj Group of Institutions, India

Copyright © 2025 Yang, Liu, Luo and Pang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Patrick Cheong-Iao Pang, bWFpbEBwYXRyaWNrcGFuZy5uZXQ=

ORCID: Jinhua Yang, https://orcid.org/0009-0004-4633-1830
Ting Liu, https://orcid.org/0009-0001-0331-262X
Yiming Taclis Luo, https://orcid.org/0009-0002-6117-738X
Patrick Cheong-Iao Pang, https://orcid.org/0000-0002-8820-5443

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.