Generative AI for Data Augmentation in Small Biomedical Datasets

  • 631

    Total views and downloads

About this Research Topic

Submission deadlines

  1. Manuscript Submission Deadline 24 March 2026

  2. This Research Topic is currently accepting articles.

Background

A dataset can be expanded or enhanced by adding altered or synthetic representations of the original data, a process known as data augmentation. Generative AI can produce synthetic data that resembles real data but is anonymised or contains missing values. Data security and privacy are two major issues that businesses may run into when generative AI is being implemented. Large data sets are essential for generative AI models to produce precise and significant results. But managing such vast amounts of sensitive data might raise privacy and security issues. The use of AI to produce original text, images, music, audio, and video content is known as generative artificial intelligence, or generative AI. Large AI models known as foundation models, which are the brains behind generative AI, are capable of multitasking and performing unconventional tasks like classification, Q&A, summarisation, and more. Generative AI is capable of producing pertinent data for physicians in addition to automating tasks. To anticipate health outcomes, detect potential health hazards, and provide individualised treatment strategies, for instance, it can analyse patient data.

Data augmentation is the process of altering an image by flipping, cropping, rotating, scaling, and other modifications. Creating more examples of the class while maintaining the underlying category is the aim of data augmentation. It is possible to employ data augmentation for testing, training, or both. A dataset with a large number of photos, for instance, can be enhanced by cropping, resizing, or adding noise. An alternative method is to supplement a text dataset for natural language processing (NLP), which substitutes synonyms or paraphrases passages. With the help of generative AI, users may produce new content rapidly using a range of inputs. These models can receive inputs and outputs in the form of text, pictures, sounds, animation, 3D models, and other kinds of data. Because generative AI models may generate fresh data that is similar to the training data, they are a useful tool for creative industries like music and art. These models have the ability to comprehend and produce data within the context of the environment or input data. The legal issues surrounding AI are continually developing. The main problems with AI include things like liability, intellectual property rights, and regulatory compliance. The subject of accountability comes up when an AI-based decision maker is involved and leads to a malfunctioning system or an incident that could injure someone.

In the future, it is expected that generative AI models will be able to seamlessly integrate data from several modalities, such as text, images, and audio. This integration is a significant step forward, opening the door to the creation of interactive and all-encompassing generative systems. To evaluate security rules and controls, for instance, cybersecurity professionals can utilise generative AI to create unsafe environment simulations. In order to detect possible security threats, AI technologies can also examine historical data for patterns. Teams can therefore reduce these risks to strengthen their security posture. Developers may generate new landscapes, text effects, and characters with the aid of generative AI. Additionally, it can write new code, saving engineers the time and effort required to accomplish so by hand. For this themed article collection, a variety of fields and viewpoints are welcome to contribute, including but not limited to: Generative AI for Data Augmentation in Small Biomedical Datasets.

Potential topics include, but are not limited to the following:

 Methods for Enhancing Data in Biomedical Imaging.
 Generating Synthetic Data for Rare Disease Datasets.
 Self-Guided Education for Enhancing Data.
 Data Augmentation for Genomic Data using GANs.
 Reducing Data Imbalance with Generative AI.
 Healthcare Data Augmentation While Preserving Privacy.
 Producing Label-Rich Biomedical Data of Superior Quality.
 Augmenting Cross-Modality Data using Generative Models.
 Generative AI for Biomedical Image Anomaly Detection.
 Temporal Biomedical Data Augmentation.
 Assessing the Quality of Data Augmentation in Biomedical Applications.
 Combining transfer learning and generative artificial intelligence with biomedical data.
 Regulatory and Ethical Issues with Synthetic Biomedical Data.

Article types and fees

This Research Topic accepts the following article types, unless otherwise specified in the Research Topic description:

  • Brief Research Report
  • Clinical Trial
  • Community Case Study
  • Conceptual Analysis
  • Data Report
  • Editorial
  • FAIR² Data
  • General Commentary
  • Hypothesis and Theory

Articles that are accepted for publication by our external editors following rigorous peer review incur a publishing fee charged to Authors, institutions, or funders.

Keywords: Rare Disease Datasets, Gen AI, Data Augmentation, Biomedical Imaging, GANs

Important note: All contributions to this Research Topic must be within the scope of the section and journal to which they are submitted, as defined in their mission statements. Frontiers reserves the right to guide an out-of-scope manuscript to a more suitable section or journal at any stage of peer review.

Topic editors

Topic coordinators

Manuscripts can be submitted to this Research Topic via the main journal or any other participating journal.

Impact

  • 631Topic views
View impact