Research Topic

Training Big Data: Fairness and Bias in the Digital Age

About this Research Topic

Online platforms such as Wikipedia, Google Photos, and social media have become a commonplace, but they are also sources of commonly used training datasets for machine learning. Existing research indicates that for specific platforms such datasets, including structured data, images, texts, and videos, can be biased. Biased human-curated datasets broadly reflect the stereotypes and prejudices of our societies. Using them to train machine learning models can, thus, bias those models. As a result, biased training datasets (e.g. gender or race stereotypes and more) can have dramatic consequences on the fairness of applications using machine learning models.

When a model trained on biased data is used for decision making, unfair decisions take place. Such decisions can unjustifiably exclude members of society from certain benefits as, for example, getting a loan, having access to mobility as a service and access to justice, or being successful in a job application, to name a few. It is of paramount importance to develop techniques and frameworks for identifying and measuring the fairness of training data fed into machine learning algorithms. By successfully identifying and analyzing the nature of different data-driven algorithmic biases and making them measurable and transparent, we can raise awareness and address them in an appropriate manner. Furthermore, existing datasets and pre-trained models do not always allow for the easy analysis or modification of the underlying training data. It is, therefore, essential to develop methods to identify and measure biases encoded inside those opaquer datasets and models.

This Research Topic welcomes contributions from practical and theoretical perspectives that address, but are not limited to, the following topics:

● Data transparency
● Applications of fairness measures to community challenges
● Methods for, and applications of, the ethical evaluation of datasets or pre-trained models
● Novel methodologies for bias evaluation of algorithmic decisions and algorithms in the context of image processing, natural language processing, and probabilistic methods
● Studies that identify and characterize different types of bias including but not limited to gender bias, origin or racial bias, religious bias, socioeconomic, or age bias in existing datasets, algorithms, or models
● The impact of cultural aspects on bias in multilingual datasets
● Methods for, and applications of, fairly curating training datasets

Topic editor Prof. Dr. Mascha Kurpicz-Briki is a member of the board of directors at IFAA cooperative in Bern, Switzerland. All other Topic Editors declare no competing interests with regards to the Research Topic subject.


Keywords: machine learning, training data, fairness, digital ethics, bias


Important Note: All contributions to this Research Topic must be within the scope of the section and journal to which they are submitted, as defined in their mission statements. Frontiers reserves the right to guide an out-of-scope manuscript to a more suitable section or journal at any stage of peer review.

Online platforms such as Wikipedia, Google Photos, and social media have become a commonplace, but they are also sources of commonly used training datasets for machine learning. Existing research indicates that for specific platforms such datasets, including structured data, images, texts, and videos, can be biased. Biased human-curated datasets broadly reflect the stereotypes and prejudices of our societies. Using them to train machine learning models can, thus, bias those models. As a result, biased training datasets (e.g. gender or race stereotypes and more) can have dramatic consequences on the fairness of applications using machine learning models.

When a model trained on biased data is used for decision making, unfair decisions take place. Such decisions can unjustifiably exclude members of society from certain benefits as, for example, getting a loan, having access to mobility as a service and access to justice, or being successful in a job application, to name a few. It is of paramount importance to develop techniques and frameworks for identifying and measuring the fairness of training data fed into machine learning algorithms. By successfully identifying and analyzing the nature of different data-driven algorithmic biases and making them measurable and transparent, we can raise awareness and address them in an appropriate manner. Furthermore, existing datasets and pre-trained models do not always allow for the easy analysis or modification of the underlying training data. It is, therefore, essential to develop methods to identify and measure biases encoded inside those opaquer datasets and models.

This Research Topic welcomes contributions from practical and theoretical perspectives that address, but are not limited to, the following topics:

● Data transparency
● Applications of fairness measures to community challenges
● Methods for, and applications of, the ethical evaluation of datasets or pre-trained models
● Novel methodologies for bias evaluation of algorithmic decisions and algorithms in the context of image processing, natural language processing, and probabilistic methods
● Studies that identify and characterize different types of bias including but not limited to gender bias, origin or racial bias, religious bias, socioeconomic, or age bias in existing datasets, algorithms, or models
● The impact of cultural aspects on bias in multilingual datasets
● Methods for, and applications of, fairly curating training datasets

Topic editor Prof. Dr. Mascha Kurpicz-Briki is a member of the board of directors at IFAA cooperative in Bern, Switzerland. All other Topic Editors declare no competing interests with regards to the Research Topic subject.


Keywords: machine learning, training data, fairness, digital ethics, bias


Important Note: All contributions to this Research Topic must be within the scope of the section and journal to which they are submitted, as defined in their mission statements. Frontiers reserves the right to guide an out-of-scope manuscript to a more suitable section or journal at any stage of peer review.

About Frontiers Research Topics

With their unique mixes of varied contributions from Original Research to Review Articles, Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author.

Topic Editors

Loading..

Submission Deadlines

24 November 2020 Manuscript
24 December 2020 Manuscript Extension

Participating Journals

Manuscripts can be submitted to this Research Topic via the following journals:

Loading..

Topic Editors

Loading..

Submission Deadlines

24 November 2020 Manuscript
24 December 2020 Manuscript Extension

Participating Journals

Manuscripts can be submitted to this Research Topic via the following journals:

Loading..
Loading..

total views article views article downloads topic views

}
 
Top countries
Top referring sites
Loading..