Technology Report ARTICLE
AIRR Community Standardized Representations for Annotated Immune Repertoires
- 1Department of Neurology, Yale School of Medicine, Yale University, United States
- 2Department of Pathology, Yale School of Medicine, Yale University, United States
- 3Department of Molecular Biology and Biochemistry, Simon Fraser University, Canada
- 4Department of Biological Sciences, Simon Fraser University, Canada
- 5Division of B Cell Immunology, Deutsches Krebsforschungszentrum, Helmholtz-Gemeinschaft Deutscher Forschungszentren (HZ), Germany
- 6Department of Microbiology and Immunology, College of Medicine, Drexel University, United States
- 7School of Biomedical Engineering Science and Health Systems, Drexel University, United States
- 8Interdepartmental Program in Computational Biology and Bioinformatics, Yale University, United States
- 9Fred Hutchinson Cancer Research Center, United States
- 10Vaccine Research Center, National Institute of Allergy and Infectious Diseases (NIAID), United States
- 11Department of Clinical Sciences, University of Texas Southwestern Medical Center, United States
- 12Icahn School of Medicine at Mount Sinai, United States
Increased interest in the immune system's involvement in pathophysiological phenomena coupled to decreased DNA sequencing costs have led to an explosion of antibody and T cell receptor sequencing data (collectively termed "adaptive immune receptor repertoire sequencing" or AIRR-seq). The AIRR Community has been actively working to standardize protocols, metadata, formats, APIs, and other guidelines to promote open and reproducible studies of the immune repertoire. In this paper, we describe the work of the AIRR Community's Data Representation Working Group to develop standardized data representations for storing and sharing annotated antibody and T cell receptor data. Our file format emphasizes ease-of-use/accessibility, scalability to large data sets, and a commitment to open/transparent science. It is composed of a tab-delimited format with a specific schema. Several popular repertoire analysis tools and data repositories already utilize this AIRR-seq data format. We hope that others will follow suit in the interest of promoting interoperable standards.
Keywords: antibody, immunoglobulin, T cell receptor, Sequencing, File formats, data representation, immunology, Repertoire sequencing
Received: 30 May 2018;
Accepted: 05 Sep 2018.
Edited by:Benny Chain, University College London, United Kingdom
Reviewed by:James M. Heather, Massachusetts General Hospital, Harvard Medical School, United States
Mikael Salson, Université de Lille, France
Copyright: © 2018 Vander Heiden, Marquez, Marthandan, Bukhari, Busse, Corrie, Hershberg, Kleinstein, Matsen, Ralph, Rosenfeld, Schramm, Christley and Laserson. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
Dr. Scott Christley, University of Texas Southwestern Medical Center, Department of Clinical Sciences, Dallas, TX, United States, firstname.lastname@example.org
Dr. Uri Laserson, Icahn School of Medicine at Mount Sinai, New York City, NY, United States, email@example.com