Edited by: Rajan Singh, Charles R. Drew University of Medicine and Science, United States
Reviewed by: Asish Chaudhuri, Buck Institute for Research on Aging, United States; Christopher Gerner, University of Vienna, Austria
This article was submitted to Molecular Medicine, a section of the journal Frontiers in Cell and Developmental Biology
This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
In the framework of the Human Proteome Project initiative, we aim to improve mapping and characterization of mitochondrial proteome. In this work we implemented an experimental workflow, combining classical biochemical enrichments and mass spectrometry, to pursue a much deeper definition of mitochondrial proteome and possibly mine mitochondrial uncharacterized
In 2010, the international scientific community launched a joint effort for the comprehensive mapping of the Human Proteome. This program has been split in two sections named Chromosome-based Human Proteome Project (C-HPP), aimed at finding high-stringency evidence for all proteins encoded by the human genome, and Biology/Disease Human Proteome Project (B/D-HPP) whose mission was to annotate all the encoded proteins and to provide verified insights on their functions in health and disease (
The Italian Proteomics Association has dedicated its effort to extensively identify, characterize and quantify human mitochondrial proteome by proteomic mass spectrometry (MS) based technologies (
The mitochondrial organelle is characterized by a peculiar structure composed of four sub-mitochondrial compartments (outer membrane, inter-membrane space, inner-membrane and matrix) through which mitochondrial protein carriers translocate to fulfill biochemical functions (
Several enrichment procedures are commonly used for the biochemical study of mitochondrial metabolic activity. In our previous work, we proposed how to assess the quality of a standard mitochondrial preparation in a proteomic workflow (
Data on human proteins are integrated in the human–centric platform neXtProt, which includes all manually reviewed UniProtKB entries combined with related information derived from proteomics, transcriptomics and genomics data, to report intra- and inter-individual diversity (
Although most of the human proteins have been annotated in the human proteome parts list, 10% of the complete human proteome lacks of functional information. The terms
This study focused on improving the quality of mitochondrial proteome characterization, aiming to increase the recovery of integral membrane proteins.
We applied a combined biochemistry and MS based experimental protocol in order to investigate the mitochondrial proteome of HeLa cells.
Upon a two step membrane solubilization protocol, followed by an extensive MS analysis, we identified 3,187 and 3,275 proteins, in the two solubilized fraction F1 and F2, respectively. The qualitative evaluation of those IDs list, returned 4,230 proteins uniquely identified in the whole mitochondria enrichment.
According to the specific Integrated Mitochondrial Protein Index (IMPI Q2 2018) web resource (
In this report we demonstrate that such an experimental workflow allows a deeper mapping of mitochondrial proteome providing a novel ground to depicting the mitonuclear genomic relationship. The present work draws an initial framework to search for uncharacterized mitochondrial proteins in different biological and pathological models that might precede the development of functional studies.
This section is described in details in
We have implemented a combined biochemistry and MS based experimental workflow to extensively characterize the mt-proteome.
Briefly, after isolating the mitochondria from HeLa cells (
To begin with, we treated mitochondria with digitonin detergent, and we were able to separate a fraction of soluble proteins, that we named Fraction1 (F1), from a denser pellet, containing presumably more hydrophobic mt-inner membrane. We then treated this pellet with n-dodecil-β-(D)-maltoside to solubilize the remaining mt-membrane, and we called this protein extract Fraction 2 (F2). A bottom up approach was then used for MS analysis, which involves analysis of proteolytic peptides released upon enzymatic digestion. To further increase the proteome coverage, each sample was subject to three different proteolytic digestions and mass spectra have been acquired applying two different acquisition methods: Data Dependent Acquisition (DDA) and Data Independent Acquisition (DIA).
In particular, we used trypsin, chymotrypsin and Glu-C enzymes for both protein fractions F1 and F2. Digested peptides have been analyzed by LC-MS DDA on a Orbitrap Elite mass spectrometer (Thermo) and HDMSE, a DIA procedure, on a Synapt G2S
Three technical replicates have been acquired for each of the twelve experimental condition (i.e. 2x sub-fraction/3x proteolytic digestion/2x acquisition protocol).
Combining the IDs list of proteins detected both in DIA and DDA experiments, we retrieved a list of 3,187 protein IDs from F1 and 3,275 total entries from F2.
These numbers have been obtained by merging, for each fraction, the list of proteins identified in DDA and DIA experiment performed in triplicates, on the three different proteolytic digestions, with the exclusion of redundant forms.
Finally, if we compare proteins identified in F1 and F2, after deduction of duplicates, we end up with a dataset of 4,230 protein IDs uniquely identified in whole mitochondria enrichment.
The list of all the proteins identified in DDA experiments are reported in
In
Venn Diagram of protein identification obtained by multiple enzyme digestion approach. Venn diagrams represent the overlap of the lists of proteins IDs detected by LC-MS/MS experiments upon proteolytic digestion with the three enzymes trypsin, chymotrypsin and GluC. The number of IDs uniquely identified by trypsin, Glu-C and chymotrypsin enzymatic digestion and/or number of proteins identified by two or three alternative and independent enzymatic digestion are indicated, respectively, in F1
In both cases the highest percentage of unique identifications arise from the analysis of tryptic peptides (46% of F1 in panel A, 50,5% of F2 in panel B), while a small percentage of proteins (less than 20%) has been retrieved in each of the three different batches of proteolytic peptides.
As might be expected, small groups of proteins identification are equally associated to peptides spectra obtained from two proteolytic digestions.
This multi-enzyme approach led to a considerable higher output of MS spectra, resulting in specific identifications which would have been lost using a traditional single digestion protocol.
The different MS acquisition approach also led to an extension of the data set obtained, though the contribution of the DDA data set, in our experimental configuration, results in an increase <10% in the number of identified proteins respect to DIA dataset, as shown in
To extract from these lists a subset of proteins whose mitochondrial localization is proved from strong experimental evidences, we interrogated the Integrated Mitochondrial Protein Index (IMPI Q2 2018) gene database (
In order to give an overall number of total mt-proteins identified in this work, we merged the mt-protein list obtained in F1 and F2 according to IMPI classification and, by excluding the redundant forms, we were able to retrieve 1,014 mt-proteins (
In order to search our datasets for the presence of mitochondrial proteins not yet annotated for their functions, belonging to the neXtProt PE1 category (dark proteins or uncharacterized PE1-uPE1), we submitted our mt-proteins ID list to the neXtProt dataset query NXQ_00022 which is a selection of those entry, only. Among those proteins, we retrieved 22 of the previously defined mitochondrial proteins, listed in
List of mt-dark proteins identified in HeLa dataset.
NX_O60941-1 | Dystrobrevin beta | Predicted | Unknown | Mitochondria (A) | Try; Glu-C | F1, F2 | −0.55 |
NX_Q3SXM5-1 | Inactive hydroxysteroid dehydrogenase-like protein 1 | Known | OM | Intracellular, Membrane (P) | Chym; Try | F2 | +0.15 |
NX_Q4VC31-1 | Coiled-coil domain-containing protein 58 | Known | IMS | Nucleoli, Mitochondria (A) | Try; Chym; Glu-C | F2 | −0.60 |
NX_Q56VL3-1 | OCIA domain-containing protein 2 | Known | Unknown | Mitochondria (E) | Try | F2 | −0.26 |
NX_Q8IYQ7-1 | Threonine synthase-like 1 | Known | Matrix | Nuclear bodies (A), Mitochondria (A), Cytosol (A) | Try | F2 | −0.13 |
NX_Q8NFV4-1 | Protein ABHD11 | Known | Matrix | Mitochondria (S) | Try; Chym | F2 | −0.09 |
NX_Q96EX1-1 | Small integral membrane protein 12 | Predicted | Unknown | Mitochondria (A) | Try | F2 | −0.53 |
NX_Q96C01-1 | Protein FAM136A | Known | IMS | Mitochondria (A) | Try; Glu-C | F2 | −0.43 |
NX_Q96ER9-1 | Coiled-coil domain-containing protein 51 | Known | Matrix | Nucleosome (S), Mitochondria (S) Centrosome (A) | Try; Glu-C | F2 | −0.38 |
NX_P56378-1 | 6.8 kDa mitochondrial proteolipid | Known | IM | Mitochondria (S),Nucleoli (S) | Try | F1,F2 | −0.02 |
NX_Q9GZT6-1 | Coiled-coil domain-containing protein 90B | Known | Matrix | Mitochondria (E) | Try; Glu-C | F2 | −0.55 |
NX_A8MTT3-1 | Protein CEBPZOS | Known | IMS | Nucleoplasm (A) | Try; Glu-C | F2 | −0.27 |
NX_Q9H4I3-1 | TraB domain-containing protein | Known | OM | Nucleus (A), Mitochondria (A) | Try | F2 | −0.21 |
NX_Q9UFN0-1 | Protein NipSnap homolog 3A | Known | Matrix | not available | Try; Glu-C | F2 | −0.37 |
NX_Q6P1 × 6-1 | UPF0598 protein C8orf82 | Known | Matrix | Nucleus (A) | Try; Chym | F2 | −0.23 |
NX_Q8N2U0-1 | Transmembrane protein 256 | Predicted | Unknown | Vesicles (A) | Try | F2 | +0.46 |
NX_Q8WVI0-1 | Small integral membrane protein 4 | Predicted | Unknown | Nucleoplasm (A), Mitochondria (A) | Try | F2 | −0.54 |
NX_Q8WW59-1 | SPRY domain-containing protein 4 | Known | Matrix | Nucleoplasm (A) | Try | F1,F2 | −0.07 |
NX_Q96BQ5-1 | Coiled-coil domain-containing protein 127 | Known | OM/IMS | Nucleus (S), Nucleoli (S) | Try; Chym | F1,F2 | −0.72 |
NX_Q96DB5-1 | Regulator of microtubule dynamics protein 1 | Known | OM/IMS | Centrosomes (S),Actin filaments (S) | Try | F1,F2 | −0.37 |
NX_Q96KF7-1 | Small integral membrane protein 8 | Known | OM | Vesicles (A) | Try; Chym | F2 | −0.55 |
NX_Q9NU23-1 | LYR motif-containing protein 2 | Known | matrix | Cytosol (A) | Try | F2 | −0.70 |
The table report IMPI evidence for mitochondrial localization (predicted or known to be mitochondrial) and location within mitochondria plus the Human Protein Atlas
Most of them (20 out of 22) have a Gravy score below 0 meaning that are more likely globular (hydrophilic protein), on the contrary, the 2 of them with a score value above 0 are more likely membranous (hydrophobic) (
Interestingly these last two proteins (NX_Q3SXM5-1 with GS:+0.15; NX_Q8N2U0-1 with GS:+0.46) have been both identified in sample fraction F2.
Moreover, despite the fact that not all the proteins have a complete annotation, the theoretical localization inferred from IMPI confirms that we successfully dissolved both the outer membranes (OM) and the inner membrane (IM) of mitochondria.
For 5 of them a localization is still unknown but if we look at, for instance, proteins theoretically localized in the mitochondrial matrix, we could detected 6 out of 8 exclusively in the F2 fraction (e.g., NX_Q8IYQ7-1, Threonine Synthase Like-1) and 2 out of 8, in both F2 and F1 (e.g., NX_Q6P1 × 6-1, UPF0598 protein C8orf 82), suggesting that the double detergent treatment allows to disgregate the complex inner mitochondrial membrane structure.
To assess that the information retrieved with this experimental workflow allows to achieve a much deeper mapping of mitochondrial proteome, we investigated the presence of these uPE1 proteins, in other mitochondrial proteome dataset.
To this aim, this dark protein IDs list has been matched to the ProteomeXchange dataset PXD007053, previously deposited by our group (
Distribution of uPE1 mt-dark proteins identified in PXD007053 repository database.
NX_O60941-1 | Dystrobrevin beta | – | – | – | – | – | – | – | – | – | – |
NX_Q3SXM5-1 | Inactive hydroxysteroid dehydrogenase-like protein 1 | – | – | – | √ | √ | – | – | – | – | – |
NX_Q4VC31-1 | Coiled-coil domain-containing protein 58 | – | – | √ | √ | – | √ | – | – | ||
NX_Q56VL3-1 | OCIA domain-containing protein 2 | – | √ | √ | – | – | √ | √ | √ | √ | |
NX_Q8IYQ7-1 | Threonine synthase-like 1 | – | – | – | √ | – | – | √ | √ | – | – |
NX_Q8NFV4-1 | Protein ABHD11 | – | – | – | – | √ | – | √ | √ | – | – |
NX_Q96EX1-1 | Small integral membrane protein 12 | – | – | – | – | – | – | – | – | – | – |
NX_Q96C01-1 | Protein FAM136A | – | – | – | √ | √ | – | – | √ | – | – |
NX_Q96ER9-1 | Coiled-coil domain-containing protein 51 | – | – | – | √ | – | – | – | – | – | |
NX_P56378-1 | 6.8 kDa mitochondrial proteolipid | – | – | – | – | – | – | – | – | – | – |
NX_Q9GZT6-1 | Coiled-coil domain-containing protein 90B | – | – | √ | – | – | – | – | – | – | – |
NX_A8MTT3-1 | Protein CEBPZOS | – | – | – | – | – | – | – | – | – | – |
NX_Q9H4I3-1 | TraB domain-containing protein | – | – | √ | – | – | √ | – | √ | – | – |
NX_Q9UFN0-1 | Protein NipSnap homolog 3A | – | – | √ | – | – | – | √ | √ | √ | – |
NX_Q6P1 × 6-1 | UPF0598 protein C8orf82 | – | – | – | √ | √ | – | √ | √ | √ | – |
NX_Q8N2U0-1 | Transmembrane protein 256 | – | – | – | – | – | – | – | – | – | – |
NX_Q8WVI0-1 | Small integral membrane protein 4 | – | – | – | – | – | – | – | – | – | – |
NX_Q8WW59-1 | SPRY domain-containing protein 4 | – | – | – | √ | √ | – | – | – | – | – |
NX_Q96BQ5-1 | Coiled-coil domain-containing protein 127 | – | – | √ | – | – | – | – | – | – | – |
NX_Q96DB5-1 | Regulator of microtubule dynamics protein 1 | – | – | √ | √ | – | – | √ | √ | – | – |
NX_Q96KF7-1 | Small integral membrane protein 8 | – | – | – | – | – | – | – | – | – | – |
NX_Q9NU23-1 | LYR motif-containing protein 2 | – | – | – | – | – | – | – | – | – | – |
Interestingly, the majority of uPE1 dark proteins found in HeLa dataset with the experimental workflow we are describing here, were not detected in the same cell line in the repository dataset PXD007053. In particular, comparing these two HeLa datasets only one uPE1 protein out of 22 was found in both. Such a difference is well expected given the higher dimensionality of the current HeLa dataset, which has been collected using a number of specific experimental features. Nevertheless, the results are slightly different if we compare the dark uPE1 proteins of our dataset with those obtained from other cell lines, where we can retrieve a variable number of them: 6 in H28, 8 in Hek293, 7 in HepG2, 2 in HUVEC, 7 in MDA MB231, 9 in THP1, 2 in U2OS, 1 in SHSY5Y. This evidence proves that a specific experimental workflow to sub-fractionate mitochondrial is mandatory in order to identify a higher number of dark proteins, mainly in cell line models, such as HeLa, U2OS or SHSY5Y, whose complete proteome has been well characterized in the last years (
To confirm the low abundance of those proteins we exploited the quantitative analysis performed in our DIA experiments with the PLGS 3.0.3 software (Waters Co) (
To plot the abundance of each dark protein we normalized their concentration (in fmol) versus the concentration of Citrate Synthase (neXtProt Accession:
Relative abundance of mt-dark proteins in HeLa mitochondria enrichment. Relative label free quantification analysis of each mt-dark protein in comparison with Citrate Synthase (NX_075390-1)abundance expressed by percentage ratio and calculated from trypsin-DIA experiments.
Our previous work on mitochondrial proteomic standardization has established standard protocols for mitochondrial enrichment providing analytical key performance criteria (
With this work, using a higher dimensionality method, in term of proteome depth, we aim to overcome this limit and increase the information about mitochondrial proteome, possibly by increasing the detection of membrane proteins, and the likelihood of finding new dark proteins associated with mitochondria.
To this purpose, we conceived an experimental workflow combining detergent mediated membrane disruption with a deep proteomic investigation by means of LC-MS/MS label free bottom up experiments. For that matter, we pursued a multiple enzymatic protein digestion (i.e., trypsin, chymotrypsin and GluC) to generate peptides which have been subsequently analyzed both in DIA and DDA MS experiments.
We draw up all the experimental procedure in HeLa cells, though we are aware that they present some limitations (
Firstly, we enriched mitochondria from HeLa cells. Then, we performed a sub-mitochondrial fractionation by means of incubation with specific detergents. We performed a first milder extraction with digitonin buffer and a subsequent solubilization of the most hydrophobic component by using n-dodecyl-β-(D)-maltoside. Both digitonin and n-dodecyl-β-(D)-maltoside are mild non-ionic detergents for the solubilization of biological membranes. In detail, digitonin is one of the mildest detergents used and for this reason we chose it to perform the first solubilization step. n-dodecyl-β-(D)-maltoside is more effective in solubilizing integral membrane proteins in native conformation and for this purpose we used it to solubilize and extract proteins from the lipophilic pellet (
The multiple-enzyme approach provided a remarkable increase of the identification number. Indeed, cleavages at different amino acid residues (K, R for trypsin; F, W, Y for chymotrypsin; D, E for Glu-C), allow to maximize not only the protein coverage but also the number of IDs, as confirmed by the comparison with what was found previously in HeLa cells (
The DDA mode provides selective, quantitative and sensitive analysis of peptides by defining transition list of the most intense precursor ions for further fragmentation. The main advantage of this method is represented by high accuracy and sensibility for select target but it leads to a lack of acquisition of proteins excluded from this transition list. To overcome this limit, we also operated in DIA mode which acquires spectra of all detectable precursors and their products by combining the sequential isolation of a large precursor window with full product ion spectrum acquisition (
Due to the different technologies and instrumental set up used in this work we do not predict a 100% overlap in the identification results. On the contrary we aim to enlarge the information. As a matter of fact we could confirm the identity of our entries by a high confident and significant overlap of 43.8.% between the DIA and DDA IDs for F1 fraction and for F2 of 45.3% as showed in
Matching the total mitochondrial ID list obtained in this work (1,014 mt-proteins) with the current neXtProt database query for Dark Proteome (query: NXQ_00022), we were able to detect and quantify 22 mitochondrial uPE1 dark proteins As we reported in
In conclusion this work represents a valid experimental MS based workflow able to extensively characterize the mitochondrial proteome by offering confident and comprehensive mitochondrial proteomic dataset repository publicly available which could be sources for further mitochondrial investigation studies on different experimental model.
Moreover, this work provides important and new insights about the mt-dark proteome. In this perspective proteomic studies may provide significant contributions to pharmaceutical research and may support the development of personalized medicinal applications.
The mass spectrometry datasets generated for this study can be found in the ProteomeXchange Consortium via the PRIDE (
FM, LP, and AU designed the experiments and wrote the manuscript. FM performed the cellular biology, biochemistry, and bio-informatic analysis. FM, VC, VG, MR, FI, and SP carried out the LC-MS experiments and data analysis. LP, AU, and MC revised the manuscript. All authors read and approved the final version of the manuscript.
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
This work was financially supported by the Catholic University of the Sacred Heart Intramural Research D1.1 and by MIUR PRIN 20158EB2CM to AU. We gratefully acknowledge the funding “Ricerca Corrente” (2017–2018) and the “Conto Capitale”CC-2016-2365526 from the Italian Ministry of Health to IRCCS Fondazione Santa Lucia. LP acknowledge the 5 × 1000 from Italian Ministry of Health to IRCCS Fondazione Santa Lucia (2017). FM acknowledges ITPA, ITPA Foundations, and EuPA for travel grants support.
The Supplementary Material for this article can be found online at: