Impact Factor 5.555 | CiteScore 5.1
More on impact ›


Front. Endocrinol., 17 October 2018 |

Metabolomics for Prediction of Relapse in Graves' Disease: Observational Pilot Study

Tristan Struja1*, Andreas Eckart1, Alexander Kutz1, Andreas Huber2, Peter Neyer2, Marius Kraenzlin3, Beat Mueller1,4, Christian Meier3,4, Luca Bernasconi2 and Philipp Schuetz1,4
  • 1Division of Endocrinology, Diabetes and Metabolism, Medical University Department, Kantonsspital Aarau, Aarau, Switzerland
  • 2Department of Laboratory Medicine, Kantonsspital Aarau, Aarau, Switzerland
  • 3Endonet, Basel, Switzerland
  • 4Medical Faculty, University of Basel, Basel, Switzerland

Background: There is a lack of biochemical markers for early prediction of relapse in patients with Graves' disease [GD], which may help to direct treatment decisions. We assessed the prognostic ability of a high-throughput proton NMR metabolomic profile to predict relapse in a well characterized cohort of GD patients.

Methods: Observational study investigating patients presenting with GD at a Swiss hospital endocrine referral center and an associated endocrine outpatient clinic. We measured 227 metabolic markers in the blood of patients before treatment initiation. Main outcome was relapse of hyperthyroidism within 18 months of stopping anti-thyroid drugs. We used ROC analysis with AUC to assess discrimination.

Results: Of 69 included patients 18 (26%) patients had a relapse of disease. The clinical GREAT score had an AUC of 0.68 (95% CI 0.63–0.70) to predict relapse. When looking at the metabolomic markers, univariate analysis revealed pyruvate and triglycerides in medium VLDL as predictors with AUCs of 0.73 (95% CI 0.58–0.84) and 0.67 (95% CI 0.53–0.80), respectively. All other metabolomic markers had lower AUCs.

Conclusion: Overall, metabolomic markers in our pilot study had low to moderate prognostic potential for prediction of relapse of GD, with pyruvate and triglycerides being candidates with acceptable discriminatory abilities. Our data need validation in future larger trials.


Graves' disease [GD] is among the leading causes of hyperthyroidism affecting approximately 0.5% of the general population, especially younger women. It is caused by the presence of autoantibodies to the thyrotropin [TSH] receptor [TRAb] leading to unregulated production and secretion of thyroid hormones (1).

Although treatment with thyroidectomy or radioactive iodine ablation [RAI] provide good cure rates from hyperthyroidism, they are definitive ablative procedures rendering patients subject to lifelong therapy with levothyroxine [T4] (2). On the other hand, anti-thyroid drugs [ATD] provide the chance of cure, albeit, at the cost of a very high relapse rate of approximately 40-60% (1). A more personalized approach would include identifying those who were to benefit most of ATD therapy before treatment initiation. Various approaches have been studied in the past, such as genome wide association studies, thyroidal blood flow assessed by sonography, numerous TRAb assays, and combinations of biochemical and epidemiological markers (3, 4). So far, none have provided enough predictive power to be widely adopted into clinical practice.

Recently, the concept of extensively mapping the phenotypic metabolic state of an individual (i.e., metabolome) has become available by advances in spectrometric techniques. Some studies have already mapped the metabolic differences of hyperthyroid GD states compared to euthyroidism (5, 6).

In other areas, predictive qualities of metabolomics have already been assessed. One report showed that inclusion of lysophosphatidylcholine (20:4) as marker improved recurrence risk prediction of strokes by 6% (7), whereas another report found elevated levels of decanoylcarnitine and octanoylcarnitine to be associated with a higher stroke recurrence risk (hazard ratios 3.8 and 5.5, respectively) (8). Such findings have also been observed in a rat model of ANCA positive vasculitis (immunized to human myeloperoxidase), where urinary di-methyl-glycine and trimethylamine N-oxide levels at day 56 post immunization increased relapse prediction accuracy from 90.5 to 95.2% (9). Furthermore, a Japanese group measured plasma free amino acids in patients with ulcerative colitis. They observed that lower levels of histidine were associated with an increased risk of relapse within year (10).

We hypothesized that distinct metabolic patterns might predict outcome of ATD therapy with regard to relapse. To our knowledge, this study is the first to assess whether metabolomic differences can be used to predict relapse of hyperthyroidism after a course of ATD.


From a previous observational cohort study (11), we had roughly 320 serum aliquots left over at our disposal. Patients were included at an endocrine outpatient clinic and one hospital-based referral center in Switzerland. Patients were treated with ATD in a titration regimen (usually carbimazole or propylthiouracil for 12-18 months). Inclusion criteria were a first episode of GD defined as suppressed TSH (<0.01 mU/l), elevated free T4, and if available, diffuse increased uptake in scintigraphy. Patients with a shorter follow-up period than 24 months after start of ATD treatment were excluded. Also, we excluded patients with ATD treatment duration <12 months, initial ablative therapy (i.e., surgery or radio-active iodine), and time gap between initiation of treatment and blood sample collection over one month. Aforementioned aliquots were analyzed in the current study. After application of the inclusion and exclusion criteria, there were 69 patients left for final analysis. We collected clinical data by medical charts review and if necessary we complemented missing follow-up data by phone calls to patients and general practitioners. The study protocol was approved by the local ethics committee (Ethikkommission Nordwest- und Zentralschweiz (EKNZ) Project No. 2015/227) and has been conducted according to the principles of the Declaration of Helsinki. Need for informed consent was waived due to retrospective nature of analysis with no impact on health outcome.

After blood withdrawal, samples were directly centrifuged and analyzed on serum TSH, fT4, anti-Thyroperoxidase-Antibodies [anti-TPO-Ab] and TRAb levels by standard commercially available laboratory kits (Assays used, are listed in Supplementary Table 1). Leftover serum aliquots were stored at−24° Celsius and mean duration storage time was 46 months (median 46 months; 70 to 17 months interquartile range). Two hundred and twenty-seven metabolic biomarkers were quantified from serum using high-throughput proton NMR metabolomics (Nightingale Health Ltd., Helsinki, Finland) (12, 13) (biomarkers assessed are listed in Supplementary Table 2). This technique is able to provide excellently reproducible results and analytical accuracy given its limitations regarding sensitivity and resolution as an NMR based method (14, 15). Aliquots were shipped on dry ice by a professional courier service and temperature inside the box was monitored continuously.

To validate our storage conditions and the quality of our samples, Nightingale compared our data with their reference data (see Supplementary Figure 1). Reference values were derived from several studies in Scandinavian and UK cohorts with adjacent biobanks [e.g., most recent publication with references to previous works (16)], mainly from the Finnish National Institute of Health and Welfare Biobanks (THL) (17).

Prior to statistical analysis, data was cube root transformed, normalized by the median of each sample, and Pareto scaled to achieve a normal distribution.

The primary outcome of this study was prediction of relapse in GD at ATD treatment initiation. Relapse had to be established by suppressed TSH and elevated peripheral hormones. First, we fitted univariate ROC models for every metabolomic marker. Second, multivariate ROC models were fit by Monte-Carlo cross validation using balanced sub-sampling. We used partial least squares discriminant analysis [PLS-DA] as classification and feature ranking method. Each cross-validation used two thirds of the samples to gauge feature importance. Top important features where then used to build classification models by using the remaining third of samples (18). To account for multiple testing, correction with Benjamini–Hochberg false discovery rate was applied. Statistical significance was set at α < 0.05. Statistical analysis was conducted using MetaboAnalyst software version 4.0 (19, 20) and Stata software version 12.1 (Stata Corp., College Station, TX, USA).


Table 1 shows details of the patient population stratified by relapse, the primary endpoint. Previously, we published a validation study of the GREAT score, a combination of epidemiolocal (i.e., age, goiter size) and standard laboratory variables (i.e., fT4, TRAb) to predict relapse (11). It showed an AUC of 0.68 (95% CI 0.63–0.70) to predict relapse.


Table 1. Baseline characteristics according to relapse status.

Comparison of our data with the reference data set revealed relevant differences for omega-3, glutamine, pyruvate, citrate and acetate. Minor differences were observed for phosphoglycerides, phosphatidylcholines, total cholines, unsaturated fatty acids, and VLDL and LDL diameter, but not in HDL diameter (see Supplementary Figure 1).

Univariate analysis only revealed pyruvate and triglycerides in medium VLDL [MVLDLTG] as significant predictors of GD relapse with AUCs of 0.73 (95% CI 0.58–0.84) and 0.67 (95% CI 0.53–0.80), respectively.

Inclusion of multiple variables by multivariate ROC analysis did not yield higher AUCs. Figure 1 provides an overview of the top six models generated. AUCs ranged from 0.53 (95% CI 0.33–0.66; 100 variables) to 0.57 (95% CI 0.36–0.80; 5 variables), each not being statistically significant. Inclusion of more variables into a model did not result in improved discriminatory power (see Figure 2). Figure 3 displays the frequency of a variable being selected by PLS-DA.


Figure 1. Top 6 ROC models generated by PLS-DA with increasing number of variables. AUC, area under the curve; CI, 95% confidence intervals; PLS-DA, partial least squares-discriminant analysis; Var, number of variables included into model.


Figure 2. Predictive accuracies of the models with increasing number features included.


Figure 3. Frequency of a variable being selected by PLS-DA. MVLDLTG, triglycerides in medium VLDL; XLHDLTG, triglycerides in very large HDL; XLHDLC, total cholesterol in very large HDL; LVLDLTG, triglycerides in chylomicrons and extremely large VLDL; Pyr, pyruvate; XLHDLFC, free cholesterol in very large HDL; XLVLDLTG, triglycerides in chylomicrons and extremely large VLDL; XLHDLPL, phospholipids in very large HDL; SVLDLTG, triglycerides in small VLDL; MVLDLPL, phospholipids in medium VLDL; XLVLDLPL, phospholipids in chylomicrons and extremely large VLDL; LHDLPL, phospholipids in very large HDL; LVLDLC, total cholesterol in chylomicrons and extremely large VLDL; MVLDLFC, free cholesterol in medium VLDL; XLVLDLCE, cholesterol esters in chylomicrons and extremely large VLDL.

As there were no significant results although PLS-DA tends to overfit data, we abstained from validating the model in a subset of our data.


Based on this observational, secondary analysis of blood samples, we were not able to find any metabolomic markers that could predict relapse outcome before ATD treatment initiation with high accuracy. To the best of our knowledge, we are the first to apply the principle of metabolomic phenotyping on relapse prediction in GD.

Although we measured roughly 300 samples, we decided to generate a homogenous cohort by applying stringent inclusion and exclusion criteria leading to many exclusions. We did loosen our exclusion criteria post-hoc to include more patients, but this did not influence results in any way.

Our model was not able to generate any predictive properties which is reflected by the AUCs around 0.55. Inclusion of more variables into a ROC model usually leads to better predictive capacities at the cost of decreasing practicability (20). In our case, AUC tended to decrease with a growing number of variables in a model. We assume this is a chance finding as all median values are very close to each other, and CIs do overlap.

While there are already some reports investigating the metabolomic phenotype of hyperthyroid GD patients (5, 6, 21, 22). Not surprisingly, there were distinct differences in metabolic pathways between the euthyroid and hyperthyroid state detected. Besides histamine and nitrogen pathways, amino acid pathways were mainly involved. For instance, Piras et al. reported the changes from the hyperthyroid to euthyroid state in 15 patients with GD compared to 26 healthy controls (22). They found that GD patients after treatment had significantly lower levels of creatinine, formate, glycerol, histamine, methylamine, and methylsuccinate in plasma as compared to the healthy controls.

Al-Majdoub and colleagues reported changes in the carnitine metabolism of 30 GD patients before treatment compared to 12 months after institution of euthyroidism (5). They observed an increase in short-chain acylcarnitines, whereas medium-chain acylcarnitines were decreased and long-chain acylcarnitines were unchanged after treatment. In general, lysophosphatidylcholines and sphingomyelins were increased in their study. The authors speculated that these changes reflect a starvation like process that was induced by hyperthyroidism.

In 2016, researchers from Singapore published their data on 24 female GD patients transitioning from hyperthyroidism to euthyroidism. In contrast to the previous report, they found a fall of medium- and long-chain acylcarnitines, whereas they observed rises in total cholesterol, LDL, and HDL. The authors postulate that the changes in cholesterol metabolism might be due to increased clearance in hyperthyroidism, whereas the changes in acylcarnitines is based on the T3 induced increased mitochondrial biogenesis and enhanced tricarboxylic acid cycle activity. They also found no changes in branched chain amino acid concentrations (i.e., valine, isoleucine, and leucine). On the other hand, levels of phenylalanine and tyrosine were elevated which might have been due to the increased demand of these amino acid in the synthesis of thyroid hormones (6).

So far there was no investigation looking at the relation to relapse rates. Thus, we studied all markers for their potential to predict relapse with the risk for chance findings. Our data are thus rather hypothesis-generating and need to be validated in future studies. Compared to the previous studies, our laboratory assay put more emphasis on lipid pathways but not carnitine and amino acid metabolism, which might explain our negative findings (see Supplementary Figure 1 and Supplementary Table 2) or it could be due to our limitations in study design. As our focus was prediction of relapse and had only blood samples before the start of treatment, we did not investigate metabolic changes during the transition from hyperthyroidism to euthyroidism.

Our study has three major limitations. First, blood samples were not drawn in a fasting state but randomly. Second, mean storage time of samples was 46 months under sub-optimal conditions (i.e., −24°C instead of −80°C) (23), although other groups reported significant results after storage at −24°C (5). On one hand, suboptimal storage conditions lead to low levels of glutamine, phenylalanine, pyruvate, and acetate. On the other hand, these metabolites are very scarce and at least for lipid components a large coefficient of variation has been reported even under optimal conditions (24). Moreover, glycerol, lactate, and creatinine which are shown to be altered significantly by freeze-thaw cycles in rats (25), have not shown obvious deviations in our cohort.

Furthermore, we did not observe large deviations from the reference sample in fatty acids, especially polyunsaturated fatty acids, which would be other indicators of suboptimal storage and handling (24). Additionally, metabolites such as glucose and lactate that are known to be susceptible to preanalytical errors and have been proposed as a screening tool to assess preanalytical care (26). In our batch, these metabolites were not altogether different from the manufacturer's reference sample.

Third, we had two freeze-thaw cycles during our sample preparation before analysis. From an idealistic standpoint, immediate analysis after blood draw would be the preferable approach which is rarely feasible in routine. Furthermore, a report demonstrated that up to four freeze-thaw cycles did alter samples only slightly (23).


Overall, metabolomic markers in our pilot study had only low to moderate prognostic potential for prediction of relapse of GD, with pyruvate and triglycerides being the most promising candidates with acceptable discriminatory ability. Our data need validation in future larger trials.

Author Contributions

TS analyzed data and wrote the first draft of the manuscript with primary responsibility for the final content. All authors read and approved the final manuscript.


This study was supported in part by the Swiss National Science Foundation (SNSF Professorship, PP00P3_150531/1) and the Research Council of the Kantonsspital Aarau (1410.000.044).

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Supplementary Material

The Supplementary Material for this article can be found online at:


1. Brent GA. Clinical practice. Graves' disease. N Engl J Med. (2008) 358:2594–605. doi: 10.1056/NEJMcp0801880

PubMed Abstract | CrossRef Full Text | Google Scholar

2. Ross DS, Burch HB, Cooper DS, Greenlee MC, Laurberg P, Maia AL, et al. 2016 American thyroid association guidelines for diagnosis and management of hyperthyroidism and other causes of thyrotoxicosis. Thyroid (2016) 26:1343–421. doi: 10.1089/thy.2016.0229

PubMed Abstract | CrossRef Full Text | Google Scholar

3. Struja T, Fehlberg H, Kutz A, Guebelin L, Degen C, Mueller B, et al. Can we predict relapse in Graves' disease? Results from a systematic review and meta-analysis. Eur J Endocrinol. (2017) 176:87–97. doi: 10.1530/EJE-16-0725

PubMed Abstract | CrossRef Full Text | Google Scholar

4. Struja T, Kutz A, Fischli S, Meier C, Mueller B, Recher M, et al. Is Graves' disease a primary immunodeficiency? New immunological perspectives on an endocrine disease. BMC Med. (2017) 15:174. doi: 10.1186/s12916-017-0939-9

PubMed Abstract | CrossRef Full Text | Google Scholar

5. Al-Majdoub M, Lantz M, Spegel P. Treatment of Swedish patients with Graves' hyperthyroidism is associated with changes in acylcarnitine levels. Thyroid (2017) 27:1109–17. doi: 10.1089/thy.2017.0218

PubMed Abstract | CrossRef Full Text | Google Scholar

6. Chng CL, Lim AY, Tan HC, Kovalik JP, Tham KW, Bee YM, et al. Physiological and metabolic changes during the transition from hyperthyroidism to Euthyroidism in Graves' disease. Thyroid (2016) 26:1422–30. doi: 10.1089/thy.2015.0602

PubMed Abstract | CrossRef Full Text | Google Scholar

7. Jove M, Mauri-Capdevila G, Suarez I, Cambray S, Sanahuja J, Quilez A, et al. Metabolomics predicts stroke recurrence after transient ischemic attack. Neurology (2015) 84:36–45. doi: 10.1212/WNL.0000000000001093

PubMed Abstract | CrossRef Full Text | Google Scholar

8. Seo WK, Jo G, Shin MJ, Oh K. Medium-chain acylcarnitines are associated with cardioembolic stroke and stroke recurrence: a metabolomics study. Arterioscler Thromb Vasc Biol. (2018) 38:2245–53. doi: 10.1161/ATVBAHA.118.311373

CrossRef Full Text | Google Scholar

9. Al-Ani B, Fitzpatrick M, Al-Nuaimi H, Coughlan AM, Hickey FB, Pusey CD, et al. Changes in urinary metabolomic profile during relapsing renal vasculitis. Sci Rep. (2016) 6:38074. doi: 10.1038/srep38074

PubMed Abstract | CrossRef Full Text | Google Scholar

10. Hisamatsu T, Ono N, Imaizumi A, Mori M, Suzuki H, Uo M, et al. Decreased plasma histidine level predicts risk of relapse in patients with ulcerative colitis in remission. PLoS ONE (2015) 10:e0140716. doi: 10.1371/journal.pone.0140716

PubMed Abstract | CrossRef Full Text | Google Scholar

11. Struja T, Kaeslin M, Boesiger F, Jutzi R, Imahorn N, Kutz A, et al. External validation of the GREAT score to predict relapse risk in Graves' disease: results from a multicenter, retrospective study with 741 patients. Eur J Endocrinol. (2017) 176:413–9. doi: 10.1530/EJE-16-0986

PubMed Abstract | CrossRef Full Text | Google Scholar

12. Soininen P, Kangas AJ, Wurtz P, Suna T, Ala-Korpela M. Quantitative serum nuclear magnetic resonance metabolomics in cardiovascular epidemiology and genetics. Circ Cardiovasc Genet. (2015) 8:192–206. doi: 10.1161/CIRCGENETICS.114.000216

PubMed Abstract | CrossRef Full Text | Google Scholar

13. Soininen P, Kangas AJ, Wurtz P, Tukiainen T, Tynkkynen T, Laatikainen R, et al. High-throughput serum NMR metabonomics for cost-effective holistic studies on systemic metabolism. Analyst (2009) 134:1781–5. doi: 10.1039/b910205a

PubMed Abstract | CrossRef Full Text | Google Scholar

14. Nagana Gowda GA, Raftery D. Can NMR solve some significant challenges in metabolomics? J Magn Reson (2015) 260:144–60. doi: 10.1016/j.jmr.2015.07.014

CrossRef Full Text | Google Scholar

15. Gathungu RM, Kautz R, Kristal BS, Bird SS, Vouros P. The integration of LC-MS and NMR for the analysis of low molecular weight trace analytes in complex matrices. Mass Spectrom Rev. (2018). doi: 10.1002/mas.21575. [Epub ahead of print].

PubMed Abstract | CrossRef Full Text | Google Scholar

16. Wang Q, Wurtz P, Auro K, Makinen VP, Kangas AJ, Soininen P, et al. Metabolic profiling of pregnancy: cross-sectional and longitudinal evidence. BMC Med. (2016) 14:205. doi: 10.1186/s12916-016-0733-0

PubMed Abstract | CrossRef Full Text | Google Scholar

17. Puska P, Stahl T. Health in all policies-the Finnish initiative: background, principles, current issues. Annu Rev Public Health (2010) 31:315–28 3 p following 328. doi: 10.1146/annurev.publhealth.012809.103658

PubMed Abstract | CrossRef Full Text | Google Scholar

18. Xia J, Wishart DS. Using metaboanalyst 3.0 for comprehensive metabolomics data analysis. Curr Protoc Bioinformat. (2016) 55:14.10.1–14.10.91. doi: 10.1002/cpbi.11

PubMed Abstract | CrossRef Full Text | Google Scholar

19. Xia J, Wishart DS. Web-based inference of biological patterns, functions and pathways from metabolomic data using MetaboAnalyst. Nat Protoc. (2011) 6:743–60. doi: 10.1038/nprot.2011.319

PubMed Abstract | CrossRef Full Text | Google Scholar

20. Xia J, Broadhurst DI, Wilson M, Wishart DS. Translational biomarker discovery in clinical metabolomics: an introductory tutorial. Metabolomics (2013) 9:280–99.

PubMed Abstract | Google Scholar

21. Wojtowicz W, Zabek A, Deja S, Dawiskiba T, Pawelka D, Glod M, et al. Serum and urine (1)H NMR-based metabolomics in the diagnosis of selected thyroid diseases. Sci Rep. (2017) 7:9108.

Google Scholar

22. Piras C, Arisci N, Poddighe S, Liggi S, Mariotti S, Atzori L. Metabolomic profile in hyperthyroid patients before and after antithyroid drug treatment: correlation with thyroid hormone and TSH concentration. Int J Biochem Cell Biol. (2017) 93:119–28. doi: 10.1016/j.biocel.2017.07.024

PubMed Abstract | CrossRef Full Text | Google Scholar

23. Yin P, Peter A, Franken H, Zhao X, Neukamm SS, Rosenbaum L, et al. Preanalytical aspects and sample quality assessment in metabolomics studies of human blood. Clin Chem. (2013) 59:833–45. doi: 10.1373/clinchem.2012.199257

PubMed Abstract | CrossRef Full Text | Google Scholar

24. Zivkovic AM, Wiest MM, Nguyen UT, Davis R, Watkins SM, German JB. Effects of sample handling and storage on quantitative lipid analysis in human serum. Metabolomics (2009) 5:507–16. doi: 10.1007/s11306-009-0174-2

PubMed Abstract | CrossRef Full Text | Google Scholar

25. Torell F, Bennett K, Rannar S, Lundstedt-Enkel K, Lundstedt T, Trygg J. The effects of thawing on the plasma metabolome: evaluating differences between thawed plasma and multi-organ samples. Metabolomics (2017) 13:66. doi: 10.1007/s11306-017-1196-9

PubMed Abstract | CrossRef Full Text | Google Scholar

26. Jobard E, Tredan O, Postoly D, Andre F, Martin AL, Elena-Herrmann B, Boyault S. A Systematic evaluation of blood serum and plasma pre-analytics for metabolomics cohort studies. Int J Mol Sci. (2016) 17:2035. doi: 10.3390/ijms17122035

PubMed Abstract | CrossRef Full Text | Google Scholar

Keywords: Graves basedow disease, metabolomics, relapse activity, predicable results, retrospective analysis

Citation: Struja T, Eckart A, Kutz A, Huber A, Neyer P, Kraenzlin M, Mueller B, Meier C, Bernasconi L and Schuetz P (2018) Metabolomics for Prediction of Relapse in Graves' Disease: Observational Pilot Study. Front. Endocrinol. 9:623. doi: 10.3389/fendo.2018.00623

Received: 28 July 2018; Accepted: 01 October 2018;
Published: 17 October 2018.

Edited by:

Joanna Klubo-Gwiezdzinska, National Institutes of Health (NIH), United States

Reviewed by:

Onyebuchi Okosieme, Cwm Taf University Health Board, United Kingdom
Miloš Žarković, Faculty of Medicine, University of Belgrade, Serbia

Copyright © 2018 Struja, Eckart, Kutz, Huber, Neyer, Kraenzlin, Mueller, Meier, Bernasconi and Schuetz. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Tristan Struja,

These authors have contributed equally to this work