<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="review-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Genet.</journal-id>
<journal-title>Frontiers in Genetics</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Genet.</abbrev-journal-title>
<issn pub-type="epub">1664-8021</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fgene.2021.667358</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Genetics</subject>
<subj-group>
<subject>Mini Review</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Improving Genomic Prediction Using High-Dimensional Secondary Phenotypes</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Arouisse</surname> <given-names>Bader</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1237036/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Theeuwen</surname> <given-names>Tom P. J. M.</given-names></name>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1233855/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>van Eeuwijk</surname> <given-names>Fred A.</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/755162/overview"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name><surname>Kruijer</surname> <given-names>Willem</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="corresp" rid="c001"><sup>&#x0002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1124995/overview"/>
</contrib>
</contrib-group>
<aff id="aff1"><sup>1</sup><institution>Biometris, Wageningen University and Research</institution>, <addr-line>Wageningen</addr-line>, <country>Netherlands</country></aff>
<aff id="aff2"><sup>2</sup><institution>Laboratory of Genetics, Wageningen University and Research</institution>, <addr-line>Wageningen</addr-line>, <country>Netherlands</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Diego Jarquin, University of Nebraska-Lincoln, United States</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Roberto Fritsche-Neto, International Rice Research Institute (IRRI), Philippines; Paulino P&#x000E9;rez-Rodr&#x000ED;guez, Colegio de Postgraduados (COLPOS), Mexico</p></fn>
<corresp id="c001">&#x0002A;Correspondence: Willem Kruijer <email>willem.kruijer&#x00040;wur.nl</email></corresp>
<fn fn-type="other" id="fn001"><p>This article was submitted to Statistical Genetics and Methodology, a section of the journal Frontiers in Genetics</p></fn></author-notes>
<pub-date pub-type="epub">
<day>24</day>
<month>05</month>
<year>2021</year>
</pub-date>
<pub-date pub-type="collection">
<year>2021</year>
</pub-date>
<volume>12</volume>
<elocation-id>667358</elocation-id>
<history>
<date date-type="received">
<day>12</day>
<month>04</month>
<year>2021</year>
</date>
<date date-type="accepted">
<day>14</day>
<month>04</month>
<year>2021</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x000A9; 2021 Arouisse, Theeuwen, van Eeuwijk and Kruijer.</copyright-statement>
<copyright-year>2021</copyright-year>
<copyright-holder>Arouisse, Theeuwen, van Eeuwijk and Kruijer</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license>
</permissions>
<abstract><p>In the past decades, genomic prediction has had a large impact on plant breeding. Given the current advances of high-throughput phenotyping and sequencing technologies, it is increasingly common to observe a large number of traits, in addition to the target trait of interest. This raises the important question whether these additional or &#x0201C;secondary&#x0201D; traits can be used to improve genomic prediction for the target trait. With only a small number of secondary traits, this is known to be the case, given sufficiently high heritabilities and genetic correlations. Here we focus on the more challenging situation with a large number of secondary traits, which is increasingly common since the arrival of high-throughput phenotyping. In this case, secondary traits are usually incorporated through additional relatedness matrices. This approach is however infeasible when secondary traits are not measured on the test set, and cannot distinguish between genetic and non-genetic correlations. An alternative direction is to extend the classical selection indices using penalized regression. So far, penalized selection indices have not been applied in a genomic prediction setting, and require plot-level data in order to reliably estimate genetic correlations. Here we aim to overcome these limitations, using two novel approaches. Our first approach relies on a dimension reduction of the secondary traits, using either penalized regression or random forests (LS-BLUP/RF-BLUP). We then compute the bivariate GBLUP with the dimension reduction as secondary trait. For simulated data (with available plot-level data), we also use bivariate GBLUP with the penalized selection index as secondary trait (SI-BLUP). In our second approach (GM-BLUP), we follow existing multi-kernel methods but replace secondary traits by their genomic predictions, with the advantage that genomic prediction is also possible when secondary traits are only measured on the training set. For most of our simulated data, SI-BLUP was most accurate, often closely followed by RF-BLUP or LS-BLUP. In real datasets, involving metabolites in Arabidopsis and transcriptomics in maize, no method could substantially improve over univariate prediction when secondary traits were only available on the training set. LS-BLUP and RF-BLUP were most accurate when secondary traits were available also for the test set.</p></abstract>
<kwd-group>
<kwd>GBLUP</kwd>
<kwd>genomic prediction</kwd>
<kwd>secondary traits</kwd>
<kwd>selection indices</kwd>
<kwd>penalized regression</kwd>
<kwd>random forest</kwd>
</kwd-group>
<counts>
<fig-count count="3"/>
<table-count count="2"/>
<equation-count count="16"/>
<ref-count count="37"/>
<page-count count="12"/>
<word-count count="8758"/>
</counts>
</article-meta>
</front>
<body>
<sec sec-type="intro" id="s1">
<title>1. Introduction</title>
<p>Genomic prediction is increasingly applied as standard tool in many animal and plant breeding programs. Since it was first introduced by Meuwissen et al. (<xref ref-type="bibr" rid="B19">2001</xref>), the main objective of genomic prediction was to estimate the breeding values for unphenotyped (test) genotypes with only molecular markers, using a training population for which both phenotypic and genotypic data are available. Applications of genomic prediction facilitate the rapid selection of superior genotypes (genomic selection) and accelerate genetic progress in crop breeding.</p>
<p>At the same time, advances in high-throughput phenotyping and cell biology technologies provide increasing amounts of phenotypic data, in addition to the &#x0201C;primary&#x0201D; or &#x0201C;target&#x0201D; traits of interest, such as yield or disease resistance. Such additional traits are typically high-dimensional, and collected using various types of technology, e.g., remote-sensing (Araus et al., <xref ref-type="bibr" rid="B1">2018</xref>), machine vision (Yang et al., <xref ref-type="bibr" rid="B35">2020</xref>), and automation technology (Sun et al., <xref ref-type="bibr" rid="B27">2019</xref>). Common situations are that secondary traits are measured (1) in the field, on the same plant as the target trait, but much earlier in the growing season (2) on entirely different plants, in controlled environments in phenotyping platforms. In both cases, the secondary traits are either observed only for the training set of genotypes, or also for the test set. In all cases however, the question is whether some of the secondary traits are associated with the target traits of interest, and whether these correlations are genetic. In a genomic prediction context, the question becomes when and how secondary traits can improve prediction for the target trait. This is well understood if there is only one secondary trait: accuracy for the target trait then improves when the heritability of the target trait is lower than the heritability of the secondary trait times the squared genetic correlation (Schulthess et al., <xref ref-type="bibr" rid="B25">2016</xref>; Velazco et al., <xref ref-type="bibr" rid="B32">2019</xref>). Here we focus on the more challenging situation with a large numbers of secondary traits, which is increasingly common since the arrival of high-throughput phenotyping.</p>
<p>The two main approaches to incorporate high-dimensional secondary traits in genomic prediction are the use of multiple relatedness matrices, and penalized selection indices. In the former approach, the target trait is modeled as the sum of genetic effects and effects from secondary traits. Both type of effects are random, and the relative importance of these contributions is estimated either using REML-estimates for variance components or cross-validation. Predictions for the test set are the sum of the BLUPs for the different effects. Examples of this approach are Fu et al. (<xref ref-type="bibr" rid="B10">2012</xref>), who obtained a high level of accuracy for predicting hybrid yield performance using gene expression data from the hybrid parents. Similarly, Riedelsheimer et al. (<xref ref-type="bibr" rid="B22">2012</xref>) reported moderate to high accuracies for yield-related traits using 120 metabolites in maize. Schrag et al. (<xref ref-type="bibr" rid="B24">2018</xref>) and Xiang et al. (<xref ref-type="bibr" rid="B33">2019</xref>) used different relatedness matrices corresponding to different types of -omics data. Two major limitations of multiple random-effects models are that (1) they cannot be used when secondary traits are only available on the training set; (2) they cannot distinguish between genetic and residual correlations among the target and secondary traits.</p>
<p>The second approach was recently proposed by Lopez-Cruz et al. (<xref ref-type="bibr" rid="B17">2020</xref>), who extended classical selection indices by imposing a LASSO or ridge penalty on the coefficients. This achieves a dimension reduction, replacing the secondary traits by a single selection index <italic>S</italic>, which is a linear combination of the original traits. The coefficients are chosen to maximize <inline-formula><mml:math id="M1"><mml:msup><mml:mrow><mml:mi>h</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>S</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:msubsup><mml:mrow><mml:mi>&#x003C1;</mml:mi></mml:mrow><mml:mrow><mml:mi>G</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>Y</mml:mi><mml:mo>,</mml:mo><mml:mi>S</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula>, i.e., the heritability of <italic>S</italic> times the squared genetic correlation between <italic>S</italic> and the target trait (<italic>Y</italic>). Lopez-Cruz et al. (<xref ref-type="bibr" rid="B17">2020</xref>) found that on new data, this quantity was indeed much higher than for the classical (unpenalized) selection index. Despite this promising result, penalized selection indices have not yet been applied in a genomic prediction context. One possible reason may be that accurate estimates of genetic correlations between <italic>Y</italic> and each of the secondary traits are required, for which the availability of plant/plot-level observations is assumed.</p>
<p>In the present paper, we propose two new approaches to deal with large numbers of secondary traits, and compare these to the approaches described above, using simulated and real data. First, we define genomic prediction using alternative dimension reductions (LS-BLUP/RF-BLUP), relying on penalized regression (or random forest regression) of the target on the secondary traits. We then compute the bivariate GBLUP with the dimension reduction as secondary trait. Second, we extend existing multi-kernel methods by replacing the secondary traits by their genomic predictions, the main advantage being that genomic prediction for the test set is always possible, also when secondary traits are only measured on the training set. For simulated data (with available plot-level data), we will also use bivariate GBLUP with the penalized selection index as secondary trait (SI-BLUP).</p>
</sec>
<sec sec-type="materials and methods" id="s2">
<title>2. Materials and Methods</title>
<sec>
<title>2.1. Distributional Assumptions</title>
<p>To a large extent we follow the notation of Runcie and Cheng (<xref ref-type="bibr" rid="B23">2019</xref>), assuming observations on traits <italic>Y</italic><sub>1</sub>, &#x02026;, <italic>Y</italic><sub><italic>p</italic>&#x0002B;1</sub>, where each <italic>Y</italic><sub><italic>j</italic></sub> is a column vector. The first one (<italic>Y</italic><sub>1</sub> &#x0003D; <italic>Y</italic><sub><italic>f</italic></sub>) is the focal or target trait, for which genomic predictions are required; <italic>Y</italic><sub>2</sub>, &#x02026;, <italic>Y</italic><sub><italic>p</italic>&#x0002B;1</sub> are the secondary traits. <inline-formula><mml:math id="M2"><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msubsup><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msubsup><mml:mo>,</mml:mo><mml:mo>&#x02026;</mml:mo><mml:mo>,</mml:mo><mml:msubsup><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>p</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msubsup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula> is the column vector containing all secondary traits; similarly, <inline-formula><mml:math id="M3"><mml:mi>Y</mml:mi><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msubsup><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msubsup><mml:mo>,</mml:mo><mml:mo>&#x02026;</mml:mo><mml:mo>,</mml:mo><mml:msubsup><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>p</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msubsup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>t</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula> is the column vector containing all traits. We have in total <italic>n</italic> &#x0003D; <italic>n</italic><sub><italic>t</italic></sub> &#x0002B; <italic>n</italic><sub><italic>o</italic></sub> genotypes, including <italic>n</italic><sub><italic>o</italic></sub> genotypes for which the target trait is observed (the training set), and <italic>n</italic><sub><italic>t</italic></sub> for which it is to be predicted (the <italic>t</italic> referring to test set). We will use subscripts <italic>t</italic> and <italic>o</italic> to indicate that we take the subset of values on the test, respectively training set, for example <italic>Y</italic><sub><italic>o</italic></sub> and <italic>Y</italic><sub><italic>f, o</italic></sub>.</p>
<p>The secondary phenotypes are either observed only on the training set (the CV1-scenario, using the terminology of Runcie and Cheng, <xref ref-type="bibr" rid="B23">2019</xref>), or also for the test genotypes (CV2). Since our focus here is on variable selection and dimension reduction (rather than different cross-validation schemes), we will refer to these simply with scenarios 1 and 2, respectively. The <italic>n</italic> &#x000D7; <italic>n</italic> genetic relatedness matrix <italic>K</italic> is partitioned as:</p>
<disp-formula id="E1"><mml:math id="M4"><mml:mtable columnalign="left"><mml:mtr><mml:mtd><mml:mi>K</mml:mi><mml:mo>=</mml:mo><mml:mrow><mml:mo stretchy="true">(</mml:mo><mml:mrow><mml:mtable columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>K</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mtd><mml:mtd><mml:msub><mml:mrow><mml:mi>K</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi><mml:mi>o</mml:mi></mml:mrow></mml:msub></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>K</mml:mi></mml:mrow><mml:mrow><mml:mi>o</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mtd><mml:mtd><mml:msub><mml:mrow><mml:mi>K</mml:mi></mml:mrow><mml:mrow><mml:mi>o</mml:mi><mml:mi>o</mml:mi></mml:mrow></mml:msub></mml:mtd></mml:mtr></mml:mtable></mml:mrow><mml:mo stretchy="true">)</mml:mo></mml:mrow><mml:mo>,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where the <italic>n</italic><sub><italic>t</italic></sub> &#x000D7; <italic>n</italic><sub><italic>o</italic></sub> matrix <italic>K</italic><sub><italic>to</italic></sub> defines the relatedness between new (test) and observed (training) genotypes. We will also write <italic>K</italic><sub><italic>t</italic>&#x000B7;</sub> &#x0003D; [<italic>K</italic><sub><italic>tt</italic></sub> <italic>K</italic><sub><italic>to</italic></sub>] and <italic>K</italic><sub><italic>o</italic>&#x000B7;</sub> &#x0003D; [<italic>K</italic><sub><italic>ot</italic></sub> <italic>K</italic><sub><italic>oo</italic></sub>]. Similarly, we can decompose the genetic and residual covariance matrices &#x003A3;<sup><italic>u</italic></sup> and &#x003A3;<sup><italic>e</italic></sup> as</p>
<disp-formula id="E2"><mml:math id="M5"><mml:mtable columnalign="left"><mml:mtr><mml:mtd><mml:msup><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mrow><mml:mi>u</mml:mi></mml:mrow></mml:msup><mml:mo>=</mml:mo><mml:mrow><mml:mo stretchy="true">(</mml:mo><mml:mrow><mml:mtable columnalign="left"><mml:mtr><mml:mtd><mml:msubsup><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mrow><mml:mi>f</mml:mi><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mi>u</mml:mi></mml:mrow></mml:msubsup></mml:mtd><mml:mtd><mml:msubsup><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mrow><mml:mi>f</mml:mi><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mi>u</mml:mi></mml:mrow></mml:msubsup></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:msubsup><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mi>u</mml:mi></mml:mrow></mml:msubsup></mml:mtd><mml:mtd><mml:msubsup><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mi>u</mml:mi></mml:mrow></mml:msubsup></mml:mtd></mml:mtr></mml:mtable></mml:mrow><mml:mo stretchy="true">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mo stretchy="true">(</mml:mo><mml:mrow><mml:mtable style="text-align:axis;" equalrows="false" columnlines="" equalcolumns="false" class="array"><mml:mtr><mml:mtd><mml:msubsup><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mrow><mml:mi>f</mml:mi><mml:mo>&#x000B7;</mml:mo></mml:mrow><mml:mrow><mml:mi>u</mml:mi></mml:mrow></mml:msubsup></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:msubsup><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mo>&#x000B7;</mml:mo></mml:mrow><mml:mrow><mml:mi>u</mml:mi></mml:mrow></mml:msubsup></mml:mtd></mml:mtr></mml:mtable></mml:mrow><mml:mo stretchy="true">)</mml:mo></mml:mrow><mml:mo>,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<disp-formula id="E3"><mml:math id="M6"><mml:mtable columnalign="left"><mml:mtr><mml:mtd><mml:msup><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mrow><mml:mi>e</mml:mi></mml:mrow></mml:msup><mml:mo>=</mml:mo><mml:mrow><mml:mo stretchy="true">(</mml:mo><mml:mrow><mml:mtable columnalign="left"><mml:mtr><mml:mtd><mml:msubsup><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mrow><mml:mi>f</mml:mi><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mi>e</mml:mi></mml:mrow></mml:msubsup></mml:mtd><mml:mtd><mml:msubsup><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mrow><mml:mi>f</mml:mi><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mi>e</mml:mi></mml:mrow></mml:msubsup></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:msubsup><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mi>e</mml:mi></mml:mrow></mml:msubsup></mml:mtd><mml:mtd><mml:msubsup><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mi>e</mml:mi></mml:mrow></mml:msubsup></mml:mtd></mml:mtr></mml:mtable></mml:mrow><mml:mo stretchy="true">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mo stretchy="true">(</mml:mo><mml:mrow><mml:mtable style="text-align:axis;" equalrows="false" columnlines="" equalcolumns="false" class="array"><mml:mtr><mml:mtd><mml:msubsup><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mrow><mml:mi>f</mml:mi><mml:mo>&#x000B7;</mml:mo></mml:mrow><mml:mrow><mml:mi>e</mml:mi></mml:mrow></mml:msubsup></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:msubsup><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mo>&#x000B7;</mml:mo></mml:mrow><mml:mrow><mml:mi>e</mml:mi></mml:mrow></mml:msubsup></mml:mtd></mml:mtr></mml:mtable></mml:mrow><mml:mo stretchy="true">)</mml:mo></mml:mrow><mml:mo>,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where the scalars <inline-formula><mml:math id="M7"><mml:msubsup><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mrow><mml:mi>f</mml:mi><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mi>u</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> and <inline-formula><mml:math id="M8"><mml:msubsup><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mrow><mml:mi>f</mml:mi><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mi>e</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> are respectively the genetic and residual variance of the focal trait, and the matrices <inline-formula><mml:math id="M9"><mml:msubsup><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mi>u</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> and <inline-formula><mml:math id="M10"><mml:msubsup><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mi>e</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> contain the genetic and residual (co)variances of the secondary traits. The row-vectors <inline-formula><mml:math id="M11"><mml:msubsup><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mrow><mml:mi>f</mml:mi><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mi>u</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> and <inline-formula><mml:math id="M12"><mml:msubsup><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mrow><mml:mi>f</mml:mi><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mi>e</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> contain the genetic and residual covariance between the focal and the secondary traits.</p>
<p>The joint distribution of <italic>Y</italic> &#x0003D; (<italic>Y</italic><sub>1</sub>, &#x02026;, <italic>Y</italic><sub><italic>p</italic>&#x0002B;1</sub>) is assumed to be</p>
<disp-formula id="E4"><label>(1)</label><mml:math id="M13"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mtable columnalign="left"><mml:mtr><mml:mtd><mml:mi>Y</mml:mi><mml:mo>=</mml:mo><mml:mi>X</mml:mi><mml:mi>&#x003B2;</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:mi>U</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:mi>E</mml:mi></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mo>=</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mtable style="text-align:axis;" equalrows="false" columnlines="" equalcolumns="false" class="array"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mo>&#x022EE;</mml:mo></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>p</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:mtd></mml:mtr></mml:mtable></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mtable style="text-align:axis;" equalrows="false" columnlines="" equalcolumns="false" class="array"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi>&#x003B2;</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mo>&#x022EE;</mml:mo></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>p</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi>&#x003B2;</mml:mi></mml:mrow><mml:mrow><mml:mi>p</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:mtd></mml:mtr></mml:mtable></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mo>&#x0002B;</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mtable style="text-align:axis;" equalrows="false" columnlines="" equalcolumns="false" class="array"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mo>&#x022EE;</mml:mo></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mrow><mml:mi>p</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:mtd></mml:mtr></mml:mtable></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mo>&#x0002B;</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mtable style="text-align:axis;" equalrows="false" columnlines="" equalcolumns="false" class="array"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>E</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mo>&#x022EE;</mml:mo></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>E</mml:mi></mml:mrow><mml:mrow><mml:mi>p</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:mtd></mml:mtr></mml:mtable></mml:mrow><mml:mo>]</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mo>=</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mtable style="text-align:axis;" equalrows="false" columnlines="" equalcolumns="false" class="array"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>f</mml:mi></mml:mrow></mml:msub></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:mtd></mml:mtr></mml:mtable></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mtable style="text-align:axis;" equalrows="false" columnlines="" equalcolumns="false" class="array"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>f</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi>&#x003B2;</mml:mi></mml:mrow><mml:mrow><mml:mi>f</mml:mi></mml:mrow></mml:msub></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi>&#x003B2;</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:mtd></mml:mtr></mml:mtable></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mo>&#x0002B;</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mtable style="text-align:axis;" equalrows="false" columnlines="" equalcolumns="false" class="array"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mrow><mml:mi>f</mml:mi></mml:mrow></mml:msub></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:mtd></mml:mtr></mml:mtable></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mo>&#x0002B;</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mtable style="text-align:axis;" equalrows="false" columnlines="" equalcolumns="false" class="array"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>E</mml:mi></mml:mrow><mml:mrow><mml:mi>f</mml:mi></mml:mrow></mml:msub></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>E</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:mtd></mml:mtr></mml:mtable></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mo>,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where</p>
<disp-formula id="E5"><label>(2)</label><mml:math id="M14"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mi>U</mml:mi><mml:mo>&#x0007E;</mml:mo><mml:mi>N</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:msup><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mrow><mml:mi>u</mml:mi></mml:mrow></mml:msup><mml:mo>&#x02297;</mml:mo><mml:mi>K</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>,</mml:mo><mml:mtext>&#x02003;</mml:mtext><mml:mi>E</mml:mi><mml:mo>&#x0007E;</mml:mo><mml:mi>N</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:msup><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mrow><mml:mi>e</mml:mi></mml:mrow></mml:msup><mml:mo>&#x02297;</mml:mo><mml:msub><mml:mrow><mml:mi>I</mml:mi></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>.</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>The genetic covariances (<inline-formula><mml:math id="M15"><mml:msubsup><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mrow><mml:mi>f</mml:mi><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mi>u</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula>) quantify the degree of overlap among genetic signals, based on which multivariate methods can potentially improve genomic prediction. The residual covariances (<inline-formula><mml:math id="M16"><mml:msubsup><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mrow><mml:mi>f</mml:mi><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mi>e</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula>) are important when traits are measured on the same individuals; if measured on different individuals (typically, in a different experiment), &#x003A3;<sup><italic>e</italic></sup> can assumed to be diagonal. &#x003A3;<sup><italic>u</italic></sup> and &#x003A3;<sup><italic>e</italic></sup> are usually unknown, and need to be estimated from the data. For <italic>p</italic> larger than 5 &#x02212; 10, this usually requires approximations. Below we describe several dimension reduction approaches, which reduce the dimensionality of the secondary phenotypes to 1, and exact REML-estimates of &#x003A3;<sup><italic>u</italic></sup> and &#x003A3;<sup><italic>e</italic></sup> can be obtained with standard software.</p>
</sec>
<sec>
<title>2.2. Genomic Prediction</title>
<p>The main objective is the prediction of the genetic effect <italic>U</italic><sub>1</sub> &#x0003D; <italic>U</italic><sub><italic>f</italic></sub>, i.e., the breeding values for the focal trait, in particular for the test set (<italic>U</italic><sub><italic>f, t</italic></sub>). In our simulations we assess prediction accuracy in terms of the Pearson correlation (<italic>r</italic>) between the simulated and predicted genetic effects, on the test set. For real data, we consider the correlation between the predicted genetic effects and the trait values observed on the test sets. Although it is well-known that this is a biased estimator of the true accuracy (i.e., the correlation with the unknown genetic effect), the bias is likely to be constant among methods, as long as the target and secondary traits are observed on different plants (Runcie and Cheng, <xref ref-type="bibr" rid="B23">2019</xref>).</p>
</sec>
<sec>
<title>2.3. Univariate GBLUP</title>
<p>The univariate GBLUP for <italic>U</italic><sub><italic>f, t</italic></sub> is defined by</p>
<disp-formula id="E6"><label>(3)</label><mml:math id="M17"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mtable columnalign="left"><mml:mtr><mml:mtd><mml:msubsup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>f</mml:mi><mml:mo>,</mml:mo><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mtext class="textrm" mathvariant="normal">uni</mml:mtext></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mi>E</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mrow><mml:mi>f</mml:mi><mml:mo>,</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">|</mml:mo><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>f</mml:mi><mml:mo>,</mml:mo><mml:mi>o</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:msubsup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>f</mml:mi><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mi>u</mml:mi></mml:mrow></mml:msubsup><mml:msub><mml:mrow><mml:mi>K</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi><mml:mi>o</mml:mi></mml:mrow></mml:msub><mml:msup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>V</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>f</mml:mi><mml:mo>,</mml:mo><mml:mi>o</mml:mi></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>f</mml:mi><mml:mo>,</mml:mo><mml:mi>o</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>&#x003B2;</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>f</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mtext>&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi>K</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi><mml:mi>o</mml:mi></mml:mrow></mml:msub><mml:msubsup><mml:mrow><mml:mi>K</mml:mi></mml:mrow><mml:mrow><mml:mi>o</mml:mi><mml:mi>o</mml:mi></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msubsup><mml:msubsup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>f</mml:mi><mml:mo>,</mml:mo><mml:mi>o</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mtext class="textrm" mathvariant="normal">uni</mml:mtext></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mo>,</mml:mo></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:msubsup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>f</mml:mi><mml:mo>,</mml:mo><mml:mi>o</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mtext class="textrm" mathvariant="normal">uni</mml:mtext></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:msubsup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>f</mml:mi><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mi>u</mml:mi></mml:mrow></mml:msubsup><mml:msub><mml:mrow><mml:mi>K</mml:mi></mml:mrow><mml:mrow><mml:mi>o</mml:mi><mml:mi>o</mml:mi></mml:mrow></mml:msub><mml:msup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>V</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>f</mml:mi><mml:mo>,</mml:mo><mml:mi>o</mml:mi></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>f</mml:mi><mml:mo>,</mml:mo><mml:mi>o</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>&#x003B2;</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>f</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>,</mml:mo></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mtext>&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:mover accent="true"><mml:mrow><mml:mi>V</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover><mml:mo>=</mml:mo><mml:msubsup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>f</mml:mi><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mi>u</mml:mi></mml:mrow></mml:msubsup><mml:msub><mml:mrow><mml:mi>K</mml:mi></mml:mrow><mml:mrow><mml:mi>o</mml:mi><mml:mi>o</mml:mi></mml:mrow></mml:msub><mml:mo>&#x0002B;</mml:mo><mml:msubsup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>f</mml:mi><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mi>e</mml:mi></mml:mrow></mml:msubsup><mml:msub><mml:mrow><mml:mi>I</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>o</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msub><mml:mo>,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where <inline-formula><mml:math id="M18"><mml:msubsup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>f</mml:mi><mml:mo>,</mml:mo><mml:mi>o</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mstyle class="text"><mml:mtext class="textrm" mathvariant="normal">uni</mml:mtext></mml:mstyle></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msubsup></mml:math></inline-formula> is the GBLUP for the training set, and REML-estimates of &#x003B2;<sub><italic>f</italic></sub> and the variance components <inline-formula><mml:math id="M19"><mml:msubsup><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mrow><mml:mi>f</mml:mi><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mi>u</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> and <inline-formula><mml:math id="M20"><mml:msubsup><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mrow><mml:mi>f</mml:mi><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mi>e</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> are obtained from a univariate mixed model for <italic>Y</italic><sub><italic>f</italic></sub>. This is the best (univariate) linear unbiased predictor, at least given the true values of the variance components.</p>
</sec>
<sec>
<title>2.4. Multivariate GBLUP in Scenarios 1 and 2</title>
<p>The multivariate GBLUP in scenario 1 is</p>
<disp-formula id="E7"><label>(4)</label><mml:math id="M21"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mtable columnalign="left"><mml:mtr><mml:mtd><mml:msubsup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>f</mml:mi><mml:mo>,</mml:mo><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>m</mml:mi><mml:mn>1</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mi>E</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mrow><mml:mi>f</mml:mi><mml:mo>,</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>o</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msubsup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>f</mml:mi><mml:mo>&#x000B7;</mml:mo></mml:mrow><mml:mrow><mml:mi>u</mml:mi></mml:mrow></mml:msubsup><mml:mo>&#x02297;</mml:mo><mml:msub><mml:mrow><mml:mi>K</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi><mml:mi>o</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:msup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>V</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>o</mml:mi></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>o</mml:mi></mml:mrow></mml:msub><mml:mover accent="true"><mml:mrow><mml:mi>&#x003B2;</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mtext>&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi>K</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi><mml:mi>o</mml:mi></mml:mrow></mml:msub><mml:msubsup><mml:mrow><mml:mi>K</mml:mi></mml:mrow><mml:mrow><mml:mi>o</mml:mi><mml:mi>o</mml:mi></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msubsup><mml:msubsup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>f</mml:mi><mml:mo>,</mml:mo><mml:mi>o</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>m</mml:mi><mml:mn>1</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mo>,</mml:mo></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:msubsup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>f</mml:mi><mml:mo>,</mml:mo><mml:mi>o</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>m</mml:mi><mml:mn>1</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msubsup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>f</mml:mi><mml:mo>&#x000B7;</mml:mo></mml:mrow><mml:mrow><mml:mi>u</mml:mi></mml:mrow></mml:msubsup><mml:mo>&#x02297;</mml:mo><mml:msub><mml:mrow><mml:mi>K</mml:mi></mml:mrow><mml:mrow><mml:mi>o</mml:mi><mml:mi>o</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:msup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>V</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>o</mml:mi></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>o</mml:mi></mml:mrow></mml:msub><mml:mover accent="true"><mml:mrow><mml:mi>&#x003B2;</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>,</mml:mo></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mtext>&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:mover accent="true"><mml:mrow><mml:mi>V</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>u</mml:mi></mml:mrow></mml:msup><mml:mo>&#x02297;</mml:mo><mml:msub><mml:mrow><mml:mi>K</mml:mi></mml:mrow><mml:mrow><mml:mi>o</mml:mi><mml:mi>o</mml:mi></mml:mrow></mml:msub><mml:mo>&#x0002B;</mml:mo><mml:msup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>e</mml:mi></mml:mrow></mml:msup><mml:mo>&#x02297;</mml:mo><mml:msub><mml:mrow><mml:mi>I</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>o</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msub><mml:mo>,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where <inline-formula><mml:math id="M22"><mml:msubsup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>f</mml:mi><mml:mo>,</mml:mo><mml:mi>o</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>m</mml:mi><mml:mn>1</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msubsup></mml:math></inline-formula> is the GBLUP for the training set, and REML-estimates of &#x003B2; and the variance components (matrices) &#x003A3;<sup><italic>u</italic></sup> and &#x003A3;<sup><italic>e</italic></sup> are obtained from the multivariate mixed model for <italic>Y</italic><sub><italic>f</italic></sub> and <italic>Y</italic><sub><italic>s</italic></sub>. As pointed out by Runcie and Cheng (<xref ref-type="bibr" rid="B23">2019</xref>), <inline-formula><mml:math id="M23"><mml:msubsup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>f</mml:mi><mml:mo>,</mml:mo><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>m</mml:mi><mml:mn>1</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msubsup></mml:math></inline-formula> and <inline-formula><mml:math id="M24"><mml:msubsup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>f</mml:mi><mml:mo>,</mml:mo><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mstyle class="text"><mml:mtext class="textrm" mathvariant="normal">uni</mml:mtext></mml:mstyle></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msubsup></mml:math></inline-formula> have the same form, but the &#x0201C;input&#x0201D; &#x000DB;<sub><italic>f, o</italic></sub> differs.</p>
<p>The multivariate GBLUP in scenario 2 is</p>
<disp-formula id="E8"><label>(5)</label><mml:math id="M25"><mml:mtable columnalign='left'><mml:mtr><mml:mtd><mml:msubsup><mml:mover accent='true'><mml:mi>U</mml:mi><mml:mo>&#x0005E;</mml:mo></mml:mover><mml:mrow><mml:mi>f</mml:mi><mml:mo>,</mml:mo><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mo stretchy='false'>(</mml:mo><mml:mi>m</mml:mi><mml:mn>2</mml:mn><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mi>E</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>U</mml:mi><mml:mrow><mml:mi>f</mml:mi><mml:mo>,</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>&#x0007C;</mml:mo><mml:msub><mml:mi>Y</mml:mi><mml:mrow><mml:mi>f</mml:mi><mml:mo>,</mml:mo><mml:mi>o</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>Y</mml:mi><mml:mi>s</mml:mi></mml:msub><mml:mo stretchy='false'>)</mml:mo></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mtext>&#x02009;&#x02009;&#x02009;&#x02009;&#x02009;&#x02009;&#x02009;&#x02009;&#x02009;&#x02009;&#x02009;&#x02009;&#x02009;</mml:mtext><mml:mo>=</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mtable><mml:mtr><mml:mtd><mml:mrow><mml:msubsup><mml:mover accent='true'><mml:mo>&#x003A3;</mml:mo><mml:mo>&#x0005E;</mml:mo></mml:mover><mml:mrow><mml:mi>f</mml:mi><mml:mi>f</mml:mi></mml:mrow><mml:mi>u</mml:mi></mml:msubsup><mml:mo>&#x02297;</mml:mo><mml:msub><mml:mi>K</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mi>o</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:msubsup><mml:mover accent='true'><mml:mo>&#x003A3;</mml:mo><mml:mo>&#x0005E;</mml:mo></mml:mover><mml:mrow><mml:mi>f</mml:mi><mml:mi>s</mml:mi></mml:mrow><mml:mi>u</mml:mi></mml:msubsup><mml:mo>&#x02297;</mml:mo><mml:msub><mml:mi>K</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mo>&#x000B7;</mml:mo></mml:mrow></mml:msub></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mtext>&#x02009;</mml:mtext><mml:msup><mml:mover accent='true'><mml:mi>V</mml:mi><mml:mo>&#x0005E;</mml:mo></mml:mover><mml:mrow><mml:mo>&#x02212;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mtable><mml:mtr><mml:mtd><mml:mrow><mml:msub><mml:mi>Y</mml:mi><mml:mrow><mml:mi>f</mml:mi><mml:mo>,</mml:mo><mml:mi>o</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mrow><mml:mi>f</mml:mi><mml:mo>,</mml:mo><mml:mi>o</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mover accent='true'><mml:mi>&#x003B2;</mml:mi><mml:mo>&#x0005E;</mml:mo></mml:mover><mml:mi>f</mml:mi></mml:msub></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mrow><mml:msub><mml:mi>Y</mml:mi><mml:mi>s</mml:mi></mml:msub><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mi>s</mml:mi></mml:msub><mml:msub><mml:mover accent='true'><mml:mi>&#x003B2;</mml:mi><mml:mo>&#x0005E;</mml:mo></mml:mover><mml:mi>s</mml:mi></mml:msub></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mtext>&#x0200B;</mml:mtext><mml:mo>,</mml:mo></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mtext>&#x02009;&#x02009;&#x02009;&#x02009;&#x02009;&#x02009;&#x02009;&#x02009;&#x02009;</mml:mtext><mml:mover accent='true'><mml:mi>V</mml:mi><mml:mo>&#x0005E;</mml:mo></mml:mover><mml:mo>=</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mtable><mml:mtr><mml:mtd><mml:mrow><mml:msubsup><mml:mover accent='true'><mml:mo>&#x003A3;</mml:mo><mml:mo>&#x0005E;</mml:mo></mml:mover><mml:mrow><mml:mi>f</mml:mi><mml:mi>f</mml:mi></mml:mrow><mml:mi>u</mml:mi></mml:msubsup><mml:msub><mml:mi>K</mml:mi><mml:mrow><mml:mi>o</mml:mi><mml:mi>o</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:msubsup><mml:mover accent='true'><mml:mo>&#x003A3;</mml:mo><mml:mo>&#x0005E;</mml:mo></mml:mover><mml:mrow><mml:mi>f</mml:mi><mml:mi>s</mml:mi></mml:mrow><mml:mi>u</mml:mi></mml:msubsup><mml:mo>&#x02297;</mml:mo><mml:msub><mml:mi>K</mml:mi><mml:mrow><mml:mi>o</mml:mi><mml:mo>&#x000B7;</mml:mo></mml:mrow></mml:msub></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mrow><mml:msubsup><mml:mover accent='true'><mml:mo>&#x003A3;</mml:mo><mml:mo>&#x0005E;</mml:mo></mml:mover><mml:mrow><mml:mi>s</mml:mi><mml:mi>f</mml:mi></mml:mrow><mml:mi>u</mml:mi></mml:msubsup><mml:mo>&#x02297;</mml:mo><mml:msubsup><mml:mi>K</mml:mi><mml:mrow><mml:mi>o</mml:mi><mml:mo>&#x000B7;</mml:mo></mml:mrow><mml:mi>t</mml:mi></mml:msubsup></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:msubsup><mml:mover accent='true'><mml:mo>&#x003A3;</mml:mo><mml:mo>&#x0005E;</mml:mo></mml:mover><mml:mrow><mml:mi>s</mml:mi><mml:mi>s</mml:mi></mml:mrow><mml:mi>u</mml:mi></mml:msubsup><mml:mo>&#x02297;</mml:mo><mml:mi>K</mml:mi></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mtext>&#x02009;&#x02009;&#x02009;&#x02009;&#x02009;&#x02009;&#x02009;&#x02009;&#x02009;&#x02009;&#x02009;&#x02009;&#x02009;</mml:mtext><mml:mo>+</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mtable><mml:mtr><mml:mtd><mml:mrow><mml:msubsup><mml:mover accent='true'><mml:mo>&#x003A3;</mml:mo><mml:mo>&#x0005E;</mml:mo></mml:mover><mml:mrow><mml:mi>f</mml:mi><mml:mi>f</mml:mi></mml:mrow><mml:mi>e</mml:mi></mml:msubsup><mml:msub><mml:mi>I</mml:mi><mml:mrow><mml:msub><mml:mi>n</mml:mi><mml:mi>o</mml:mi></mml:msub></mml:mrow></mml:msub></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:msubsup><mml:mover accent='true'><mml:mo>&#x003A3;</mml:mo><mml:mo>&#x0005E;</mml:mo></mml:mover><mml:mrow><mml:mi>f</mml:mi><mml:mi>s</mml:mi></mml:mrow><mml:mi>e</mml:mi></mml:msubsup><mml:mo>&#x02297;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mtable><mml:mtr><mml:mtd><mml:mn>0</mml:mn></mml:mtd><mml:mtd><mml:mrow><mml:msub><mml:mi>I</mml:mi><mml:mrow><mml:msub><mml:mi>n</mml:mi><mml:mi>o</mml:mi></mml:msub></mml:mrow></mml:msub></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mrow><mml:msubsup><mml:mover accent='true'><mml:mo>&#x003A3;</mml:mo><mml:mo>&#x0005E;</mml:mo></mml:mover><mml:mrow><mml:mi>s</mml:mi><mml:mi>f</mml:mi></mml:mrow><mml:mi>e</mml:mi></mml:msubsup><mml:mo>&#x02297;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mtable><mml:mtr><mml:mtd><mml:mrow><mml:msup><mml:mn>0</mml:mn><mml:mi>t</mml:mi></mml:msup></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mrow><mml:msub><mml:mi>I</mml:mi><mml:mrow><mml:msub><mml:mi>n</mml:mi><mml:mi>o</mml:mi></mml:msub></mml:mrow></mml:msub></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mtd><mml:mtd><mml:mrow><mml:msubsup><mml:mover accent='true'><mml:mo>&#x003A3;</mml:mo><mml:mo>&#x0005E;</mml:mo></mml:mover><mml:mrow><mml:mi>s</mml:mi><mml:mi>s</mml:mi></mml:mrow><mml:mi>e</mml:mi></mml:msubsup><mml:mo>&#x02297;</mml:mo><mml:msub><mml:mi>I</mml:mi><mml:mi>n</mml:mi></mml:msub></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where 0 denotes a <italic>n</italic><sub><italic>t</italic></sub> &#x000D7; <italic>n</italic><sub><italic>o</italic></sub> matrix of zeros. This differs from the CV2 prediction in Runcie and Cheng (<xref ref-type="bibr" rid="B23">2019</xref>), who described a two-step approach.</p>
</sec>
<sec>
<title>2.5. Dimension Reduction Using LASSO or Random Forests</title>
<p>Expressions (4) and (5) are valid regardless whether there is just a single secondary phenotype, or multiple ones. However, when the dimension of the secondary phenotype (<italic>p</italic>) is larger than 5 &#x02212; 10, estimation of the required genetic covariances quickly becomes challenging and often infeasible (Zhou and Stephens, <xref ref-type="bibr" rid="B36">2014</xref>; Zwiernik et al., <xref ref-type="bibr" rid="B37">2017</xref>). Moreover, even if estimates of genetic covariance are available, the resulting predictions may be prone to overfitting. Reducing the dimension of the secondary phenotype appears to be a relevant strategy to deal with these issues.</p>
<p>Here we propose the dimension reduction <italic>S</italic> &#x0003D; <italic>&#x00125;</italic>(<italic>Y</italic><sub><italic>s</italic></sub>), where <italic>&#x00125;</italic>(<italic>Y</italic><sub><italic>s</italic></sub>) is a prediction of <italic>Y</italic><sub><italic>f</italic></sub> based on <italic>Y</italic><sub><italic>s</italic></sub>, obtained either with LASSO or random forests. Genomic prediction in scenarios 1 and 2 is then performed using (4) and (5), with <italic>S</italic> &#x0003D; <italic>&#x00125;</italic>(<italic>Y</italic><sub><italic>s</italic></sub>) as secondary trait. We will refer to the resulting genomic predictions using LS-BLUP and RF-BLUP, depending on whether the dimension reduction was achieved by respectively LASSO or random forests. In a GWAS context, such dimension reductions have been used by van Heerwaarden et al. (<xref ref-type="bibr" rid="B31">2015</xref>) and Melandri (<xref ref-type="bibr" rid="B18">2019</xref>). The intuition behind this dimension reduction is that some of the secondary traits may have a causal effect on <italic>Y</italic><sub><italic>f</italic></sub> (<xref ref-type="fig" rid="F1">Figure 1</xref>, left). Genomic prediction with LS-BLUP and RF-BLUP may then work well if &#x00176;<sub><italic>f</italic></sub> captures most of the relevant genetic correlations. In our simulations described below, we also consider the situation where genetic correlations are not the result of a causal effect of <italic>Y</italic><sub><italic>s</italic></sub> on <italic>Y</italic><sub><italic>f</italic></sub> (for example, as in <xref ref-type="fig" rid="F1">Figure 1</xref>, right panel). Because of the relatively small size of the populations considered here, the dimension reduction is computed on the same training set that is used for genomic prediction. This is of course not essential for this approach, and various sample splitting techniques may be of interest for larger populations; see the discussion section below.</p>
<fig id="F1" position="float">
<label>Figure 1</label>
<caption><p>Causal diagrams showing different assumptions about the mechanisms underlying genetic correlations between a high-dimensional secondary phenotype <italic>Y</italic><sub><italic>s</italic></sub> and a target (focal) trait <italic>Y</italic><sub><italic>f</italic></sub>. For ease of presentation, <italic>Y</italic><sub><italic>s</italic></sub> is represented by a single node; causal relationships among some of the secondary traits might exist. Outgoing arrows from the node <italic>G</italic> to a trait represent the genetic effect of all loci combined. The arrow <italic>Y</italic><sub><italic>s</italic></sub> &#x02192; <italic>Y</italic><sub><italic>f</italic></sub> represents a causal effect from at least one of the secondary traits on the target trait. <bold>(Left)</bold> Some of the genetic correlations between <italic>Y</italic><sub><italic>s</italic></sub> and <italic>Y</italic><sub><italic>f</italic></sub> are the result of the causal effect <italic>Y</italic><sub><italic>s</italic></sub> &#x02192; <italic>Y</italic><sub><italic>f</italic></sub>; to some extent they may also be a consequence from correlation between the direct genetic effects <italic>G</italic> &#x02192; <italic>Y</italic><sub><italic>f</italic></sub> and <italic>G</italic> &#x02192; <italic>Y</italic><sub><italic>s</italic></sub> (see Kruijer et al., <xref ref-type="bibr" rid="B15">2020</xref> for more mathematical details). <bold>(Right)</bold> There is no causal effect <italic>Y</italic><sub><italic>s</italic></sub> &#x02192; <italic>Y</italic><sub><italic>f</italic></sub>, and genetic correlations between them may be induced by genetic effects on a latent trait <italic>L</italic> that is affecting both <italic>Y</italic><sub><italic>s</italic></sub> and <italic>Y</italic><sub><italic>f</italic></sub>. The <bold>LS-BLUP</bold> and <bold>RF-BLUP</bold> methods assume the left diagram, and reduce the dimension of <italic>Y</italic><sub><italic>s</italic></sub> first making a prediction &#x00176;<sub><italic>f</italic></sub> using <italic>Y</italic><sub><italic>s</italic></sub> within the training set. Also the <bold>GM-BLUP</bold> method implicitly assumes the left diagram.</p></caption>
<graphic xlink:href="fgene-12-667358-g0001.tif"/>
</fig>
<p>When using RF-BLUP in the simulations described below, we used the R-package randomForest, with the default settings. Often however, a more accurate dimension reduction can be achieved by tuning various hyperparameters (like the number of trees), which we explore for the real data.</p>
</sec>
<sec>
<title>2.6. Dimension Reduction Using Selection Indices</title>
<p>In addition to the notation <italic>Y</italic><sub><italic>s</italic></sub> for the column vector containing all secondary traits, we will now also use <italic>Y</italic><sub><italic>s</italic></sub>(<italic>j</italic>) for the column-vector containing the <italic>j</italic>th secondary trait, the dimension being either <italic>n</italic><sub><italic>o</italic></sub> &#x000D7; 1 (scenario 1) or <italic>n</italic> &#x000D7; 1 (scenario 2). We will use <inline-formula><mml:math id="M27"><mml:msubsup><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msubsup></mml:math></inline-formula> for the row-vector containing all secondary traits for genotype <italic>i</italic>. Recall that the individual secondary traits are still labeled <italic>Y</italic><sub>2</sub>, &#x02026;, <italic>Y</italic><sub><italic>p</italic>&#x0002B;1</sub>, <italic>Y</italic><sub>1</sub> being the target trait.</p>
<p>A well-known alternative dimension reduction approach is to use a selection index <inline-formula><mml:math id="M28"><mml:mi>S</mml:mi><mml:mo>=</mml:mo><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>j</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>p</mml:mi></mml:mrow></mml:munderover><mml:msub><mml:mrow><mml:mi>&#x003B3;</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>j</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula>, which is a linear combination of secondary traits, with coefficients such that the resulting index best predicts the genetic effect of the target trait (Falconer and Mackay, <xref ref-type="bibr" rid="B7">1996</xref>). Assuming independent genetic effects (i.e., ignoring population structure), the <italic>p</italic> &#x000D7; 1 vector &#x003B3; of coefficients is obtained by minimizing, for each individual <italic>i</italic>, the expectation of <inline-formula><mml:math id="M29"><mml:msup><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mrow><mml:mi>f</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mo>-</mml:mo><mml:msubsup><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mi>&#x003B3;</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:math></inline-formula>. The minimizing &#x003B3; then equals the inverse variance-covariance of <italic>Y</italic><sub><italic>s</italic></sub> times the vector of genetic covariances between <italic>Y</italic><sub><italic>s</italic></sub> and <italic>Y</italic><sub><italic>f</italic></sub>, i.e., <inline-formula><mml:math id="M30"><mml:msup><mml:mrow><mml:mi>&#x003B3;</mml:mi></mml:mrow><mml:mrow><mml:mi>S</mml:mi><mml:mi>I</mml:mi></mml:mrow></mml:msup><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:msub><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:msubsup><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mi>u</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula>. To estimate &#x003B3;<sup><italic>SI</italic></sup> one could plug in estimates <inline-formula><mml:math id="M31"><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> and <inline-formula><mml:math id="M32"><mml:msubsup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mi>u</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula>, where <inline-formula><mml:math id="M33"><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msubsup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mi>u</mml:mi></mml:mrow></mml:msubsup><mml:mo>&#x02297;</mml:mo><mml:msub><mml:mrow><mml:mi>K</mml:mi></mml:mrow><mml:mrow><mml:mi>o</mml:mi><mml:mi>o</mml:mi></mml:mrow></mml:msub><mml:mo>&#x0002B;</mml:mo><mml:msubsup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mi>e</mml:mi></mml:mrow></mml:msubsup><mml:mo>&#x02297;</mml:mo><mml:msub><mml:mrow><mml:mi>I</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>o</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msub></mml:math></inline-formula> is the estimated variance-covariance matrix of the secondary traits on the training population, and <inline-formula><mml:math id="M34"><mml:msubsup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mi>u</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> contains estimates of genetic covariances with the target trait. However, when the dimension (<italic>p</italic>) is large, <inline-formula><mml:math id="M35"><mml:msubsup><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mi>u</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> and <inline-formula><mml:math id="M36"><mml:msubsup><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mi>e</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> are difficult to estimate, and the selection index is likely to overfit, as some elements in <inline-formula><mml:math id="M37"><mml:msubsup><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mi>u</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> may be large by chance, and receive too much weight.</p>
<p>To address these issues, Lopez-Cruz et al. (<xref ref-type="bibr" rid="B17">2020</xref>) proposed penalized selection indices, minimizing instead <inline-formula><mml:math id="M38"><mml:mi>E</mml:mi><mml:msup><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mrow><mml:mi>f</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mo>-</mml:mo><mml:msubsup><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mi>&#x003B3;</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mo>&#x0002B;</mml:mo><mml:mi>&#x003BB;</mml:mi><mml:mi>J</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>&#x003B3;</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula>, where &#x003BB; &#x0003E; 0 is the penalty and <italic>J</italic>(&#x003B3;) is either <inline-formula><mml:math id="M39"><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>j</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>p</mml:mi></mml:mrow></mml:munderover><mml:msubsup><mml:mrow><mml:mi>&#x003B3;</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:math></inline-formula> (ridge penalty) or <inline-formula><mml:math id="M40"><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>j</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>p</mml:mi></mml:mrow></mml:munderover><mml:mo>|</mml:mo><mml:msub><mml:mrow><mml:mi>&#x003B3;</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>|</mml:mo></mml:math></inline-formula> (LASSO penalty). &#x003BB; &#x0003D; 0 gives the classical (unpenalized) SI. In case of a ridge penalty, the penalized SI is given by</p>
<disp-formula id="E9"><label>(6)</label><mml:math id="M41"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>&#x003B3;</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>S</mml:mi><mml:mi>I</mml:mi></mml:mrow></mml:msup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>&#x003BB;</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub><mml:mo>&#x0002B;</mml:mo><mml:mi>&#x003BB;</mml:mi><mml:msub><mml:mrow><mml:mi>I</mml:mi></mml:mrow><mml:mrow><mml:mi>p</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:msubsup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mi>u</mml:mi></mml:mrow></mml:msubsup><mml:mo>.</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>We will follow the implementation by Lopez-Cruz et al. (<xref ref-type="bibr" rid="B17">2020</xref>) in their R-package SFSI, where <inline-formula><mml:math id="M42"><mml:msubsup><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mi>u</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> is estimated with MANOVA on the individual plant or plot-level data, and <inline-formula><mml:math id="M43"><mml:msubsup><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mi>u</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> is estimated using the sample covariance matrix of the secondary traits. We emphasize that no multi-trait mixed-model of the form (1)&#x02013;(2) is fitted. Moreover, the regularization only controls how <inline-formula><mml:math id="M44"><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> affects <inline-formula><mml:math id="M45"><mml:msubsup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mi>u</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula>; the estimates <inline-formula><mml:math id="M46"><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> and <inline-formula><mml:math id="M47"><mml:msubsup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mi>u</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> themselves are not regularized.</p>
<p>Following again (Lopez-Cruz et al., <xref ref-type="bibr" rid="B17">2020</xref>), we use internal cross-validation within the training set to choose an appropriate value of &#x003BB;, maximizing <italic>h</italic>(<italic>S</italic>)&#x003C1;<sub><italic>G</italic></sub>(<italic>S, Y</italic><sub><italic>f</italic></sub>). After selecting a value for &#x003BB;, genomic prediction in scenarios 1 and 2 is performed using (4) and (5), with a single secondary trait, i.e., the selection index <inline-formula><mml:math id="M48"><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>j</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>p</mml:mi></mml:mrow></mml:munderover><mml:msubsup><mml:mrow><mml:mi>&#x003B3;</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>&#x003BB;</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mi>j</mml:mi></mml:mrow><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula>. We will use SI-BLUP to refer to the genomic prediction obtained this way.</p>
</sec>
<sec>
<title>2.7. Genomic Prediction Using Multiple Relatedness Matrices</title>
<p>Another alternative to selection indices is to model the secondary traits using random effects (see e.g., Riedelsheimer et al., <xref ref-type="bibr" rid="B22">2012</xref>; Van De Wiel et al., <xref ref-type="bibr" rid="B30">2016</xref>; Xu et al., <xref ref-type="bibr" rid="B34">2016</xref>; Schrag et al., <xref ref-type="bibr" rid="B24">2018</xref>; Xiang et al., <xref ref-type="bibr" rid="B33">2019</xref>; Azodi et al., <xref ref-type="bibr" rid="B3">2020</xref>). In addition to the genetic relatedness matrix <italic>K</italic>, these models use an additional relatedness matrix <italic>M</italic> derived from the secondary phenotypes, and assume that</p>
<disp-formula id="E10"><label>(7)</label><mml:math id="M49"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>f</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>f</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi>&#x003B2;</mml:mi></mml:mrow><mml:mrow><mml:mi>f</mml:mi></mml:mrow></mml:msub><mml:mo>&#x0002B;</mml:mo><mml:msubsup><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mtext class="textrm" mathvariant="normal">gen</mml:mtext></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mo>&#x0002B;</mml:mo><mml:msubsup><mml:mrow><mml:mi>V</mml:mi></mml:mrow><mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mtext class="textrm" mathvariant="normal">sec</mml:mtext></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mo>&#x0002B;</mml:mo><mml:msub><mml:mrow><mml:mi>E</mml:mi></mml:mrow><mml:mrow><mml:mi>f</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>f</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi>&#x003B2;</mml:mi></mml:mrow><mml:mrow><mml:mi>f</mml:mi></mml:mrow></mml:msub><mml:mo>&#x0002B;</mml:mo><mml:msubsup><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mtext class="textrm" mathvariant="normal">gen</mml:mtext></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mo>&#x0002B;</mml:mo><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi>b</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub><mml:mo>&#x0002B;</mml:mo><mml:msub><mml:mrow><mml:mi>E</mml:mi></mml:mrow><mml:mrow><mml:mi>f</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where <inline-formula><mml:math id="M50"><mml:msubsup><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mstyle class="text"><mml:mtext class="textrm" mathvariant="normal">gen</mml:mtext></mml:mstyle></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mo>&#x0007E;</mml:mo><mml:mi>N</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mi>K</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup><mml:mi>K</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M51"><mml:msubsup><mml:mrow><mml:mi>V</mml:mi></mml:mrow><mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mstyle class="text"><mml:mtext class="textrm" mathvariant="normal">sec</mml:mtext></mml:mstyle></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mo>&#x0007E;</mml:mo><mml:mi>N</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mi>M</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup><mml:mi>M</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula>. We will call this the Multi-BLUP model (not to be confused with Speed and Balding, <xref ref-type="bibr" rid="B26">2014</xref>, where the same type of model is used, but where genomic regions are represented by different relatedness matrices). The variance components <inline-formula><mml:math id="M52"><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mi>K</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:math></inline-formula>, <inline-formula><mml:math id="M53"><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mi>M</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:math></inline-formula>, and <inline-formula><mml:math id="M54"><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mi>E</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:math></inline-formula> can be estimated with REML or with cross-validation. For simplicity we consider only one type of secondary phenotypes. Similar to the equivalence between GBLUP and SNP-BLUP, the effects <inline-formula><mml:math id="M55"><mml:msubsup><mml:mrow><mml:mi>V</mml:mi></mml:mrow><mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mstyle class="text"><mml:mtext class="textrm" mathvariant="normal">sec</mml:mtext></mml:mstyle></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msubsup></mml:math></inline-formula> can be written as <italic>Y</italic><sub><italic>s</italic></sub><italic>b</italic><sub><italic>s</italic></sub>, for a vector <italic>b</italic><sub><italic>s</italic></sub> of independent random effects with <inline-formula><mml:math id="M56"><mml:mi>N</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:msup><mml:mrow><mml:mi>p</mml:mi></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mi>M</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula> distribution. Hence, similar to the LS-BLUP and RF-BLUP, the Multi-BLUP approach implicitly assumes a causal effect of <italic>Y</italic><sub><italic>s</italic></sub> on <italic>Y</italic><sub><italic>f</italic></sub> (<xref ref-type="fig" rid="F1">Figure 1</xref>, left), which is assumed to be linear, with random coefficients. The usual &#x0201C;genomic&#x0201D; prediction based on model (7) is</p>
<disp-formula id="E11"><label>(8)</label><mml:math id="M57"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mtext class="textrm" mathvariant="normal">Multi</mml:mtext></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msubsup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mtext class="textrm" mathvariant="normal">gen</mml:mtext></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mo>&#x0002B;</mml:mo><mml:msubsup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>V</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mtext class="textrm" mathvariant="normal">sec</mml:mtext></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mo>,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>i.e., the sum of the BLUPs for the genetic and secondary trait effects. We put genomic between quotes because (8) is partly a phenotypic prediction: instead of the genetic component of the secondary traits, it directly relies on these traits themselves, which are assumed to be available on the test set. As a consequence, the use of (8) is limited to scenario 2.</p>
<p>To overcome these limitations we propose the GM-BLUP:</p>
<disp-formula id="E12"><label>(9)</label><mml:math id="M58"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>G</mml:mi><mml:mi>M</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msubsup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mtext class="textrm" mathvariant="normal">gen</mml:mtext></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mo>&#x0002B;</mml:mo><mml:msubsup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mtext class="textrm" mathvariant="normal">gen</mml:mtext></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>b</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where <inline-formula><mml:math id="M59"><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>b</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> is the vector of predicted random coefficients obtained from the Multi-BLUP model, and <inline-formula><mml:math id="M60"><mml:msubsup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mstyle class="text"><mml:mtext class="textrm" mathvariant="normal">gen</mml:mtext></mml:mstyle></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msubsup></mml:math></inline-formula> is the matrix of GBLUPs for the secondary traits (either univariate or multivariate). These GBLUPs can of course also be computed in scenario 1. Apart from being the &#x0201C;genomic analogue&#x0201D; of (8), (9) can also be motivated by a causal model of the form</p>
<disp-formula id="E13"><label>(10)</label><mml:math id="M61"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>f</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>f</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi>&#x003B2;</mml:mi></mml:mrow><mml:mrow><mml:mi>f</mml:mi></mml:mrow></mml:msub><mml:mo>&#x0002B;</mml:mo><mml:msub><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mrow><mml:mi>f</mml:mi></mml:mrow></mml:msub><mml:mo>&#x0002B;</mml:mo><mml:msub><mml:mrow><mml:mi>E</mml:mi></mml:mrow><mml:mrow><mml:mi>f</mml:mi></mml:mrow></mml:msub><mml:mo>&#x0002B;</mml:mo><mml:mi>h</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>as considered by T&#x000F6;pner et al. (<xref ref-type="bibr" rid="B29">2017</xref>) and Grotzinger et al. (<xref ref-type="bibr" rid="B13">2019</xref>). In contrast to the Multi-BLUP, GM-BLUP only depends on the genetic components of the secondary traits.</p>
<p>Finally, following many other authors (e.g., Riedelsheimer et al., <xref ref-type="bibr" rid="B22">2012</xref>; Xu et al., <xref ref-type="bibr" rid="B34">2016</xref>) we will also compute a prediction based on the secondary traits alone, using the model</p>
<disp-formula id="E14"><label>(11)</label><mml:math id="M62"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>f</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>f</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi>&#x003B2;</mml:mi></mml:mrow><mml:mrow><mml:mi>f</mml:mi></mml:mrow></mml:msub><mml:mo>&#x0002B;</mml:mo><mml:msubsup><mml:mrow><mml:mi>V</mml:mi></mml:mrow><mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mtext class="textrm" mathvariant="normal">sec</mml:mtext></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mo>&#x0002B;</mml:mo><mml:msub><mml:mrow><mml:mi>E</mml:mi></mml:mrow><mml:mrow><mml:mi>f</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>f</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi>&#x003B2;</mml:mi></mml:mrow><mml:mrow><mml:mi>f</mml:mi></mml:mrow></mml:msub><mml:mo>&#x0002B;</mml:mo><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi>b</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub><mml:mo>&#x0002B;</mml:mo><mml:msub><mml:mrow><mml:mi>E</mml:mi></mml:mrow><mml:mrow><mml:mi>f</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>and define the MBLUP</p>
<disp-formula id="E15"><label>(12)</label><mml:math id="M63"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>U</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>M</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msubsup><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>V</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mtext class="textrm" mathvariant="normal">sec</mml:mtext></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi>Y</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>b</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub><mml:mo>.</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>Again, this is to some degree a phenotypic prediction, and since the direct effects of the SNPs are ignored, the estimated effects <inline-formula><mml:math id="M64"><mml:msub><mml:mrow><mml:mover accent="true"><mml:mrow><mml:mi>b</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> will differ from those obtained from model (7).</p>
</sec>
<sec>
<title>2.8. Simulations</title>
<p>We first compare the different methods on simulated data, with <italic>p</italic> &#x0003D; 300 secondary traits. We used existing genotypic data, from the Arabidopsis RegMap, containing 1, 307 accessions genotyped with 214, 051 SNPs (Horton et al., <xref ref-type="bibr" rid="B14">2012</xref>). For each data-set we randomly selected 500 accessions, from which we randomly sampled a test set of 100 accessions. We randomly selected 1, 500 SNPs with a minor allele frequency of at least 0.3. For each data-set we first simulated direct genetic effects (<italic>g</italic><sub><italic>i</italic></sub>) and residuals (<italic>r</italic><sub><italic>i</italic></sub>) for each accession <italic>i</italic>, and the final trait values were obtained using a structural equation model, describing functional relations between traits. More specifically, for each individual <italic>i</italic>, the (<italic>p</italic> &#x0002B; 1) &#x000D7; 1 vector of trait values is defined by <italic>y</italic><sub><italic>i</italic></sub> &#x0003D; <italic>y</italic><sub><italic>i</italic></sub>&#x0039B; &#x0002B; <italic>g</italic><sub><italic>i</italic></sub> &#x0002B; <italic>r</italic><sub><italic>i</italic></sub>, &#x0039B; being the (<italic>p</italic> &#x0002B; 1) &#x000D7; (<italic>p</italic> &#x0002B; 1) matrix of structural coefficients. The (<italic>k, l</italic>)th entry of &#x0039B; contains the effect of trait <italic>k</italic> on trait <italic>l</italic>, and the vectors <italic>g</italic><sub><italic>i</italic></sub> and <italic>r</italic><sub><italic>i</italic></sub> have zero mean Gaussian distributions with covariance matrices &#x003A3;<sup><italic>g</italic></sup> and &#x003A3;<sup><italic>r</italic></sup>, respectively. The joint distribution of all <italic>n</italic>(<italic>p</italic> &#x0002B; 1) trait values is then as in (1), with &#x003A3;<sup><italic>u</italic></sup> &#x0003D; &#x00393;<sup><italic>t</italic></sup>&#x003A3;<sup><italic>g</italic></sup>&#x00393; and &#x003A3;<sup><italic>e</italic></sup> &#x0003D; &#x00393;<sup><italic>t</italic></sup>&#x003A3;<sup><italic>r</italic></sup>&#x00393;, where &#x00393; &#x0003D; (<italic>I</italic> &#x02212; &#x0039B;)<sup>&#x02212;1</sup> (Gianola and Sorensen, <xref ref-type="bibr" rid="B12">2004</xref>; T&#x000F6;pner et al., <xref ref-type="bibr" rid="B29">2017</xref>; Kruijer et al., <xref ref-type="bibr" rid="B15">2020</xref>).</p>
<p>The target trait is defined as <italic>Y</italic><sub><italic>f</italic></sub> &#x0003D; <italic>Y</italic><sub>1</sub> &#x0003D; &#x003BB;(<italic>Y</italic><sub>2</sub> &#x0002B; <italic>Y</italic><sub>3</sub> &#x0002B; <italic>Y</italic><sub>4</sub>) &#x0002B; <italic>G</italic><sub>1</sub> &#x0002B; <italic>R</italic><sub>1</sub>, and we do not assume any functional relations among the secondary traits. Hence, if &#x003BB; &#x02260; 0, there is a causal effect from <italic>Y</italic><sub>2</sub>, <italic>Y</italic><sub>3</sub>, and <italic>Y</italic><sub>4</sub> on <italic>Y</italic><sub>1</sub>, but the algorithms under consideration do not know which of the 300 secondary traits are the actual causal ones. We consider &#x003BB; values on the grid {&#x02212;1, &#x02212;0.5, 0, 0.5, 1}. &#x003A3;<sup><italic>g</italic></sup> has diagonal elements (0.2, 0.7, &#x02026;, 0.7), i.e., the variances of the direct genetic effects are 0.2 for <italic>Y</italic><sub><italic>f</italic></sub> and 0.7 for each of the secondary traits. The off-diagonal elements corresponding to <italic>Y</italic><sub>1</sub> vs. (<italic>Y</italic><sub>2</sub>, <italic>Y</italic><sub>3</sub>, <italic>Y</italic><sub>4</sub>) are <inline-formula><mml:math id="M65"><mml:msub><mml:mrow><mml:mi>&#x003C1;</mml:mi></mml:mrow><mml:mrow><mml:mi>G</mml:mi></mml:mrow></mml:msub><mml:msqrt><mml:mrow><mml:mn>0</mml:mn><mml:mo>.</mml:mo><mml:mn>2</mml:mn><mml:mo>&#x000B7;</mml:mo><mml:mn>0</mml:mn><mml:mo>.</mml:mo><mml:mn>7</mml:mn></mml:mrow></mml:msqrt></mml:math></inline-formula>, where we choose &#x003C1;<sub><italic>G</italic></sub> &#x02208; {&#x02212;0.5, 0, 0.5}. Similarly, &#x003A3;<sup><italic>r</italic></sup> has diagonal elements 0.8 for <italic>Y</italic><sub><italic>f</italic></sub> and 0.3 for the secondary traits, and the off-diagonal elements between <italic>Y</italic><sub>1</sub> and (<italic>Y</italic><sub>2</sub>, <italic>Y</italic><sub>3</sub>, <italic>Y</italic><sub>4</sub>) are <inline-formula><mml:math id="M66"><mml:msub><mml:mrow><mml:mi>&#x003C1;</mml:mi></mml:mrow><mml:mrow><mml:mi>E</mml:mi></mml:mrow></mml:msub><mml:msqrt><mml:mrow><mml:mn>0</mml:mn><mml:mo>.</mml:mo><mml:mn>8</mml:mn><mml:mo>&#x000B7;</mml:mo><mml:mn>0</mml:mn><mml:mo>.</mml:mo><mml:mn>3</mml:mn></mml:mrow></mml:msqrt></mml:math></inline-formula>, with &#x003C1;<sub><italic>E</italic></sub> &#x02208; {&#x02212;0.5, 0, 0.5}. The other off-diagonal elements in &#x003A3;<sup><italic>g</italic></sup> and &#x003A3;<sup><italic>r</italic></sup> are zero.</p>
<p>For the special case &#x003BB; &#x0003D; 0 we have &#x00393; &#x0003D; <italic>I</italic>, &#x003A3;<sup><italic>u</italic></sup> &#x0003D; &#x003A3;<sup><italic>g</italic></sup> and &#x003A3;<sup><italic>e</italic></sup> &#x0003D; &#x003A3;<sup><italic>r</italic></sup>, and <italic>Y</italic><sub><italic>f</italic></sub> will have a heritability of 0.2. The secondary traits will have heritability 0.7, and there is no causal effect of (<italic>Y</italic><sub>2</sub>, <italic>Y</italic><sub>3</sub>, <italic>Y</italic><sub>4</sub>) on <italic>Y</italic><sub>1</sub>. Genomic prediction for <italic>Y</italic><sub>1</sub> can however still benefit from the genetic correlation between these traits (which is present when &#x003C1;<sub><italic>G</italic></sub> &#x02260; 0). When &#x003BB; &#x02260; 0, the causal effect of (<italic>Y</italic><sub>2</sub> &#x0002B; <italic>Y</italic><sub>3</sub> &#x0002B; <italic>Y</italic><sub>4</sub>) on <italic>Y</italic><sub>1</sub> will introduce additional genetic and residual covariance in &#x003A3;<sup><italic>u</italic></sup> and &#x003A3;<sup><italic>e</italic></sup>.</p>
<p>For each of the 125 combinations of &#x003BB;, &#x003C1;<sub><italic>G</italic></sub> and &#x003C1;<sub><italic>E</italic></sub> we simulate 50 data-sets; for each of them we predicted the simulated genetic effects for the test set, with the different methods.</p>
<sec>
<title>2.8.1. Benchmark</title>
<p>In addition to the methods described above, we evaluate a benchmark prediction, by computing (4) and (5) for the four-dimensional mixed model with <italic>Y</italic><sub>1</sub> &#x02212; <italic>Y</italic><sub>4</sub>, using the true (simulated) variance components.</p>
</sec>
</sec>
<sec>
<title>2.9. Data</title>
<p>To test the methods on real data, we consider four data-sets with various target and secondary phenotypes. To assess accuracy, each data set was randomly split into training (70%) and a test genotypes (30%). This was repeated 160 times, and we report accuracy averaged over the 160 test sets. Because of the required computing time, only 50 test sets were analyzed for RF-BLUP with hyper-parameter-optimization (for the Arabidopsis data-sets), and 30 test-sets for the maize data (for all methods). With one exception (mentioned below), the target and secondary phenotypes were measured on different plants; therefore, all bivariate mixed models were fitted with diagonal residual covariance (i.e., diagonal &#x003A3;<sup><italic>e</italic></sup> in Equations 4 and 5).</p>
<p>The first two data sets were measured on the <italic>A. thaliana</italic> HapMap population, where 36 metabolites from Fusari et al. (<xref ref-type="bibr" rid="B11">2017</xref>) were used as secondary phenotypes and the kinship matrix was estimated based on one million imputed SNPs (Arouisse et al., <xref ref-type="bibr" rid="B2">2020</xref>). Dataset 1 contains three target traits related to biotic and abiotic stress, from Thoen et al. (<xref ref-type="bibr" rid="B28">2017</xref>). In dataset 2, the target is the rosette fresh weight, measured in of the experiments of Fusari et al. (<xref ref-type="bibr" rid="B11">2017</xref>). This is the only dataset for which the residual covariance is non-diagonal.</p>
<p>In the third data set, we predicted the grain yield, plant height (PH) and flowering time (FT) of 388 inbred maize lines (<italic>Z. mays</italic>), using 5, 760 transcripts (Azodi et al., <xref ref-type="bibr" rid="B3">2020</xref>) as secondary traits. In this case, we selected for each data-set a subset of transcripts using the LASSO on the training set, following Azodi et al. (<xref ref-type="bibr" rid="B3">2020</xref>). In other words, the transcripts selected by LS-BLUP were also used for the other methods.</p>
</sec>
<sec>
<title>2.10. Data Availability</title>
<p>The data that support the findings of this study are available at:</p>
<p><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1105/tpc.19.00332">https://doi.org/10.1105/tpc.19.00332</ext-link> (Maize data)</p>
<p><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1105/tpc.17.00232">https://doi.org/10.1105/tpc.17.00232</ext-link> (<italic>A. thaliana</italic> Metabolite data)</p>
<p><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1111/nph.14220">https://doi.org/10.1111/nph.14220</ext-link> (<italic>A. thaliana</italic> Phenotypes)</p>
<p><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1111/tpj.14659">https://doi.org/10.1111/tpj.14659</ext-link> (<italic>A. thaliana</italic> SNP data)</p>
<p>All data-sets (except the maize transcriptomics) are included in an Rdata file available at: <ext-link ext-link-type="uri" xlink:href="https://figshare.com/s/5d01062711ce33bb327e">https://figshare.com/s/5d01062711ce33bb327e</ext-link>.</p>
</sec>
<sec>
<title>2.11. Software and Computing Time</title>
<p>The required computing time is mainly driven by the complexity of fitting either a bivariate mixed model with a single relatedness matrix, or univariate mixed models with either one or two relatedness matrices. For the datasets considered here, each bivariate mixed model took between 20 and 50 s to fit, the univariate mixed models taking at most a few seconds. For complexity as function of <italic>n</italic> and <italic>p</italic> we refer to Zhou and Stephens (<xref ref-type="bibr" rid="B36">2014</xref>).</p>
<p>R-code for all methods is available at <ext-link ext-link-type="uri" xlink:href="https://figshare.com/s/5d01062711ce33bb327e">https://figshare.com/s/5d01062711ce33bb327e</ext-link>, where we mostly relied on asreml-R (Butler et al., <xref ref-type="bibr" rid="B4">2009</xref>). Several open source alternatives are however available; in particular sommer (Covarrubias-Pazaran, <xref ref-type="bibr" rid="B5">2016</xref>) for bivariate mixed models, and gaston for univariate mixed models. Using gaston&#x00027;s lmm.diago.likelihood function, the (univariate) GBLUP for large numbers of traits can be computed in only a few seconds, which is useful for the GM-BLUP method. For the dimension reduction in LS- and RF-BLUP we used the R-packages glmnet (Friedman et al., <xref ref-type="bibr" rid="B9">2010</xref>), caret (<ext-link ext-link-type="uri" xlink:href="https://cran.r-project.org/package=caret">https://cran.r-project.org/package=caret</ext-link>), and randomForest (Liaw and Wiener, <xref ref-type="bibr" rid="B16">2002</xref>). For the maize data, LASSO and random-forest regression were performed in python, using the scikit-learn packages.</p>
</sec>
</sec>
<sec sec-type="results" id="s3">
<title>3. Results</title>
<sec>
<title>3.1. Simulations</title>
<p><xref ref-type="fig" rid="F2">Figures 2</xref>, <xref ref-type="fig" rid="F3">3</xref> show the estimated accuracy as function of &#x003BB;, i.e., the size of the causal effects of <italic>Y</italic><sub>2</sub>, <italic>Y</italic><sub>3</sub>, and <italic>Y</italic><sub>4</sub> on the target trait <italic>Y</italic><sub><italic>f</italic></sub> (i.e., <italic>Y</italic><sub>1</sub>). We focus on three cases, with different values for the correlations between the direct genetic effects on <italic>Y</italic><sub>1</sub>, &#x02026;, <italic>Y</italic><sub>4</sub>, as well as the corresponding residuals (see section 2): (A) &#x003C1;<sub><italic>G</italic></sub> &#x0003D; 0.5 and &#x003C1;<sub><italic>E</italic></sub> &#x0003D; &#x02212;0.5, (B) &#x003C1;<sub><italic>G</italic></sub> &#x0003D; &#x003C1;<sub><italic>E</italic></sub> &#x0003D; 0, and (C) &#x003C1;<sub><italic>G</italic></sub> &#x0003D; 0.5 and &#x003C1;<sub><italic>E</italic></sub> &#x0003D; 0.5. In scenario 1 (<xref ref-type="fig" rid="F2">Figure 2</xref>) as well as scenario 2 (<xref ref-type="fig" rid="F3">Figure 3</xref>), accuracies are generally higher when &#x003BB; moves away from zero. This is expected, as the total genetic variance and heritability increase due to the causal effect, especially when &#x003C1;<sub><italic>G</italic></sub> and &#x003BB; have the same sign. When they have opposite sign, the lowest accuracy can occur at an intermediate value of &#x003BB; [e.g., at &#x003BB; &#x0003D; &#x02212;0.5 in case of (A)].</p>
<fig id="F2" position="float">
<label>Figure 2</label>
<caption><p>Accuracy of genomic prediction methods in scenario 1, which for each value of &#x003BB; is estimated from 50 simulated data-sets (standard errors between 0.011 and 0.042). &#x0201C;GBLUP&#x0201D; is the univariate GBLUP, and the benchmark is the multivariate GBLUP based on <italic>Y</italic><sub>1</sub>, &#x02026;, <italic>Y</italic><sub>4</sub>, using the true (simulated) values of the variance components (see section 2.8.1). Acronyms of the other methods are given in section 2; they use all secondary traits (<italic>Y</italic><sub>2</sub>, &#x02026;, <italic>Y</italic><sub>301</sub>), without knowledge of (<italic>Y</italic><sub>2</sub>, <italic>Y</italic><sub>3</sub>, <italic>Y</italic><sub>4</sub>) being causal. &#x003BB; is the size of the causal effect of (<italic>Y</italic><sub>2</sub>, <italic>Y</italic><sub>3</sub>, <italic>Y</italic><sub>4</sub>) on <italic>Y</italic><sub>1</sub>. &#x003C1;<sub><italic>G</italic></sub> is the correlation between the direct genetic effects on <italic>Y</italic><sub>1</sub>, &#x02026;, <italic>Y</italic><sub>4</sub>; similarly, &#x003C1;<sub><italic>E</italic></sub> is the correlation between the non-genetic effects. The total genetic correlation is function of &#x003BB; and &#x003C1;<sub><italic>G</italic></sub>. <bold>(A)</bold> &#x003C1;<sub><italic>G</italic></sub> &#x0003D; 0.5, &#x003C1;<sub><italic>E</italic></sub> &#x0003D; &#x02212;0.5, <bold>(B)</bold> &#x003C1;<sub><italic>G</italic></sub> &#x0003D; 0, &#x003C1;<sub><italic>E</italic></sub> &#x0003D; 0, and <bold>(C)</bold> &#x003C1;<sub><italic>G</italic></sub> &#x0003D; 0.5, &#x003C1;<sub><italic>E</italic></sub> &#x0003D; 0.5.</p></caption>
<graphic xlink:href="fgene-12-667358-g0002.tif"/>
</fig>
<fig id="F3" position="float">
<label>Figure 3</label>
<caption><p>Accuracy of genomic prediction methods in scenario 2, which for each value of &#x003BB; is estimated from 50 simulated data-sets (standard errors between 0.014 and 0.051). &#x0201C;GBLUP&#x0201D; is the univariate GBLUP, and the benchmark is the multivariate GBLUP based on <italic>Y</italic><sub>1</sub>, &#x02026;, <italic>Y</italic><sub>4</sub>, using the true (simulated) values of the variance components (see section 2.8.1). Acronyms of the other methods are given in section 2; they use all secondary traits (<italic>Y</italic><sub>2</sub>, &#x02026;, <italic>Y</italic><sub>301</sub>), without knowledge of (<italic>Y</italic><sub>2</sub>, <italic>Y</italic><sub>3</sub>, <italic>Y</italic><sub>4</sub>) being causal. &#x003BB; is the size of the causal effect of (<italic>Y</italic><sub>2</sub>, <italic>Y</italic><sub>3</sub>, <italic>Y</italic><sub>4</sub>) on <italic>Y</italic><sub>1</sub>. &#x003C1;<sub><italic>G</italic></sub> is the correlation between the direct genetic effects on <italic>Y</italic><sub>1</sub>, &#x02026;, <italic>Y</italic><sub>4</sub>; similarly, &#x003C1;<sub><italic>E</italic></sub> is the correlation between the non-genetic effects. The total genetic correlation is function of &#x003BB; and &#x003C1;<sub><italic>G</italic></sub>. <bold>(A)</bold> &#x003C1;<sub><italic>G</italic></sub> &#x0003D; 0.5, &#x003C1;<sub><italic>E</italic></sub> &#x0003D; &#x02212;0.5, <bold>(B)</bold> &#x003C1;<sub><italic>G</italic></sub> &#x0003D; 0, &#x003C1;<sub><italic>E</italic></sub> &#x0003D; 0, and <bold>(C)</bold> &#x003C1;<sub><italic>G</italic></sub> &#x0003D; 0.5, &#x003C1;<sub><italic>E</italic></sub> &#x0003D; 0.5.</p></caption>
<graphic xlink:href="fgene-12-667358-g0003.tif"/>
</fig>
<p>The multi-trait benchmark with perfect information on the genetic and residual covariance between the target trait <italic>Y</italic><sub><italic>f</italic></sub> and secondary traits <italic>Y</italic><sub>2</sub>, <italic>Y</italic><sub>3</sub>, and <italic>Y</italic><sub>4</sub> always outperforms univariate GBLUP, except when &#x003C1;<sub><italic>G</italic></sub> &#x0003D; &#x003BB; &#x0003D; 0, in which case accuracies are equal. When &#x003C1;<sub><italic>G</italic></sub> &#x02260; 0, the benchmark always benefits from the genetic correlations between the target trait and the secondary traits, even if the latter do not have a causal effect on <italic>Y</italic><sub><italic>f</italic></sub>.</p>
<p>The accuracy of univariate GBLUP varied between <italic>r</italic> &#x0003D; 0.44 and <italic>r</italic> &#x0003D; 0.70, while the benchmark had accuracy between 0.50 &#x02212; 0.70 (scenario 1) and 0.50 &#x02212; 0.92 (scenario 2). The difference between scenario 2 (secondary traits observed on the test set) and scenario 1 (secondary traits only observed on the training set) was bigger for large values of |&#x003BB;|. This is because for large |&#x003BB;|, the total genetic correlation (which is also a function of &#x003C1;<sub><italic>G</italic></sub>) between <italic>Y</italic><sub><italic>f</italic></sub> and the causal secondary traits (<italic>Y</italic><sub>2</sub>, <italic>Y</italic><sub>3</sub>, and <italic>Y</italic><sub>4</sub>) is larger.</p>
<p>In absence of a causal effect <italic>Y</italic><sub><italic>s</italic></sub> &#x02192; <italic>Y</italic><sub><italic>f</italic></sub> (&#x003BB; &#x0003D; 0) and residual genetic and residual correlations having opposite sign (case A), our simulation setup appeared to be too challenging, and none of the methods performed better than univariate GBLUP. Something similar occurred in case C, for &#x003BB; &#x0003D; &#x02212;0.5. On the positive side, for large values of |&#x003BB;|, both SI-BLUP and LS-BLUP have near-benchmark accuracy, where the latter did not rely on plot-level observations. In scenario 2, RF-BLUP appeared to be an interesting alternative, with somewhat lower accuracy on the extreme sides, but relatively good performance at unfavorable values of &#x003BB;.</p>
<p>Prediction based on the secondary traits only (M-BLUP; only available in scenario 2) is generally one of the least successful. The multi-kernel methods (Multi-BLUP and GM-BLUP) are somewhere in between, GM-BLUP often having an accuracy similar to that of RF-BLUP. GM-BLUP appears to be slightly better than Multi-BLUP, but in most cases the difference is smaller than the standard errors of the accuracy estimates.</p>
</sec>
<sec>
<title>3.2. Arabidopsis and Maize Data</title>
<p><xref ref-type="table" rid="T1">Tables 1</xref>, <xref ref-type="table" rid="T2">2</xref> contain the accuracies for datasets 1&#x02013;4 described above, averaged over randomly sampled test sets (see section 2). Because the original individual plant (or plot) data were not available, we could not compute the SI-BLUP here.</p>
<table-wrap position="float" id="T1">
<label>Table 1</label>
<caption><p>Prediction accuracy in scenario 1, for various target and secondary traits in Maize and Arabidopsis.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>Data sets</bold></th>
<th valign="top" align="left"><bold>Target trait</bold></th>
<th valign="top" align="left"><bold>Secondary phenotypes</bold></th>
<th valign="top" align="center"><bold>GBLUP</bold></th>
<th valign="top" align="center"><bold>GM-BLUP</bold></th>
<th valign="top" align="center"><bold>LS-BLUP</bold></th>
<th valign="top" align="center"><bold>RF-BLUP</bold></th>
<th valign="top" align="center"><bold>RF-BLUP*</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">1</td>
<td valign="top" align="left">Number of spreading lesions</td>
<td valign="top" align="left">Metabolites</td>
<td valign="top" align="center">0.23</td>
<td valign="top" align="center">0.22</td>
<td valign="top" align="center">0.20</td>
<td valign="top" align="center">0.21</td>
<td valign="top" align="center">0.21</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">under fungus stress</td>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td/>
<td valign="top" align="left">Fresh weight of the rosette</td>
<td valign="top" align="left">Metabolites</td>
<td valign="top" align="center">0.03</td>
<td valign="top" align="center">0.00</td>
<td valign="top" align="center">0.07</td>
<td valign="top" align="center">0.09</td>
<td valign="top" align="center">0.09</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">under Salt_5 stress</td>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td/>
<td valign="top" align="left">Number of spreading lesions</td>
<td valign="top" align="left">Metabolites</td>
<td valign="top" align="center">0.19</td>
<td valign="top" align="center">0.18</td>
<td valign="top" align="center">0.16</td>
<td valign="top" align="center">0.16</td>
<td valign="top" align="center">0.15</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">under Drought_and_fungus stress</td>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td/>
<td valign="top" align="left">Number of damaged leaves and</td>
<td valign="top" align="left">Metabolites</td>
<td valign="top" align="center">0.10</td>
<td valign="top" align="center">0.09</td>
<td valign="top" align="center">0.06</td>
<td valign="top" align="center">0.10</td>
<td valign="top" align="center">0.10</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">feeding sites under Caterpillar_3 stress</td>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td valign="top" align="left">2</td>
<td valign="top" align="left">Fresh weight</td>
<td valign="top" align="left">Metabolites</td>
<td valign="top" align="center">0.30</td>
<td valign="top" align="center">0.30</td>
<td valign="top" align="center">0.29</td>
<td valign="top" align="center">0.30</td>
<td valign="top" align="center">0.30</td>
</tr>
<tr>
<td valign="top" align="left">3</td>
<td valign="top" align="left">Flowering time (FT) [4]</td>
<td valign="top" align="left">Transcripts</td>
<td valign="top" align="center">0.54</td>
<td valign="top" align="center">0.55</td>
<td valign="top" align="center">0.55</td>
<td valign="top" align="center">0.53</td>
<td valign="top" align="center">0.55</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Plant height (PH)</td>
<td valign="top" align="left">Transcripts</td>
<td valign="top" align="center">0.54</td>
<td valign="top" align="center">0.55</td>
<td valign="top" align="center">0.55</td>
<td valign="top" align="center">0.53</td>
<td valign="top" align="center">0.51</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Yield</td>
<td valign="top" align="left">Transcripts &#x0002B; FT&#x0002B;PH</td>
<td valign="top" align="center">0.53</td>
<td valign="top" align="center">0.53</td>
<td valign="top" align="center">0.54</td>
<td valign="top" align="center">0.52</td>
<td valign="top" align="center">0.52</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Yield</td>
<td valign="top" align="left">Transcripts</td>
<td valign="top" align="center">0.55</td>
<td valign="top" align="center">0.55</td>
<td valign="top" align="center">0.55</td>
<td valign="top" align="center">0.55</td>
<td valign="top" align="center">0.55</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p><italic>Acronyms of the methods are as in <xref ref-type="fig" rid="F2">Figures 2</xref>, <xref ref-type="fig" rid="F3">3</xref>. For RF-BLUP*, we used the randomForest package with the default settings; for RF-BLUP, hyper-parameters were optimized using the caret package (data-sets 1 and 2) or scikit-learn (data-set 3). For data-sets 1 and 2, reported accuracies are averages over 160 test sets (standard errors between 0.006 and 0.007), except for RF-BLUP, where 50 sets were used (SE between 0.010 and 0.014). In dataset 3, 30 test sets were used for all methods (SE between 0.006 and 0.03)</italic>.</p>
</table-wrap-foot>
</table-wrap>
<table-wrap position="float" id="T2">
<label>Table 2</label>
<caption><p>Prediction accuracy in scenario 2, for various target and secondary traits in Maize and Arabidopsis.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>Data sets</bold></th>
<th valign="top" align="left"><bold>Target trait</bold></th>
<th valign="top" align="left"><bold>Secondary phenotypes</bold></th>
<th valign="top" align="center"><bold>GBLUP</bold></th>
<th valign="top" align="center"><bold>M-BLUP</bold></th>
<th valign="top" align="center"><bold>Multi-BLUP</bold></th>
<th valign="top" align="center"><bold>GM-BLUP</bold></th>
<th valign="top" align="center"><bold>LS-BLUP</bold></th>
<th valign="top" align="center"><bold>RF-BLUP</bold></th>
<th valign="top" align="center"><bold>RF-BLUP*</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">1</td>
<td valign="top" align="left">Number of spreading lesions</td>
<td valign="top" align="left">Metabolites</td>
<td valign="top" align="center">0.23</td>
<td valign="top" align="center">&#x02212;0.04</td>
<td valign="top" align="center">0.21</td>
<td valign="top" align="center">0.22</td>
<td valign="top" align="center">0.31</td>
<td valign="top" align="center">0.28</td>
<td valign="top" align="center">0.28</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">under fungus stress</td>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td/>
<td valign="top" align="left">Fresh weight of the rosette</td>
<td valign="top" align="left">Metabolites</td>
<td valign="top" align="center">0.03</td>
<td valign="top" align="center">0.09</td>
<td valign="top" align="center">0.08</td>
<td valign="top" align="center">0.07</td>
<td valign="top" align="center">0.23</td>
<td valign="top" align="center">0.20</td>
<td valign="top" align="center">0.19</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">under Salt_5 stress</td>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td/>
<td valign="top" align="left">Number of spreading lesions</td>
<td valign="top" align="left">Metabolites</td>
<td valign="top" align="center">0.19</td>
<td valign="top" align="center">&#x02212;0.02</td>
<td valign="top" align="center">0.16</td>
<td valign="top" align="center">0.17</td>
<td valign="top" align="center">0.27</td>
<td valign="top" align="center">0.25</td>
<td valign="top" align="center">0.23</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">under Drought_and_fungus stress</td>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td/>
<td valign="top" align="left">Number of damaged leaves and</td>
<td valign="top" align="left">Metabolites</td>
<td valign="top" align="center">0.10</td>
<td valign="top" align="center">0.05</td>
<td valign="top" align="center">0.06</td>
<td valign="top" align="center">0.07</td>
<td valign="top" align="center">0.14</td>
<td valign="top" align="center">0.12</td>
<td valign="top" align="center">0.11</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">feeding sites under Caterpillar_3 stress</td>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td valign="top" align="left">2</td>
<td valign="top" align="left">Fresh weight</td>
<td valign="top" align="left">Metabolites</td>
<td valign="top" align="center">0.30</td>
<td valign="top" align="center">0.00</td>
<td valign="top" align="center">0.29</td>
<td valign="top" align="center">0.30</td>
<td valign="top" align="center">0.32</td>
<td valign="top" align="center">0.30</td>
<td valign="top" align="center">0.28</td>
</tr>
<tr>
<td valign="top" align="left">3</td>
<td valign="top" align="left">Flowering time (FT) [4]</td>
<td valign="top" align="left">Transcripts</td>
<td valign="top" align="center">0.55</td>
<td valign="top" align="center">0.54</td>
<td valign="top" align="center">0.55</td>
<td valign="top" align="center">0.55</td>
<td valign="top" align="center">0.66</td>
<td valign="top" align="center">0.65</td>
<td valign="top" align="center">0.54</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Plant height (PH)</td>
<td valign="top" align="left">Transcripts</td>
<td valign="top" align="center">0.54</td>
<td valign="top" align="center">0.53</td>
<td valign="top" align="center">0.54</td>
<td valign="top" align="center">0.55</td>
<td valign="top" align="center">0.66</td>
<td valign="top" align="center">0.64</td>
<td valign="top" align="center">0.53</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Yield</td>
<td valign="top" align="left">Transcripts &#x0002B; FT&#x0002B;PH</td>
<td valign="top" align="center">0.53</td>
<td valign="top" align="center">0.49</td>
<td valign="top" align="center">0.50</td>
<td valign="top" align="center">0.52</td>
<td valign="top" align="center">0.72</td>
<td valign="top" align="center">0.71</td>
<td valign="top" align="center">0.49</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Yield</td>
<td valign="top" align="left">Transcripts</td>
<td valign="top" align="center">0.55</td>
<td valign="top" align="center">0.52</td>
<td valign="top" align="center">0.53</td>
<td valign="top" align="center">0.54</td>
<td valign="top" align="center">0.64</td>
<td valign="top" align="center">0.65</td>
<td valign="top" align="center">0.51</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p><italic>Acronyms of the methods are as in <xref ref-type="fig" rid="F2">Figures 2</xref>, <xref ref-type="fig" rid="F3">3</xref>. For RF-BLUP*, we used the randomForest package with the default settings; for RF-BLUP, hyper-parameters were optimized using the caret package (data-sets 1 and 2) or scikit-learn (data-set 3). For data-sets 1 and 2, reported accuracies are averages over 160 test sets (standard errors between 0.006 and 0.012), except for RF-BLUP, where 50 sets were used (SE between 0.010 and 0.014). In dataset 3, 30 test sets were used for all methods (SE between 0.006 and 0.03)</italic>.</p>
</table-wrap-foot>
</table-wrap>
<p>In scenario 1 (<xref ref-type="table" rid="T1">Table 1</xref>), none of the multi-trait methods performed consistently better than univariate GBLUP. For the second trait in data-set 1 (Salt5), RF-BLUP had accuracy 0.09, vs. 0.03 for univariate GBLUP; the latter had highest accuracy for the first and third trait in dataset 1 (fungus, and drought and fungus stress combined).</p>
<p>The remainder of this section we focus on scenario 2 (<xref ref-type="table" rid="T2">Table 2</xref>), in which there were more substantial differences among methods. For all datasets, methods based on multiple relatedness matrices (Multi-BLUP and GM-BLUP) had accuracies similar to single-trait GBLUP. As in the simulations, GM-BLUP gave only a minor (if any) improvement over Multi-BLUP. The approaches based on dimension reduction of the secondary traits (LS-BLUP and RF-BLUP) appeared to give a substantial improvement over univariate GBLUP, e.g., from <italic>r</italic> &#x0003D; 0.03 to <italic>r</italic> &#x0003D; 0.23 (LS-BLUP) for the Salt5 trait in data-set 1, or from <italic>r</italic> &#x0003D; 0.55 to <italic>r</italic> &#x0003D; 0.65 (RF-BLUP) for Maize yield in data-set 3, with transcriptomics as secondary traits.</p>
<p>LS-BLUP had the highest accuracy in all Arabidopsis datasets, with a small but consistent improvement over RF-BLUP (0.02&#x02013;0.03 higher), also when optimized with the caret/scikit-learn packages. This hyperparameter optimization appeared to be rather important for the Maize data; using the default settings from the randomForest package (as in the simulations), accuracy was considerably lower (for yield and the transcripts for example, <italic>r</italic> &#x0003D; 0.65 vs. <italic>r</italic> &#x0003D; 0.51).</p>
<p>For the maize data, RF/LS-BLUP improved accuracy for yield from around 0.64 &#x02212; 0.65 to 0.71 &#x02212; 72 when plant height and flowering time were included as secondary phenotypes, together with the transcriptome data. None of the other methods could exploit the additional data, and accuracies were similar to those obtained with the transcripts alone. Prediction based on the secondary traits alone (M-BLUP) had around zero accuracy in all Arabidopsis data-sets, but <italic>r</italic> &#x0003D; 0.49&#x02212;0.54 for the maize data, similar to GBLUP and multi-BLUP.</p>
</sec>
</sec>
<sec sec-type="discussion" id="s4">
<title>4. Discussion</title>
<p>Given the importance of genomic selection in plant breeding and the rapid development of phenotyping technology, it becomes increasingly important to know if and how the availability of additional phenotypic traits can improve prediction accuracy for a target trait. Here we proposed new methods to incorporate large numbers of such additional traits in genomic prediction, and compared these to existing methods, in simulated and real data. In many of the simulated data-sets, some of our methods indeed greatly improved univariate genomic prediction. In these cases, the accuracy was often close to that of penalized selection indices, without requiring plot-level data. In other cases, none of the methods did very much better than univariate prediction, while the multi-trait benchmark indicated that there is in fact scope for improvement. This happens especially when genetic and residual correlation have opposite sign. Moreover, our study indicates that current methods do not perform well when the secondary traits are available only on the training set (i.e., in scenario 1): while there was often some improvement in many of the simulations, accuracy in scenario 1 was hardly improved for any of the real data-sets.</p>
<p>While scenario 1 is probably most common, scenario 2 (secondary traits being also observed for the test set) may arise in a number of applications. In particular, it has become increasingly common to screen large collections for metabolites or other types of -omics data, and scenario 2 may also arise in a biomedical context when biomarkers could be used to predict disease. Our results for various stress traits in Arabidopsis showed that metabolites can indeed improve accuracy, even if they were measured in a different study. While Multi-BLUP and the LS- and RF-BLUP require balanced data, the GM-BLUP is more flexible, and can also handle an intermediate scenario where only some of the secondary traits are measured for all (or some of) the test genotypes.</p>
<p>Except SI-BLUP, all methods implicitly assume a causal relationship between the secondary traits and the target trait. In our simulations, accuracy was indeed suboptimal when this relationship was weak or absent. However, in these cases the SI-BLUP often performed poorly as well. The accuracy of LS-BLUP and RF-BLUP may be improved if one could successfully address the following two artifacts. First, the dimension reduction and genomic prediction should ideally be carried out on different subsets of the training set. In the populations we considered here, this however led to poor estimation of variance components and lower accuracies, because of the relatively small population size. We therefore used the whole training set for both dimension reduction and genomic prediction. The advantage of a larger training set seems to outweigh the incurred overfitting, but this may be different for larger populations, in which case sub-sampling strategies like bootstrap aggregation (bagging) might be useful. Second, specifically for LS-BLUP, the cross-validation in the first (dimension reduction) step appears to select too many variables. Often, this may still result in an accurate prediction &#x00176;<sub><italic>s</italic></sub> on the training set, but for the prediction of breeding values on the test set that leads to overfitting. The methodology implemented in the hdi-package (Dezeure et al., <xref ref-type="bibr" rid="B6">2015</xref>) might resolve this issue, by first assessing significance of secondary traits. Such improvements should at least guarantee an accuracy that is never (much) below that of univariate GBLUP. Finally, a remaining limitation of RF-BLUP and LS-BLUP is that the dimension reduction relies on phenotypic rather than genetic values, which is likely to stay sub-optimal in case genetic and residual correlations have opposite sign.</p>
<p>We attempted to improve existing multi-kernel methods with our GM-BLUP approach, replacing secondary traits by their genomic predictions. Unfortunately, this led to only minor improvements. In case secondary traits have high heritability, there is little shrinkage and genomic predictions and trait values are highly correlated, leading to similar accuracies. In case secondary traits have lower heritabilities, the methods may potentially differ more, but at the same time, in such a scenario there is much less scope for improvement with multi-trait methods in the first place. Both Multi-BLUP and GM-BLUP were often less accurate than competing methods. To some extent this may be explained by the absence of variable selection, or, compared to RF-BLUP, the assumed linearity. Nonetheless, GM-BLUP extended the use of Multi-BLUP to scenario 1, without ever being less accurate.</p>
<p>For the case of a single secondary trait, Runcie and Cheng (<xref ref-type="bibr" rid="B23">2019</xref>) studied the bias in accuracy estimates, when these are based on the correlation with the observed phenotype, rather than with the (unobserved) genetic effect. This can become problematic when traits are measured on the same plants, in which case the amount of bias is likely to vary among methods, in particular when residual correlations between the target and secondary traits are large. For the Arabisopsis and maize data considered here, the bias should be constant, as all target and secondary traits were measured on different plants. No bias occurred for the simulated data, where we used the true genetic values to assess accuracy. Nevertheless, further work is needed to extend the methods presented here with reliable estimates of accuracy, also in the case of traits measured on the same plants. For the LS-BLUP, RF-BLUP and SI-BLUP, the parametric and semi-parametric accuracy estimates of Runcie and Cheng (<xref ref-type="bibr" rid="B23">2019</xref>) can in principle be computed, since all these methods reduce the dimension of the secondary traits to one. This would however require the sample-splitting or bagging schemes mentioned above, and it is an open question how the different accuracy estimates should be aggregated.</p>
<p>Statistical methods for high-dimensional data often benefit from initial screening, for example by removing variables with very low marginal correlation (see e.g., Fan and Lv, <xref ref-type="bibr" rid="B8">2008</xref>). In the present context, screening should be based on heritability and genetic correlation with the target trait. This is however difficult for several reasons. First, as pointed out before, reliable estimates of these correlations require plot-level data, at least for the population sizes considered here. Moreover, bivariate mixed models need to be fitted for each secondary trait, increasing computation time. A more fundamental problem is that even if accurate estimates were available, it would be difficult to formulate an appropriate criterion and threshold. The well-known criterion for a single secondary trait (whose heritability times the squared genetic correlation with the target trait should exceed the heritability of the latter) cannot directly be generalized. For example, in one of our simulation settings (i.e., with &#x003BB; &#x0003D; 0 and &#x003C1;<sub><italic>G</italic></sub> &#x0003D; 0.5), each of the three relevant secondary traits (<italic>Y</italic><sub>2</sub>, <italic>Y</italic><sub>3</sub>, <italic>Y</italic><sub>4</sub>) has heritability 0.7, the heritability of the target trait being 0.2. Consequently, we have <inline-formula><mml:math id="M67"><mml:mn>0</mml:mn><mml:mo>.</mml:mo><mml:mn>7</mml:mn><mml:mo>&#x000D7;</mml:mo><mml:msubsup><mml:mrow><mml:mi>&#x003C1;</mml:mi></mml:mrow><mml:mrow><mml:mi>G</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup><mml:mo>&#x0003C;</mml:mo><mml:mn>0</mml:mn><mml:mo>.</mml:mo><mml:mn>2</mml:mn></mml:math></inline-formula> for each secondary trait individually, while at the same time genomic prediction using a mixed model for <italic>Y</italic><sub>1</sub> &#x02212; <italic>Y</italic><sub>4</sub> is more accurate than with a mixed model for <italic>Y</italic><sub>1</sub> alone.</p>
<p>More generally, the methods presented here could be extended in several ways. First, for all of them, prediction relies on the GBLUP: either bivariate GBLUP, or univariate GBLUP extended with additional relatedness matrices. This corresponds to a Gaussian prior on the marker effects, which could be generalized to a mixture of Gaussians and a point mass at 0, as for example in Bayes-R (Moser et al., <xref ref-type="bibr" rid="B21">2015</xref>). Another extension would be the prediction of sensitivities to environmental covariates, which could then be used to predict new environments, as in Millet et al. (<xref ref-type="bibr" rid="B20">2019</xref>). In the LS- and RF-BLUP methods, a wider range of prediction methods could be considered to achieve the dimension reduction, such as elastic nets or gradient tree boosting. Ideally, this reduction is driven by genetic rather than phenotypic effects, and the dimension should not necessarily be reduced to one (like we did here), but to a data-driven number. Finally, it would be of interest to relax the linearity assumption on which most methods (except RF-BLUP) rely. Deep learning with feedforward or convolutional neural networks seems of particular interest here, especially for the relationship between target and secondary traits.</p>
</sec>
<sec id="s5">
<title>Author Contributions</title>
<p>BA performed the research. WK, BA, and FE designed the research. BA and WK wrote the paper, with input from TT and FE. All authors contributed to the article and approved the submitted version.</p>
</sec>
<sec sec-type="COI-statement" id="conf1">
<title>Conflict of Interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
</body>
<back>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Araus</surname> <given-names>J. L.</given-names></name> <name><surname>Kefauver</surname> <given-names>S. C.</given-names></name> <name><surname>Zaman-Allah</surname> <given-names>M.</given-names></name> <name><surname>Olsen</surname> <given-names>M. S.</given-names></name> <name><surname>Cairns</surname> <given-names>J. E.</given-names></name></person-group> (<year>2018</year>). <article-title>Translating high-throughput phenotyping into genetic gain</article-title>. <source>Trends Plant Sci</source>. <volume>23</volume>, <fpage>451</fpage>&#x02013;<lpage>466</lpage>. <pub-id pub-id-type="doi">10.1016/j.tplants.2018.02.001</pub-id><pub-id pub-id-type="pmid">29555431</pub-id></citation></ref>
<ref id="B2">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Arouisse</surname> <given-names>B.</given-names></name> <name><surname>Korte</surname> <given-names>A.</given-names></name> <name><surname>van Eeuwijk</surname> <given-names>F.</given-names></name> <name><surname>Kruijer</surname> <given-names>W.</given-names></name></person-group> (<year>2020</year>). <article-title>Imputation of 3 million snps in the arabidopsis regional mapping population</article-title>. <source>Plant J</source>. <volume>102</volume>, <fpage>872</fpage>&#x02013;<lpage>882</lpage>. <pub-id pub-id-type="doi">10.1111/tpj.14659</pub-id><pub-id pub-id-type="pmid">31856318</pub-id></citation></ref>
<ref id="B3">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Azodi</surname> <given-names>C. B.</given-names></name> <name><surname>Pardo</surname> <given-names>J.</given-names></name> <name><surname>VanBuren</surname> <given-names>R.</given-names></name> <name><surname>de los Campos</surname> <given-names>G.</given-names></name> <name><surname>Shiu</surname> <given-names>S.-H.</given-names></name></person-group> (<year>2020</year>). <article-title>Transcriptome-based prediction of complex traits in maize</article-title>. <source>Plant Cell</source> <volume>32</volume>, <fpage>139</fpage>&#x02013;<lpage>151</lpage>. <pub-id pub-id-type="doi">10.1105/tpc.19.00332</pub-id><pub-id pub-id-type="pmid">31641024</pub-id></citation></ref>
<ref id="B4">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Butler</surname> <given-names>D. G.</given-names></name> <name><surname>Cullis</surname> <given-names>B. R.</given-names></name> <name><surname>Gilmour</surname> <given-names>A. R.</given-names></name> <name><surname>Gogel</surname> <given-names>B. J.</given-names></name></person-group> (<year>2009</year>). <article-title>ASReml-R reference manual</article-title>. <source>Release 3.0. Technical Report</source>, Queensland Department of Primary Industries, Australia.</citation></ref>
<ref id="B5">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Covarrubias-Pazaran</surname> <given-names>G.</given-names></name></person-group> (<year>2016</year>). <article-title>Genome-assisted prediction of quantitative traits using the r package sommer</article-title>. <source>PLoS ONE</source> <volume>11</volume>:<fpage>e156744</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pone.0156744</pub-id><pub-id pub-id-type="pmid">27271781</pub-id></citation></ref>
<ref id="B6">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dezeure</surname> <given-names>R.</given-names></name> <name><surname>B&#x000FC;hlmann</surname> <given-names>P.</given-names></name> <name><surname>Meier</surname> <given-names>L.</given-names></name> <name><surname>Meinshausen</surname> <given-names>N.</given-names></name></person-group> (<year>2015</year>). <article-title>High-dimensional inference: Confidence intervals, <italic>p</italic>-values and R-software HDI</article-title>. <source>Stat. Sci</source>. <volume>30</volume>, <fpage>533</fpage>&#x02013;<lpage>558</lpage>. <pub-id pub-id-type="doi">10.1214/15-STS527</pub-id></citation></ref>
<ref id="B7">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Falconer</surname> <given-names>D. S.</given-names></name> <name><surname>Mackay</surname> <given-names>T. F. C.</given-names></name></person-group> (<year>1996</year>). <source>Introduction to Quantitative Genetics, 4th Edn</source>. <publisher-loc>Harlow</publisher-loc>: <publisher-name>Prentice Hall</publisher-name>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://www.worldcat.org/title/introduction-to-quantitative-genetics/oclc/422852955">https://www.worldcat.org/title/introduction-to-quantitative-genetics/oclc/422852955</ext-link></citation></ref>
<ref id="B8">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fan</surname> <given-names>J.</given-names></name> <name><surname>Lv</surname> <given-names>J.</given-names></name></person-group> (<year>2008</year>). <article-title>Sure independence screening for ultrahigh dimensional feature space</article-title>. <source>J. R. Stat. Soc. Ser. B</source> <volume>70</volume>, <fpage>849</fpage>&#x02013;<lpage>911</lpage>. <pub-id pub-id-type="doi">10.1111/j.1467-9868.2008.00674.x</pub-id></citation></ref>
<ref id="B9">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Friedman</surname> <given-names>J.</given-names></name> <name><surname>Hastie</surname> <given-names>T.</given-names></name> <name><surname>Tibshirani</surname> <given-names>R.</given-names></name></person-group> (<year>2010</year>). <article-title>Regularization paths for generalized linear models via coordinate descent</article-title>. <source>J. Stat. Softw</source>. <volume>33</volume>, <fpage>1</fpage>&#x02013;<lpage>22</lpage>. <pub-id pub-id-type="doi">10.18637/jss.v033.i01</pub-id><pub-id pub-id-type="pmid">20808728</pub-id></citation></ref>
<ref id="B10">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fu</surname> <given-names>J.</given-names></name> <name><surname>Falke</surname> <given-names>K. C.</given-names></name> <name><surname>Thiemann</surname> <given-names>A.</given-names></name> <name><surname>Schrag</surname> <given-names>T. A.</given-names></name> <name><surname>Melchinger</surname> <given-names>A. E.</given-names></name> <name><surname>Scholten</surname> <given-names>S.</given-names></name> <etal/></person-group>. (<year>2012</year>). <article-title>Partial least squares regression, support vector machine regression, and transcriptome-based distances for prediction of maize hybrid performance with gene expression data</article-title>. <source>Theor. Appl. Genet</source>. <volume>124</volume>, <fpage>825</fpage>&#x02013;<lpage>833</lpage>. <pub-id pub-id-type="doi">10.1007/s00122-011-1747-9</pub-id><pub-id pub-id-type="pmid">22101908</pub-id></citation></ref>
<ref id="B11">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fusari</surname> <given-names>C. M.</given-names></name> <name><surname>Kooke</surname> <given-names>R.</given-names></name> <name><surname>Lauxmann</surname> <given-names>M. A.</given-names></name> <name><surname>Annunziata</surname> <given-names>M. G.</given-names></name> <name><surname>Enke</surname> <given-names>B.</given-names></name> <name><surname>Hoehne</surname> <given-names>M.</given-names></name> <etal/></person-group>. (<year>2017</year>). <article-title>Genome-wide association mapping reveals that specific and pleiotropic regulatory mechanisms fine-tune central metabolism and growth in arabidopsis</article-title>. <source>Plant Cell</source> <volume>29</volume>, <fpage>2349</fpage>&#x02013;<lpage>2373</lpage>. <pub-id pub-id-type="doi">10.1105/tpc.17.00232</pub-id><pub-id pub-id-type="pmid">28954812</pub-id></citation></ref>
<ref id="B12">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gianola</surname> <given-names>D.</given-names></name> <name><surname>Sorensen</surname> <given-names>D.</given-names></name></person-group> (<year>2004</year>). <article-title>Quantitative genetic models for describing simultaneous and recursive relationships between phenotypes</article-title>. <source>Genetics</source> <volume>167</volume>, <fpage>1407</fpage>&#x02013;<lpage>1424</lpage>. <pub-id pub-id-type="doi">10.1534/genetics.103.025734</pub-id><pub-id pub-id-type="pmid">15280252</pub-id></citation></ref>
<ref id="B13">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Grotzinger</surname> <given-names>A. D.</given-names></name> <name><surname>Rhemtulla</surname> <given-names>M.</given-names></name> <name><surname>de Vlaming</surname> <given-names>R.</given-names></name> <name><surname>Ritchie</surname> <given-names>S. J.</given-names></name> <name><surname>Mallard</surname> <given-names>T. T.</given-names></name> <name><surname>Hill</surname> <given-names>W. D.</given-names></name> <etal/></person-group>. (<year>2019</year>). <article-title>Genomic structural equation modelling provides insights into the multivariate genetic architecture of complex traits</article-title>. <source>Nat. Hum. Behav</source>. <volume>3</volume>, <fpage>513</fpage>&#x02013;<lpage>525</lpage>. <pub-id pub-id-type="doi">10.1038/s41562-019-0566-x</pub-id><pub-id pub-id-type="pmid">30962613</pub-id></citation></ref>
<ref id="B14">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Horton</surname> <given-names>M. W.</given-names></name> <name><surname>Hancock</surname> <given-names>A. M.</given-names></name> <name><surname>Huang</surname> <given-names>Y. S.</given-names></name> <name><surname>Toomajian</surname> <given-names>C.</given-names></name> <name><surname>Atwell</surname> <given-names>S.</given-names></name> <name><surname>Auton</surname> <given-names>A.</given-names></name> <etal/></person-group>. (<year>2012</year>). <article-title>Genome-wide patterns of genetic variation in worldwide Arabidopsis thaliana accessions from the RegMap panel</article-title>. <source>Nat. Genet</source>. <volume>44</volume>, <fpage>212</fpage>&#x02013;<lpage>216</lpage>. <pub-id pub-id-type="doi">10.1038/ng.1042</pub-id><pub-id pub-id-type="pmid">22231484</pub-id></citation></ref>
<ref id="B15">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kruijer</surname> <given-names>W.</given-names></name> <name><surname>Behrouzi</surname> <given-names>P.</given-names></name> <name><surname>Bustos-Korts</surname> <given-names>D.</given-names></name> <name><surname>Rodr&#x000ED;guez-&#x000C1;lvarez</surname> <given-names>M. X.</given-names></name> <name><surname>Mahmoudi</surname> <given-names>S. M.</given-names></name> <name><surname>Yandell</surname> <given-names>B.</given-names></name> <etal/></person-group>. (<year>2020</year>). <article-title>Reconstruction of networks with direct and indirect genetic effects</article-title>. <source>Genetics</source> <volume>214</volume>, <fpage>781</fpage>&#x02013;<lpage>807</lpage>. <pub-id pub-id-type="doi">10.1534/genetics.119.302949</pub-id><pub-id pub-id-type="pmid">32015018</pub-id></citation></ref>
<ref id="B16">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Liaw</surname> <given-names>A.</given-names></name> <name><surname>Wiener</surname> <given-names>M.</given-names></name></person-group> (<year>2002</year>). <article-title>Classification and regression by randomforest</article-title>. <source>R News</source> <volume>2</volume>, <fpage>18</fpage>&#x02013;<lpage>22</lpage>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://www.researchgate.net/publication/228451484_Classification_and_Regression_by_RandomForest">https://www.researchgate.net/publication/228451484_Classification_and_Regression_by_RandomForest</ext-link></citation></ref>
<ref id="B17">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lopez-Cruz</surname> <given-names>M.</given-names></name> <name><surname>Olson</surname> <given-names>E.</given-names></name> <name><surname>Rovere</surname> <given-names>G.</given-names></name> <name><surname>Crossa</surname> <given-names>J.</given-names></name> <name><surname>Dreisigacker</surname> <given-names>S.</given-names></name> <name><surname>Mondal</surname> <given-names>S.</given-names></name> <etal/></person-group>. (<year>2020</year>). <article-title>Regularized selection indices for breeding value prediction using hyper-spectral image data</article-title>. <source>Sci. Rep</source>. <volume>10</volume>, <fpage>1</fpage>&#x02013;<lpage>12</lpage>. <pub-id pub-id-type="doi">10.1038/s41598-020-65011-2</pub-id><pub-id pub-id-type="pmid">32424224</pub-id></citation></ref>
<ref id="B18">
<citation citation-type="thesis"><person-group person-group-type="author"><name><surname>Melandri</surname> <given-names>G.</given-names></name></person-group> (<year>2019</year>). <source>Understanding drought tolerance in rice by the dissection and genetic analysis of leaf metabolism, oxidative stress status and stomatal behavior</source> (Ph.D. thesis). <publisher-name>Wageningen University</publisher-name>, <publisher-loc>Wageningen, Netherlands</publisher-loc>.</citation></ref>
<ref id="B19">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Meuwissen</surname> <given-names>T. H. E.</given-names></name> <name><surname>Hayes</surname> <given-names>B. J.</given-names></name> <name><surname>Goddard</surname> <given-names>M. E.</given-names></name></person-group> (<year>2001</year>). <article-title>Prediction of total genetic value using genome-wide dense marker maps</article-title>. <source>Genetics</source> <volume>157</volume>, <fpage>1819</fpage>&#x02013;<lpage>1829</lpage>. <pub-id pub-id-type="doi">10.1093/genetics/157.4.1819</pub-id><pub-id pub-id-type="pmid">11290733</pub-id></citation></ref>
<ref id="B20">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Millet</surname> <given-names>E. J.</given-names></name> <name><surname>Kruijer</surname> <given-names>W.</given-names></name> <name><surname>Coupel-Ledru</surname> <given-names>A.</given-names></name> <name><surname>Alvarez Prado</surname> <given-names>S.</given-names></name> <name><surname>Cabrera-Bosquet</surname> <given-names>L.</given-names></name> <name><surname>Lacube</surname> <given-names>S.</given-names></name> <etal/></person-group>. (<year>2019</year>). <article-title>Genomic prediction of maize yield across European environmental conditions</article-title>. <source>Nat. Genet</source>. <volume>51</volume>, <fpage>952</fpage>&#x02013;<lpage>956</lpage>. <pub-id pub-id-type="doi">10.1038/s41588-019-0414-y</pub-id><pub-id pub-id-type="pmid">31110353</pub-id></citation></ref>
<ref id="B21">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Moser</surname> <given-names>G.</given-names></name> <name><surname>Lee</surname> <given-names>S. H.</given-names></name> <name><surname>Hayes</surname> <given-names>B. J.</given-names></name> <name><surname>Goddard</surname> <given-names>M. E.</given-names></name> <name><surname>Wray</surname> <given-names>N. R.</given-names></name> <name><surname>Visscher</surname> <given-names>P. M.</given-names></name></person-group> (<year>2015</year>). <article-title>Simultaneous discovery, estimation and prediction analysis of complex traits using a Bayesian mixture model</article-title>. <source>PLoS Genet</source>. <volume>11</volume>:<fpage>e1004969</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pgen.1004969</pub-id><pub-id pub-id-type="pmid">25849665</pub-id></citation></ref>
<ref id="B22">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Riedelsheimer</surname> <given-names>C.</given-names></name> <name><surname>Czedik-Eysenberg</surname> <given-names>A.</given-names></name> <name><surname>Grieder</surname> <given-names>C.</given-names></name> <name><surname>Lisec</surname> <given-names>J.</given-names></name> <name><surname>Technow</surname> <given-names>F.</given-names></name> <name><surname>Sulpice</surname> <given-names>R.</given-names></name> <etal/></person-group>. (<year>2012</year>). <article-title>Genomic and metabolic prediction of complex heterotic traits in hybrid maize</article-title>. <source>Nat. Genet</source>. <volume>44</volume>, <fpage>217</fpage>&#x02013;<lpage>220</lpage>. <pub-id pub-id-type="doi">10.1038/ng.1033</pub-id><pub-id pub-id-type="pmid">22246502</pub-id></citation></ref>
<ref id="B23">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Runcie</surname> <given-names>D.</given-names></name> <name><surname>Cheng</surname> <given-names>H.</given-names></name></person-group> (<year>2019</year>). <article-title>Pitfalls and remedies for cross validation with multi-trait genomic prediction methods</article-title>. <source>G3</source> <volume>9</volume>, <fpage>3727</fpage>&#x02013;<lpage>3741</lpage>. <pub-id pub-id-type="doi">10.1534/g3.119.400598</pub-id><pub-id pub-id-type="pmid">31511297</pub-id></citation></ref>
<ref id="B24">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schrag</surname> <given-names>T. A.</given-names></name> <name><surname>Westhues</surname> <given-names>M.</given-names></name> <name><surname>Schipprack</surname> <given-names>W.</given-names></name> <name><surname>Seifert</surname> <given-names>F.</given-names></name> <name><surname>Thiemann</surname> <given-names>A.</given-names></name> <name><surname>Scholten</surname> <given-names>S.</given-names></name> <etal/></person-group>. (<year>2018</year>). <article-title>Beyond genomic prediction: combining different types of omics data can improve prediction of hybrid performance in maize</article-title>. <source>Genetics</source> <volume>208</volume>, <fpage>1373</fpage>&#x02013;<lpage>1385</lpage>. <pub-id pub-id-type="doi">10.1534/genetics.117.300374</pub-id><pub-id pub-id-type="pmid">29363551</pub-id></citation></ref>
<ref id="B25">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schulthess</surname> <given-names>A. W.</given-names></name> <name><surname>Wang</surname> <given-names>Y.</given-names></name> <name><surname>Miedaner</surname> <given-names>T.</given-names></name> <name><surname>Wilde</surname> <given-names>P.</given-names></name> <name><surname>Reif</surname> <given-names>J. C.</given-names></name> <name><surname>Zhao</surname> <given-names>Y.</given-names></name></person-group> (<year>2016</year>). <article-title>Multiple-trait- and selection indices-genomic predictions for grain yield and protein content in rye for feeding purposes. TAG. Theoretical and applied genetics</article-title>. <source>Theor. Angew. Genet</source>. <volume>129</volume>, <fpage>273</fpage>&#x02013;<lpage>287</lpage>. <pub-id pub-id-type="doi">10.1007/s00122-015-2626-6</pub-id></citation></ref>
<ref id="B26">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Speed</surname> <given-names>D.</given-names></name> <name><surname>Balding</surname> <given-names>D. J.</given-names></name></person-group> (<year>2014</year>). <article-title>MultiBLUP: improved SNP-based prediction for complex traits</article-title>. <source>Genome Res</source>. <volume>24</volume>, <fpage>1550</fpage>&#x02013;<lpage>1557</lpage>. <pub-id pub-id-type="doi">10.1101/gr.169375.113</pub-id><pub-id pub-id-type="pmid">24963154</pub-id></citation></ref>
<ref id="B27">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sun</surname> <given-names>J.</given-names></name> <name><surname>Poland</surname> <given-names>J. A.</given-names></name> <name><surname>Mondal</surname> <given-names>S.</given-names></name> <name><surname>Crossa</surname> <given-names>J.</given-names></name> <name><surname>Juliana</surname> <given-names>P.</given-names></name> <name><surname>Singh</surname> <given-names>R. P.</given-names></name> <etal/></person-group>. (<year>2019</year>). <article-title>High-throughput phenotyping platforms enhance genomic selection for wheat grain yield across populations and cycles in early stage</article-title>. <source>Theor. Appl. Genet</source>. <volume>132</volume>, <fpage>1705</fpage>&#x02013;<lpage>1720</lpage>. <pub-id pub-id-type="doi">10.1007/s00122-019-03309-0</pub-id><pub-id pub-id-type="pmid">30778634</pub-id></citation></ref>
<ref id="B28">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Thoen</surname> <given-names>M. P. M.</given-names></name> <name><surname>Davila Olivas</surname> <given-names>N. H.</given-names></name> <name><surname>Kloth</surname> <given-names>K. J.</given-names></name> <name><surname>Coolen</surname> <given-names>S.</given-names></name> <name><surname>Huang</surname> <given-names>P.-P.</given-names></name> <name><surname>Aarts</surname> <given-names>M. G. M.</given-names></name> <etal/></person-group>. (<year>2017</year>). <article-title>Genetic architecture of plant stress resistance: multi-trait genome-wide association mapping</article-title>. <source>New Phytol</source>. <volume>213</volume>, <fpage>1346</fpage>&#x02013;<lpage>1362</lpage>. <pub-id pub-id-type="doi">10.1111/nph.14220</pub-id><pub-id pub-id-type="pmid">27699793</pub-id></citation></ref>
<ref id="B29">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>T&#x000F6;pner</surname> <given-names>K.</given-names></name> <name><surname>Rosa</surname> <given-names>G. J. M.</given-names></name> <name><surname>Gianola</surname> <given-names>D.</given-names></name> <name><surname>Sch&#x000F6;n</surname> <given-names>C.-C.</given-names></name></person-group> (<year>2017</year>). <article-title>Bayesian networks illustrate genomic and residual trait connections in maize (<italic>Zea mays</italic> l.)</article-title>. <source>G3</source> <volume>7</volume>, <fpage>2779</fpage>&#x02013;<lpage>2789</lpage>. <pub-id pub-id-type="doi">10.1534/g3.117.044263</pub-id><pub-id pub-id-type="pmid">28637811</pub-id></citation></ref>
<ref id="B30">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Van De Wiel</surname> <given-names>M. A.</given-names></name> <name><surname>Lien</surname> <given-names>T. G.</given-names></name> <name><surname>Verlaat</surname> <given-names>W.</given-names></name> <name><surname>van Wieringen</surname> <given-names>W. N.</given-names></name> <name><surname>Wilting</surname> <given-names>S. M.</given-names></name></person-group> (<year>2016</year>). <article-title>Better prediction by use of co-data: adaptive group-regularized ridge regression</article-title>. <source>Stat. Med</source>. <volume>35</volume>, <fpage>368</fpage>&#x02013;<lpage>381</lpage>. <pub-id pub-id-type="doi">10.1002/sim.6732</pub-id><pub-id pub-id-type="pmid">26365903</pub-id></citation></ref>
<ref id="B31">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>van Heerwaarden</surname> <given-names>J.</given-names></name> <name><surname>van Zanten</surname> <given-names>M.</given-names></name> <name><surname>Kruijer</surname> <given-names>W.</given-names></name></person-group> (<year>2015</year>). <article-title>Genome-wide association analysis of adaptation using environmentally predicted traits</article-title>. <source>PLoS Genet</source>. <volume>11</volume>:<fpage>e1005594</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pgen.1005594</pub-id><pub-id pub-id-type="pmid">26496492</pub-id></citation></ref>
<ref id="B32">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Velazco</surname> <given-names>J. G.</given-names></name> <name><surname>Jordan</surname> <given-names>D. R.</given-names></name> <name><surname>Mace</surname> <given-names>E. S.</given-names></name> <name><surname>Hunt</surname> <given-names>C. H.</given-names></name> <name><surname>Malosetti</surname> <given-names>M.</given-names></name> <name><surname>van Eeuwijk</surname> <given-names>F. A.</given-names></name></person-group> (<year>2019</year>). <article-title>Genomic prediction of grain yield and drought-adaptation capacity in sorghum is enhanced by multi-trait analysis</article-title>. <source>Front. Plant Sci</source>. <volume>10</volume>:<fpage>997</fpage>. <pub-id pub-id-type="doi">10.3389/fpls.2019.00997</pub-id><pub-id pub-id-type="pmid">31417601</pub-id></citation></ref>
<ref id="B33">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Xiang</surname> <given-names>R.</given-names></name> <name><surname>Berg</surname> <given-names>I. v. d.</given-names></name> <name><surname>MacLeod</surname> <given-names>I. M.</given-names></name> <name><surname>Hayes</surname> <given-names>B. J.</given-names></name> <name><surname>Prowse-Wilkins</surname> <given-names>C. P.</given-names></name> <name><surname>Wang</surname> <given-names>M.</given-names></name> <etal/></person-group>. (<year>2019</year>). <article-title>Quantifying the contribution of sequence variants with regulatory and evolutionary significance to 34 bovine complex traits</article-title>. <source>Proc. Natl. Acad. Sci. U.S.A</source>. <volume>116</volume>, <fpage>19398</fpage>&#x02013;<lpage>19408</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.1904159116</pub-id><pub-id pub-id-type="pmid">31501319</pub-id></citation></ref>
<ref id="B34">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Xu</surname> <given-names>S.</given-names></name> <name><surname>Xu</surname> <given-names>Y.</given-names></name> <name><surname>Gong</surname> <given-names>L.</given-names></name> <name><surname>Zhang</surname> <given-names>Q.</given-names></name></person-group> (<year>2016</year>). <article-title>Metabolomic prediction of yield in hybrid rice</article-title>. <source>Plant J</source>. <volume>88</volume>, <fpage>219</fpage>&#x02013;<lpage>227</lpage>. <pub-id pub-id-type="doi">10.1111/tpj.13242</pub-id><pub-id pub-id-type="pmid">27311694</pub-id></citation></ref>
<ref id="B35">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yang</surname> <given-names>W.</given-names></name> <name><surname>Feng</surname> <given-names>H.</given-names></name> <name><surname>Zhang</surname> <given-names>X.</given-names></name> <name><surname>Zhang</surname> <given-names>J.</given-names></name> <name><surname>Doonan</surname> <given-names>J. H.</given-names></name> <name><surname>Batchelor</surname> <given-names>W. D.</given-names></name> <etal/></person-group>. (<year>2020</year>). <article-title>Crop phenomics and high-throughput phenotyping: past decades, current challenges, and future perspectives</article-title>. <source>Mol. Plant</source> <volume>13</volume>, <fpage>187</fpage>&#x02013;<lpage>214</lpage>. <pub-id pub-id-type="doi">10.1016/j.molp.2020.01.008</pub-id><pub-id pub-id-type="pmid">31981735</pub-id></citation></ref>
<ref id="B36">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhou</surname> <given-names>X.</given-names></name> <name><surname>Stephens</surname> <given-names>M.</given-names></name></person-group> (<year>2014</year>). <article-title>Efficient multivariate linear mixed model algorithms for genome-wide association studies</article-title>. <source>Nat. Methods</source> <volume>11</volume>, <fpage>407</fpage>&#x02013;<lpage>409</lpage>. <pub-id pub-id-type="doi">10.1038/nmeth.2848</pub-id><pub-id pub-id-type="pmid">24531419</pub-id></citation></ref>
<ref id="B37">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zwiernik</surname> <given-names>P.</given-names></name> <name><surname>Uhler</surname> <given-names>C.</given-names></name> <name><surname>Richards</surname> <given-names>D.</given-names></name></person-group> (<year>2017</year>). <article-title>Maximum likelihood estimation for linear gaussian covariance models</article-title>. <source>J. R. Stat. Soc. Ser. B</source> <volume>79</volume>, <fpage>1269</fpage>&#x02013;<lpage>1292</lpage>. <pub-id pub-id-type="doi">10.1111/rssb.12217</pub-id></citation></ref>
</ref-list>
<fn-group>
<fn fn-type="financial-disclosure"><p><bold>Funding.</bold> This work was supported by the Netherlands Scientific Organization for Research NWO-STW project 11145 Learning from Nature, and the EU project H2020 731013 (EPPN2020).</p>
</fn>
</fn-group>
</back>
</article> 