The Construction of Efficient Portfolios: A Verification of Risk Models for Investment Making

Various statistical models have been used in estimating inputs to mean-variance efficient portfolio construction since the mid-1960s. One can argue how many factors are necessary, but there appears to be substantial evidence that statistical models outperform fundamental models for several expected returns models, such as we test in this analysis. In this paper, we show that tracking portfolios constructed with expected return rankings based on earnings forecasting and price momentum composite alpha strategies produce statistically significant excess returns and increased Sharpe Ratios when optimized with 3-factor statistical risk model.


INTRODUCTION
In this paper, we study the construction of US mean-variance efficient portfolios during the period 1999-2017. We construct mean-variance portfolios by maximizing the 10-factor U.S. Expected Return stock selection model (USER) alphas and constraining Tracking Error with respect to the S&P 500 benchmark using 3-factor risk model of Blume et al. [1] 1 . The main finding of this paper is that the mean-variance efficient portfolios produce statistically significant portfolio excess returns in the US market.
The organization of the paper is as follows. The first section describes the construction of efficient portfolios, estimation of covariance matrix with multi-factor models, and the data used 1 An assumption underlying many studies is that the market model, or more generally a model with one factor common to all securities, generates realized returns. In such a one-factor model, realized returns are the sum of an asset's response to a stochastic factor common to all assets and a factor unique to the individual asset. In the last decade, there has been much interest in models with more than one common stochastic factor, using either pre-specified factors, like Fama and French [2] 3-factor model, or factors identified through factor analysis or similar multivariate techniques. Factor analysis and similar factor analytic techniques have on occasion played an important role in the analysis of returns on common stocks and other types of financial assets. Farrar [3] may have been the first to use factor analysis in conjunction with principal component analysis to assign securities into homogeneous correlation groups. King [4] used factor analysis to evaluate the role of market and industry factors in explaining stock returns. These two studies sparked an interest in multi-index models, and a rich body of empirical work soon emerged. Examples include Elton and Gruber [5,6], Meyer [7], Farrell [8], and Livingston [9], among others. The major goal of these earlier studies was to establish the smallest number of "indexes" needed to construct efficient sets. Factor models have been used in the tests of arbitrage pricing theory and its variants. See for example, Ross and Roll [10] and Dhrymes et al. [11][12][13], to cite a few from the large literature. in construction of size ranked portfolios in estimating common risk factors. The second section describes the expected excess return model used in the study, statistical estimation method, and the data. The third section describes construction of tracking portfolios and presents portfolio statistics. The final section contains concluding remarks.

CONSTRUCTING EFFICIENT PORTFOLIOS
The Markowitz portfolio construction approach is based on the premise that mean and variance of future outcomes are sufficient for rational decision making under uncertainty, to identify the best opportunity set, efficient frontier, where returns are maximized for a given level of risk, or minimize risk for a given level of return. The reader is referred to Markowitz [14,15] for the seminal discussion of portfolio construction and management. The two parameters needed are the portfolio expected return, E(R p ) is calculated by taking the sum of the security weights, w multiplied by their respective expected returns, and the portfolio standard deviation is the sum of the weighted covariances.
where, E = {µ 1 , µ 2 ,...,µ N } is N × 1 vector of expected security returns, (N is the number of candidate securities), is the N × N covariance matrix, W = {w 1 , w 2 ,..., w N } is the vector of weights, and 1 is the unit column vector. Sum of weights in (3) indicates that the portfolio is fully invested. One can construct infinite number of Mean-Variance efficient portfolios. Optimal portfolio choice decision will be determined by an investors' risk tolerance 2 . Following Markowitz's [14,[16][17][18] general portfolio optimization objective function is: where, λ is the coefficient of relative risk aversion of the investor 3 .
Accurate characterization of portfolio risk requires an accurate estimate of the covariance matrix of security returns.
Estimation of the covariance structure is almost always based on a linear return generating multi-factor model (MFM) in the form of: The non-factor, or asset-specific return on security jẽ j,t is the residual return of the security after removing the estimated impacts of the finite number of K factors where 1 ≤ K ≤ N. The termf k,t is the rate of return of factor "k, " which is independent of securities and affects the security's return through its exposure coefficient β jk . Under the assumption that the residual return e j,t is not correlated across securities the covariance matrix of the securities is reduced to form: where: B in (7) is the matrix of exposure coefficients, also referred as "loadings" in the literature, θ in (8) is the covariance matrix of the factors, and in (9) is the covariance of the residuals. The very first model used in the literature is Treynor's market model that led to development of the Capital Asset Pricing Model (CAPM) 4 . There is a rich volume of research covering multi factor models starting with King [4].
In this paper, we use a statistical risk model developed by Blume et al. [1], (BGG). Statistical factor models deduce the appropriate factor structure by analyzing the sample asset returns covariance matrix. There is no need to pre-define factors and compute exposures, as required by fundamental factor models. The only inputs are a time-series of asset returns and the number of desired factors. BGG has shown that return generating model based on factors analysis estimation is superior to commonly used multi-factor models used in the literature.

Data and Estimation Methodology
The empirical analyses to estimate the factors use monthly returns of 444 sets of size-ranked portfolios of NYSE stocks constructed from the CRSP file. The first set consists of all securities in the CRSP files with complete data for the 5 years 1980 through 1984. These securities ranked by their market value as of December 1979 and then partitioned into 30 equally weighted size-ranked portfolios with as close to an equal number of securities as possible. This process is repeated for each rolling 5-year period every month to December 2017 with each set consisting 30 monthly portfolio returns with 60 observations.
We use the maximum likelihood method (MLM) to estimate the factor models; the usual way to assess the number of required factors is to rerun the procedure, successively increasing the number of factors until the X 2 test for the goodness of fit developed by Bartlett [25] indicates that the number of factors is generally shown to be sufficient in explaining returns. To use this criterion, one must specify the level of significance, often arbitrarily set at 1 or 5 percent. The level of significance is important since there is a direct relation between the level of significance and the number of significant factors. BGG's findings indicate that the number of required factors varies over time. Their analysis of the required number of factors reveals a positive relation between the number of factors and the variability of returns during the estimation period. A rationale for this finding is that during periods of relatively low volatility, most of the volatility is firm specific and it is difficult to identify the common factors. In more volatile times, the common factors are relatively more important than the firm-specific factors, making it easier to identify them. Their findings indicate that median number of factor required to explain the returns at 5 percent confidence level is three. In Figure 1 we plot the standard deviation of portfolio 1, (small cap), portfolio 30, (large-cap), and the number of factors required at 5 percent confidence level. The number of factors needed during the study period is between two and four. In this paper, we set the number of factors to three rather than varying them over time based on Bartlett's goodness of fit criteria.
For each security in our universe and the benchmark (S&P 500), we estimated the factor loadings over the same period in (5) with three factors extracted from 30 size ranked portfolio returns. We then estimated the covariance matrix for each month, based on the previous 5 years of monthly data for the securities in our universe and the benchmark as: Note that the MLM estimation extracts orthogonal factors and the variance of the factors is set to unity by default. That is, Ψ in (9) is reduced to N × N unit matrix. In this paper we assume that factor loadings are stationary over the month (B t+1,k = B t,k ) in estimating the weight of each security in tracking portfolio.

ESTIMATION OF EXPECTED RETURNS
There are many approaches to security valuation and the creation of expected returns. We believe that asset managers use security analysis and stock selection models consisting of reported earnings, forecasted earnings and financial data 5 . Graham and Dodd [27] recommended that stocks be purchased on the basis of the price-earnings (P/E) ratio. The "low" PE investment strategy was discussed in Williams [28], the monograph that influenced Harry Markowitz and his thinking on portfolio construction. Bloch et al. [29] and Haugen and Baker [30,31] advocated models incorporating earnings-to-price (EP), book-value-toprice, BP, cash flow-to-price, CP, sales-to-price, SP, and other fundamental data. Guerard et al. [32,33] added price momentum (PM), price at t-1 divided by the price 12 months ago, t-12, and consensus temporary earnings forecast (CTEF) to expected returns modeling. They denoted the stock selection model as United States Expected Returns (USER). They reported, among other results, that: (1) the EP variable had a larger average weight than the BP variable; (2) the relative PE, denoted RPE, the EP relative to its 60-month average had a higher average weight than the PE variable; and (3) the composite earnings forecast variable, CTEF, had a larger weight than the RPE variable. In fact, in the USER model, only the price momentum variable, PM, had a higher weight than the CTEF variable (and only by one percent, at that) 6 .
In this paper, we use the same USER Model.
TR t+1 = a 0 + a 1 EP t + a 2 BP t + a 3 CP t + a 4 SP t + a 5 REP t + a 6 RBP t + a 7 RCP t + a 8 RSP t + a 9 CTEF t + a 10 PM t + e t (11) where : EP = [earnings per share]/[price per share] = earnings − price ratio; 6 Wall Street practitioners have embraced the "low PE" approach for well over 50 years. The low PE strategy is a form of the contrarian investment approach associated with Bernard [34] and Dremen [35,36]. The authors believe in the low PE strategy, but not as the exclusive strategy. There is extensive literature on the impact of individual value ratios on the cross section of stock returns. We go beyond using just one or two of the standard value ratios (EP and BP) to include the cash-price ratio (CP) and/or the sales-price ratio (SP). Several major papers on combination of value ratios to predict stock returns (that include at least CP and/or SP) are Fama and French [2,37,38], Bloch et al. [29], Chan et al. [39], Blin et al. [40], Guerard, Gültekin, and Stone [41], and Haugen and Baker [30,31]. Given concerns about both outlier distortion and multicollinearity, Bloch et al. [29] tested the relative explanatory and predictive merits of alternative regression estimation procedures: OLS, robust regression using the Beaton and Tukey [42] bi-square criterion to mitigate the impact of outliers, latent root to address the issue of multicollinearity [see [43]], and weighted latent root, denoted WLRR, a combination of robust and latent root. The Guerard et al. [33] USER model test substantiated the Bloch et al. [29] approach, techniques, and conclusions that WLLR works best among the alternative linear predictive models.

Data and Estimation
For each security, we use monthly total stock returns and prices from CRSP files, earnings book value cash flow, net sales from quarterly COMPUSTAT files, and consensus earnings-per-share, forecast revisions and breadth from I/B/E/S files. We construct the variables used in (8) for each month starting in January 1980. The USER model is estimated using WLRR analysis over the 60 month (5 year) moving window for each period to identify variables statistically significant at the 10% level. The model uses the normalized coefficients as weights over the past 12 months with Beaton-Tukey outlier adjustment. We use the statistically significant coefficients to estimate the next month's expected return rank, E i , for each security. The USER estimation conditions are virtually identical to those described in Guerard et al. [32,33,44].

PORTFOLIO CONSTRUCTION
We construct monthly long only, i.e., w i ≥ 0, portfolios to track S&P 500 index with minimum tracking error by solving the following equation: where λ is the relative risk aversion, 1 − is the unit, vector X t = {x 1,t , x 2,t , . . . , x N,t } is the vector of binary variables that indicate if the security i is included in the portfolio in month t, Z t = { x 1,t − x 1,t−1 , x 2,t − x 2,t−1 , ..., x N,t − x N,t−1 } is the vector of binary variables to account for security turnover, c is the transaction cost, p is the portfolio turnover limit percentage, and M is the maximum number of securities allowed in the portfolio. Variance of S&P 500 for the period in Equation (12) is estimated with Equation (10) using the factor loadings of the 3-factor model. We specifically solve Equation (12) with a relatively small number of securities (M), set at 50 and 100 or less, transactions cost (c), set 150 basis points each way, and portfolio turnover (p) set at 8 percent or less 7 .
In Table 1, we present portfolio statistics for each year for relative risk aversion of 0.01, 0.05, and 0.10. Excess return is the annual portfolio return net of truncations costs in excess of the annual return of S&P 500. Relative Sharpe ratios is portfolio Sharpe ratio divided by Share ratio of S&P 500. The active average excess returns of the USER model are statistically significant. Tracking error is not statistically significant for 100security portfolio.

SUMMARY AND CONCLUSIONS
Investing with fundamental, expectations, and momentum variables is a good investment strategy over the long run. The use of multi-factor risk-control significantly improves portfolios performance relative to the benchmark. We considered long only portfolio construction in this study. Construction of realistic Long-Short portfolios are not feasible under these settings unless one assumes that securities are always available to borrow to short sell. However, there are various actively traded derivative securities based on S&P 500 index, the benchmark used in this study. Portfolios constructed in the study tracks the S&P Index with reasonably low tracking error. With the use of these derivative securities, it is possible to expand the opportunity set for investors.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation, to any qualified researcher.