The “out-of-sample” performance of long-run risk models

The long-run risk model developed by Bansal-Yaron (JF 2004) has been a phenomenal success. One central feature of the model is that consumption and dividend growth contain a small long-run predictable component. A burgeoning literature shows that the model can explain various asset markets phenomena, including the equity premium puzzle, size and book-to-market effects, momentum, long-term return reversal in stock prices, risk premiums in bond markets, real exchange rate movements, and more. However, the evidence to date is mostly based on calibration exercises, in which researchers examine whether prices and returns generated by a calibrated model match actual prices and returns, or based on in-sample estimation, in which researchers choose model parameters to fit within the sample of asset returns. However, for an asset pricing model to be useful for practical applications, the model should be able to fit returns out-of-sample. The reason is that most practical applications are, in some sense, out of sample. For example, firms want to estimate cost of capital for future projects, portfolio and risk managers want to know the expected compensation for future risk, and academic researchers want to make risk adjustments to expected returns in future data. From this perspective, this paper provides an empirical analysis of the out-of-sample performance of the long-run risk models.

We study the “out-of-sample” performance of long-run risk models for explaining the equity premium puzzle, size and book-to-market effects, momentum, reversal, and bond returns of different maturity and credit quality. We examine both stationary and cointegrated versions of the models using annual data for 1931–2009. To evaluate out-of-sample fit, we use a traditional “rolling” estimation. We estimate the model parameters over an initial period and predict returns for the next year. We then roll the estimation period forward by one year and repeat the process.

Our stationary model specification follows Bansal-Yaron (JF 2004). In the model, consumption growth contains a small, highly persistent long-run risk component. The conditional volatility of consumption varies with time. There are three shocks: shocks to short-run consumption growth, shocks to long-run consumption growth, and shocks to consumption volatility.

Our cointegrated model follows Bansal-Dittmar-Lundblad (JF 2005) and Bansal-Dittmar-Kiku (RFS 2009). In the cointegrated model, the natural logarithms of aggregate consumption and dividend levels are cointegrated. This means that both consumption growth and dividend growth contain a highly persistent long-run risk component, but the two can't wander too far apart, so a weighted difference between them is stationary. The conditional volatility of consumption is time varying in this model as well. There are three shocks: shocks to the cointegrating relation between dividends and consumption, shocks to long-run consumption growth, and shocks to consumption volatility.

For an asset pricing model to be useful in practical application, the model should be able to fit returns out-of-sample. We use the mean squared pricing error (MSE) as the criterion to evaluate models. A model that delivers the correct (conditional) mean of the future return minimizes the (conditional) MSE. We also compare the two components of the MSE—the expected pricing error and the error variance—of these different models. We compare the out-of-sample performance of long-run risk models with that of two simple models: a simple consumption beta model (CCAPM) and the classic Capital Asset Pricing Model of Sharpe (JF 1964). To evaluate the factors that drive the fit of the long-run risk models, we compare long-run risk models with models that suppress the consumption-related shocks. We also compare long-run risk models with models that restrict the risk premiums to identify structural parameters.

We have the following main results. First, the cointegrated long-run risk model beats the original, stationary version in terms of out-of-sample performance. Thus, we argue that the out-of-sample performance suggests that cointegrated models should have a more prominent role in future research. Second, we find that the long-run risk models trump other models in explaining the Momentum Effect. At the same time, they perform poorly in explaining the relative returns to low-grade corporate and long-term government bonds. These results suggest that there are important missing factors in the simple, long-run risk models. Third, we find that both the short-run consumption risk factor and the long-run risk factor are important ingredients in the models' performance. Our evidence is that models that suppress the consumption-related shocks perform poorly (which indicates that the consumption shocks are important risk factors), and that the long-run risk models also perform better than the simple consumption-based model (CCAPM). Fourth, the mean squared errors of the long-run risk models are not substantially better than those of the classical CAPM, except for Momentum. Fifth, when we restrict the risk premiums on the long-run risk models' factors to identify structural parameters of the model, we find that restricted model's overall performance is inferior to the classical CAPM, as the restriction increases the average out-of-sample pricing errors substantially, while sometimes reducing the pricing error variances.

Francisco Goya (1746–1828): Sleep of Reason Produces Monsters. Spain, 1799.. Goya was the last and perhaps the coolest of the old masters. This famous etching was censored in its time, because it was considered a biting critique of 18th-century Spanish society, full of corruption and superstition. In this work, the sleeping “reason” is haunted by monsters and evil spirits. But Goya was not promoting reason alone. His caption reads “Imagination abandoned by reason produces impossible monsters; united with her, she is the mother of the arts and source of their wonders.” Many of the other etchings in the series are both amusing and disturbing.