- The "equal weights puzzle" in forecast combination
- Random forests, and ensemble averaging algorithms more generally
- Bootstrap aggregation ("bagging")
- Boosting
- Best subset averaging
- Survey averages
- k-nearest-neighbor forecasts
- Amisano-Geweke equally-weighted prediction pools
- "1/N" portfolios
- Bayesian model averaging
- Bates-Granger-Ramanathan frequentist model averaging
- Any forecasts extracted from markets (the ultimate information aggregator), ranging from "standard" markets (e.g., volatility forecasts extracted from options prices, interest rate forecasts extracted from the current yield curve, etc.), to explicit so-called "prediction markets" (e.g., sports betting markets).

## Sunday, January 21, 2018

### Averaging for Prediction in Econometrics and ML

Random thought. At the risk of belaboring the obvious, it's interesting to heighten collective awareness by thinking about the many appearances of averaging in forecasting, particularly in forecast combination. Some averages are weighted, and some are not. Most are linear, some are not.

## Sunday, January 14, 2018

### Comparing Interval Forecasts

Here's a new one, "On the Comparison of Interval Forecasts".

You'd think that interval forecast evaluation would be easy. After all, point forecast evaluation is (more or less) well understood and easy, and density forecast evaluation is (more or less) well understood and easy, and interval forecasts seem somewhere in between, so by some sort of continuity argument you'd think that their evaluation would also be well understood and easy. But no. In fact it's quite difficult, maybe impossible...

You'd think that interval forecast evaluation would be easy. After all, point forecast evaluation is (more or less) well understood and easy, and density forecast evaluation is (more or less) well understood and easy, and interval forecasts seem somewhere in between, so by some sort of continuity argument you'd think that their evaluation would also be well understood and easy. But no. In fact it's quite difficult, maybe impossible...

## Monday, January 8, 2018

### Yield-Curve Modeling

Happy New Year to all!

Riccardo Rebonato's

*Bond Pricing and Yield-Curve Modeling: A Structural Approach*will soon appear from Cambridge University Press. It's very well done -- a fine blend of theory, empirics, market sense, and good prose. And not least, endearing humility, well-captured by a memorable sentence from the acknowledgements: "My eight-year-old son has forgiven me, I hope, for not playing with him as much as I would have otherwise; perhaps he has been so understanding because he has had a chance to build a few thousand paper planes with the earlier drafts of this book."

TOC below. Pre-order here.

Contents

Acknowledgements page ix

Symbols and Abbreviations xi

Part I The Foundations

1 What This Book Is About 3

2 Definitions, Notation and a Few Mathematical Results 24

3 Links among Models, Monetary Policy and the Macroeconomy 49

4 Bonds: Their Risks and Their Compensations 63

5 The Risk Factors in Action 81

6 Principal Components: Theory 98

7 Principal Components: Empirical Results 108

Part II The Building Blocks: A First Look

8 Expectations 137

9 Convexity: A First Look 147

10 A Preview: A First Look at the Vasicek Model 160

Part III The Conditions of No-Arbitrage

11 No-Arbitrage in Discrete Time 185

12 No-Arbitrage in Continuous Time 196

13 No-Arbitrage with State Price Deflators 206

14 No-Arbitrage Conditions for Real Bonds 224

15 The Links with an Economics-Based Description of Rates 241

Part IV Solving the Models

16 Solving Affine Models: The Vasicek Case 263

17 First Extensions 285

18 A General Pricing Framework 299

19 The Shadow Rate: Dealing with a Near-Zero Lower Bound 329

Part V The Value of Convexity

20 The Value of Convexity 351

21 A Model-Independent Approach to Valuing Convexity 371

22 Convexity: Empirical Results 391

Part VI Excess Returns

23 Excess Returns: Setting the Scene 415

24 Risk Premia, the Market Price of Risk and Expected Excess Returns 431

25 Excess Returns: Empirical Results 449

26 Excess Returns: The Recent Literature – I 473

27 Excess Returns: The Recent Literature – II 497

28 Why Is the Slope a Good Predictor? 527

29 The Spanning Problem Revisited 547

Part VII What the Models Tell Us

30 The Doubly Mean-Reverting Vasicek Model 559

31 Real Yields, Nominal Yields and Inflation: The D’Amico–Kim–Wei Model 575

32 From Snapshots to Structural Models: The Diebold–Rudebusch Approach 602

33 Principal Components as State Variables of Affine Models: The PCA Affine Approach 618

34 Generalizations: The Adrian–Crump–Moench Model 663

35 An Affine, Stochastic-Market-Price-of-Risk Model 688

36 Conclusions 714

Bibliography 725

index 000

## Monday, December 18, 2017

### Holiday Haze

[Photo credit: Public domain, by Marcus Quigmire, from Florida, USA (Happy Holidays Uploaded by Princess MÃ©rida) [CC-BY-SA-2.0 (http://creativecommons.org/licenses/by-sa/2.0)], via Wikimedia Commons]

## Sunday, December 10, 2017

### More on the Problem with Bayesian Model Averaging

I blogged earlier on a problem with Bayesian model averaging (BMA) and gave some links to new work that chips away at it. The interesting thing about that new work is that it stays very close to traditional BMA while acknowledging that all models are misspecified.

But there are also other Bayesian approaches to combining density forecasts, such as prediction pools formed to optimize a predictive score. (See, e.g. Amisano and Geweke, 2017, and the references therein. Ungated final draft, and code, here.)

Another relevant strand of new work, less familiar to econometricians, is "Bayesian predictive synthesis" (BPS), which builds on the expert opinions analysis literature. The framework, which traces to Lindley et al. (1979), concerns a Bayesian faced with multiple priors coming from multiple experts, and explores how to get a posterior distribution utilizing all of the information available. Earlier work by Genest and Schervish (1985) and West and Crosse (1992) develops the basic theory, and new work (McAlinn and West, 2017), extends it to density forecast combination.

Thanks to Ken McAlinn for reminding me about BPS. Mike West gave a nice presentation at the FRBSL forecasting meeting. [Parts of this post are adapted from private correspondence with Ken.]

But there are also other Bayesian approaches to combining density forecasts, such as prediction pools formed to optimize a predictive score. (See, e.g. Amisano and Geweke, 2017, and the references therein. Ungated final draft, and code, here.)

Another relevant strand of new work, less familiar to econometricians, is "Bayesian predictive synthesis" (BPS), which builds on the expert opinions analysis literature. The framework, which traces to Lindley et al. (1979), concerns a Bayesian faced with multiple priors coming from multiple experts, and explores how to get a posterior distribution utilizing all of the information available. Earlier work by Genest and Schervish (1985) and West and Crosse (1992) develops the basic theory, and new work (McAlinn and West, 2017), extends it to density forecast combination.

Thanks to Ken McAlinn for reminding me about BPS. Mike West gave a nice presentation at the FRBSL forecasting meeting. [Parts of this post are adapted from private correspondence with Ken.]

## Sunday, December 3, 2017

### The Problem With Bayesian Model Averaging...

The problem is that one of the models considered is traditionally assumed true (explicitly or implicitly) since the prior model probabilities sum to one. Hence all posterior weight gets placed on a single model asymptotically -- just what you don't want when constructing a portfolio of surely-misspecified models. The earliest paper I know that makes and explores this point is one of mine, here. Recent and ongoing research is starting to address it much more thoroughly, for example here and here. (Thanks to Veronika Rockova for sending.)

## Sunday, November 26, 2017

### Modeling With Mixed-Frequency Data

Here's a bit more related to the FRB St. Louis conference.

The fully-correct approach to mixed-frequency time-series modeling is: (1) write out the state-space system at the highest available data frequency or higher (e.g., even if your highest frequency is weekly, you might want to write the system daily to account for different numbers of days in different months), (2) appropriately treat most of the lower-frequency data as missing and handle it optimally using the appropriate filter (e.g., the Kalman filter in the linear-Gaussian case). My favorite example (no surprise) is here.

Until recently, however, the prescription above was limited in practice to low-dimensional linear-Gaussian environments, and even there it can be tedious to implement if one insists on MLE. Hence the well-deserved popularity of the MIDAS approach to approximating the prescription, recently also in high-dimensional environments.

But now the sands are shifting. Recent work enables exact Bayesian posterior mixed-frequency analysis even in high-dimensional structural models. I've known Schorfheide-Song (2015,

The fully-correct approach to mixed-frequency time-series modeling is: (1) write out the state-space system at the highest available data frequency or higher (e.g., even if your highest frequency is weekly, you might want to write the system daily to account for different numbers of days in different months), (2) appropriately treat most of the lower-frequency data as missing and handle it optimally using the appropriate filter (e.g., the Kalman filter in the linear-Gaussian case). My favorite example (no surprise) is here.

Until recently, however, the prescription above was limited in practice to low-dimensional linear-Gaussian environments, and even there it can be tedious to implement if one insists on MLE. Hence the well-deserved popularity of the MIDAS approach to approximating the prescription, recently also in high-dimensional environments.

But now the sands are shifting. Recent work enables exact Bayesian posterior mixed-frequency analysis even in high-dimensional structural models. I've known Schorfheide-Song (2015,

*JBES*; 2013 working paper version here) for a long time, but I never fully appreciated the breakthrough that it represents -- that is, how straightforward exact mixed-frequency estimation is becoming -- until I saw the stimulating Justiniano presentation at FRBSL (older 2016 version here). And now it's working its way into important substantive applications, as in Schorfheide-Song-Yaron (2017, forthcoming in*Econometrica*).## Monday, November 20, 2017

### More on Path Forecasts

I blogged on path forecasts yesterday. A reader just forwarded this interesting paper, of which I was unaware. Lots of ideas and up-to-date references.

## Thursday, November 16, 2017

### Forecasting Path Averages

Consider two standard types of \(h\)-step forecast:

(a). \(h\)-step forecast, \(y_{t+h,t}\), of \(y_{t+h}\)

(b). \(h\)-step path forecast, \(p_{t+h,t}\), of \(p_{t+h} = \{ y_{t+1}, y_{t+2}, ..., y_{t+h} \}\).

Clive Granger used to emphasize the distinction between (a) and (b).

As regards path forecasts, lately there's been some focus not on forecasting the entire path \(p_{t+h}\), but rather on forecasting the path average:

(c). \(h\)-step path average forecast, \(a_{t+h,t}\), of \(a_{t+h} = 1/h [y_{t+1} + y_{t+2} + ... + y_{t+h}]\)

The leading case is forecasting "average growth", as in Mueller and Waston (2016).

Forecasting path averages (c) never resonated thoroughly with me. After all, (b) is sufficient for (c), but not conversely -- the average is just one aspect of the path, and additional aspects (overall shape, etc.) might be of interest.

Then, listening to Ken West's FRB SL talk, my eyes opened. Of course the path average is insufficient for the whole path, but it's surely the most important aspect of the path -- if you could know just one thing about the path, you'd almost surely ask for the average. Moreover -- and this is important -- it might be much easier to provide credible point, interval, and density forecasts of \(a_{t+h}\) than of \(p_{t+h}\).

So I still prefer full path forecasts when feasible/credible, but I'm now much more appreciative of path averages.

(a). \(h\)-step forecast, \(y_{t+h,t}\), of \(y_{t+h}\)

(b). \(h\)-step path forecast, \(p_{t+h,t}\), of \(p_{t+h} = \{ y_{t+1}, y_{t+2}, ..., y_{t+h} \}\).

Clive Granger used to emphasize the distinction between (a) and (b).

As regards path forecasts, lately there's been some focus not on forecasting the entire path \(p_{t+h}\), but rather on forecasting the path average:

(c). \(h\)-step path average forecast, \(a_{t+h,t}\), of \(a_{t+h} = 1/h [y_{t+1} + y_{t+2} + ... + y_{t+h}]\)

The leading case is forecasting "average growth", as in Mueller and Waston (2016).

Forecasting path averages (c) never resonated thoroughly with me. After all, (b) is sufficient for (c), but not conversely -- the average is just one aspect of the path, and additional aspects (overall shape, etc.) might be of interest.

Then, listening to Ken West's FRB SL talk, my eyes opened. Of course the path average is insufficient for the whole path, but it's surely the most important aspect of the path -- if you could know just one thing about the path, you'd almost surely ask for the average. Moreover -- and this is important -- it might be much easier to provide credible point, interval, and density forecasts of \(a_{t+h}\) than of \(p_{t+h}\).

So I still prefer full path forecasts when feasible/credible, but I'm now much more appreciative of path averages.

## Wednesday, November 15, 2017

### FRB St. Louis Forecasting Conference

Got back a couple days ago. Great lineup. Wonderful to see such sharp focus. Many thanks to FRBSL and the organizers (Domenico Giannone, George Kapetanios, and Mike McCracken). I'll hopefully blog on one or two of the papers shortly. Meanwhile, the program is here.

Subscribe to:
Posts (Atom)