Слайд 1Modeling and Forecasting Volatility
Presenter
Charis Christofides
Joint Vienna Institute / IMF ICD
Macro-econometric
Forecasting and Analysis
JV16.12, L10, Vienna, Austria, May 24, 2016
This training material is the property of the International Monetary Fund (IMF) and is intended for use
in IMF Institute courses. Any reuse requires the permission of the IMF Institute.
Слайд 2Outline
Introduction: Why ARCH?
ARCH Models
Extensions: GARCH, T-GARCH, Q-GARCH, GARCH-M, Box-Cox GARCH
Estimation
Multivariate GARCH
Models: Diagonal Vech, BEKK and CCC
Application: Value-at-Risk (VaR)
Appendix
Слайд 4Why ARCH?
ARMA and VAR models are based on the conditional mean
of the distribution where conditioning is based on lagged values of the dependent variable.
The conditional variance of the distribution is assumed to be time-invariant (i.e. homoskedasticity).
In addition, if the error term is assumed to be normal, the conditional distribution (and hence the marginal and joint distributions) is Gaussian.
Are these properties supported by real data?
Слайд 6Dow Jones
Symmetric Shocks?
Homoskedastic?
Слайд 7U.S. Unemployment rate vs. stock market volatility, 1929-2010
Слайд 8U.S. Realized Volatility (kernel based)
1997-2009
Слайд 9An example
Let us apply Box-Jenkins methods to a real time series,
namely, weekly returns on S&P500 from April 1, 1986 to December 14, 2007.
Слайд 10Example (cont.)
Note:
Tranquil
period
Volatile period
Слайд 11Example (cont. )
Homoskedasticity?
Symmetry?
Tranquil
period
Volatile period
Слайд 12Example (cont.)
Both ACF and PACF are flat, suggesting p=0 and q=0
if we stay in the domain of ARMA.
Слайд 13Example (cont. )
Look at the histogram and some summary statistics of
the data:
Asymmetry
Fat tails
Skewness= E[(y-μ)3]/Var[y]3/2,
Kurtosis= E[(y-μ)4]/Var[y]2
Слайд 14Skewness
The shape of a uni-modal distribution can be symmetric or
skewed to one side.
If the bulk of the data is at the left and the right tail is longer, the distribution is positively skewed; if the peak is toward the right and the left tail is longer, the distribution is negatively skewed.
If skewness is less than −1 or greater than +1, the distribution is highly skewed.
If skewness is between −1 and −½ or between +½ and +1, the distribution is moderately skewed.
If skewness is between −½ and +½, the distribution is approximately symmetric.
Слайд 15Kurtosis
Kurtosis measures the height and sharpness of the peak relative
to the rest of the data .
Higher values indicate a higher, sharper peak; lower values indicate a lower, less distinct peak.
Increasing kurtosis is associated with a movement of the probability mass from the shoulders of a distribution into its center and tails
Слайд 16Remarks
Gaussian ARMA models are not able to generate asymmetric or fat-tailed
behavior.
The previous time series plot shows that there are turbulent periods where there is a sequence of very large movements in returns and tranquil periods where the magnitude of movements is relatively small.
This phenomenon is known as volatility clustering, which highlights the property that the volatility of financial returns is not constant over time, but appears to come in bursts.
Слайд 17Example
Variance of financial returns is often referred to as volatility.
To understand
the dynamics of volatility, we can examine the time series behavior of the squared returns.
Слайд 18EViews Example – Daily S&P 500 Returns
Слайд 20We’ll be able to make squared residuals white noise
Слайд 21Quality of TGARCH predictions: 1% quantiles, VaR(0.01), from August 1, 2007
Слайд 24ARCH(q)
AR(J)-ARCH(q)
AR(J)-ARCH(q)
ARCH(q)
Steady-State
Слайд 25A special case: ARCH(1)
Properties [It-1 = y1,..,yt-1]
with the AR coefficient γ1
If , the ARCH(1) is covariance stationary
Kurtosis =
Слайд 26Testing for the ARCH effects
Regress on
.
Calculate , which is an LM statistic.
Under the null hypothesis of no ARCH effect, its asymptotical distribution is the chi-square with q degrees of freedom.
If there exist a value of q such that the LM statistic is larger than the critical value of the chi-square with q degrees of freedom, we reject the null hypothesis of no ARCH effect.
In practice, a large q may be needed.
Слайд 28GARCH(p,q)
AR(J)-ARCH(q)
AR(J)-GARCH(p,q)
GARCH(p,q)
Steady-State
Additivity
No negativity
Слайд 29GARCH(1,1)
The most popular ARCH-type model
Volatility ( )
VaR=1.645σ
Слайд 30Properties of GARCH(1,1)
1. follows an ARMA(1,1) with the
AR coefficient , and the MA coefficient
2. If , then is covariance stationary.
3. The volatility persistence is determined by , which empirically is often close to one
Слайд 31I-GARCH
If the coefficients of the GARCH model sum to 1, then
the model has “integrated” volatility.
This is similar to having a random walk, but in volatility instead of the variable itself.
Model itself remains stationary (if constant variance model is stationary)
Likelihood-based inference remains valid (Lumsdaine, 1996 Econometrica)
Слайд 32
The speed of decrease in the IRFs is determined by
Impulse response functions (IRFs)
of GARCH(1,1)
Слайд 33NIC: as a function of holding
other variables constant.
The NIC of GARCH(1,1):
It is symmetric.
News impact curve (NIC)
Слайд 34
Student t -- GARCH(1,1)
where
Compared to the Gaussian GARCH, the Student t-GARCH
can generate fatter tails.
Student t -- GARCH(1,1)
Слайд 35
T-GARCH (Asymmetry)
NIC is asymmetric.
If , bad news
has a larger impact on the future volatility then good news of the same magnitude
IRF depends on the type of news as well
Asymmetric Volatility
Threshold
Слайд 37
NIC is asymmetric as long as
Asymmetric Volatility
Q(uadratic)-GARCH (Asymmetry)
Слайд 38
NIC of Quadratic GARCH vs.
Symmetric GARCH
Слайд 39GARCH-M
An important application of the ARCH-type models is in modeling
the trade-off between the mean and the volatility.
In financial economics, this is known as risk-return trade-off.
The GARCH-M model is of the form
GARCH in Mean
Слайд 40Box-Cox GARCH(1,1)
We model the power transformation of volatility.
As long
as , NIC is asymmetric
This is a non-linear model
Слайд 41Summary: NICs of Alternative ARCHs
Inflation Volatility
Слайд 42Summing up (see Appendix for an expanded list)
Asymmetric Models
Non linear
Слайд 44
Maximum Likelihood
Maximize L(y,Φ)
Φ
L*
Слайд 45Maximum Likelihood (continued)
The maximum likelihood decomposes in a “mean”
and a “variance” component. Estimation has to be done numerically.
Parameters for the mean can be estimated consistently by OLS, but won’t be as efficient if they don’t take account of heteroskedasticity.
Note: we could have a non-normal error (e.g., Student-t or GED-density)
Слайд 46Optimization
Newton’s Method
Stochastic Newton Method
Gradient and Hill Climbing Techniques
Слайд 47Multiple Solutions
Monte Carlo
Genetic Algorithms
Слайд 49Multivariate GARCH Models
A natural extension of the time-varying variance models based
on the univariate GARCH framework is the multivariate version whereby both variances and covariances are modelled.
This class of models is known as Multivariate GARCH
The variance covariance matrix needs to be restricted to be positive definite for all t
The number of unknown parameters governing the behavior of the variances and covariances cannot be too large
Слайд 50Vech Model (2 variables)
The conditional variance of each variable depends on
its own lagged value, on the lagged conditional covariance, on the product of lagged squared errors and errors.
A large number of parameters (in this case, 21)
Restrictions to ensure that is positive definite are complicated.
Слайд 51BEKK Model
C is a NxN lower triangular matrix of unknown parameters
A and B are NxN matrices each containing N2 unknown parameters associated with the lagged disturbances and lagged conditional covariance matrix
This formulation ensures that all variances are positive (the diagonal elements of )
It also allows shocks to variances of one variable to affect variances of the other variables (spillovers)
Still, a large number of parameters
Слайд 52Diagonal Vech Model (2 variables)
Variances and covariances are GARCH(1,1)
Parameters are now
9 instead of the 21 of the Vech model.
Restrictions imply that there are no interactions among variances
Слайд 53CCC
(Constant Conditional Correlation) Model
3 variables
The correlation coefficients are all time
invariant
Слайд 54An extension: VAR + CCC
3 variables
Слайд 55
A further extension:
VAR + CCC+ GARCH-M
Interactions between Markets
Contagion
Слайд 56An example of volatility “contagion’’
Слайд 575. Application:
Value-at-Risk (VaR)
Слайд 58VaR
What is the most I can lose on an investment?
VaR
tries to provide an answer.
It is used most often by commercial and investment banks to capture the potential loss in value of their traded portfolios from adverse market movements over a specified period.
This potential loss can then be compared to their available capital and cash reserves to ensure that the losses can be covered without putting the firms at risk.
VaR is applied widely in capital regulation (Basel)
Слайд 59Value-at-Risk (VaR)
VaR summarizes the expected maximum loss over a time horizon
within a given confidence interval
The VaR approach tries to estimate the level of losses that will be exceeded over a given time period only with a certain (small) probability
For example, the 95% VaR loss is the amount of loss that will be exceeded only 5% of the time
Слайд 60Value-at-Risk (VaR) - Continued
The simplest assumption: daily gains/losses are normally distributed
and independent.
Calculate VaR from the standard deviation of the portfolio change, σ, assuming the mean change in the portfolio value is 0:
1-day VaR= N-1(X)σ, with X the confidence level.
The N-day VaR equals sqrt(N) times the 1-day VaR.
Слайд 61Measuring VaR with historical data
0
20
40
60
80
100
120
140
160
180
-15
-12
-9
-6
-3
0
3
6
9
12
15
0
20
40
60
80
100
120
140
160
180
Probability
Слайд 62Assuming a Normal distribution
Mean Return (μ)
Standard Deviation (σ)
Assume that asset returns
are normally distributed
Their behavior can be fully described in terms of mean and standard deviation
Слайд 63VaR with Normally
Distributed Returns
The probability of the return falling below
a certain threshold depends on how many standard deviations the threshold is below the mean return
99% confidence interval
Слайд 64Portfolio VaR
When we have more than one asset in our portfolio
we can exploit the gains from diversification.
There are gains from diversification whenever the VaR for the portfolio does not exceed the sum of the stand-alone VaRs (i.e., the VaRs on the single assets).
The VaR for the portfolio equals the sum of the stand-alone VaRs if and only if the securities’ returns are uncorrelated.
Слайд 65An Example
Let us consider the following investment
US$200 million invested in 5-year
zero coupon US Treasury
Examine VaR using a daily horizon
Assume that the mean daily return is 0.01%
Based on past several years of actual returns, the standard deviation is σ = 0.295%.
Слайд 66An Example (cont.)
Suppose we want to compute the 95% VaR.
The
critical threshold is 1.65 standard deviations below the mean, i.e.,
0.0001-1.65 • 0.00295=-0.00477
VaR = 0.00477 • 200m=0.95m
Expect to lose $0.95 million or more on 1 day in 20
Слайд 67An Example of Portfolio VaR
Two securities
30-year zero-coupon U.S. Treasury bond
5-year zero-coupon
U.S. Treasury bond
For simplicity assume that the expected return is zero
Invest US$100 million in the 30-year bond
Daily return volatility (std dev) σ1 = 1.409%
Invest US$200 million in the 5-year bond
Daily return volatility (std dev) σ2 = 0.295%
Слайд 68An Example of Portfolio VaR
95% confidence level
30 year zero VaR
1.65
* 0.01409 * 100m = $2,325,000
5 year zero VaR
1.65 * 0.00295 * 200m = $974,000
Sum of individual VaRs = US$ 3.299m
But US$3.299 million is not the VaR for the portfolio...why?
Слайд 69VaR of the Portfolio
Suppose the correlation between the two bonds is
ρ12=0.88
Remember that
Portfolio variance:
(100*0.01409)2 + (200*0.00295)2
+2(100*0.01409)(200*0.00295) * 0.88 = 3.797
Portfolio standard deviation:
σp = $1.948m
Portfolio VaR = 1.65 * 1.948m = $3.214m
This is different from the sum of VaRs
Слайд 70The problem with Normality: Kurtosis
Extreme asset price changes occur more often
than the normal distribution predicts.
Excess kurtosis (fat tails)
Слайд 71Fat Tails and underestimation of VaR
If we assume that returns are
normally distributed when they are not, we underestimate the VaR
VaR with actual return distribution
VaR with normal returns
Слайд 72Backtesting
Model backtesting involves systematic comparisons of the calculated VaRs with the
subsequent realized profits and losses.
With a 95% VaR bound, expect 5% of losses greater than the bound
Example: Approximately 12 days out of 250 trading days
If the actual number of exceptions is “significantly” higher than the desired confidence level, the model may be inaccurate.
Therefore, in additional to the risk predicted by the VaR, there is also “model risk”
Слайд 73Relevance: Basel VaR Guidelines
VaR computed daily, holding period is 10 days.
The
confidence interval is 99 percent
Banks are required to hold capital in proportion to the losses that can be expected to occur more often than once every 100 periods
At least 1 year of data to calculate parameters
Parameter estimates updated at least quarterly
Capital provision is the greater of
Previous day’s VAR
3 times the average of the daily VAR for the preceding 60 business days plus a factor based on backtesting results
Слайд 74Summing up
A host of research has examined
a. how best to
compute VaR with assumptions other than the standardized normal
b. How to obtain more reliable variance and covariance values to use in the VaR calculations.
Here Multivariate GARCH models play an important role in assessing both portfolio risk and diversification benefits.
We will see this in the forthcoming workshop
Слайд 76Appendix – GARCH univariate families
Слайд 77Source: Bollerslev 2010, Engle Festschrift