18 Manual estimation

When estimating the volatility models in Chapter 16, we used the rugarch library. It is very comprehensive and provides almost all the functionality needed to estimate volatility models.

However, it can be beneficial to have a deeper understanding of how estimation actually works. That is what we provide in this chapter. We are manually implementing two features that are automatically provided in rugarch.

The first is the actual programming of the likelihood function, and the second is how we use optimisation algorithms to maximise the likelihood.

The benefit of doing such manual optimisation is that we gain familiarity with optimisation procedures, something quite common in quantitative finance. In many cases, we do not have the benefit of high-quality libraries and instead have to implement the estimation procedure ourselves.

This chapter builds on the foundation established in Chapter 16 where we used the rugarch package.

18.1 Data and libraries

library(reshape2)
source("common/functions.r",chdir=TRUE)
library(nloptr)
suppressPackageStartupMessages(library(rugarch))
suppressPackageStartupMessages(library(lubridate))
suppressPackageStartupMessages(library(zoo))

data=ProcessRawData()
y=data$sp500$y
y=y-mean(y)

18.2 Optimisers

Maximum likelihood estimation requires numerical optimisation to find the parameter values that best fit the data. Understanding optimisation algorithms helps with diagnosing estimation problems and choosing appropriate solvers for different situations.

Recall the discussion in Chapter 16 about maximising the likelihood function. The algorithm used to do that maximisation is called an optimiser.

The basic R distribution provides a function called optim() for optimising functions. It provides several algorithms, including the well-established and robust Nelder-Mead. That might not be the fastest algorithm, but it does have the advantage of not depending on derivatives, and it will always find the nearest optimum. Nelder-Mead has at least two disadvantages: it requires more steps to find the optimum than some other algorithms, and it only does local optimisation, meaning it finds the nearest peak.

There is an alternative library called NLopt that provides a large number of state-of-the-art optimisers. NLopt can be used with almost every programming language, and the R implementation is nloptr.

nloptr requires more steps than the built-in optim().

18.2.1 Optimisers minimise

One thing to keep in mind is that optimisers minimise functions by default instead of maximising them, as we require here. This means that when we return a functional value, we have to invert the sign before returning it.

18.2.2 Ensuring positive parameters

In our case, the parameters that go into the likelihood function must be positive. There are at least two different ways we can ensure that positivity. The easiest is to take the absolute value of the parameters, but we can often also impose constraints when we call the optimising function. We show both below.

18.2.3 Optimisation in R

Suppose we want to minimise some function f(parameters), where we have three parameters.

18.2.3.1 Built-in `optim`

The easiest way is to use the built-in optim() function, which by default uses the Nelder-Mead algorithm. It returns a list that contains the parameter estimates, the function value at the optimum and some other diagnostics.

res = optim(c(0.000001, 0.1, 0.85) ,fn= f) # Optimization

18.2.3.2 `NLopt`

NLopt is a bit more complicated. We first have to set up control parameters for the estimation, like one of the:

opts =list("algorithm"="NLOPT_GN_CRS2_LM", "xtol_rel"=1.0e-8, maxeval=1e5)
opts =list("algorithm"="NLOPT_GN_DIRECT_L", "xtol_rel"=1.0e-8, maxeval=1e5)
opts =list("algorithm"="NLOPT_LN_NELDERMEAD", "xtol_rel"=1.0e-8, maxeval=1e5)

Here, we see three different algorithm choices. We also set the tolerance xtol_rel and the maximum number of times the function can be called, maxeval.

After that, we call the actual optimisation nloptr(). We set the starting values x0, the function name eval_f, the lower and upper bounds on the parameters lb and ub, and the control list.

res = nloptr(x0=c(0.000001, 0.1, 0.85),
 eval_f=f,
 lb=c(0,0,0),
 ub=c(1,1,1),
 opts=opts)

It returns a list that contains the parameter estimates, the function value at the optimum and some other diagnostics.

18.3 Coding the likelihood

The likelihood function forms the foundation of GARCH estimation. By coding it manually, we gain insight into the mathematical structure of these models and develop skills needed to implement custom specifications or modifications.

18.3.1 Issues

Value of \(\sigma_1^2\): We need to assign a value to \(\sigma_1^2\), the first iteration of the conditional volatility. In general, we choose the unconditional variance of the data for this;
Initial values of the parameters: Optimisation works iteratively. You should assign initial values for the parameters, making sure these are compliant with the function’s restrictions and the parameter bounds. This will have a small effect on the parameter values but can have a significant effect on computing time.

We also need to pick an optimisation method and algorithm.

18.3.2 Normal GARCH

The likelihood function and its derivation can be found in Chapter 2 of the book and slides.

\[ \begin{split} \log\mathcal{L} &= \underbrace{-\frac{T-1}{2}\log(2\pi)}_{\text{Constant}} \\ &\quad - \frac{1}{2}\sum_{t=2}^{T}\left(\log(\omega + \alpha y^2_{t-1} + \beta \hat\sigma^2_{t-1}) + \frac{y^2_t}{\omega + \alpha y^2_{t-1} + \beta \hat\sigma^2_{t-1}}\right) \end{split} \]

18.3.3 tGARCH

The Student-t density is given by \[ \frac{\Gamma\left(\frac{\nu+1}{2}\right)}{\sqrt{\nu\pi}\Gamma\left(\frac{\nu}{2}\right)} \left(1+\frac{x^2}{\nu}\right)^{-\frac{\nu+1}{2}} \]

We want to make two alterations.

Make it standardised;
Write it in terms of a time-changing volatility model,

Which gives us

\[ \frac{1}{\sigma_t} \frac{\Gamma\left(\frac{\nu+1}{2}\right)}{\sqrt{(\nu-2)\pi}\Gamma\left(\frac{\nu}{2}\right)} \left(1+\frac{y_t^2}{\sigma_t^2(\nu-2)}\right)^{-\frac{\nu+1}{2}} \]

And take logs. There are two parts: the constant and the time-changing part.

The constant is

\[ \log\Gamma\left(\frac{\nu+1}{2}\right) - \frac{1}{2}\log((\nu-2)\pi) - \log\Gamma\left(\frac{\nu}{2}\right) \] While the time-changing part is

\[ -\log\sigma_t - \frac{\nu+1}{2}\left(\log\left(\sigma_t^2(\nu-2)+y_t^2\right) - \log\left(\sigma_t^2(\nu-2)\right)\right) \]

18.4 Implementation

We now combine the likelihood functions with optimisation algorithms to create complete estimation procedures. This section demonstrates how to handle practical issues like parameter constraints, computational efficiency, and result validation.

18.4.1 Issues

Three issues arise.

First, we must ensure the parameters are positive, which we do below by par=abs(par) but also by setting lb and ub in nloptr. We do not need both approaches, but since we are comparing two different ways of doing optimisation, we keep both here.

Second, we want to avoid repeating calculations. We might call the likelihood function a large number of times, so we do not want to square the returns, count the elements, or take the variance more often than necessary. Therefore, we pre-calculate everything possible and it is advisable to ensure only necessary calculations occur inside the loops.

Finally, we need to get the last volatility. While we can write more sophisticated code for that, in this case, we use the simple approach of putting it into global memory with <<-, which is not the recommended approach, but it can be quite convenient.

18.4.2 Normal GARCH

y2=y^2
sigma2=var(y)
T = length(y2)


LL.GARCH = function(par) {
  par = abs(par)
  omega = par[1]
  alpha = par[2]
  beta = par[3]
  loglikelihood = 0
  for(i in 2:T) {
    sigma2 = omega + alpha * y2[i-1] + beta * sigma2
    loglikelihood = loglikelihood + log(sigma2) + y2[i] / sigma2
  }
  loglikelihood = -0.5 * loglikelihood - (T-1) / 2 * log(2*pi)
  sigma2Last <<- sigma2
  return(-loglikelihood)
}
res = optim(c(0.000001, 0.1, 0.85), fn = LL.GARCH)
res$par = abs(res$par)

opts = list("algorithm" = "NLOPT_LN_NELDERMEAD", "xtol_rel" = 1.0e-8, maxeval = 1e4)

res = nloptr(x0 = c(0.000001, 0.1, 0.85),
             eval_f = LL.GARCH,
             lb = c(0, 0, 0),
             ub = c(1, 1, 1),
             opts = opts)

spec.2 = ugarchspec(
 variance.model = list(garchOrder = c(1,1)),
 mean.model = list(armaOrder = c(0,0), include.mean = FALSE)
)
Res = ugarchfit(spec = spec.2, data = y)

Warning in .sgarchfit(spec = spec, data = data, out.sample = out.sample, : 
ugarchfit-->warning: solver failer to converge.

cat("Omega:", res$solution[1], coef(Res)[1],"\n",
 "Alpha:", res$solution[2], coef(Res)[2], "\n",
 "Beta:", res$solution[3], coef(Res)[3],"\n")

Omega: 1.8976e-06 
 Alpha: 0.105267 
 Beta: 0.8793471

18.4.3 tGARCH

y2=y^2
sigma2=var(y)
T = length(y2)

LL.tGARCH = function(par) {
  par = abs(par)
  omega = par[1]
  alpha = par[2]
  beta = par[3]
  nu = par[4]
  n1 = nu + 1.0
  n2 = nu - 2.0
  loglikelihood = 0

  for(i in 2:T) {
    sigma2 = omega + alpha * y2[i-1] + beta * sigma2
    loglikelihood = loglikelihood + log(sigma2) +
                    n1 * (log(n2 * sigma2 + y2[i]) - log(n2 * sigma2))
  }
  loglikelihood = -loglikelihood / 2.0
  loglikelihood = loglikelihood + (T-1) * (log(gamma(n1/2)) -
                                           log(gamma(nu/2)) - 0.5 * log(pi * n2))
  sigma2Last <<- sigma2
  return(-loglikelihood)
}
res = optim(c(0.000001, 0.1, 0.85, 6), fn = LL.tGARCH)  # Parameter estimation
res$par = abs(res$par)



# Alternative optimiser options:
# opts = list("algorithm" = "NLOPT_GN_CRS2_LM", "xtol_rel" = 1.0e-6, maxeval = 1e6)
opts = list("algorithm" = "NLOPT_LN_NELDERMEAD", "xtol_rel" = 1.0e-8, maxeval = 1e5)

res_t = nloptr(x0 = c(0.000001, 0.1, 0.85, 6),
               eval_f = LL.tGARCH,
               lb = c(0, 0.02, 0.6, 2),
               ub = c(1e-4, 0.2, 1, 1e3),
               opts = opts)


cat("sigma2:", sigma2Last, "\n")

sigma2: 6.464978e-05

spec.3 = ugarchspec(
 variance.model = list(garchOrder = c(1,1)),
 mean.model = list(armaOrder = c(0,0), include.mean = FALSE),
 distribution.model = "std"
)
Res = ugarchfit(spec = spec.3, data = y)

cat("Omega:", res_t$solution[1], coef(Res)[1], "\n",
    "Alpha:", res_t$solution[2], coef(Res)[2], "\n",
    "Beta:", res_t$solution[3], coef(Res)[3], "\n",
    "nu:", res_t$solution[4], coef(Res)[4], "\n")

Omega: 1.156451e-06 1.16291e-06 
 Alpha: 0.09978952 0.1000315 
 Beta: 0.8949667 0.8946775 
 nu: 6.304793 6.31163

18.5 Exercise

Implement a manual GARCH(1,1) estimation with GJR-GARCH asymmetric effects. Modify the likelihood function to include the leverage parameter γ, and compare your results with rugarch’s gjrGARCH specification. Discuss the additional optimisation challenges introduced by the asymmetric term.