7 Gaussian HMM

Univariate Gaussian HMMs with TMB are very similar to the previous Poisson HMM. The main changes are the distribution parameters being changed from \({\boldsymbol\lambda}\) to \({\boldsymbol\mu}\) and \({\boldsymbol\sigma}\). In turn, these cause the density function to change from dpois to dnorm (and from rpois to rnorm for simulations).

Many functions available in functions/utils.R have been adapted for the Gaussian case in functions/norm_utils.R.

We detail below an example of a Gaussian HMM with TMB.

7.1 Dataset

The dataset we show in this example contains log-returns of the SP500 dataset pulled from Yahoo finance. These log-returns are generated with the following code.

library(tidyverse)
# Loading the sp500 dataset
SP500 <- read_csv("data/S&P500.csv") %>%
  mutate(lag = lag(`Adj Close`),
         lr = log(`Adj Close`/lag)) %>%
  select(lr) %>%
  drop_na()
SP500 <- 100 * SP500$lr
save(SP500, file = "data/SP500.RData")

7.2 Likelihood function

The Gaussian negative log-likelihood in TMB can be coded as

#include <TMB.hpp>
#include "../functions/norm_utils.cpp"

// Likelihood for a normal hidden markov model. 
template<class Type>
Type objective_function<Type>::operator() ()
{
  
  // Data
  DATA_VECTOR(x);          // timeseries vector
  DATA_INTEGER(m);         // Number of states m
  
  // Parameters
  PARAMETER_VECTOR(tmu);            // conditional means
  PARAMETER_VECTOR(tsigma);         // conditional log_sd's
  PARAMETER_VECTOR(tgamma);         // m(m-1) working parameters of TPM
  // PARAMETER_VECTOR(tdelta);      // m-1 working parameters of initial distribution
  
 
  // Transform working parameters to natural parameters:
  vector<Type> mu = tmu;
  vector<Type> sigma = tsigma.exp();
  matrix<Type> gamma = gamma_w2n(m, tgamma);
  // vector<Type> delta = delta_w2n(m, tdelta);
  
  // Construct stationary distribution 
  vector<Type> delta = stat_dist(m, gamma);
  
  // Get number of timesteps (n)
  int n = x.size();
  
  // Evaluate conditional distribution: Put conditional probabilities
  // of observed x in n times m matrix (one column for each state)
  matrix<Type> emission_probs(n, m);
  matrix<Type> row1vec(1, m);
  row1vec.setOnes();
  for (int i = 0; i < n; i++) {
    if (x[i] != x[i]) { // f != f returns true if and only if f is NaN. 
      // Replace missing values (NA in R, NaN in C++) with 1
      emission_probs.row(i) = row1vec;
    }
    else {
      emission_probs.row(i) = dnorm(x(i), mu, sigma, false);
    }
  }
  
  // Corresponds to (Zucchini et al., 2016, p 333)
  matrix<Type> foo, P;
  Type mllk, sumfoo, lscale;

  foo = (delta * vector<Type>(emission_probs.row(0))).matrix();
  sumfoo = foo.sum();
  lscale = log(sumfoo);
  foo.transposeInPlace();
  foo /= sumfoo;
  for (int i = 2; i <= n; i++) {
    P = emission_probs.row(i - 1);
    foo = ((foo * gamma).array() * P.array()).matrix();
    sumfoo = foo.sum();
    lscale += log(sumfoo);
    foo /= sumfoo;
  }
  mllk = -lscale;
  
  // Use adreport on variables for which we want standard errors
  ADREPORT(mu);
  ADREPORT(sigma);
  ADREPORT(gamma);
  ADREPORT(delta);
  
  // Variables we need for local decoding and in a convenient format
  REPORT(mu);
  REPORT(sigma);
  REPORT(gamma);
  REPORT(delta);
  REPORT(n);
  // REPORT(emission_probs);
  REPORT(mllk);
  
  return mllk;
}

and requires the following utility functions.

// Function transforming working parameters in initial distribution
// to natural parameters
template<class Type>
vector<Type> delta_w2n(int m, vector<Type> tdelta) {

  vector<Type> delta(m);
  vector<Type> foo(m);
  
  if (m == 1)
    return Type(1);
  
  // set first element to one.
  // Fill in the last m - 1 elements with working parameters
  // and take exponential
  foo << Type(1), tdelta.exp();

  // normalize
  delta = foo / foo.sum();

  return delta;
}

// Function transforming the working parameters in TPM to
// natural parameters (w2n)
template<class Type>
matrix<Type> gamma_w2n(int m, vector<Type> tgamma) {

  // Construct m x m identity matrix
  matrix<Type> gamma(m, m);
  gamma.setIdentity();
  
  if (m == 1)
    return gamma;

  // Fill offdiagonal elements with working parameters column-wise:
  int idx = 0;
  for (int i = 0; i < m; i++){
    for (int j = 0; j < m; j++){
      if (j != i){
        // Fill gamma according to mapping and take exponential
        gamma(j, i) = tgamma.exp()(idx);
        idx++;
      }
    }
  }
  // Normalize each row:
  vector<Type> cs = gamma.rowwise().sum();
  for (int i = 0; i < m; i++) gamma.row(i) /= cs[i];

  return gamma;
}

// Function computing the stationary distribution of a Markov chain
template<class Type>
vector<Type> stat_dist(int m, matrix<Type> gamma) {
  
  // Construct stationary distribution
  matrix<Type> I(m, m);
  matrix<Type> U(m, m);
  matrix<Type> row1vec(1, m);
  U = U.setOnes();
  I = I.setIdentity();
  row1vec.setOnes();
  matrix<Type> A =  I - gamma + U;
  matrix<Type> Ainv = A.inverse();
  matrix<Type> deltamat = row1vec * Ainv;
  vector<Type> delta = deltamat.row(0);
  
  return delta;
}

7.3 Estimation

The following R code illustrates how to compute the estimates and standard errors, using the objective function written above.

# Load packages and utility functions
source("code/packages.R")
source("functions/norm_utils.R")

# Compile and load the objective function for TMB
TMB::compile("code/norm_hmm.cpp")
## [1] 0
dyn.load(dynlib("code/norm_hmm"))

# Parameters and covariates
load("data/SP500.RData")
m <- 2
# Means
mu <- c(-2 * mean(SP500), 2 * mean(SP500))
# Standard deviations
sigma <- c(0.5 * sd(SP500), 2 * sd(SP500))
# TPM
gamma <- matrix(c(0.9, 0.1,
                  0.1, 0.9), nrow = 2, byrow = TRUE)

# Parameters & covariates for TMB
working_params <- norm.HMM.pn2pw(m = m, mu = mu, sigma = sigma, gamma = gamma)
TMB_data <- list(x = SP500, m = m)
obj <- MakeADFun(TMB_data, working_params, DLL = "norm_hmm", silent = TRUE)

# Estimation
obj_tmb <- MakeADFun(data = TMB_data,
                     parameters = working_params,
                     DLL = "norm_hmm",
                     silent = TRUE)
mod_tmb <- nlminb(start = obj_tmb$par,
                  objective = obj_tmb$fn,
                  gradient = obj_tmb$gr,
                  hessian = obj_tmb$he)

Note that the script above makes a simple educated guess on the true parameters in order to choose initial parameters.

7.4 Results

Estimates can be shown here along with their standard errors, as was done previously in the Poisson case. As before, the matrix results are displayed in a column-wise format. This can be seen by comparing estimates with a more readable view.

summary(sdreport(obj_tmb), "report")
##          Estimate  Std. Error
## mu     0.06116848 0.005401313
## mu    -0.12288814 0.032944675
## sigma  0.69860750 0.005201810
## sigma  2.24762988 0.029916757
## gamma  0.98850975 0.001067611
## gamma  0.04374985 0.004126772
## gamma  0.01149025 0.001067611
## gamma  0.95625015 0.004126772
## delta  0.79199446 0.016439163
## delta  0.20800554 0.016439163
# Readable estimates
obj_tmb$report()
## $mu
## [1]  0.06116848 -0.12288814
## 
## $sigma
## [1] 0.6986075 2.2476299
## 
## $delta
## [1] 0.7921573 0.2078427
## 
## $n
## [1] 23349
## 
## $gamma
##            [,1]       [,2]
## [1,] 0.98852111 0.01147889
## [2,] 0.04374985 0.95625015
## 
## $mllk
## [1] 31516.82

An intuitive interpretation of these results is that on average, market prices either grow slowly (positive low mean) but surely (low variance) as people invest carefully, or the market prices decrease (negative larger mean) as people sell in panic for fear of losing too much (higher variance). For more details on this, see e.g. and get some information on bull and bear markets.

Note that the TPM is displayed column-wise.