4 Using TMB
We are now ready to present the main TMB
code for calculating the negative log-likelihood of an HMM.
4.1 Likelihood function
The function is defined in code/poi_hmm.cpp and imports the utility functions defined in functions/utils.cpp.
#include <TMB.hpp>
#include "../functions/utils.cpp"
// Likelihood for a poisson hidden markov model.
template<class Type>
<Type>::operator() ()
Type objective_function{
// Data
(x); // timeseries vector
DATA_VECTOR(m); // Number of states m
DATA_INTEGER
// Parameters
(tlambda); // conditional log_lambdas's
PARAMETER_VECTOR(tgamma); // m(m-1) working parameters of TPM
PARAMETER_VECTOR
// Uncomment only when using a non-stationary distribution
//PARAMETER_VECTOR(tdelta); // transformed stationary distribution,
// Transform working parameters to natural parameters:
<Type> lambda = tlambda.exp();
vector<Type> gamma = gamma_w2n(m, tgamma);
matrix
// Construct stationary distribution
<Type> delta = stat_dist(m, gamma);
vector// If using a non-stationary distribution, use this instead
//vector<Type> delta = delta_w2n(m, tdelta);
// Get number of timesteps (n)
int n = x.size();
// Evaluate conditional distribution: Put conditional
// probabilities of observed x in n times m matrix
// (one column for each state, one row for each datapoint):
<Type> emission_probs(n, m);
matrix<Type> row1vec(1, m);
matrix.setOnes();
row1vecfor (int i = 0; i < n; i++) {
if (x[i] != x[i]) { // f != f returns true if and only if f is NaN.
// Replace missing values (NA in R, NaN in C++) with 1
.row(i) = row1vec;
emission_probs}
else {
.row(i) = dpois(x[i], lambda, false);
emission_probs}
}
// Corresponds to (Zucchini et al., 2016, p 333)
<Type> foo, P;
matrix, sumfoo, lscale;
Type mllk
= (delta * vector<Type>(emission_probs.row(0))).matrix();
foo = foo.sum();
sumfoo = log(sumfoo);
lscale .transposeInPlace();
foo/= sumfoo;
foo for (int i = 2; i <= n; i++) {
= emission_probs.row(i - 1);
P = ((foo * gamma).array() * P.array()).matrix();
foo = foo.sum();
sumfoo += log(sumfoo);
lscale /= sumfoo;
foo }
= -lscale;
mllk
// Use adreport on variables for which we want standard errors
(lambda);
ADREPORT(gamma);
ADREPORT(delta);
ADREPORT
// Variables we need for local decoding and in a convenient format
(lambda);
REPORT(gamma);
REPORT(delta);
REPORT(n);
REPORT(emission_probs);
REPORT(mllk);
REPORT
return mllk;
}
Note that we use similar names as to the TMB side.
4.2 Optimization
Before we can fit the model, we need to load some necessary packages and data files. We also need to compile the C++
code and load the functions into our working environment in R
.
- We start by compiling the
C++
code for computing the likelihood and its gradients. Once it is compiled, we can load the TMB likelihood object intoR
– making it available fromR
.
# Load TMB and optimization packages
library(TMB)
# Run the C++ file containing the TMB code
::compile("code/poi_hmm.cpp")
TMB# Load it
dyn.load(dynlib("code/poi_hmm"))
Next, we need to load the necessary packages and the utility R
functions.
# Load the parameter transformation function
source("functions/utils.R")
- The data are part of a large data set collected with the “Track Your Tinnitus” (TYT) mobile application, a detailed description of which is presented in Pryss, Reichert, Langguth, et al. (2015) and Pryss, Reichert, Herrmann, et al. (2015). We analyze 87 successive days of the “arousal” variable, which is measured on a discrete scale. Higher values correspond to a higher degree of excitement, lower values to a more calm emotional state (for details, see Probst et al. 2016). The data can be loaded by a simple call.
load("data/tinnitus.RData")
Figure 4.1 presents the raw data in the Table below, which are also available for download at data/tinnitus.RData.
- Initialization of the number of states and starting (or initial) values for the optimization.
First, the number of states needs to be determined. As explained by Pohle et al. (2017a), Pohle et al. (2017b), and Zucchini, MacDonald, and Langrock (2016, sec. 6) (to name only a few), usually one would first fit models with a different number of states. Then, these models are evaluated e.g. by means of model selection criteria (as carried out by Leroux and Puterman 1992) or prediction performance (Celeux and Durand 2008).
As shown in the appendix, the model selection procedure shows that both AIC and BIC prefer a two-state model over a Poisson regression and over a model with three or four states. Consequently, we focus on the two-state model in the following.
The list object TMB_data
contains the data and the number of states.
# Model with 2 states
<- 2
m <- list(x = tinn_data, m = m) TMB_data
Secondly, initial values for the optimization procedure need to be defined. Although we will apply unconstrained optimization, we initialize the natural parameters, because this is much more intuitive and practical than handling the working parameters.
# Generate initial set of parameters for optimization
<- c(1, 3)
lambda <- matrix(c(0.8, 0.2,
gamma 0.2, 0.8), byrow = TRUE, nrow = m)
- Transformation from natural to working parameters
The previously created initial values are transformed and stored in the list parameters
for the optimization procedure.
# Turn them into working parameters
<- pois.HMM.pn2pw(m, lambda, gamma) parameters
- Creation of the
TMB
negative log-likelihood function with its derivatives
This object, stored as obj_tmb
requires the data, the initial values, and the previously created DLL as input.
Setting argument silent = TRUE
disables tracing information and is only used here to avoid excessive output.
<- MakeADFun(TMB_data, parameters,
obj_tmb DLL = "poi_hmm", silent = TRUE)
This object also contains the previously defined initial values as a vector (par
) rather than a list.
The negative log-likelihood (fn
), its gradient (gr
), and Hessian (he
) are functions of the parameters (in vector form) while the data are considered fixed.
These functions are available to the user and can be evaluated for instance at the initial parameter values:
$par
obj_tmb## tlambda tlambda tgamma tgamma
## 0.000000 1.098612 -1.386294 -1.386294
$fn(obj_tmb$par)
obj_tmb## [1] 228.3552
$gr(obj_tmb$par)
obj_tmb## [,1] [,2] [,3] [,4]
## [1,] -3.60306 -146.0336 10.52832 -1.031706
$he(obj_tmb$par)
obj_tmb## [,1] [,2] [,3] [,4]
## [1,] 1.902009 -5.877900 -1.3799682 2.4054017
## [2,] -5.877900 188.088247 -4.8501589 2.3434284
## [3,] -1.379968 -4.850159 9.6066700 -0.8410438
## [4,] 2.405402 2.343428 -0.8410438 0.7984216
- Execution of the optimization
For this step we rely again on the optimizer implemented in the nlminb
function.
The arguments, i.e.~ initial values for the parameters and the function to be optimized, are extracted from the previously created TMB object.
<- nlminb(start = obj_tmb$par, objective = obj_tmb$fn)
mod_tmb # Check that it converged successfully
$convergence == 0
mod_tmb## [1] TRUE
It is noteworthy that various alternatives to nlminb
exist.
Nevertheless, we focus on this established optimization routine because of its high speed of convergence.
- Obtaining ML estimates
Obtaining the ML estimates of the natural parameters together with their standard errors is possible by using the previously introduced command sdreport
.
Recall that this requires the parameters of interest to be treated by the ADREPORT
statement in the C++
part. It should be noted that the presentation of the set of parameters gamma
below results from a column-wise representation of the TPM.
summary(sdreport(obj_tmb, par.fixed = mod_tmb$par), "report")
## Estimate Std. Error
## lambda 1.63641070 0.27758294
## lambda 5.53309626 0.31876141
## gamma 0.94980192 0.04374682
## gamma 0.02592209 0.02088689
## gamma 0.05019808 0.04374682
## gamma 0.97407791 0.02088689
## delta 0.34054163 0.23056401
## delta 0.65945837 0.23056401
Note that the table above also contains estimation results for \({\boldsymbol\delta}\) and accompanying standard errors, although \({\boldsymbol\delta}\) is not estimated, but derived from \({\boldsymbol\Gamma}\).
We provide further details on this aspect in Confidence intervals.
The value of the nll function in the minimum found by the optimizer can also be extracted directly from the object mod_tmb
by accessing the list element objective
:
$objective
mod_tmb## [1] 168.5361
- Use exact gradient and Hessian
In the optimization above we already benefited from an increased speed due to the evaluation of the nll in C++
compared to the forward algorithm being executed entirely in R
.
However, since TMB
computes the gradient and hessian of the likelihood, we can provide this information to the optimizer
This is in general advisable, because TMB
provides an exact value of both gradient and Hessian up to machine precision, which is superior to approximations used by the optimizing procedure.
Similar to the nll, both quantities can be extracted directly from the TMB
object obj_tmb
:
# The negative log-likelihood is accessed by the objective
# attribute of the optimized object
<- nlminb(start = obj_tmb$par, objective = obj_tmb$fn,
mod_tmb gradient = obj_tmb$gr, hessian = obj_tmb$he)
$objective
mod_tmb## [1] 168.5361
Note that passing the exact gradient and Hessian as provided by TMB
to nlminb
leads to the same minimum, i.e. value of the nll function, here.
Is it noteworthy that inconsistencies can happen in the estimates due to computer approximations.
The stationary distribution is a vector of probabilities and should sum to 1. However, it doesn’t always behave as expected.
<- summary(sdreport(obj_tmb), "report")
adrep <- adrep[rownames(adrep) == "delta", "Estimate"]
estimate_delta sum(estimate_delta)
## [1] 1
sum(estimate_delta) == 1 # The sum is displayed as 1 but is not 1
## [1] FALSE
As noted on Zucchini, MacDonald, and Langrock (2016, 159–60), ``the row sums of \({\boldsymbol\Gamma}\) will only approximately equal 1, and the components of the vector \({\boldsymbol\delta}\) will only approximately total 1. This can be remedied by scaling the vector \({\boldsymbol\delta}\) and each row of \({\boldsymbol\Gamma}\) to total 1’’. We do not remedy this issue because it provides no benefit to us, but this may lead to surprising results when checking equality.
This is likely due to machine approximations when numbers far apart from each other interact together. In R, a small number is not 0 but is treated as 0 when added to a much larger number.
This can result in incoherent findings when checking equality between 2 numbers.
1e-100 == 0 # Small numbers are 0
## [1] FALSE
1 + 1e-100) == 1
(## [1] TRUE
4.3 Basic nested model
specification
4.3.1 Principle
As a first step in building a nested model, we arbitrarily fix \(\lambda_1\) to the value 1.
# Get the previous values, and fix some
<- lambda
fixed_par_lambda 1] <- 1
fixed_par_lambda[# Transform them into working parameters
<- pois.HMM.pn2pw(m = m,
new_parameters lambda = fixed_par_lambda,
gamma = gamma)
Then, we instruct TMB
to read these custom parameters.
We indicate fixed values by mapping them to NA values, whereas the variable values need to be mapped to different factor variables.
Lastly, we specify this mapping with the map
argument when making the TMB
object.
<- list(tlambda = as.factor(c(NA, 1)),
map tgamma = as.factor(c(2, 3)))
<- MakeADFun(TMB_data, new_parameters,
fixed_par_obj_tmb DLL = "poi_hmm",
silent = TRUE,
map = map)
Estimation of the model and displaying the results is performed as usual.
<- nlminb(start = fixed_par_obj_tmb$par,
fixed_par_mod_tmb objective = fixed_par_obj_tmb$fn,
gradient = fixed_par_obj_tmb$gr,
hessian = fixed_par_obj_tmb$he)
summary(sdreport(fixed_par_obj_tmb), "report")
## Estimate Std. Error
## lambda 1.00000000 0.00000000
## lambda 5.50164872 0.30963641
## gamma 0.94561055 0.04791050
## gamma 0.02655944 0.02133283
## gamma 0.05438945 0.04791050
## gamma 0.97344056 0.02133283
## delta 0.32810136 0.22314460
## delta 0.67189864 0.22314460
Note that the standard error of \(\lambda_1\) is zero, because it is no longer considered a variable parameter and does not enter the optimization procedure.
4.3.2 Limits
This method cannot work in general for working parameters which are not linked to a single natural parameter. This is because only working parameters can be fixed with this method, but the working parameters of the TPM are not each linked to a single natural parameter. As an example, fixing the natural parameter \(\gamma_{11}\) is not equivalent to fixing any working parameter \(\tau_{ij}\). Hence, the TPM cannot in general be fixed, except perhaps via constrained optimization.
However, if conditions on the natural parameters can be translated to conditions on the working parameters, then there should not be any issue. We refer to the next section for more details.
4.3.3 Parameter equality
constraints
More complex constraints are also possible. For example, imposing equality constraints (such as \(\gamma_{11} = \gamma_{22}\)) requires the corresponding factor level to be identical for the concerned entries.
As a reminder, we defined the working parameters via \[ \tau_{ij} = \log(\frac{\gamma_{ij}}{1 - \sum_{k \neq i} \gamma_{ik}}) = \log(\gamma_{ij}/\gamma_{ii}), \text{ for } i \neq j \]
With a two-state Poisson HMM, the constraint \(\gamma_{11} = \gamma_{22}\) is equivalent to \(\gamma_{12} = \gamma_{21}\). Thus, the constraint can be transformed into \(\tau_{12} = \log(\gamma_{12}/\gamma_{11}) = \log(\gamma_{21}/\gamma_{22}) = \tau_{21}\).
The mapping parameters must be set to a common factor to be forced equal.
<- list(tlambda = as.factor(c(1, 2)),
map tgamma = as.factor(c(3, 3)))
<- MakeADFun(TMB_data, parameters,
fixed_par_obj_tmb DLL = "poi_hmm",
silent = TRUE,
map = map)
<- nlminb(start = fixed_par_obj_tmb$par,
fixed_par_mod_tmb objective = fixed_par_obj_tmb$fn,
gradient = fixed_par_obj_tmb$gr,
hessian = fixed_par_obj_tmb$he)
# Results + check that the constraint is respected
<- summary(sdreport(fixed_par_obj_tmb), "report")
results <- matrix(results[rownames(results) == "gamma", "Estimate"],
tpm nrow = m,
ncol = m,
byrow = FALSE) # Transformations are column-wise by default, be careful!
tpm## [,1] [,2]
## [1,] 0.96759216 0.03240784
## [2,] 0.03240784 0.96759216
1, 1] == tpm[2, 2]
tpm[## [1] TRUE
Similar complex constraints on the TPM can also be setup for HMMs with three or more states. However, it appears this can only be achieved in general when the constraint involves natural parameters of the same row (with the exception of a two-state model). We have not found a way to easily implement equality constraints between natural TPM parameters of different rows. A solution might be constrained optimization.
As an example, we will look at a three-state HMM with the constraint \(\gamma_{12} = \gamma_{13}\).
# Model with 2 states
<- 3
m <- list(x = tinn_data, m = m)
TMB_data
# Initial set of parameters
<- c(1, 3, 5)
lambda <- matrix(c(0.8, 0.1, 0.1,
gamma 0.1, 0.8, 0.1,
0.1, 0.1, 0.8), byrow = TRUE, nrow = m)
# Turn them into working parameters
<- pois.HMM.pn2pw(m, lambda, gamma)
parameters
# Build the TMB object
<- MakeADFun(TMB_data, parameters,
obj_tmb DLL = "poi_hmm", silent = TRUE)
# Optimize
<- nlminb(start = obj_tmb$par,
mod_tmb objective = obj_tmb$fn,
gradient = obj_tmb$gr,
hessian = obj_tmb$he)
# Check convergence
$convergence == 0
mod_tmb## [1] TRUE
# Results
summary(sdreport(obj_tmb), "report")
## Estimate Std. Error
## lambda 8.281290e-01 5.418824e-01
## lambda 1.705082e+00 2.949769e-01
## lambda 5.514886e+00 3.080238e-01
## gamma 5.516919e-01 3.132490e-01
## gamma 4.596623e-09 5.115944e-05
## gamma 2.749570e-02 2.164849e-02
## gamma 1.989160e-01 2.698088e-01
## gamma 9.772963e-01 3.436716e-02
## gamma 2.453591e-10 2.730433e-06
## gamma 2.493922e-01 2.546636e-01
## gamma 2.270369e-02 3.436717e-02
## gamma 9.725043e-01 2.164849e-02
## delta 3.836408e-02 3.757898e-02
## delta 3.361227e-01 3.144598e-01
## delta 6.255132e-01 3.063220e-01
The transformed constraint becomes \(\tau_{12} = \log(\gamma_{12}/\gamma_{11}) = \log(\gamma_{13}/\gamma_{11}) = \tau_{13}\).
We need to be careful how we specify the constraint, because the vector tgamma
will be converted into a matrix column-wise since this is R’s default way to handle matrix-vector conversions.
The tgamma
matrix looks naturally like
\[\begin{pmatrix}
&\tau_{12}&\tau_{13}\\
\tau_{21}& &\tau_{23}\\
\tau_{31}&\tau_{32}&
\end{pmatrix}\]
As a vector in R
, it becomes \(\left(\tau_{21}, \tau_{31}, \tau_{12}, \tau_{32}, \tau_{13}, \tau_{23}\right)\).
Therefore, the constraint needs to be placed on the \(3^{rd}\) and \(5^{th}\) vector parameter in the same way as we did when the Poisson mean \(\lambda_1\) was fixed.
<- list(tlambda = as.factor(c(1, 2, 3)),
map tgamma = as.factor(c(4, 5, 6, 7, 6, 8)))
<- MakeADFun(TMB_data, parameters,
fixed_par_obj_tmb DLL = "poi_hmm",
silent = TRUE,
map = map)
<- nlminb(start = fixed_par_obj_tmb$par,
fixed_par_mod_tmb objective = fixed_par_obj_tmb$fn,
gradient = fixed_par_obj_tmb$gr,
hessian = fixed_par_obj_tmb$he)
# Results + check that the constraint is respected
<- summary(sdreport(fixed_par_obj_tmb), "report")
results <- matrix(results[rownames(results) == "gamma", "Estimate"],
tpm nrow = m,
ncol = m,
byrow = FALSE) # Transformations are column-wise by default, be careful!
tpm## [,1] [,2] [,3]
## [1,] 5.447127e-01 2.276437e-01 0.22764365
## [2,] 2.718007e-09 9.753263e-01 0.02467374
## [3,] 2.710078e-02 2.005408e-10 0.97289922
1, 2] == tpm[1, 3]
tpm[## [1] TRUE
Equality constraints involving a diagonal member of the TPM are simpler to specify: the constraint \(\gamma_{i,j} = \gamma_{i,i}\) becomes transformed to \(\tau_{i,j} = 1\) and this can be specified in the same way the Poisson mean \(\lambda_1\) was fixed.
4.4 State inference and forecasting
The following code requires executing the code presented in the Section Forward algorithm and backward algorithm, as the log-forward and log-backward probabilities are needed. We reuse the TYT data set from above along with the HMM that was estimated on it.
# Retrieve the objects at ML value
<- obj_tmb$report(obj_tmb$env$last.par.best)
adrep <- adrep$delta
delta <- adrep$gamma
gamma <- adrep$emission_probs
emission_probs <- adrep$n
n <- length(delta)
m <- adrep$mllk
mllk # Compute log-forward probabilities (scaling used)
<- matrix(NA, m, n)
lalpha <- delta * emission_probs[1, ]
foo <- sum(foo)
sumfoo <- log(sumfoo)
lscale <- foo / sumfoo
foo 1] <- log(foo) + lscale
lalpha[, for (i in 2:n) {
<- foo %*% gamma * emission_probs[i, ]
foo <- sum(foo)
sumfoo <- lscale + log(sumfoo)
lscale <- foo / sumfoo
foo <- log(foo) + lscale
lalpha[, i]
}# Compute log-backwards probabilities (scaling used)
<- matrix(NA, m, n)
lbeta <- rep(0, m)
lbeta[, n] <- rep (1 / m, m)
foo <- log(m)
lscale for (i in (n - 1):1) {
<- gamma %*% (emission_probs[i + 1, ] * foo)
foo <- log(foo) + lscale
lbeta[, i] <- sum(foo)
sumfoo <- foo / sumfoo
foo <- lscale + log(sumfoo)
lscale }
4.4.1 Local decoding
The smoothing probabilities are defined in Zucchini, MacDonald, and Langrock (2016, 87 and p.336) as \[ \text{P}(C_t = i \vert X^{(n)} = x^{(n)}) = \frac{\alpha_t(i) \beta_t(i)}{L_n} \]
# Compute conditional state probabilities, smoothing probabilities
<- matrix(NA, ncol = length(tinn_data), nrow = m)
stateprobs <- - mllk
llk for(i in 1:n) {
<- exp(lalpha[, i] + lbeta[, i] - llk)
stateprobs[, i]
}# stateprobs contains n=87 columns, so we only display 4 for readability
# Each row corresponds to a state.
# For example, stateprobs[2, 3] is the probability that the third data
# is associated with state 2
1:4]
stateprobs[, ## [,1] [,2] [,3] [,4]
## [1,] 1.977715e-05 7.408176e-05 2.674435e-03 1.150174e-05
## [2,] 6.490109e-04 1.548449e-04 9.322969e-05 1.097237e-05
## [3,] 9.993312e-01 9.997711e-01 9.972323e-01 9.999775e-01
Local decoding is a straightforward maximum of the smoothing probabilities.
# Most probable states (local decoding)
<- rep(NA, n)
ldecode for (i in 1:n) {
<- which.max(stateprobs[, i])
ldecode[i]
}
ldecode## [1] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 1 1 1 3 3 3
## [58] 3 3 3 3 3 3 3 3 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
4.4.2 Forecast
The forecast distribution or h-step-ahead-probabilities as well as its implementation in R is detailed in Zucchini, MacDonald, and Langrock (2016, 83 and p.337)
Then, \[ \text{P}(X_{n+h} = x \vert X^{(n)} = x^{(n)}) = \frac{{\boldsymbol\alpha}_n {\boldsymbol\Gamma}^h \textbf{P}(x) {\boldsymbol 1}'}{{\boldsymbol\alpha}_n {\boldsymbol 1}'} = {\boldsymbol\Phi}_n {\boldsymbol\Gamma}^h \textbf{P}(x) {\boldsymbol 1}' \] An implementation of this using a scaling scheme is
# Number of steps
<- 1
h # Values for which we want the forecast probabilities
<- 0:50
xf
<- length(xf)
nxf <- matrix(0, nrow = h, ncol = nxf)
dxf <- delta * emission_probs[1, ]
foo <- sum(foo)
sumfoo <- log(sumfoo)
lscale <- foo / sumfoo
foo for (i in 2:n) {
<- foo %*% gamma * emission_probs[i, ]
foo <- sum( foo)
sumfoo <- lscale + log(sumfoo)
lscale <- foo / sumfoo
foo
}<- get.emission.probs(xf, lambda)
emission_probs_xf for (i in 1:h) {
<- foo %*% gamma
foo for (j in 1:m) {
<- dxf[i, ] + foo[j] * emission_probs_xf[, j]
dxf[i, ]
}
}# dxf contains n=87 columns, so we only display 4 for readability
1:4]
dxf[, ## [1] 0.04892114 0.14674114 0.22073452 0.22199873
4.4.3 Global decoding
The Viterbi algorithm is detailed in Zucchini, MacDonald, and Langrock (2016, 88 and p.334). It calculates the sequence of states \((i_1^*, \ldots, i_n^*)\) which maximizes the conditional probability of all states simultaneously, i.e. \[ (i_1^*, \ldots, i_n^*) = \mathop{\mathrm{argmax}}_{i_1, \ldots, i_n \in \{1, \ldots, m \}} \text{P}(C_1 = i_1, \ldots, C_n = i_n \vert X^{(n)} = x^{(n)}). \] An implementation of it is
<- matrix(0, n, m)
xi <- delta * emission_probs[1, ]
foo 1, ] <- foo / sum(foo)
xi[for (i in 2:n) {
<- apply(xi[i - 1, ] * gamma, 2, max) * emission_probs[i, ]
foo <- foo / sum(foo)
xi[i, ]
}<- numeric(n)
iv <- which.max(xi[n, ])
iv[n] for (i in (n - 1):1){
<- which.max(gamma[, iv[i + 1]] * xi[i, ])
iv[i]
}
iv## [1] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 1 1 1 3 3 3
## [58] 3 3 3 3 3 3 3 3 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
4.5 Appendix
Model selection tinnitus via AIC and BIC (calculation defined in Zucchini, MacDonald, and Langrock (2016)).
- AIC and BIC of a Poisson regression
<- 1
m <- list(x = tinn_data, m = m)
TMB_data <- 2
lambda <- 1
gamma <- pois.HMM.pn2pw(m, lambda, gamma)
parameters <- MakeADFun(TMB_data, parameters,
obj_tmb DLL = "poi_hmm", silent = TRUE)
<- nlminb(start = obj_tmb$par, objective = obj_tmb$fn)
mod_tmb <- mod_tmb$objective
mllk <- length(unlist(parameters))
np <- 2 * (mllk + np)
AIC_1 <- sum(!is.na(TMB_data$x))
n <- 2 * mllk + np * log(n)
BIC_1 $convergence == 0
mod_tmb## [1] TRUE
AIC_1## [1] 394.197
BIC_1## [1] 396.6629
- AIC and BIC of a two-state Poisson HMM
<- 2
m <- list(x = tinn_data, m = m)
TMB_data <- seq(from = 1, to = 3, length.out = m)
lambda # 0.8 on the diagonal, and 0.2 split along the rest of each line, size m
<- matrix(0.2 / (m - 1),
gamma nrow = m,
ncol = m)
diag(gamma) <- 0.8
<- pois.HMM.pn2pw(m, lambda, gamma)
parameters <- MakeADFun(TMB_data, parameters,
obj_tmb DLL = "poi_hmm", silent = TRUE)
<- nlminb(start = obj_tmb$par, objective = obj_tmb$fn)
mod_tmb <- mod_tmb$objective
mllk <- length(unlist(parameters))
np <- 2 * (mllk + np)
AIC_2 <- sum(!is.na(TMB_data$x))
n <- 2 * mllk + np * log(n)
BIC_2 $convergence == 0
mod_tmb## [1] TRUE
AIC_2## [1] 345.0721
BIC_2## [1] 354.9357
- AIC and BIC of a three-state Poisson HMM
<- 3
m <- list(x = tinn_data, m = m)
TMB_data <- seq(from = 1, to = 3, length.out = m)
lambda # 0.8 on the diagonal, and 0.2 split along the rest of each line, size m
<- matrix(0.2 / (m - 1),
gamma nrow = m,
ncol = m)
diag(gamma) <- 0.8
<- pois.HMM.pn2pw(m, lambda, gamma)
parameters <- MakeADFun(TMB_data, parameters,
obj_tmb DLL = "poi_hmm", silent = TRUE)
<- nlminb(start = obj_tmb$par, objective = obj_tmb$fn)
mod_tmb <- mod_tmb$objective
mllk <- length(unlist(parameters))
np <- 2 * (mllk + np)
AIC_3 <- sum(!is.na(TMB_data$x))
n <- 2 * mllk + np * log(n)
BIC_3 $convergence == 0
mod_tmb## [1] TRUE
AIC_3## [1] 353.0463
BIC_3## [1] 375.2395
- AIC and BIC of a four-state Poisson HMM
<- 4
m <- list(x = tinn_data, m = m)
TMB_data <- seq(from = 1, to = 3, length.out = m)
lambda # 0.8 on the diagonal, and 0.2 split along the rest of each line, size m
<- matrix(0.2 / (m - 1),
gamma nrow = m,
ncol = m)
diag(gamma) <- 0.8
<- pois.HMM.pn2pw(m, lambda, gamma)
parameters <- MakeADFun(TMB_data, parameters,
obj_tmb DLL = "poi_hmm", silent = TRUE)
<- nlminb(start = obj_tmb$par, objective = obj_tmb$fn)
mod_tmb <- mod_tmb$objective
mllk <- length(unlist(parameters))
np <- 2 * (mllk + np)
AIC_4 <- sum(!is.na(TMB_data$x))
n <- 2 * mllk + np * log(n)
BIC_4 $convergence == 0
mod_tmb## [1] TRUE
AIC_4## [1] 365.6644
BIC_4## [1] 405.1189
- Summary
Models AIC | BIC | |
---|---|---|
Poisson regression | 394.1969962 | 396.6629043 |
Two-state Poisson HMM | 345.0721117 | 354.9357442 |
Three-state Poisson HMM | 353.0463375 | 375.2395106 |
Four-state Poisson HMM | 365.6643762 | 405.1189061 |
AIC and BIC prefer a two-state model over a Poisson regression and over three and four-state HMMs.