3. Model diagnostics

Rceattle provides several layers of diagnostics: convergence diagnostics attached automatically to every fit, S3 methods (residual, logLik, etc), fit plots to visually inspect how well the model reproduces observed data, retrospective analysis (peels) to detect patterns of systematic over- or under-estimation, jitter testing to check that the optimiser has found a global minimum, self-testing (simulation–estimation) to check that the model can recover its own estimates from simulated data, and likelihood profiling to check how informative the data are about a particular parameter.

Setup and plotting data

library(Rceattle)

# Fit a single-species model for the 2022 Northern Rockfish assessment.
data("NorthernRockfish2022")
nrdata <- NorthernRockfish2022
nrdata$fleet_control$proj_F_prop <- 1
plot_data(nrdata)

Fitting the model

model_1 <- Rceattle::fit_mod(
  data_list = nrdata,
  estimateMode = 0,  # Estimate
  initMode = 1,      # Assume unfished equilibrium
  M1Fun = build_M1(
    updateM1 = TRUE,
    M1_model = 1,
    M1_use_prior = TRUE,
    M_prior = 0.06,
    M_prior_sd = 0.05),
  fit_control = fit_control(phase = TRUE, verbose = 0)
)
summary(model_1)

S3 Methods

summary(model_1)              # same compact summary
coef(model_1)                 # estimated fixed-effect parameters
logLik(model_1)               # logLik with df attribute (AIC works)
AIC(model_1)
vcov(model_1)                 # fixed-effect covariance fro
residuals(model_1)            # data.frame of residuals
as.data.frame(model_1)

Convergence diagnostics

fit_mod() runs a battery of convergence checks after optimization and attaches the result as model$convergence. Each check is one record with a severity ("OK", "NOTE", "WARN", or "FAIL"); the object’s status is the worst severity present. Non-OK checks are surfaced via message() during the fit (a non-converged model is never turned into an error — the fit is always returned with its diagnostics attached), and print(model) shows the overall status.

model_1$convergence            # overall status + any non-OK checks
print(model_1$convergence, all = TRUE)   # show every check, including OK ones

Re-run the battery on any fit with convergence_diagnostics():

convergence_diagnostics(model_1)

The checks cover:

  • max_gradient — maximum absolute marginal gradient and the parameter carrying it (WARN > 1e-3, FAIL > 1).
  • pdHess — whether the Hessian is positive definite (FAIL if not).
  • sdreport_failedFAIL when an sdreport was requested but the Hessian could not be inverted.
  • hessian_conditioning — the Hessian condition number and, when poorly conditioned, the parameters loading on the least-identified direction (the unidentified linear combination). Complements TMBhelper::check_estimability (a per-parameter verdict) with a continuous severity and direction.
  • parameters_on_bounds — parameters that hit a configured build_bounds() limit (often unidentified or mis-scaled).
  • phasing — phases that ended with a high gradient, localizing which parameter block is hard to fit.
  • estimability — surfaces TMBhelper::check_estimability when it ran.

Each record carries a data element with the underlying numbers (e.g. the condition number and loadings for hessian_conditioning).

Fit plots

Composition data

plot_comp() overlays observed (bars) and predicted (line) age or length compositions for every fleet.

plot_comp(model_1)

Survey indices

plot_index() and plot_logindex() show observed vs. predicted survey biomass on the natural and log scales, respectively. plot_indexresidual() plots the log-scale residuals — a useful first check for time trends or heteroscedasticity.

plot_index(model_1)
plot_logindex(model_1)
plot_indexresidual(model_1)

Catch

plot_catch(model_1)

Retrospective analysis

A retrospective analysis systematically removes the most recent years of data one peel at a time and re-fits the model. Bias in the final estimates relative to earlier peels is summarised by Mohn’s rho: values outside ±0.2 (for SSB) generally indicate a problem worth investigating.

model_1_retro <- retrospective(Rceattle = model_1, peels = 5)

# Mohn's rho for each quantity
model_1_retro$mohns

# Plot historical trajectories across peels
plot_biomass(model_1_retro$Rceattle_list)

# Include the projection period to see how the forecast changes
plot_biomass(model_1_retro$Rceattle_list, incl_proj = TRUE)

The retrospective() function returns a list with two elements:

Element Description
Rceattle_list List of fitted models, ordered from full run to most-peeled
mohns Data frame with columns Object (quantity, e.g. "Biomass", "SSB"), Forecast year (0 = terminal year bias; 1+ = forecast skill), N (number of peels), and one column per species with mean relative error (Mohn’s rho)

The nyrs_forecast argument (default 3) additionally evaluates rho for years projected beyond the terminal peel, making it possible to quantify forecast skill alongside retrospective bias in a single call.

Jitter testing

Jitter testing re-fits the model from many randomly perturbed starting values to check whether the optimizer consistently returns the same minimum negative log-likelihood (NLL). A spread of NLL values across jitters suggests the likelihood surface has multiple local minima and results should be interpreted cautiously.

jitters <- jitter(Rceattle = model_1, njitter = 10, phase = TRUE)  # default njitter = 50

# Histogram of NLL differences relative to the best run
hist(log(jitters$nll - min(jitters$nll)),
     main = "Jitter NLL spread (log scale)",
     xlab = "log(NLL - min NLL)")

# Overlay biomass trajectories — tight overlap indicates a stable optimum
plot_biomass(jitters$Rceattle_list)

# Number of runs that converged
length(jitters$Rceattle_list)

Non-converging runs are silently dropped from Rceattle_list, so if length(jitters$Rceattle_list) is much less than njitter, convergence may be fragile.

retrospective() and jitter() both fit peels / starts in parallel by default. Pass cores = 1 to force sequential execution; otherwise the call uses a PSOCK cluster sized at parallel::detectCores() - 6 (capped at 2 under R CMD check).

Self-test (simulation–estimation)

self_test() simulates nsim datasets from a fitted model — keeping its estimated parameters fixed — and re-fits the model to each simulated dataset. If estimates from the refits cluster around the parameter values of the fit they were simulated from, the model is at least self-consistent (it can recover its own estimates from data it generated). Persistent bias or wide spread in a quantity is a sign that the data are not informative about it.

sims <- self_test(model_1, nsim = 10)            # default nsim = 50

# Number of simulations that converged (non-converged runs are dropped)
length(sims)

# Overlay biomass / SSB trajectories across simulations — the original fit's
# trajectory should sit inside the spread of the refits.
plot_biomass(c(list(model_1), sims), model_names = c("fit", names(sims)))
plot_ssb(c(list(model_1), sims),     model_names = c("fit", names(sims)))

Each simulation uses seed + i as its RNG seed, so results are reproducible whether or not cores > 1. Set simulate = FALSE to refit against the model’s expected values (no observation error) rather than draws from the observation likelihood — useful for confirming that the estimator returns the generating parameters in the noise-free limit. cores behaves the same way as in retrospective() and jitter().

Likelihood profile

profile() re-fits the model across a grid of fixed values for one or more cells of a chosen parameter and returns the resulting NLL surface. A flat profile means the data carry little information about the parameter; a sharp minimum away from the MLE means the fit has not actually settled there. The function generalises the legacy profile_rsigma() helper from examples/Profiling_sigmaR.R and works for any parameter slot in Rceattle$estimated_params; it is tested for R_log_sd (recruitment sigma), rec_pars (stock-recruit parameters), and log_M1 (residual natural mortality).

Two ways to specify param:

  • Natural-scale alias (recommended for the three documented parameters). The alias implies transform = "log" so you pass natural-scale values like sigmaR or M directly. For the rec_pars aliases ("R0", "alpha", "beta") the column is filled in from the alias name, so slots only needs the species index.
    • "sigmaR" / "R_sd"R_log_sd
    • "M1"log_M1
    • "R0"rec_pars[, 1]
    • "alpha"rec_pars[, 2]
    • "beta"rec_pars[, 3]
  • Raw parameter slot. Pass the storage name ("R_log_sd", "rec_pars", "log_M1", etc.); slots must index the full array and transform controls the scale.

slots is a list of integer index vectors — one entry per cell to hold fixed — and (under raw names) its entries must match the dimensionality of the parameter array (length 1 for vectors, length 2 for matrices, length 3 for 3-D arrays). values is a list of value grids parallel to slots; the full grid of fits is expand.grid(values), so multiple slots produce an N-D cross-profile.

# 1-D profile: sigmaR for species 1 (natural-scale alias)
prof_sigmaR <- profile(
  fitted = model_1,
  param    = "sigmaR",
  slots    = list(1),
  values   = list(seq(0.1, 1.5, by = 0.1))
)

plot(prof_sigmaR$grid$slot_1,
     prof_sigmaR$nll - min(prof_sigmaR$nll, na.rm = TRUE),
     type = "l", xlab = "sigmaR", ylab = "dNLL")

# 1-D profile: SRR alpha for species 1
# (alias fills in the rec_pars column; slot is just the species index)
prof_alpha <- profile(
  fitted = model_1,
  param    = "alpha",
  slots    = list(1),
  values   = list(seq(2, 80, length.out = 20))
)

# 2-D cross-profile: M1 across sex for species 1, age 1
prof_M_sex <- profile(
  fitted = model_1,
  param    = "M1",
  slots    = list(c(1, 1, 1), c(1, 2, 1)),  # males and females
  values   = list(seq(0.10, 0.40, by = 0.05),
                  seq(0.10, 0.40, by = 0.05))
)

profile() returns Rceattle_list (one fit per grid row, NULL where the fit failed), grid (the user-scale value grid), nll (joint NLL aligned with grid, NA for non-converged fits), and echoes of param and slots. To cross-profile across multiple species, supply one slot per species (e.g. slots = list(1, 2, 3) with param = "sigmaR"); to cross-profile M1 across sex, supply one slot per sex as in the third example above. cores behaves the same way as in the other diagnostic functions.

Comparing single- and multi-species trajectories

Plotting sensitivity runs together is itself a useful diagnostic — large divergences in biomass or mortality deserve scrutiny.

mod_list  <- list(model_1, model_1_retro$Rceattle_list$Year_2019)
mod_names <- 1:length(mod_list)

plot_biomass(Rceattle = mod_list, model_names = mod_names)
plot_recruitment(Rceattle = mod_list, model_names = mod_names, add_ci = TRUE)
plot_depletionSSB(Rceattle = mod_list, model_names = mod_names)

Model average

For model averaging across model variants, see ?model_average.

mod_avg <- model_average(Rceattle = list(model_1, model_1), weights = c(1,1))
plot_biomass(mod_avg)