--- title: "3. Model diagnostics" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteEngine{knitr::rmarkdown} %\VignetteIndexEntry{3. Model diagnostics} %\VignetteEncoding{UTF-8} --- ```{r setup, include=FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", eval = FALSE, fig.width = 7, fig.height = 5 ) ``` Rceattle provides several layers of diagnostics: **convergence diagnostics** attached automatically to every fit, **S3 methods** (`residual`, `logLik`, etc), **fit plots** to visually inspect how well the model reproduces observed data, **retrospective analysis** (peels) to detect patterns of systematic over- or under-estimation, **jitter testing** to check that the optimiser has found a global minimum, **self-testing** (simulation–estimation) to check that the model can recover its own estimates from simulated data, and **likelihood profiling** to check how informative the data are about a particular parameter. ## Setup and plotting data ```{r} library(Rceattle) # Fit a single-species model for the 2022 Northern Rockfish assessment. data("NorthernRockfish2022") nrdata <- NorthernRockfish2022 nrdata$fleet_control$proj_F_prop <- 1 plot_data(nrdata) ``` Fitting the model ```{r} model_1 <- Rceattle::fit_mod( data_list = nrdata, estimateMode = 0, # Estimate initMode = 1, # Assume unfished equilibrium M1Fun = build_M1( updateM1 = TRUE, M1_model = 1, M1_use_prior = TRUE, M_prior = 0.06, M_prior_sd = 0.05), fit_control = fit_control(phase = TRUE, verbose = 0) ) summary(model_1) ``` ## S3 Methods ```{r} summary(model_1) # same compact summary coef(model_1) # estimated fixed-effect parameters logLik(model_1) # logLik with df attribute (AIC works) AIC(model_1) vcov(model_1) # fixed-effect covariance fro residuals(model_1) # data.frame of residuals as.data.frame(model_1) ``` ## Convergence diagnostics `fit_mod()` runs a battery of convergence checks after optimization and attaches the result as `model$convergence`. Each check is one record with a `severity` (`"OK"`, `"NOTE"`, `"WARN"`, or `"FAIL"`); the object's `status` is the worst severity present. Non-OK checks are surfaced via `message()` during the fit (a non-converged model is never turned into an error — the fit is always returned with its diagnostics attached), and `print(model)` shows the overall status. ```{r} model_1$convergence # overall status + any non-OK checks print(model_1$convergence, all = TRUE) # show every check, including OK ones ``` Re-run the battery on any fit with `convergence_diagnostics()`: ```{r} convergence_diagnostics(model_1) ``` The checks cover: - **`max_gradient`** — maximum absolute marginal gradient and the parameter carrying it (`WARN` > 1e-3, `FAIL` > 1). - **`pdHess`** — whether the Hessian is positive definite (`FAIL` if not). - **`sdreport_failed`** — `FAIL` when an `sdreport` was requested but the Hessian could not be inverted. - **`hessian_conditioning`** — the Hessian condition number and, when poorly conditioned, the parameters loading on the least-identified direction (the unidentified linear combination). Complements `TMBhelper::check_estimability` (a per-parameter verdict) with a continuous severity and direction. - **`parameters_on_bounds`** — parameters that hit a configured `build_bounds()` limit (often unidentified or mis-scaled). - **`phasing`** — phases that ended with a high gradient, localizing which parameter block is hard to fit. - **`estimability`** — surfaces `TMBhelper::check_estimability` when it ran. Each record carries a `data` element with the underlying numbers (e.g. the condition number and loadings for `hessian_conditioning`). ## Fit plots ### Composition data `plot_comp()` overlays observed (bars) and predicted (line) age or length compositions for every fleet. ```{r} plot_comp(model_1) ``` ### Survey indices `plot_index()` and `plot_logindex()` show observed vs. predicted survey biomass on the natural and log scales, respectively. `plot_indexresidual()` plots the log-scale residuals — a useful first check for time trends or heteroscedasticity. ```{r} plot_index(model_1) plot_logindex(model_1) plot_indexresidual(model_1) ``` ### Catch ```{r} plot_catch(model_1) ``` ## Retrospective analysis A retrospective analysis systematically removes the most recent years of data one peel at a time and re-fits the model. Bias in the final estimates relative to earlier peels is summarised by Mohn's rho: values outside ±0.2 (for SSB) generally indicate a problem worth investigating. ```{r} model_1_retro <- retrospective(Rceattle = model_1, peels = 5) # Mohn's rho for each quantity model_1_retro$mohns # Plot historical trajectories across peels plot_biomass(model_1_retro$Rceattle_list) # Include the projection period to see how the forecast changes plot_biomass(model_1_retro$Rceattle_list, incl_proj = TRUE) ``` The `retrospective()` function returns a list with two elements: | Element | Description | |---|---| | `Rceattle_list` | List of fitted models, ordered from full run to most-peeled | | `mohns` | Data frame with columns `Object` (quantity, e.g. `"Biomass"`, `"SSB"`), `Forecast year` (0 = terminal year bias; 1+ = forecast skill), `N` (number of peels), and one column per species with mean relative error (Mohn's rho) | The `nyrs_forecast` argument (default 3) additionally evaluates rho for years projected beyond the terminal peel, making it possible to quantify forecast skill alongside retrospective bias in a single call. ## Jitter testing Jitter testing re-fits the model from many randomly perturbed starting values to check whether the optimizer consistently returns the same minimum negative log-likelihood (NLL). A spread of NLL values across jitters suggests the likelihood surface has multiple local minima and results should be interpreted cautiously. ```{r} jitters <- jitter(Rceattle = model_1, njitter = 10, phase = TRUE) # default njitter = 50 # Histogram of NLL differences relative to the best run hist(log(jitters$nll - min(jitters$nll)), main = "Jitter NLL spread (log scale)", xlab = "log(NLL - min NLL)") # Overlay biomass trajectories — tight overlap indicates a stable optimum plot_biomass(jitters$Rceattle_list) # Number of runs that converged length(jitters$Rceattle_list) ``` Non-converging runs are silently dropped from `Rceattle_list`, so if `length(jitters$Rceattle_list)` is much less than `njitter`, convergence may be fragile. `retrospective()` and `jitter()` both fit peels / starts in parallel by default. Pass `cores = 1` to force sequential execution; otherwise the call uses a PSOCK cluster sized at `parallel::detectCores() - 6` (capped at 2 under `R CMD check`). ## Self-test (simulation–estimation) `self_test()` simulates `nsim` datasets from a fitted model — keeping its estimated parameters fixed — and re-fits the model to each simulated dataset. If estimates from the refits cluster around the parameter values of the fit they were simulated from, the model is at least self-consistent (it can recover its own estimates from data it generated). Persistent bias or wide spread in a quantity is a sign that the data are not informative about it. ```{r} sims <- self_test(model_1, nsim = 10) # default nsim = 50 # Number of simulations that converged (non-converged runs are dropped) length(sims) # Overlay biomass / SSB trajectories across simulations — the original fit's # trajectory should sit inside the spread of the refits. plot_biomass(c(list(model_1), sims), model_names = c("fit", names(sims))) plot_ssb(c(list(model_1), sims), model_names = c("fit", names(sims))) ``` Each simulation uses `seed + i` as its RNG seed, so results are reproducible whether or not `cores > 1`. Set `simulate = FALSE` to refit against the model's *expected* values (no observation error) rather than draws from the observation likelihood — useful for confirming that the estimator returns the generating parameters in the noise-free limit. `cores` behaves the same way as in `retrospective()` and `jitter()`. ## Likelihood profile `profile()` re-fits the model across a grid of fixed values for one or more cells of a chosen parameter and returns the resulting NLL surface. A flat profile means the data carry little information about the parameter; a sharp minimum away from the MLE means the fit has not actually settled there. The function generalises the legacy `profile_rsigma()` helper from `examples/Profiling_sigmaR.R` and works for any parameter slot in `Rceattle$estimated_params`; it is tested for `R_log_sd` (recruitment sigma), `rec_pars` (stock-recruit parameters), and `log_M1` (residual natural mortality). Two ways to specify `param`: - **Natural-scale alias** (recommended for the three documented parameters). The alias implies `transform = "log"` so you pass natural-scale values like sigmaR or M directly. For the `rec_pars` aliases (`"R0"`, `"alpha"`, `"beta"`) the column is filled in from the alias name, so `slots` only needs the species index. - `"sigmaR"` / `"R_sd"` → `R_log_sd` - `"M1"` → `log_M1` - `"R0"` → `rec_pars[, 1]` - `"alpha"` → `rec_pars[, 2]` - `"beta"` → `rec_pars[, 3]` - **Raw parameter slot**. Pass the storage name (`"R_log_sd"`, `"rec_pars"`, `"log_M1"`, etc.); `slots` must index the full array and `transform` controls the scale. `slots` is a list of integer index vectors — one entry per cell to hold fixed — and (under raw names) its entries must match the dimensionality of the parameter array (length 1 for vectors, length 2 for matrices, length 3 for 3-D arrays). `values` is a list of value grids parallel to `slots`; the full grid of fits is `expand.grid(values)`, so multiple slots produce an N-D cross-profile. ```{r} # 1-D profile: sigmaR for species 1 (natural-scale alias) prof_sigmaR <- profile( fitted = model_1, param = "sigmaR", slots = list(1), values = list(seq(0.1, 1.5, by = 0.1)) ) plot(prof_sigmaR$grid$slot_1, prof_sigmaR$nll - min(prof_sigmaR$nll, na.rm = TRUE), type = "l", xlab = "sigmaR", ylab = "dNLL") # 1-D profile: SRR alpha for species 1 # (alias fills in the rec_pars column; slot is just the species index) prof_alpha <- profile( fitted = model_1, param = "alpha", slots = list(1), values = list(seq(2, 80, length.out = 20)) ) # 2-D cross-profile: M1 across sex for species 1, age 1 prof_M_sex <- profile( fitted = model_1, param = "M1", slots = list(c(1, 1, 1), c(1, 2, 1)), # males and females values = list(seq(0.10, 0.40, by = 0.05), seq(0.10, 0.40, by = 0.05)) ) ``` `profile()` returns `Rceattle_list` (one fit per grid row, `NULL` where the fit failed), `grid` (the user-scale value grid), `nll` (joint NLL aligned with `grid`, `NA` for non-converged fits), and echoes of `param` and `slots`. To cross-profile across multiple species, supply one slot per species (e.g. `slots = list(1, 2, 3)` with `param = "sigmaR"`); to cross-profile M1 across sex, supply one slot per sex as in the third example above. `cores` behaves the same way as in the other diagnostic functions. ## Comparing single- and multi-species trajectories Plotting sensitivity runs together is itself a useful diagnostic — large divergences in biomass or mortality deserve scrutiny. ```{r} mod_list <- list(model_1, model_1_retro$Rceattle_list$Year_2019) mod_names <- 1:length(mod_list) plot_biomass(Rceattle = mod_list, model_names = mod_names) plot_recruitment(Rceattle = mod_list, model_names = mod_names, add_ci = TRUE) plot_depletionSSB(Rceattle = mod_list, model_names = mod_names) ``` ## Model average For model averaging across model variants, see `?model_average`. ```{r} mod_avg <- model_average(Rceattle = list(model_1, model_1), weights = c(1,1)) plot_biomass(mod_avg) ```