The FishSET R package currently features six statistical functions for maximum likelihood estimation, detailed in section 8.3 of the user manual. These functions offer a solid foundation for a variety of analyses. However, we recognize that these built-in functions may not suit your specific modeling needs. We encourage users with more specialized modeling requirements to develop their own likelihood functions that can be integrated into the FishSET R package. By doing so, users will not only be able to take advantage of other FishSET features such as policy simulation tools but also contribute to a richer toolkit for the FishSET user community.
Here we provide a template for developing a likelihood function that can be seamlessly integrated into the FishSET package. Feel free to reach out to our team at [email protected] with any questions about developing your own likelihood function.
Fork the FishSET repo and clone the forked repository to your computer. More information on “forking” repositories and the GitHub workflow can be found here.
Open the FishSET_RPackage.Rproj file.
Create a new script for your likelihood function following the
template below. Save it in the R
folder.
Commit and push changes.
Review and test code.
If you have gone through the steps above but it has been a while since you worked on the code, pull changes from the original FishSET repo to your clone.
Submit a pull request.
Give your likelihood function an informative name (replace
likelihood_function
in the code chunk above) and save the
script with the same name. Only include code for the likelihood function
in the R script.
The names and order of input arguments must match the code above.
These inputs are created within the make_model_design()
and discretefish_subroutine()
functions and described in detail below.
starts3: Numeric vector that contains starting
parameter values. The length and order of parameter values will depend
on your model structure. For example, the order of starts3
for the conditional
logit model is
c([alternative-specific parameters], [travel-distance parameters])
,
and the length of each depends on the number of variables included in
the model
design.
dat: Numeric matrix that is generated in the
shift_sort_x()
function. The first column contains catch
(thus rows represent observations); the second column indicates the zone
fished; columns 3 to the square of alternatives + 2 (example, if there
are 4 alternatives/zones, then columns 3-18) contains a flattened
identity matrix that has been shifted and sorted such that the zone
selected is moved to the first column position of the matrix (see
example below); and the last x columns, where x equals the number of
alternatives/zones, contains distances from the starting location to
each alternative (distances have also been shifted such that the
distance to the zone selected is first - see example below).
dat
matrix:.$$\begin{bmatrix} 17&2&0&0&1&1&0&0&0&1&0&5&10&15 \\ \end{bmatrix}$$
$$\begin{bmatrix} 0&0&1 \\ 1&0&0 \\ 0&1&0 \\ \end{bmatrix}$$
otherdat: A list that contains variables used in
the model. For example, otherdat
for conditional logit
models contains intdat
variables (as a list object) that
interact with travel distance and griddat
variables (as a
list object) that vary across alternatives. The otherdat
list is generated in the make_model_design()
function. If your input variables are not compatible with the current
version of make_model_design()
, reach out to our team at [email protected] and we
will do our best to update the function to accommodate your modeling
requirements.
alts: Integer representing the total number of zones included in the model.
project: Name of project.
expname: Name of expected catch table(s). This input is used for conditional logit models.
mod.name: Name of the model, which is designated
in the make_model_design()
function.
Use base R functions to calculate the negative log-likelihood (nll) and comprehensively document code.
Save the nll value to a variable named ld
, and insert
the following code at then end of your function to log inputs and
outputs of the function call:
if (is.nan(ld) == TRUE) {
ld <- .Machine$double.xmax
}
ldsumglobalcheck <- ld
paramsglobalcheck <- starts3
LDGlobalCheck <- unlist(as.matrix(ldchoice))
LDGlobalCheck <- list(model = paste0(project, expname, mod.name),
ldsumglobalcheck = ldsumglobalcheck,
paramsglobalcheck = paramsglobalcheck,
LDGlobalCheck = LDGlobalCheck)
pos <- 1
envir = as.environment(pos)
assign("LDGlobalCheck", value = LDGlobalCheck, envir = envir)
return(ld)
Before submitting a Pull Request, we ask contributors to thoroughly review code and test likelihood functions (ideally on multiple datasets if possible). Also, verify that the function is well-documented and remove unnecessary or redundant code that might have been left behind during development. Please reach out to the FishSET team ([email protected]) if any questions come up during the review and testing process.
Give the Pull Request the same name as the likelihood function you just created and provide a brief description. Once the FishSET team receives the Pull Request we will do our best to review the function in a timely manner and notify you if we merge it with the main package or request any changes to the function prior to merging.
The Pull Request for your likelihood function should not change any other code in the FishSET package. If changes to the package outside of your function are necessary, please contact the FishSET development team ([email protected]) to resolve any issues.