Scallop Conditional Logit Model Example


This is an example of a conditional logit model using the scallop data from the FishSET package.



Load Data

This analysis uses three of FishSET’s example datasets: the scallop dataframe which contains anonymized scallop data from the Northeast, scallop_ports which is a table of ports and their location, and tenMNSQR which is a spatial dataframe of Northeastern ten minute squares.

The scallop dataframe contains a random sample of 10,000 trips from vessels in the Limited Access Days-at-Sea fleet when they are declared into either an Access Area or Open area Days-at-Sea fishing trip.

Noise was added to fishing locations, landing quantities, and the value of catch. Permit, Operator, and trip identifiers were also anonymized.


vars_to_keep <- c('TRIPID', 'PERMIT.y', 'DATE_TRIP', 'DDLON', 'DDLAT', 'ZoneID', 
                  'LANDED_thousands', 'DOLLAR_2020_thousands', 'port_lon', 'port_lat',
                  'previous_port_lon', 'previous_port_lat')
scallop <- scallop[vars_to_keep]

Load each dataset into FishSET. Rescale the Landed weight to thousands of meat pounds and the Dollar value to thousands of Real dollars. Note: when running this chunk you may get a pop-up asking to identify the location for a new FishSET folder or an existing folder.

load_maindata(scallop, project = "scallopMod", over_write = TRUE)
summary_stats() is a useful way to find NAs in the data.

summary_stats(scallopModMainDataTable, project = "scallopMod")
Create Centroids

A centroid table is needed to create the distance matrix. It can be used as the choice occasion or the alternative choice. The simplest way to create a zonal centroid table is by passing it to create_centroid(), which saves the centroids to the FishSET database.

create_centroid(spat = scallopModTenMnSqrSpatTable,
                project = "scallopMod",
                spatID = "TEN_ID",
                type = "zonal centroid",
                output = "centroid table")
## # A tibble: 4,537 × 3
##    ZoneID cent.lon
##     <dbl>    <dbl>    <dbl>
##  1      0    -73.3     45.8
##  2 336411    -64.9     33.9
##  3 336412    -64.7     33.9
##  4 336413    -64.6     33.9
##  5 336414    -64.4     33.9
##  6 336415    -64.2     33.9
##  7 336416    -64.1     33.9
##  8 336421    -64.9     33.7
##  9 336422    -64.7     33.7
## 10 336423    -64.6     33.7
## # ℹ 4,527 more rows

Alternative Choice

For this example, the alternative choice list will use the longitude and latitude of the disembarking port (previous_port_lon and previous_port_lat) as the choice occasion and the zonal centroid of the fishing areas as the alternative. The minimum haul haul requirement is set to 90.

create_alternative_choice(dat = scallopModMainDataTable,
                          project = "scallopMod",
                          occasion = "lon-lat",
                          occasion_var = c("previous_port_lon", "previous_port_lat"),
                          alt_var = "zonal centroid",
                          zoneID = "ZoneID",
                 = "scallopModZoneCentroid",
                          min.haul = 90
## Alternative choice list saved to FishSET database

The plot below visualizes zone frequency after accounting for the minimum haul requirement from the alternative choice list.

z_ind <- which(alt_choice_list('scallopMod')$dataZoneTrue == 1)

zOut <- 
  zone_summary(scallopModMainDataTable[z_ind, ], 
               spat = scallopModTenMnSqrSpatTable, 
               project = "scallopMod",
               zone.dat = "ZoneID",
               zone.spat = "TEN_ID",
               output = "tab_plot")
ZoneID n
416956 242
387323 229
406962 225
387313 223
387332 198
387322 193
387341 188
406923 177
406961 172
387463 160
387464 160
387333 137
416966 134
387455 130
416816 121
416826 115
406951 113
406933 112
397346 111
406952 110
387456 108
387454 102
387331 99
406963 94
387462 91
397223 91

Expected Catch

This code chunk creates two different expected catch matrices: one using a window of seven days, lag of one and a window of 14 days, lag of two. They will be named user1 and user2 respectively.

# user1 expected catch matrix
create_expectations(dat = scallopModMainDataTable,
                    project = "scallopMod",
                    catch = "LANDED_thousands", 
                    temp.var = "DATE_TRIP",
                    temp.window = 7,
                    temp.lag = 1,
                    year.lag = 0,
                    temporal = 'daily',
                    empty.catch = NA,
                    empty.expectation = 1e-04,
                    default.exp = FALSE,
                    replace.output = TRUE)
## Expected catch/revenue matrix saved to FishSET database
# user2 expected catch matrix
create_expectations(dat = scallopModMainDataTable,
                    project = "scallopMod",
                    catch = "LANDED_thousands", 
                    temp.var = "DATE_TRIP",
                    temporal = "daily",
                    temp.window = 14,
                    temp.lag = 2,
                    empty.catch = NA,
                    empty.expectation = 1e-04,
                    default.exp = FALSE,
                    replace.output = FALSE)
## Expected catch/revenue matrix saved to FishSET database

The data must be checked for common data quality issues before it can be used in the modeling functions (i.e. make_model_design() and discretefish_subroutine()). check_model_data() saves a new version of the primary data with the suffix _final added to indicate that the table is in its “final” state and ready to be used for modeling.

                 project = "scallopMod", 
                 uniqueID = "TRIPID",
                 latlon = c("DDLON","DDLAT"))

Model Design

The model design file below will run two conditional logit models, each using one of the expected catch matrices created earlier (this is specified by using 'individual' in the expectcatchmodels argument).

make_model_design(project = "scallopMod",
                  catchID = "LANDED_thousands",
                  likelihood = "logit_c",
                  initparams = c(0, 0),
                  vars1 = NULL,
                  vars2 = NULL,
         = 'lz', 
                  expectcatchmodels = list('individual')
## Model design file done

Run Models

Use discretefish_subroutine() to run all models in the model design file.

discretefish_subroutine(project = "scallopMod", explorestarts = FALSE)
Use model_params() to see the model output. user1 and user2 are the expected catch parameters and V1 the travel distance parameter. A reasonably specified model should find positive coefficients for user1 and user2 and negative coefficiencts for V1.

model_params("scallopMod", output = 'print')
  • lz:

      estimate std_error t_value
    exp1 0.047 0.003 17.32
    exp2 0.064 0.003 23.82
    V1 -0.005 0 -25.93

Compare model fit.

Model_name AIC AICc BIC PseudoR2
lz -21307 -21307 -21288 0.987