Package 'FishSET' reference manual

Title:	Spatial Economics Toolbox for Fisheries
Description:	The Spatial Economics Toolbox for Fisheries (FishSET) is a set of tools for organizing data; developing, improving and disseminating modeling best practices.
Authors:	Lisa Pfeiffer [aut, cre], Paul G Carvalho [aut] , Anna Abelman [aut], Min-Yang Lee [aut], Melanie Harsch [aut], Bryce McManus [aut], Alan Haynie [aut]
Maintainer:	Lisa Pfeiffer <[email protected]>
License:	MIT + file LICENSE
Version:	1.1.0
Built:	2025-03-21 18:29:17 UTC
Source:	https://github.com/noaa-nwfsc/FishSET

Add removed variables back into dataset - non-interactive version

Description

Add columns that have been removed from the primary dataset back into the primary dataset.

Usage

add_vars(working_dat, raw_dat, vars, project)
add_vars(working_dat, raw_dat, vars, project)

Arguments

`working_dat`	Primary data containing information on hauls or trips. Table in FishSET database contains the string 'MainDataTable'.
`raw_dat`	Unmodified raw version of the primary dataset. Should be a character specifying a table from the FishSET database containing the string ‘MainDataTable’ and date table was created.
`vars`	Character string, variables from `raw_dat` to add back into `working_dat`.
`project`	Character, name of project. Parameter is used to generate meaningful table names in FishSET database.

Details

Add variables back into the dataset that were removed. The removed variables are obtained from the raw_dat and merged into the working data based on a row identifier. The row identifier is created when a variable is removed using the select_vars function. The row identifier is used to match the raw data variables to working_dat.

Examples

## Not run: 
add_vars(pcodMainDataTable, "pcodMainDataTable20200410", "pollock")

## End(Not run)

## Not run: 
add_vars(pcodMainDataTable, "pcodMainDataTable20200410", "pollock")

## End(Not run)

Add removed variables back into dataset

Description

Add columns that have been removed from the primary dataset back into the primary dataset.

Usage

add_vars_gui(working_dat, raw_dat, project)
add_vars_gui(working_dat, raw_dat, project)

Arguments

`working_dat`	Primary data containing information on hauls or trips. Table in FishSET database contains the string 'MainDataTable'.
`raw_dat`	Unmodified raw version of the primary dataset. Should be a character specifying a table from the FishSET database containing the string ‘MainDataTable’ and date table was created.
`project`	String, name of project.

Details

Opens an interactive table that allows users to select which variables to be added back into the working dataset.
The removed variables are obtained from the raw_dat and merged into the working data based on a row identifier. The row identifier is created when the variable is removed using the select_vars function. The row identifier is used to match the raw data variables to working_dat.

Examples

## Not run: 
select_vars_gui(pcodMainDataTable)
add_vars_gui(pcodMainDataTable, 'pcodMainDataTable20100101', 'pcod')

## End(Not run)
## Not run: 
select_vars_gui(pcodMainDataTable)
add_vars_gui(pcodMainDataTable, 'pcodMainDataTable20100101', 'pcod')

## End(Not run)

Aggregating function

Description

Aggregating function

Usage

agg_helper(
  dataset,
  value,
  period = NULL,
  group = NULL,
  within_group = NULL,
  fun = "sum",
  count = FALSE,
  format_tab = "decimal"
)
agg_helper(
  dataset,
  value,
  period = NULL,
  group = NULL,
  within_group = NULL,
  fun = "sum",
  count = FALSE,
  format_tab = "decimal"
)

Arguments

`dataset`	'MainDataTable' to aggregate.
`value`	String, name of variable to aggregate.
`period`	String, name of period variable to aggregate by. Primarily for internal use. Places temporal variables to the right-end of the summary table.
`group`	String, name of grouping variable(s) to aggregate by.
`within_group`	String, name of grouping variable(s) for calculating within group percentages. `fun = "percent"` and `period` or `group` are required.
`fun`	String, function name to aggregate by. Also accepts anonymous functions. To calculate percentage, set `fun = "percent"`; this will return the percent of total when `within_group = NULL`.
`count`	Logical, if `TRUE` then returns the number of observations by `period` and/or `group`.
`format_tab`	String. Options include `"decimal"` (default), `"scientific"`, and `"PrettyNum"` (rounds to two decimal places and uses commas).

Examples

## Not run: 

# total catch by port
agg_helper(pollockMainDataTable, value = "OFFICIAL_TOTAL_CATCH_MT", 
           group = "PORT_CODE", fun = "sum")

# count permits
agg_helper(pollockMainDataTable, value = "PERMIT", count = TRUE, fun = NULL)

# count permits by gear type
agg_helper(pollockMainDataTable, value = "PERMIT", group = "GEAR_TYPE",
           count = TRUE, fun = NULL)

# percent of total by gear type
agg_helper(pollockMainDataTable, value = "PERMIT", group = "GEAR_TYPE",
           count = TRUE, fun = "percent")
 
# within group percentage          
agg_helper(pollockMainDataTable, value = "OFFICIAL_TOTAL_CATCH_MT", 
           fun = "percent", group = c("PORT_CODE", "GEAR_TYPE"), 
           within_group = "PORT_CODE")

## End(Not run)
## Not run: 

# total catch by port
agg_helper(pollockMainDataTable, value = "OFFICIAL_TOTAL_CATCH_MT", 
           group = "PORT_CODE", fun = "sum")

# count permits
agg_helper(pollockMainDataTable, value = "PERMIT", count = TRUE, fun = NULL)

# count permits by gear type
agg_helper(pollockMainDataTable, value = "PERMIT", group = "GEAR_TYPE",
           count = TRUE, fun = NULL)

# percent of total by gear type
agg_helper(pollockMainDataTable, value = "PERMIT", group = "GEAR_TYPE",
           count = TRUE, fun = "percent")
 
# within group percentage          
agg_helper(pollockMainDataTable, value = "OFFICIAL_TOTAL_CATCH_MT", 
           fun = "percent", group = c("PORT_CODE", "GEAR_TYPE"), 
           within_group = "PORT_CODE")

## End(Not run)

Get Alternative Choice List

Description

Returns the Alternative Choice list from the FishSET database.

Usage

alt_choice_list(project, name = NULL)
alt_choice_list(project, name = NULL)

Arguments

`project`	Name of project.
`name`	Name of Alternative Choice list in the FishSET database. The table name will contain the string "AltMatrix". If `NULL`, the default table is returned. Use `tables_database` to see a list of FishSET database tables by project.

Set x-axis labels to 45 degrees

Description

Set x-axis labels to 45 degrees

Usage

angled_theme()
angled_theme()

Assign observations to fishing zones

Description

Assign each observation in the primary dataset to a fishery management or regulatory zone. This function is primarily called by other functions that require zone assignment but can also be used on its own.

Usage

assignment_column(
  dat,
  project,
  spat,
  lon.dat,
  lat.dat,
  cat,
  name = "ZoneID",
  closest.pt = FALSE,
  bufferval = NULL,
  lon.spat = NULL,
  lat.spat = NULL,
  hull.polygon = FALSE,
  epsg = NULL,
  log.fun = TRUE
)
assignment_column(
  dat,
  project,
  spat,
  lon.dat,
  lat.dat,
  cat,
  name = "ZoneID",
  closest.pt = FALSE,
  bufferval = NULL,
  lon.spat = NULL,
  lat.spat = NULL,
  hull.polygon = FALSE,
  epsg = NULL,
  log.fun = TRUE
)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in FishSET database contains the string 'MainDataTable'.
`project`	name of project.
`spat`	Spatial data containing information on fishery management or regulatory zones. `sf` objects are recommended, but `sp` objects can be used as well. If using a spatial table read from a csv file, then arguments `lon.spat` and `lat.spat` are required. To upload your spatial data to the FishSETFolder see `load_spatial`.
`lon.dat`	Longitude variable in `dat`.
`lat.dat`	Latitude variable in `dat`.
`cat`	Variable or list in `spat` that identifies the individual areas or zones. If `spat` is class `sf`, `cat` should be name of list containing information on zones.
`name`	The name of the new assignment column. Defaults to `"ZoneID"`.
`closest.pt`	Logical, if `TRUE`, observations that fall outside zones are classed as the closest zone polygon to the point.
`bufferval`	Maximum buffer distance, in meters, for assigning observations to the closest zone polygon. If the observation is not within the defined `bufferval`, then it will not be assigned to a zone polygon. Required if `closest.pt = TRUE`.
`lon.spat`	Variable or list from `spat` containing longitude data. Required for spatial tables read from csv files. Leave as `NULL` if `spat` is an `sf` or `sp` object.
`lat.spat`	Variable or list from `spat` containing latitude data. Required for spatial tables read from csv files. Leave as `NULL` if `spat` is an `sf` or `sp` object.
`hull.polygon`	Logical, if `TRUE`, creates convex hull polygon. Use if spatial data creating polygon are sparse or irregular.
`epsg`	EPSG code. Manually set the epsg code, which will be applied to `spat` and `dat`. If epsg is not specified but is defined for `spat`, then the `spat` epsg will be applied to `dat`. In addition, if epsg is not specified and epsg is not defined for `spat`, then a default epsg value will be applied to `spat` and `dat` (`epsg = 4326`). See http://spatialreference.org/ to help identify optimal epsg number.
`log.fun`	Logical, whether to log function call (for internal use).

Details

Function uses the specified latitude and longitude from the primary dataset to assign each row of the primary dataset to a zone. Zone polygons are defined by the spatial dataset. Set hull.polygon to TRUE if spatial data is sparse or irregular. Function is called by other functions if a zone identifier does not exist in the primary dataset.

Value

Returns primary dataset with new assignment column.

Examples

## Not run: 
pollockMainDataTable <- 
     assignment_column(pollockMainDataTable, "pollock", spat = pollockNMFSSpatTable,
                       lon.dat = "LonLat_START_LON", lat.dat = "LonLat_START_LAT")

## End(Not run)
## Not run: 
pollockMainDataTable <- 
     assignment_column(pollockMainDataTable, "pollock", spat = pollockNMFSSpatTable,
                       lon.dat = "LonLat_START_LON", lat.dat = "LonLat_START_LAT")

## End(Not run)

Creates numeric variables divided into equal sized groups

Description

Creates numeric variables divided into equal sized groups

Usage

bin_var(dat, project, var, br, name = "bin", labs = NULL, ...)
bin_var(dat, project, var, br, name = "bin", labs = NULL, ...)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in FishSET database contains the string 'MainDataTable'.
`project`	String, name of project.
`var`	Numeric variable in `dat` to bin into a factor.
`br`	Numeric vector. If a single number, the range of `var` is divided into `br` even groups. If two or more values are given, `var` is divided into intervals.
`name`	Variable name to return. Defaults to 'bin'.
`labs`	A character string of category labels.
`...`	Additional arguments passed to `cut`.

Details

Function adds a new factor variable, labeled by name, to the primary dataset. The numeric variable is divided into equal sized groups if the length of br is equal to one and into intervals if the length of br is greater than one.

Value

Returns the primary dataset with binned variable added.

Examples

## Not run: 
 pollockMainDataTable <- bin_var(pollockMainDataTable, 'pollock', 'HAUL', 10, 'HAULCAT')
 pollockMainDataTable <- bin_var(pollockMainDataTable, 'pollock', 'HAUL', c(5,10), 'HAULCAT')

## End(Not run)
## Not run: 
 pollockMainDataTable <- bin_var(pollockMainDataTable, 'pollock', 'HAUL', 10, 'HAULCAT')
 pollockMainDataTable <- bin_var(pollockMainDataTable, 'pollock', 'HAUL', c(5,10), 'HAULCAT')

## End(Not run)

Compare bycatch CPUE and total catch/percent of total catch for one or more species

Description

Compare bycatch CPUE and total catch/percent of total catch for one or more species

Usage

bycatch(
  dat,
  project,
  cpue,
  catch,
  date,
  period = "year",
  names = NULL,
  group = NULL,
  sub_date = NULL,
  filter_date = NULL,
  date_value = NULL,
  filter_by = NULL,
  filter_value = NULL,
  filter_expr = NULL,
  facet_by = NULL,
  conv = "none",
  tran = "identity",
  format_lab = "decimal",
  value = "stc",
  combine = FALSE,
  scale = "fixed",
  output = "tab_plot",
  format_tab = "wide"
)
bycatch(
  dat,
  project,
  cpue,
  catch,
  date,
  period = "year",
  names = NULL,
  group = NULL,
  sub_date = NULL,
  filter_date = NULL,
  date_value = NULL,
  filter_by = NULL,
  filter_value = NULL,
  filter_expr = NULL,
  facet_by = NULL,
  conv = "none",
  tran = "identity",
  format_lab = "decimal",
  value = "stc",
  combine = FALSE,
  scale = "fixed",
  output = "tab_plot",
  format_tab = "wide"
)

Arguments

`dat`	Primary data frame over which to apply function. Table in FishSET database should contain the string 'MainDataTable'.
`project`	name of project.
`cpue`	A string of CPUE variable names. The function outputs the mean CPUE by period. The variable names must match the order of variable names in `catch` and `names`.
`catch`	A character string of names of catch variables to aggregate. The function outputs the total catch or share of total catch by period depending on the value argument. The order of the catch variable string must match those of the `cpue` and `names` arguments.
`date`	A variable containing dates to aggregate by.
`period`	Period to aggregate by. Options include 'year', month', and weeks'.
`names`	An optional string of species names that will be used in the plot. If `NULL`, then species names from `catch` will be used.
`group`	A categorical variable in `dat` to group by.
`sub_date`	Date variable used for subsetting, grouping, or splitting by date.
`filter_date`	The type of filter to apply to 'MainDataTable'. To filter by a range of dates, use `filter_date = "date_range"`. To filter by a given period, use "year-day", "year-week", "year-month", "year", "month", "week", or "day". The argument `date_value` must be provided.
`date_value`	This argument is paired with `filter_date`. To filter by date range, set `filter_date = "date_range"` and enter a start- and end-date into `date_value` as a string: `date_value = c("2011-01-01", "2011-03-15")`. To filter by period (e.g. "year", "year-month"), use integers (4 digits if year, 1-2 digits if referencing a day, month, or week). Use a vector if filtering by a single period: `date_filter = "month"` and `date_value = c(1, 3, 5)`. This would filter the data to January, March, and May. Use a list if using a year-period type filter, e.g. "year-week", with the format: `list(year, period)`. For example, `filter_date = "year-month"` and `date_value = list(2011:2013, 5:7)` will filter the data table from May through July for years 2011-2013.
`filter_by`	String, variable name to filter 'MainDataTable' by. the argument `filter_value` must be provided.
`filter_value`	A vector of values to filter 'MainDataTable' by using the variable in `filter_by`. For example, if `filter_by = "GEAR_TYPE"`, `filter_value = 1` will include only observations with a gear type of 1.
`filter_expr`	String, a valid R expression to filter 'MainDataTable' by using the variable in `filter_by`.
`facet_by`	Variable name to facet by. Accepts up to two variables. Facetting by `"year"`, `"month"`, or `"week"` is available if a date variable is added to `sub_date`.
`conv`	Convert catch variable to `"tons"`, `"metric_tons"`, or by using a function entered as a string. Defaults to `"none"` for no conversion.
`tran`	A function to transform the y-axis. Options include log, log2, log10, sqrt.
`format_lab`	Formatting option for y-axis labels. Options include `"decimal"` or `"scientific"`.
`value`	Whether to return raw catch ("raw") or share of total catch ('stc').
`combine`	Logical, whether to combine variables listed in `group`.
`scale`	Scale argument passed to `facet_grid`. Defaults to `"fixed"`. Other options include `"free_y"`, `"free_x"`, and `"free_xy"`.
`output`	Output type. Options include 'table' or 'plot'.
`format_tab`	How table output should be formatted. Options include `'wide'` (the default) and `'long'`.

Details

Returns a plot and/or table of the mean CPUE and share of total catch or raw count for each species entered. For optimal plot size in an R Notebook/Markdown document, we recommend including no more than four species. The order of variables in the cpue and catch arguments must be in the same order as in the names argument. The names argument is used to join the catch and cpue variables together.

Value

bycatch() compares the average CPUE and catch total/share of total catch between one or more species. The data can be filtered by date and/or by a variable. filter_date specifies the type of date filter to apply–by date-range or by period. date_value should contain the values to filter the data by. To filter by a variable, enter its name as a string in filter_by and include the values to filter by in filter_value. Only one grouping variable will be displayed; however, any number of variables can be combined by using combine = TRUE, but no more than three is recommended. For faceting, any variable in the dataset can be used, but "year" and "month" are also available provided a date variable is added to sub_date. Generally, no more than four species should be compared, and even fewer when faceting due to limited plot space. A list containing a table and plot are printed to the console and viewer by default. For optimal plot size in an R Notebook/Markdown document, use the chunk option fig.asp = 1.

Examples

## Not run: 
cpue(pollockMainDataTable, "myproject", xWeight = "f1Weight",
  xTime = "Hour", "f1_cpue"
)

bycatch(pollockMainDataTable, "myproject", 
        cpue = c("f1_cpue", "f2_cpue", "f3_cpue", "f4_cpue"),
        catch = c("f1", "f2", "f3", "f4"), date = "FISHING_START_DATE",
        names = c("fish_1", "fish_2", "fish_3", "fish_4"), period = "month",
        date_filter = "year", date_value = 2011, value = "stc", 
        output = "table")

## End(Not run)
## Not run: 
cpue(pollockMainDataTable, "myproject", xWeight = "f1Weight",
  xTime = "Hour", "f1_cpue"
)

bycatch(pollockMainDataTable, "myproject", 
        cpue = c("f1_cpue", "f2_cpue", "f3_cpue", "f4_cpue"),
        catch = c("f1", "f2", "f3", "f4"), date = "FISHING_START_DATE",
        names = c("fish_1", "fish_2", "fish_3", "fish_4"), period = "month",
        date_filter = "year", date_value = 2011, value = "stc", 
        output = "table")

## End(Not run)

Linear Model for Catch

Description

First stage regression model for catch.

Usage

catch_lm(
  dat,
  project,
  catch.formula,
  zoneID = NULL,
  exp.name = NULL,
  new.name = NULL,
  date,
  output = "matrix"
)
catch_lm(
  dat,
  project,
  catch.formula,
  zoneID = NULL,
  exp.name = NULL,
  new.name = NULL,
  date,
  output = "matrix"
)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in FishSET database contains the string 'MainDataTable'.
`project`	String, name of project.
`catch.formula`	A formula object specifying the linear model.See `stats::lm()`.
`zoneID`	zone ID Variable in `dat` that identifies the individual zones or areas. Required if merging expected catch into `dat` using `exp.name`, or when creating a new expected catch matrix and `exp.name` is `NULL` (see `output` below).
`exp.name`	Name(s) of expected catch matrix to merge into `dat`.
`new.name`	Optional, string. When `output = 'matrix'`, `new.name` will become the name of the new expected catch matrix saved to the FishSET DB expected catch list. When `output = 'dataset'`, `new.name` will become the name of the new expected catch variable added to the primary dataset.
`date`	Date variable from `dat` used to create expected catch matrix.
`output`	Whether to output `dat` with the expected catch variable added (`'dataset'`) or to save an expected catch matrix to the expected catch FishSET DB table (`'matrix'`). Defaults to `output = 'matrix'`.

Details

catch_lm() can merge an expected catch matrix into the primary dataset before running the linear model. This is done using by passing exp.name and zoneID to merge_expected_catch() and is for convenience; users can do this separately using merge_expected_catch() if desired, just make sure to leave exp.name empty before running catch_lm(). Merging expected catch in a separate step is useful for creating tables and plots before running a first stage linear regression.

Value

catch_lm() has two output options: dataset and matrix. When output == 'dataset', the primary dataset will be returned with the fitted values from the model added as a new column. The new column is named using new.name.

When output == 'matrix' an expected catch matrix is created and saved to the FishSET DB expected catch list (it is not outputted to the console). There are two ways to create an expected catch matrix: by using an existing expected catch matrix in catch.formula, or by using a zone-identifier column (i.e. zoneID) in the catch.formula. For example, if you have created an expected catch matrix named 'user1' using create_expectations(), catch.formula could equal catch ~ vessel_length * user1. In this case exp.name would equal 'user1'. Alternatively, you could create an expected catch matrix by specifying catch.formula as catch ~ vessel_length * zone. In this case, exp.name = NULL and zoneID = 'zone'.

Save Primary Table's Centroid Columns to FishSET Database

Description

Save the unique centroid values from the primary table to the FishSET Database. Use this function if zone ID and centroid longitude/latitude are included in the primary table.

Usage

centroid_to_fsdb(
  dat,
  spat.name = NULL,
  project,
  zoneID,
  cent.lon,
  cent.lat,
  type = "zone"
)
centroid_to_fsdb(
  dat,
  spat.name = NULL,
  project,
  zoneID,
  cent.lon,
  cent.lat,
  type = "zone"
)

Arguments

`dat`	Required, main data frame containing data on hauls or trips. Table in FishSET database should contain the string `MainDataTable`.
`spat.name`	Optional, a name to associate with the centroid table.
`project`	Name of project.
`zoneID`	Variable in `dat` that identifies the individual zones or areas.
`cent.lon`	Required, variable in `dat` that identifies the centroid longitude of zones or areas.
`cent.lat`	Required, variable in `dat` that identifies the centroid latitude of zones or areas.
`type`	The type of centroid. Options include `"zone"` for zonal centroids and `"fish"` for fishing centroids.

Details

In certain cases, the user may have the necessary spatial variables to run a discrete choice model included in the primary table when uploaded to FishSET, and does not need a spatial table to assign observations to zones or find centroids (e.g. by using create_centroid()). However, a centroid table table must be saved to the FishSET Database if a centroid option is used to define alternative choice (see create_alternative_choice()). cent_to_fsdb() allows users to save a zonal or fishing centroid table provided they have the required variables: a zone ID (zoneID), a centroid longitude (cent.lon), and a centroid latitude (cent.lat) column.

Change variable data class

Description

View data class for each variable and call appropriate functions to change data class as needed.

Usage

change_class(dat, project, x = NULL, new_class = NULL, save = FALSE)
change_class(dat, project, x = NULL, new_class = NULL, save = FALSE)

Arguments

`dat`	Primary data frame over which to apply function. Table in FishSET database should contain the string 'MainDataTable'.
`project`	Name of project.
`x`	A character string of variable(s) in `dat` that will be changed to `new_class`. One ore more variables may be included. Default set to NULL.
`new_class`	A character string of data classes that `x` should be changed to. Length of `new_class` should match the length of `x` unless all variables in `x` should be the same `new_class`. Defaults to NULL. Options are "numeric", "factor", "date", "character". Must be in quotes.
`save`	Logical. Should the data table be saved in the FishSET database, replacing the working data table in the database? Defaults to FALSE.

Details

Returns a table with data class for each variable in dat and changes variable classes. To view variable classes run the function with default settings, specifying only dat and project. If variable class should be changed, run the function again, specifying the variable(s) (x) to be changed and the new_class(es) (new_class). Set save to TRUE to save modified data table.

Value

Table with data class for each variable and the working data with modified data class as specified.

Examples

## Not run: 
#View table without changing class or saving
change_class(pollockMainDataTable, "myproject")

#Change class for a single variable and save data table to FishSET database
change_class(pollockMainDataTable, "myproject", x = "HAUL", new_class = 'numeric', save=TRUE)

#Change class for multiple variables and save data table to FishSET database
change_class(pollockMainDataTable, "myproject", x = c("HAUL","DISEMBARKED_PORT"),
 new_class = c('numeric', 'factor'), save=TRUE)

## End(Not run)
## Not run: 
#View table without changing class or saving
change_class(pollockMainDataTable, "myproject")

#Change class for a single variable and save data table to FishSET database
change_class(pollockMainDataTable, "myproject", x = "HAUL", new_class = 'numeric', save=TRUE)

#Change class for multiple variables and save data table to FishSET database
change_class(pollockMainDataTable, "myproject", x = c("HAUL","DISEMBARKED_PORT"),
 new_class = c('numeric', 'factor'), save=TRUE)

## End(Not run)

Check for common data quality issues affecting modeling functions

Description

Check the primary dataset for NAs, NaNs, Inf, and that each row is a unique choice occurrence

Usage

check_model_data(dat, project, uniqueID, latlon = NULL, save.file = TRUE)
check_model_data(dat, project, uniqueID, latlon = NULL, save.file = TRUE)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in FishSET database contains the string 'MainDataTable'.
`project`	Project name.
`uniqueID`	Variable in `dat` containing unique occurrence identifier.
`latlon`	Vector of names for variables with lat, lon coordinates to be check if using 'lat-lon' as starting location.
`save.file`	Logical, if TRUE and no data issues are identified, the dataset is saved to the FishSET database. Defaults to `TRUE`.

Details

It is best to check the data for NAs, NaNs and Inf, and that each row is a unique choice occurrence after data creation functions have been run but before making the model design file (make_model_design). These steps should be taken even if the data passed earlier data verification checks, as data quality issues can arise in the creation or modification of data. Model functions may fail or return inaccurate results if data quality issues exist. The integrated data will not save if any of these issues are in the dataset. If data passes all tests, then data will be saved in the FishSET database with the prefix ‘final’. The data index table will also be updated and saved.

Value

Returns statements of data quality issues in the data. Saves table to FishSET database.

Examples

## Not run: 
check_model_data(MainDataTable, uniqueID = "uniqueID_Code", save.file = TRUE)

## End(Not run)

## Not run: 
check_model_data(MainDataTable, uniqueID = "uniqueID_Code", save.file = TRUE)

## End(Not run)

Check and correct spatial data format

Description

Converts spatial data to a sf object

Usage

check_spatdat(spatdat, lon = NULL, lat = NULL, id = NULL)
check_spatdat(spatdat, lon = NULL, lat = NULL, id = NULL)

Arguments

`spatdat`	Spatial data containing information on fishery management or regulatory zones.
`lon`	Longitude variable in `spatdat`. This is required for csv files or if `spatdat` is a dataframe (i.e. is not a `sf` or `sp` object).
`lat`	Latitude variable in `spatdat`. This is required for csv files or if `spatdat` is a dataframe (i.e. is not a `sf` or `sp` object).
`id`	Polygon ID column. This is required for csv files or if `spatdat` is a dataframe (i.e. is not a `sf` or `sp` object).

Details

This function checks whether spatdat is a sf object and attempts to convert it if not. It also applies clean_spat which fixes certain spatial issues such as invalid or empty polygons, whether a projected CRS is used (converts to WGS84 if detected), and if longitude should be shifted to Pacific view (0-360 format) to avoid splitting the Alaska region during plotting.

Retrieve closure scenario names

Description

A helper function used to display the names of currently saved closure scenarios.

Usage

close_names(project)
close_names(project)

Arguments

project

Name of project

Details

To retrieve the complete closure scenario file, use get_closure_scenario.

Combine zone and closure area

Description

Creates a new spatial dataset that merges regulatory zones with closure areas.

Usage

combine_zone(spat, closure, grid.nm, closure.nm, recast = TRUE)
combine_zone(spat, closure, grid.nm, closure.nm, recast = TRUE)

Arguments

`spat`	Spatial file containing regulatory zones.
`closure`	Closure file containing closure areas.
`grid.nm`	Character, column name containing grid ID.
`closure.nm`	Character, column name containing closure ID.
`recast`	Logical, if `TRUE` `combined` is passed to `recast_multipoly`.

Details

To combine zones with closure areas, this function performs the following steps:

Create the union of the closure area
Take the difference between the closure union and the zone file
Take the intersection of zone and the closure union
Combine the difference and intersection objects into one spatial dataframe
Assign new zone IDs to intersecting polygons

The result is a single spatial dataset containing all polygons from both spat and closure with overlapping (intersecting) polygons receiving new IDs (see new_zone_id). This allows users to partially close regulatory zones during the model design stage.

Confidentialy cache exists

Description

Returns TRUE if confidentiality cache file is found in the project output folder.

Usage

confid_cache_exists(project)
confid_cache_exists(project)

Arguments

project

Name of project.

Examples

## Not run: 
confid_cache_exists("pollock")

## End(Not run)
## Not run: 
confid_cache_exists("pollock")

## End(Not run)

View correlation coefficients between numeric variables

Description

Correlations coefficients can be displayed between all numeric variables or selected numeric variables. Defaults to pearson correlation coefficient. To change the method, specify 'method' as 'kendall', or 'spearman'. Both a plot and table output are generated and saved to the 'output' folder.

Usage

corr_out(
  dat,
  project,
  variables = "all",
  method = "pearson",
  show_coef = FALSE
)
corr_out(
  dat,
  project,
  variables = "all",
  method = "pearson",
  show_coef = FALSE
)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in FishSET database contains the string 'MainDataTable'.
`project`	String, project name.
`variables`	A character string of variables to include. Defaults to `"all"` numeric variables.
`method`	A character string indicating which correlation coefficient is to be computed. One of "pearson" (default), "kendall", or "spearman".
`show_coef`	Logical, whether to include the correlation coefficients on the correlation plot. Only coefficients with a p-value of less than or equal to 0.05 are shown.

Details

Returns Pearson's correlation coefficient between numeric variables in plot and table format. Output saved to output folder.

Examples

## Not run: 
corr_out(pollockMainDataTable, 'pollock', 'all')

## End(Not run)
## Not run: 
corr_out(pollockMainDataTable, 'pollock', 'all')

## End(Not run)

Create catch or revenue per unit effort variable

Description

Add catch per unit effort (CPUE) or revenue per unit effort variable to the primary dataset. Catch should be a weight variable but can be a count. Effort should be in a duration of time, such as days, hours, or minutes.

Usage

cpue(dat, project, xWeight = NULL, xTime, price = NULL, name = NULL)
cpue(dat, project, xWeight = NULL, xTime, price = NULL, name = NULL)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in FishSET database contains the string 'MainDataTable'.
`project`	Project name.
`xWeight`	Catch variable in `dat`. Variable should be a measure of weight (pounds, metric tons, etc) but can also be count. If calculating revenue per unit effort (RPUE) and a revenue column exists in `dat`, then add the revenue column to `price` and set `xWeight = NULL`.
`xTime`	Duration of time variable in `dat` representing effort, such as weeks, days, hours, or minutes.
`price`	Optional, variable from `dat` containing price/value data. Price is multiplied against the catch variable, `xWeight`, to generated revenue. If revenue exists in `dat` and you wish to use this revenue instead of price, then `xWeight` must be `NULL`. Defaults to `NULL`.
`name`	String, name of created variable. Defaults to "cpue" or "rpue" if `price` is not `NULL`.

Details

Creates the catch or revenue per unit effort variable. Catch variable should be in weight (lbs, mts). Effort variable should be a measurement of duration in time. New variable is added to the primary dataset with the column name defined by the name argument. CPUE for individual species should be calculated separately.

Value

Returns primary dataset with CPUE variable added.

Examples

## Not run: 
pollockMainDataTable <- cpue(pollockMainDataTable, 'pollock', 
                             xWeight = 'OFFICIAL_TOTAL_CATCH_MT', 
                             xTime = 'DURATION_IN_MIN', name = 'cpue')

## End(Not run)
## Not run: 
pollockMainDataTable <- cpue(pollockMainDataTable, 'pollock', 
                             xWeight = 'OFFICIAL_TOTAL_CATCH_MT', 
                             xTime = 'DURATION_IN_MIN', name = 'cpue')

## End(Not run)

Define alternative fishing choice

Description

Required step. Creates a list identifying how alternative fishing choices should be defined. Output is saved to the FishSET database. Run this function before running models. dat must have a zone assignment column (see assignment_column()). In certain cases a centroid table must be saved to the FishSET Database, see occasion_var for details.

Usage

create_alternative_choice(
  dat,
  project,
  occasion = "zonal centroid",
  occasion_var = NULL,
  alt_var = "zonal centroid",
  dist.unit = "miles",
  min.haul = 0,
  zoneID,
  zone.cent.name = NULL,
  fish.cent.name = NULL,
  spatname = NULL,
  spatID = NULL,
  outsample = FALSE
)
create_alternative_choice(
  dat,
  project,
  occasion = "zonal centroid",
  occasion_var = NULL,
  alt_var = "zonal centroid",
  dist.unit = "miles",
  min.haul = 0,
  zoneID,
  zone.cent.name = NULL,
  fish.cent.name = NULL,
  spatname = NULL,
  spatID = NULL,
  outsample = FALSE
)

Arguments

`dat`	Required, Primary data frame containing data on hauls or trips. Table in FishSET database should contain the string `MainDataTable`.
`project`	Required, name of project.
`occasion`	String, determines the starting point when calculating the distance matrix. Options are `"zonal centroid"`, `"fishing centroid"`, `"port"`, or `"lon-lat"`. See `occasion_var` for requirements.
`occasion_var`	Identifies an ID column or set of lon-lat variables needed to create the distance matrix. Possible options depend on the value of `occasion`: Centroid When `⁠occasion = zonal/fishing centroid⁠` the possible options are `NULL`, the name of a zone ID variable, or a set coordinate variables (in Lon-Lat order). NULL This will merge centroid lon-lat data to the primary table using the column enter in `zoneID`. A centroid table must be saved to the FishSET Database. Zone ID This option specifies the zone ID variable to merge the centroid table to. For example, a column containing the previous zonal area. A centroid table must be saved to the FishSET Database. Lon-Lat A string vector of length two containing the longitude and latitude of an existing set centroid variables in `dat`. Port When `occasion = port` the possible options include the name of a port ID variable or a set of lon-lat variables describing the location of the port. A value of `NULL` will return an error. Port ID The name of a port ID variable in `dat` that will be used to join the port table to the primary table. A port table is required (see `load_port()`) which contains the port name and the longitude and latitude of each port. Lon-Lat A string vector of length two containing a port's longitude and latitude in `dat`. Lon-Lat When `occasion = lon-lat`, `occasion_var` must contain a string vector of length two containing the longitude and latitude of a vessel's location in the `dat`. For example, the current or previous haul location.
`alt_var`	Determines the alternative choices used to calculate the distance matrix. `alt_var` may be the centroid of zonal assignment (`"zonal centroid"`), `"fishing centroid"`, or the closest point in fishing zone (`"nearest point"`). The centroid options require that the appropriate centroid table has been saved to the project's FishSET Database. See `create_centroid()` to create and save centroids. List existing centroid tables by running `list_tables("project", type = "centroid")`.
`dist.unit`	String, how distance measure should be returned. Choices are `"meters"` or `"m"`, `"kilometers"` or `"km"`, `"miles"`, or `"nmiles"` (nautical miles). Defaults to `"miles"`.
`min.haul`	Required, numeric, minimum number of hauls. Zones with fewer hauls than the `min.haul` value will not be included in model data.
`zoneID`	Variable in `dat` that identifies the individual zones or areas.
`zone.cent.name`	The name of the zonal centroid table to use when `occasion` or `alt_var` is set to `⁠zonal centroid⁠`. Use `list_tables("project", type = "centroid")` to view existing centroid tables. See `create_centroid()` to create centroid tables or `centroid_to_fsdb()` to create a centroid table from columns found in `dat`.
`fish.cent.name`	The name of the fishing centroid table to use when `occasion` or `alt_var` is set to `⁠fishing centroid⁠`. Use `list_tables("project", type = "centroid")` to view existing centroid tables. See `create_centroid()` to create centroid tables or `centroid_to_fsdb()` to create a centroid table from columns found in `dat`.
`spatname`	Required when `alt_var = 'nearest point'`. `spat` is a spatial data file containing information on fishery management or regulatory zones boundaries. `sf` objects are recommended, but `sp` objects can be used as well. See `dat_to_sf()` to convert a spatial table read from a csv file to an `sf` object. To upload your spatial data to the FishSETFolder see `load_spatial()`.If `spat` should come from the FishSET database, it should be the name of the original file name, in quotes. For example, `"pollockNMFSZonesSpatTable"`. Use `tables_database()` or `list_tables("project", type = "spat")` to view the names of spatial tables in the FishSET database.
`spatID`	Required when `alt_var = 'nearest point'`. Variable in `spat` that identifies the individual zones or areas.
`outsample`	Logical, indicating whether this is for primary data or out-of sample data.

Details

Defines the alternative fishing choices. These choices are used to develop the matrix of distances between observed and alternative fishing choices (where they could have fished but did not). The distance matrix is calculated by the make_model_design() function. occasion defines the observed fishing location and alt_var the alternative fishing location. occasion_var identifies an ID column or set of lon-lat variables needed to create the distance matrix.

Parts of the alternative choice list are pulled by create_expectations(), make_model_design(), and the model run discretefish_subroutine()) functions. These output include choices of which variable to use for catch and which zones to include in analyses based on a minimum number of hauls per trip within a zone. Note that if the alternative choice list is modified, the create_expectations() and make_model_design() functions should also be updated before rerunning models.

Value

Saves the alternative choice list to the FishSET database as a list. Output includes:

dataZoneTrue:	Vector of 0/1 indicating whether the data from that zone is to be included in the model
greaterNZ:	Zones which pass numofNecessary test
numOfNecessary:	Minimum number of hauls for zone to be included
altChoiceUnits:	Set to miles
altChoiceType:	Set to distance
occasion:	Identifies how to find latitude and longitude for starting point
occasion_var:	Identifies how to find latitude and longitude for starting point
alt_var:	Identifies how to find latitude and longitude for alternative choice
zoneRow:	Zones and choices array
zone_cent:	Geographic centroid for each zone. Generated from `find_centroid()`
fish_cent:	Fishing centroid for each zone. Generated from `find_fishing_centroid()`
zone_cent_name:	Name of the zonal centroid table
fish_cent_name:	Name of the fishing centroid table
spat:	Spatial data object
spatID:	Variable in spat that identifies individuals zones

Create Centroid Table

Description

Create a zonal or fishing centroid table. The centroid can be joined with the primary data if output = "dataset". The centroid table is automatically saved to the FishSET Database.

Usage

create_centroid(
  spat = NULL,
  dat = NULL,
  project,
  spatID = NULL,
  zoneID = NULL,
  lon.dat = NULL,
  lat.dat = NULL,
  weight.var = NULL,
  type = "zonal centroid",
  names = NULL,
  cent.name = NULL,
  output = "dataset"
)
create_centroid(
  spat = NULL,
  dat = NULL,
  project,
  spatID = NULL,
  zoneID = NULL,
  lon.dat = NULL,
  lat.dat = NULL,
  weight.var = NULL,
  type = "zonal centroid",
  names = NULL,
  cent.name = NULL,
  output = "dataset"
)

Arguments

`spat`	Spatial data containing information on fishery management or regulatory zones. Required for `type = "zonal centroid"`, not required for `type = "fishing centroid"`. `spat` will be included in centroid table name.
`dat`	Primary data containing information on hauls or trips. Table in FishSET database contains the string 'MainDataTable'. `dat` is not required if `type = "zonal centroid"` and `output = "centroid table"`.
`project`	Name of project.
`spatID`	Variable or list in `spat` that identifies the individual areas or zones. If `spat` is class sf, `spatID` should be name of list containing information on zones. Ignored if `type = "fishing centroid"`.
`zoneID`	Variable in `dat` that identifies zonal assignments. `zoneID` is not required if `type = "zonal centroid"` and `output = "centroid table"`.
`lon.dat`	Longitude variable in `dat`. Required for `type = "fishing centroid"`.
`lat.dat`	Latitude variable in `dat`. Required for `type = "fishing centroid"`.
`weight.var`	Variable from `dat` for weighted average (for `type = "fishing centroid"`. only). If `weight.var` is defined, the centroid is defined by the latitude and longitude of fishing locations in each zone weighted by `weight.var`.
`type`	The type of centroid to create. Options include `"zonal centroid"` and `"fishing centroid"`. See other arguments for `type` requirements.
`names`	Character vector of length two containing the names of the fishing centroid columns. The order should be `c("lon_name", "lat_name")`. The default names are `c("weight_cent_lon", "weight_cent_lat")` for weighted fishing centroid and `c("fish_cent_lon", "fish_cent_lat")` for unweighted fishing centroid.
`cent.name`	A string to include in the centroid table name. Table names take the form of `"projectNameZoneCentroid"` for zonal centroids and `"projectNameFishCentroid"` for fishing centroids.
`output`	Options are `"centroid table"`, `"dataset"`, or `"both"`. `"centroid table"` returns a table containing the zone name and the longitude and latitude of the centroid. `"dataset"` returns the primary table joined with the centroid table. `"both"` returns a list containing the merged primary table and the centroid table.

Interactive application to create distance between points variable

Description

Adds a variable for distance between two points to the primary dataset. There are two versions of this function. The difference between the two versions is how additional arguments specific to start and end locations are added. This version requires only five arguments to be specified before running. Additional arguments specific to identifying the lat/lon of start or end points are added through prompts. This function is designed for an interactive session. The create_dist_between_for_gui function requires all necessary arguments to be specified before running and is best used in a non-interactive session. Both versions of the distance between function require that the start and end points be different vectors. If the start or ending points are from a port then PortTable must be specified to obtain lat/lons. If the start or ending points are the center of a fishing zone or area then spat, lon.dat, lat.dat, cat, lon.spat, and lat.spat must be specified to obtain latitude and longitude.

Usage

create_dist_between(
  dat,
  project,
  start,
  end,
  units = c("miles", "meters", "km", "midpoint"),
  zoneid = NULL,
  name = "distBetween"
)
create_dist_between(
  dat,
  project,
  start,
  end,
  units = c("miles", "meters", "km", "midpoint"),
  zoneid = NULL,
  name = "distBetween"
)

Arguments

`dat`	Primary data frame over which to apply function. Table in FishSET database should contain the string 'MainDataTable'.
`project`	Project name.
`start`, `end`	Starting and ending location. Should be a port, lat/lon location, or the fishery management zone/area centroid. or area. If port is desired, start should be the column name in the `dat` containing the port Latitude and longitude for the port are extracted from the port table. If a lat/lon location is desired then start should be a character string of column names from `dat`. The order must be lon, lat. If fishery management centroid is used then set `start="centroid"` or `end="centroid"`. `find_centroid` and `assignment_column` will be called to identify the latitude and longitude if the centroid table does not exist in the FishSET database.
`units`	Unit of measurement for calculated distance between start and ending points. Can be in `"miles"`, `"meters"`, `"kilometers"`, or `"midpoint"` location.
`zoneid`	Variable in `dat` that identifies the individual zones or areas. Define if exists in `dat` and is not names 'ZoneID'.
`name`	String, output variable name. Defaults to 'distBetween'.

Details

Additional arguments.
Further arguments are required to identify the latitude and longitude of the starting or ending location if start or end is defined as zonal centroid or a column from primary dataset containing port information, such as departing or embarking port. Prompts will appear asking for required arguments.

Port arguments required:

portTable: Port table from FishSET database. Required if start or end is a port vector.

Centroids arguments required:

spat:	Spatial data set containing information on fishery management or regulatory zones. Can be shape file, json, geojson, data frame, or list data frame or list. Required if `start` or `end` is centroid.
lon.dat:	Longitude variable from `dat`.
lat.dat:	Latitude variable from `dat`.
lon.spat:	Variable or list from `spat` containing longitude data. Required if `start` or `end` is centroid. Leave as NULL if `spat` is a shape or json file.
lat.spat:	Variable or list from `spat` containing latitude data. Required if `start` or `end` is centroid. Leave as NULL if `spat` is a shape or json file.
cat:	Variable or list in `spat` that identifies the individual areas or zones. If `spat` is class sf, `cat` should be the name of list containing information on zones.

Value

Returns primary data set with distance between variable.

Examples

## Not run: 
pollockMainDataTable <- create_dist_between(pollockMainDataTable, 'pollock', 'centroid',
 'EMBARKED_PORT', units = 'miles', 'DistCentPort')

pollockMainDataTable <- create_dist_between(pollockMainDataTable, 'pollock', c('LonLat_START_LON',
 'LonLat_START_LAT'), c('LonLat_END_LON','LonLat_END_LAT'), units='midpoint', 'DistLocLock')
 
pollockMainDataTable <- create_dist_between(pollockMainDataTable, 'pollock', 'DISEMBARKED_PORT',
  'EMBARKED_PORT', units='meters', 'DistPortPort')

## End(Not run)
## Not run: 
pollockMainDataTable <- create_dist_between(pollockMainDataTable, 'pollock', 'centroid',
 'EMBARKED_PORT', units = 'miles', 'DistCentPort')

pollockMainDataTable <- create_dist_between(pollockMainDataTable, 'pollock', c('LonLat_START_LON',
 'LonLat_START_LAT'), c('LonLat_END_LON','LonLat_END_LAT'), units='midpoint', 'DistLocLock')
 
pollockMainDataTable <- create_dist_between(pollockMainDataTable, 'pollock', 'DISEMBARKED_PORT',
  'EMBARKED_PORT', units='meters', 'DistPortPort')

## End(Not run)

Create distance between points variable - non-interactive version

Description

Adds distance between two points to the primary data set. There are two versions of this function. The difference between the two versions is how additional arguments specific to start and end locations are added. This version requires all necessary arguments to be specified before running and is best used in a non-interactive session. The create_dist_between version requires only five arguments to be specified before running. Additional arguments specific to identifying the lat/long of start or end points are added through prompts. This function is designed for an interactive session. Both versions of the distance between function require that the start and end points be different vectors. If the start or ending points are from a port, then PortTable must be specified to obtain lat/lons. If the start or ending points are the center of a fishing zone or area then spat, lon.dat, lat.dat, cat, lon.spat, and lat.spat must be specified to obtain latitude and longitude.

Usage

create_dist_between_for_gui(
  dat,
  project,
  start = c("lat", "lon"),
  end = c("lat", "lon"),
  units,
  name = "DistBetwen",
  portTable = NULL,
  zoneid,
  spat = NULL,
  lon.dat = NULL,
  lat.dat = NULL,
  cat = NULL,
  lon.spat = NULL,
  lat.spat = NULL
)
create_dist_between_for_gui(
  dat,
  project,
  start = c("lat", "lon"),
  end = c("lat", "lon"),
  units,
  name = "DistBetwen",
  portTable = NULL,
  zoneid,
  spat = NULL,
  lon.dat = NULL,
  lat.dat = NULL,
  cat = NULL,
  lon.spat = NULL,
  lat.spat = NULL
)

Arguments

dat

Primary data containing information on hauls or trips. Table in FishSET database contains the string 'MainDataTable'.

project

Project name

start, end

Starting location. Should be a port, lat/lon location, or centroid of regulatory zone/area.

port:	`start` should be the column name in `dat` containing the port names. Latitude and longitude for the port are extracted from the port table.
lat/lon:	`start` should be a character string of column names from `dat`. The order must be `lat` then `lon` `start=c('lat', 'lon')`.

units

Unit of distance. Choices are "miles", "kilometers", or "midpoint".

name

String, name of new variable. Defaults to 'DistBetween'.

portTable

Data table containing port data. Required if start or end are a vector from the dat containing port names.

zoneid

Variable in dat that identifies the individual zones or areas. Required if zone identifier variable exists and is not 'ZoneID'. Defaults to NULL.

spat

Spatial data containing information on fishery management or regulatory zones. Shape, json, geojson, and csv formats are supported. Required if start or end are "centroid" and a centroid table doesn't exist in the FishSET database.

lon.dat

Longitude variable from dat. Required if start or end are ‘centroid’.

lat.dat

Latitude variable from dat. Required if start or end are ‘centroid’.

cat

Variable or list in spat that identifies the individual areas or zones. If spat is class sf, cat should be name of list containing information on zones. Required if start or end are "centroid".

lon.spat

Variable or list from spat containing longitude data. Required for csv files. Leave as NULL if spat is a shape or json file, Required if start or end are "centroid".

lat.spat

Variable or list from spat containing latitude data. Required for csv files. Leave as NULL if spat is a shape or json file, Required if start or end are "centroid".

Value

Primary data set with distance between points variable added.

Create duration of time variable

Description

Create duration of time variable based on start and ending dates in desired temporal units.

Usage

create_duration(
  dat,
  project,
  start,
  end,
  units = c("week", "day", "hour", "minute"),
  name = "create_duration"
)
create_duration(
  dat,
  project,
  start,
  end,
  units = c("week", "day", "hour", "minute"),
  name = "create_duration"
)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in FishSET database contains the string 'MainDataTable'.
`project`	Project name.
`start`	Date variable from `dat` indicating start of time period.
`end`	Date variable from `dat` indicating end of time period.
`units`	String, unit of time for calculating duration. Must be `"week"`, `"day"`, `"hour"`, or `"minute"`.
`name`	String, name of created vector. Defaults to name of the function if not defined.

Details

Calculates the duration of time between two temporal variables based on defined time unit. The new variable is added to the dataset. A duration of time variable is required for other functions, such as cpue.

Value

Returns primary dataset with duration of time variable added.

Examples

## Not run: 
pollockMainDataTable <- create_duration(pollockMainDataTable, 'pollock', 'TRIP_START', 'TRIP_END',
  units = 'minute', name = 'TripDur')

## End(Not run)
## Not run: 
pollockMainDataTable <- create_duration(pollockMainDataTable, 'pollock', 'TRIP_START', 'TRIP_END',
  units = 'minute', name = 'TripDur')

## End(Not run)

Create expected catch/expected revenue matrix

Description

Create expected catch or expected revenue matrix. The matrix is required for the logit_c model. Multiple user-defined matrices can be saved by setting replace.output = FALSE and re-running the function.

Usage

create_expectations(
  dat,
  project,
  catch,
  price = NULL,
  defineGroup = NULL,
  temp.var = NULL,
  temporal = "daily",
  calc.method = "standardAverage",
  lag.method = "simple",
  empty.catch = NULL,
  empty.expectation = 1e-04,
  temp.window = 7,
  temp.lag = 0,
  year.lag = 0,
  dummy.exp = FALSE,
  default.exp = FALSE,
  replace.output = TRUE,
  weight_avg = FALSE,
  outsample = FALSE
)
create_expectations(
  dat,
  project,
  catch,
  price = NULL,
  defineGroup = NULL,
  temp.var = NULL,
  temporal = "daily",
  calc.method = "standardAverage",
  lag.method = "simple",
  empty.catch = NULL,
  empty.expectation = 1e-04,
  temp.window = 7,
  temp.lag = 0,
  year.lag = 0,
  dummy.exp = FALSE,
  default.exp = FALSE,
  replace.output = TRUE,
  weight_avg = FALSE,
  outsample = FALSE
)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in FishSET database contains the string 'MainDataTable'.
`project`	String, name of project.
`catch`	Variable from `dat` containing catch data.
`price`	Optional, variable from `dat` containing price/value data. Price is multiplied against `catch` to generated revenue. If revenue exists in `dat` and you wish to use this revenue instead of price, then `catch` must be a vector of 1 of length equal to `dat`. Defaults to `NULL`.
`defineGroup`	Optional, variable from `dat` that defines how to split the fleet. Defaults to treating entire dataframe `dat` as a fleet.
`temp.var`	Optional, temporal variable from `dat`. Set to `NULL` if temporal patterns in catch should not be considered.
`temporal`	String, choices are `"daily"` or `"sequential"`. Should time, if `temp.var` is defined, be included as a daily timeline or sequential order of recorded dates. For daily, catch on dates with no record are filled with `NA`. The choice affects how the rolling average is calculated. If temporal is daily then the window size for average and the temporal lag are in days. If sequential, then averaging will occur over the specified number of observations, regardless of how many days they represent.
`calc.method`	String, how catch values are average over window size. Select standard average (`"standardAverage"`), simple lag regression of means (`"simpleLag"`), or weights of regressed groups (`"weights"`)
`lag.method`	String, use regression over entire group (`"simple"`) or for grouped time periods (`"grouped"`).
`empty.catch`	String, replace empty catch with `NA`, `0`, mean of all catch (`"allCatch"`), or mean of grouped catch (`"groupCatch"`).
`empty.expectation`	Numeric, how to treat empty expectation values. Choices are to not replace (`NULL`) or replace with 0.0001 or 0.
`temp.window`	Numeric, temporal window size. If `temp.var` is not `NULL`, set the window size to average catch over. Defaults to 14 (14 days if `temporal` is `"daily"`).
`temp.lag`	Numeric, temporal lag time. If `temp.var` is not `NULL`, how far back to lag `temp.window`.
`year.lag`	If expected catch should be based on catch from previous year(s), set `year.lag` to the number of years to go back.
`dummy.exp`	Logical, should a dummy variable be created? If `TRUE`, output dummy variable for originally missing value. If `FALSE`, no dummy variable is outputted. Defaults to `FALSE`.
`default.exp`	Whether to run default expectations. Defaults to `FALSE`. Alternatively, a character string containing the names of default expectations to run can be entered. Options include "recent", "older", "oldest", and "logbook". The logbook expectation is only run if `defineGroup` is used. "recent" will not include `defineGroup`. Setting `default.exp = TRUE` will include all four options. See Details for how default expectations are defined.
`replace.output`	Logical, replace existing saved expected catch data frame with new expected catch data frame? If `FALSE`, new expected catch data frames appended to previously saved expected catch data frames. Default is `TRUE`. If `TRUE`
`weight_avg`	Logical, if `TRUE` then all observations for a given zone on a given date will be included when calculating the mean, thus giving more weight to days with more observations in a given zone. If `FALSE`, then the daily mean for a zone will be calculated prior to calculating the mean across the time window.
`outsample`	Logical, if `TRUE` then generate expected catch matrix for out-of-sample data. If `FALSE` generate for primary data table. Defaults to `outsample = FALSE`

Details

Function creates an expectation of catch or revenue for alternative fishing zones (zones where they could have fished but did not). The output is saved to the FishSET database and called by the make_model_design function. create_alternative_choice must be called first as observed catch and zone inclusion requirements are defined there.
The primary choices are whether to treat data as a fleet or to group the data (defineGroup) and the time frame of catch data for calculating expected catch. Catch is averaged along a daily or sequential timeline (temporal) using a rolling average. temp.window and temp.lag determine the window size and temporal lag of the window for averaging. Use temp_obs_table before using this function to assess the availability of data for the desired temporal moving window size. Sparse data is not suited for shorter moving window sizes. For very sparse data, consider setting temp.var to NULL and excluding temporal patterns in catch.
Empty catch values are considered to be times of no fishing activity. Values of 0 in the catch variable are considered times when fishing activity occurred but with no catch. These points are included in the averaging and dummy creation as points in time when fishing occurred.
Four default expected catch cases will be run:

recent: Moving window size of two days. In this case, there is no grouping, and catch for entire fleet is used.
older: Moving window size of seven days and lag of two days. In this case, vessels are grouped (or not) based on defineGroup argument.
oldest: Moving window of seven days and lag of eight days. In this case, vessels are grouped (or not) based on defineGroup argument.
logbook: Moving window size of 14 days and lag of one year, seven days. Only used if fleet is defined in defineGroup.

Value

Function saves a list of expected catch matrices to the FishSET database as projectExpectedCatch. The list includes the expected catch matrix from the user-defined choices, recent fine grained information, older fine grained information, oldest fine grained information, and logbook level information. Additional expected catch cases can be added to the list by specifying replace.output = FALSE. The list is automatically saved to the FishSET database and is called in make_model_design. The expected catch output does not need to be loaded when defining or running the model.

newGridVar, newDumV

Examples

## Not run: 
create_expectations(pollockMainDataTable, "pollock", "OFFICIAL_TOTAL_CATCH_MT",
  price = NULL, defineGroup = "fleet", temp.var = "DATE_FISHING_BEGAN",
  temporal = "daily", calc.method = "standardAverage", lag.method = "simple",
  empty.catch = "allCatch", empty.expectation = 0.0001, temp.window = 4,
  temp.lag = 2, year.lag = 0, dummy.exp = FALSE, replace.output = FALSE,
  weight_avg = FALSE, outsample = FALSE
)

## End(Not run)

## Not run: 
create_expectations(pollockMainDataTable, "pollock", "OFFICIAL_TOTAL_CATCH_MT",
  price = NULL, defineGroup = "fleet", temp.var = "DATE_FISHING_BEGAN",
  temporal = "daily", calc.method = "standardAverage", lag.method = "simple",
  empty.catch = "allCatch", empty.expectation = 0.0001, temp.window = 4,
  temp.lag = 2, year.lag = 0, dummy.exp = FALSE, replace.output = FALSE,
  weight_avg = FALSE, outsample = FALSE
)

## End(Not run)

Creates haul midpoint latitude and longitude variables

Description

Calculates latitude and longitude of the haul midpoint and adds two variables to the primary data set: the midpoint latitude and the midpoint longitude.

Usage

create_mid_haul(
  dat,
  project,
  start = c("lon", "lat"),
  end = c("lon", "lat"),
  name = "mid_haul"
)
create_mid_haul(
  dat,
  project,
  start = c("lon", "lat"),
  end = c("lon", "lat"),
  name = "mid_haul"
)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in the FishSET database contains the string 'MainDataTable'.
`project`	Project name.
`start`	Character string, variables in `dat` defining the longitude and latitude of the starting location of haul. Must be in decimal degrees.
`end`	Character string, variables in `dat` defining the longitude and latitude of the ending location of haul. Must be in decimal degrees.
`name`	String, name of new variable. Defaults to 'mid_haul'.

Details

Each row of data must be a unique haul. Requires a start and end point for each observation.

Value

Returns primary dataset with two new variables added: latitude and longitude of haul midpoint.

Examples

## Not run: 
pollockMainDataTable <- create_mid_haul(pollockMainDataTable, 'pollock', 
    start = c('LonLat_START_LON', 'LonLat_START_LAT'), 
   end = c('LonLat_END_LON', 'LonLat_END_LAT'), name = 'mid_haul')

## End(Not run)
## Not run: 
pollockMainDataTable <- create_mid_haul(pollockMainDataTable, 'pollock', 
    start = c('LonLat_START_LON', 'LonLat_START_LAT'), 
   end = c('LonLat_END_LON', 'LonLat_END_LAT'), name = 'mid_haul')

## End(Not run)

Create fishery season identifier variable

Description

Create fishery season identifier variable

Usage

create_seasonal_ID(
  dat,
  project,
  seasonal.dat,
  use.location = c(TRUE, FALSE),
  use.geartype = c(TRUE, FALSE),
  sp.col,
  target = NULL
)
create_seasonal_ID(
  dat,
  project,
  seasonal.dat,
  use.location = c(TRUE, FALSE),
  use.geartype = c(TRUE, FALSE),
  sp.col,
  target = NULL
)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in the FishSET database contains the string 'MainDataTable'.
`project`	Project name.
`seasonal.dat`	Table containing date of fishery season(s). Can be pulled from the FishSET database.
`use.location`	Logical, should fishery season dates depend on fishery location? Column names containing location in `dat` and `seasonal.dat` must match.
`use.geartype`	Logical, should fishery season dates depend on gear type. Column names containing gear type in `dat` and `seasonal.dat` must match.
`sp.col`	Variable in `seasonal.dat` containing species names.
`target`	Name of target species. If `target` is NULL, runs through fisheries in order listed in `seasonal.dat`

Details

Uses a table of fishery season dates to create fishery season identifier variables. Output is a SeasonID variable and/or multiple SeasonID*fishery variables. If fishery season dates vary by location or gear type, then use.location and use.geartype should be TRUE.

The function matches fishery season dates provided in seasonal.dat to the earliest date variable in dat. The 'seasonID' variable is a vector of fishery seasons whereas the 'SeasonID*fishery' variables are 1/0 depending on whether the fishery was open on the observed date.

If target is not defined, then each row of seasonID is defined as the earliest fishery listed in seasonal.dat for which the fishery season date encompasses the date variable in the primary dataset. If target fishery is defined, then 'SeasonID' is defined by whether the target fishery is open on the date in the primary dataset or a different fishery. The vector is filled with 'target' or 'other'.

'SeasonID*fishery' variables are a 1/0 seasonID vector for each fishery (labeled by seasonID and fishery) where 1 indicates the dates for a given row in the primary data table fall within the fishery dates for that fishery.

Value

Returns the primary dataset with the variable SeasonID, or a series of variables identifying by the individual fisheries included (seasonID*fishery).

Examples

## Not run: 
pcodMainDataTable <- create_seasonal_ID("pcodMainDataTable", seasonal_dat,
  use.location = TRUE, use.geartype = TRUE, sp.col = "SPECIES", target = "POLLOCK"
)

## End(Not run)

## Not run: 
pcodMainDataTable <- create_seasonal_ID("pcodMainDataTable", seasonal_dat,
  use.location = TRUE, use.geartype = TRUE, sp.col = "SPECIES", target = "POLLOCK"
)

## End(Not run)

Create starting location variable

Description

Creates a variable containing the zone/area location of a vessel when choice of where to fish next was made. This variable is required for data with multiple sets or hauls in a single trip and for the full information model with Dahl's correction (logit_correction).

Usage

create_startingloc(
  dat,
  project = NULL,
  spat,
  port,
  port_name,
  port_lon,
  port_lat,
  trip_id,
  haul_order,
  starting_port,
  zoneID,
  spatID,
  name = "startingloc"
)
create_startingloc(
  dat,
  project = NULL,
  spat,
  port,
  port_name,
  port_lon,
  port_lat,
  trip_id,
  haul_order,
  starting_port,
  zoneID,
  spatID,
  name = "startingloc"
)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in FishSET database contains the string 'MainDataTable'.
`project`	Name of project
`spat`	Spatial data. Required if ZoneID does not exists in `dat`. Shape, json, geojson, and csv formats are supported.
`port`	Port data. Contains columns: Port_Name, Port_Long, Port_Lat. Table is generated using the `load_port` and saved in the FishSET database as the project and port table, for example 'pollockPortTable'.
`port_name`	Character string indicating the column in port table that contains the port name
`port_lon`	Character string indication the column in port table that contains port longitude
`port_lat`	Character string indication the column in port table that contains port latitude
`trip_id`	Variable in `dat` that identifies unique trips.
`haul_order`	Variable in `dat` containing information on the order that hauls occur within a trip. Can be time, coded variable, etc.
`starting_port`	Variable in `dat` to identify port at start of trip.
`zoneID`	Variable in `dat` that identifies the individual zones or areas.
`spatID`	Variable in `spat` that identifies the individual zones or areas.
`name`	String, name of created variable. Defaults to name of the function if not defined.

Details

Function creates the startloc vector that is required for the full information model with Dahl's correction logit_correction. The vector is the zone location of a vessel when the decision of where to fish next was made. Generally, the first zone of a trip is the departure port. The assignment_column function is called to assign starting port locations and haul locations to zones. If ZoneID exists in dat, assignment_column is not called and the following arguments are not required: spat, lon.dat, lat.dat, cat, lon.grid, lat.grid.

Value

Primary data set with starting location variable added.

Examples

## Not run: 
pcodMainDataTable <- create_startingloc(pcodMainDataTable, 'pcod',
    map2, "pcodPortTable", "TRIP_SEQ", "HAUL_SEQ", "DISEMBARKED_PORT", 
 "START_LON", "START_LAT", "NMFS_AREA", "STARTING_LOC"
)

## End(Not run)
## Not run: 
pcodMainDataTable <- create_startingloc(pcodMainDataTable, 'pcod',
    map2, "pcodPortTable", "TRIP_SEQ", "HAUL_SEQ", "DISEMBARKED_PORT", 
 "START_LON", "START_LAT", "NMFS_AREA", "STARTING_LOC"
)

## End(Not run)

Create trip centroid variable

Description

Create latitude and longitude variables containing the centroid of each trip

Usage

create_trip_centroid(dat, project, lon, lat, tripID, weight.var = NULL)
create_trip_centroid(dat, project, lon, lat, tripID, weight.var = NULL)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in the FishSET database contains the string 'MainDataTable'.
`project`	Project name.
`lon`	Variable in `dat` containing longitudinal data.
`lat`	Variable in `dat` containing latitudinal data.
`tripID`	Variable in `dat` containing trip identifier. If trip identifier should be defined by more than one variable then list as `c('var1', 'var2')`.
`weight.var`	Variable in `dat` for computing the weighted average.

Details

Computes the average longitude and latitude for each trip. Specify weight.var to calculate the weighted centroid. Additional arguments can be added that define unique trips. If no additional arguments are added, each row will be treated as a unique trip.

Value

Returns the primary dataset with centroid latitude and centroid longitude variables added.

Examples

## Not run: 
pollockMainDataTable <- create_trip_centroid(pollockMainDataTable, 'pollock', 'LonLat_START_LON', 
  'LonLat_START_LAT', weight.var = NULL, 'DISEMBARKED_PORT', 'EMBARKED_PORT')

## End(Not run)
## Not run: 
pollockMainDataTable <- create_trip_centroid(pollockMainDataTable, 'pollock', 'LonLat_START_LON', 
  'LonLat_START_LAT', weight.var = NULL, 'DISEMBARKED_PORT', 'EMBARKED_PORT')

## End(Not run)

Create haul level trip distance variable

Description

Create haul level trip distance variable

Usage

create_trip_distance(
  dat,
  project,
  port,
  trip_id,
  starting_port,
  starting_haul = c("Lon", "Lat"),
  ending_haul = c("Lon", "Lat"),
  ending_port,
  haul_order,
  name = "TripDistance",
  a = 6378137,
  f = 1/298.257223563
)
create_trip_distance(
  dat,
  project,
  port,
  trip_id,
  starting_port,
  starting_haul = c("Lon", "Lat"),
  ending_haul = c("Lon", "Lat"),
  ending_port,
  haul_order,
  name = "TripDistance",
  a = 6378137,
  f = 1/298.257223563
)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in the FishSET database contains the string 'MainDataTable'.
`project`	Project name.
`port`	Port data frame. Contains columns: Port_Name, Port_Long, Port_Lat. Table is generated using the `load_port` function and saved in the FishSET database as the project and port, for example 'pollockPortTable'.
`trip_id`	Unique trip identifier in `dat`.
`starting_port`	Variable in `dat` containing ports at the start of the trip.
`starting_haul`	Character string, variables containing latitude and longitude at start of haul in `dat`.
`ending_haul`	Character string, variables containing latitude and longitude at end of haul in `dat`.
`ending_port`	Variable in `dat` containing ports at the end of the trip.
`haul_order`	Variable in `dat` that identifies haul order within a trip. Can be time, coded variable, etc.
`name`	String, name of created variable. Defaults to 'TripDistance'.
`a`	Numeric, major (equatorial) radius of the ellipsoid. The default value is for WGS84 ellipsoid.
`f`	Numeric, ellipsoid flattening. The default value is for WGS84 ellipsoid.

Details

Summation of distance across a trip based on starting and ending ports and hauls in between. The function uses distGeo from the geosphere package to calculate distances between hauls. Inputs are the trips, ports, and hauls from the primary dataset, and the latitude and longitude of ports from the port. The ellipsoid arguments, a and f, are numeric and can be changed if an ellipsoid other than WGS84 is appropriate. See the geosphere R package for more details (https://cran.r-project.org/web/packages/geosphere/geosphere.pdf).

Value

Returns the primary dataset with a trip distance variable added.

Examples

## Not run: 
pcodMainDataTable <- create_trip_distance(pcodMainDataTable, "pcod", "pcodPortTable", 
  "TRIP_SEQ", "DISEMBARKED_PORT", c("LonLat_START_LON", "LonLat_START_LAT"),
  c("LonLat_END_LON", "LonLat_END_LAT"), "EMBARKED_PORT", "HAUL_SEQ", "TripDistance"
)

## End(Not run)

#
## Not run: 
pcodMainDataTable <- create_trip_distance(pcodMainDataTable, "pcod", "pcodPortTable", 
  "TRIP_SEQ", "DISEMBARKED_PORT", c("LonLat_START_LON", "LonLat_START_LAT"),
  c("LonLat_END_LON", "LonLat_END_LAT"), "EMBARKED_PORT", "HAUL_SEQ", "TripDistance"
)

## End(Not run)

#

Create numeric variable using arithmetic expression

Description

Creates a new variable based on the arithmetic operation between two variables. Function is useful for creating rate variables or the summation of two related variables.

Usage

create_var_num(dat, project, x, y, method, name = "create_var_num")
create_var_num(dat, project, x, y, method, name = "create_var_num")

Arguments

`dat`	Primary data containing information on hauls or trips. Table in the FishSET database contains the string 'MainDataTable'.
`project`	Project name.
`x`	Variable in `dat`. Variable will be the numerator if `method` is division.
`y`	Variable in `dat` or numeric value. Variable will be the denominator if `method` is division.
`method`	String, arithmetic expression. Options include: `"sum"`, addition (`"add"`), subtraction (`"sub"`), multiplication (`"mult"`), and division (`"div"`).
`name`	String, name of created vector. Defaults to name of the function if not defined.

Details

Creates a new numeric variable based on the defined arithmetic expression method. New variable is added to the primary dataset.

Value

Returns primary dataset with new variable added.

Examples

## Not run: 
pollockMainDataTable <- create_var_num(pollockMainDataTable, 'pollock', x = 'HAUL_CHINOOK',
    y = 'HAUL_CHUM', method = 'sum', name = 'tot_salmon')

## End(Not run)
## Not run: 
pollockMainDataTable <- create_var_num(pollockMainDataTable, 'pollock', x = 'HAUL_CHINOOK',
    y = 'HAUL_CHUM', method = 'sum', name = 'tot_salmon')

## End(Not run)

K-fold cross validation

Description

K-fold cross validation for estimating model performance

Usage

cross_validation(
  project,
  mod.name,
  zone.dat,
  groups,
  k = NULL,
  time_var = NULL,
  use.scalers = FALSE,
  scaler.func = NULL
)
cross_validation(
  project,
  mod.name,
  zone.dat,
  groups,
  k = NULL,
  time_var = NULL,
  use.scalers = FALSE,
  scaler.func = NULL
)

Arguments

`project`	Name of project
`mod.name`	Name of saved model to use. Argument can be the name of the model or can pull the name of the saved "best" model. Leave `mod.name` empty to use the saved "best" model. If more than one model is saved, `mod.name` should be the numeric indicator of which model to use. Use `table_view("modelChosen", project)` to view a table of saved models.
`zone.dat`	Variable in primary data table that identifies the individual zones or areas.
`groups`	Determine how to subset dataset into groups for training and testing
`k`	Integer, value required if `groups = 'Observations'` to determine the number of groups for splitting data into training and testing datasets. The value of `k` should be chosen to balance bias and variance and values of `k = 5 or 10` have been found to be efficient standard values in the literature. Note that higher k values will increase runtime and the computational cost of `cross_validation`. Leave-on-out cross validation is a type of k-fold cross validation in which `k = n` number of observations, which can be useful for small datasets.
`time_var`	Name of column for time variable. Required if `groups = 'Years'`.
`use.scalers`	Input for `create_model_input()`. Logical, should data be normalized? Defaults to `FALSE`. Rescaling factors are the mean of the numeric vector unless specified with `scaler.func`.
`scaler.func`	Input for `create_model_input()`. Function to calculate rescaling factors.

Details

K-fold cross validation is a resampling procedure for evaluating the predictive performance of a model. First the data are split into k groups, which can be split randomly across observations (e.g., 5-fold cross validation where each group is randomly assigned across observations) or split based on a particular variable (e.g., split groups based on gear type). Each group takes turn being the 'hold-out' or 'test' data set, while the remaining groups are the training dataset (parameters are estimated for the training dataset). Finally the predictive performance of each iteration is calculated as the percent absolute prediction error. s

Examples

## Not run: 

model_design_outsample("scallop", "scallopModName")


## End(Not run)

## Not run: 

model_design_outsample("scallop", "scallopModName")


## End(Not run)

Convert dataframe to sf

Description

Used to convert spatial data with no spatial class to a sf object. This is useful if the spatial data was read from a non-spatial file type, e.g. a CSV file.

Usage

dat_to_sf(dat, lon, lat, id, cast = "POLYGON", multi = FALSE, crs = 4326)
dat_to_sf(dat, lon, lat, id, cast = "POLYGON", multi = FALSE, crs = 4326)

Arguments

`dat`	Spatial data containing information on fishery management or regulatory zones.
`lon`	Longitude variable in `spatdat`.
`lat`	Latitude variable in `spatdat`.
`id`	Spatial feature ID column.
`cast`	Spatial feature type to create. Commonly used options are `"POINT"`, `"LINESTRING"`, and `"POLYGON"`. See `st_cast` for details.
`multi`	Logical, use if needing to convert to a multi-featured (grouped) `sf` object, e.g. `MULTIPOLYGON` or `MULTILINESTRING`.
`crs`	Coordinate reference system to assign to `dat`. Defaults to WGS 84 (EPSG: 4326).

Check for common data quality issues

Description

Check primary data for common data quality issues, such as NaNs, NAs, outliers, unique rows, and empty variables.

Usage

data_check(dat, project, x)
data_check(dat, project, x)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in the FishSET database contains the string 'MainDataTable'.
`project`	String, name of project.
`x`	Variable in `dat` to check for outliers. Must be in quotes if called from the FishSET database.

Details

Prints summary stats for all variables in dat. Prints column names that contain NaNs or NAs. Checks for outliers for specified variable x. Checks that all column names are unique, whether any columns in dat are empty, whether each row is a unique choice occurrence at the haul or trip level, that data for either lat/lon or fishing area are included. The function is also called by other functions.

Examples

## Not run: 
data_check(pcodMainDataTable, "OFFICIAL_TOTAL_CATCH_MT")

## End(Not run)

## Not run: 
data_check(pcodMainDataTable, "OFFICIAL_TOTAL_CATCH_MT")

## End(Not run)

Upload data from file, FishSET DB, or working environment

Description

Helper function that can read data from file, from FishSET DB, or a dataframe in the working environment. Used for data upload functions: load_maindata, load_port, load_aux, load_grid, load_spatial.

Usage

data_upload_helper(dat, type, ...)
data_upload_helper(dat, type, ...)

Arguments

`dat`	Reference to a dataframe. This can be a filepath, the name of an existing FishSET table, or a dataframe object in the working environment.
`type`	The type of data to upload. Options include `"main"`, `"port"`, `"grid"`, `"aux"`, and `"spat"`.
`...`	Additional arguments passed to `read_dat`.

Examples

## Not run: 
dataset <- data_upload_helper(dat, type = "main")

## End(Not run)
## Not run: 
dataset <- data_upload_helper(dat, type = "main")

## End(Not run)

Check and convert lat/lon to decimal degrees

Description

Check that latitude and longitude are in decimal degrees and the variable sign is correct. Correct lat/lon if required.

Usage

degree(
  dat,
  project,
  lat = NULL,
  lon = NULL,
  latsign = FALSE,
  lonsign = FALSE,
  replace = TRUE
)
degree(
  dat,
  project,
  lat = NULL,
  lon = NULL,
  latsign = FALSE,
  lonsign = FALSE,
  replace = TRUE
)

Arguments

`dat`	Dataset containing latitude and longitude data.
`project`	Project name.
`lat`	Variable(s) containing latitude data. If `NULL` the function will attempt to search for all latitude variables by name (e.g. by matching "lat" or "LAT").
`lon`	Variable(s) containing longitude data. If `NULL` the function will attempt to search for all longitude variables by name (e.g. by matching "lon" or "LON").
`latsign`	How should the sign value of `lat` be changed? Choices are `NULL` for no change, `"neg"` to convert all positive values to negative, `"pos"` to convert all negative values to positive, and `"all"` to change all values.
`lonsign`	How should the sign value of `lon` be changed? Choices are `NULL` for no change, `"neg"` to convert all positive values to negative, `"pos"` to convert all negative values to positive, and `"all"` to change all values.
`replace`	Logical, should `lat` and `lon` in `dat` be converted to decimal degrees? Defaults to `TRUE`. Set to `FALSE` if checking for compliance.

Details

First checks whether any variables containing 'lat' or 'lon' in their names are numeric. Returns a message on results. To convert a variable to decimal degrees, identify the lat or lon variable(s) and set replace = TRUE. To change the sign, set latsign (for lat) or lonsign (for lon = TRUE. FishSET requires that latitude and longitude be in decimal degrees.

Value

Returns the primary dataset with the latitudes and longitudes converted to decimal degrees if replace = TRUE or if Changing the sign. Otherwise, a message indicating whether selected longitude and latitude variables are in the correct format.

Examples

## Not run: 
# check format
degree(pollockMainDataTable, 'pollock', lat = 'LatLon_START_LAT',
       lon = 'LatLon_START_LON')

# change signs and convert to decimal degrees
pollockMainDataTable <- degree(pollockMainDataTable, 'pollock', 
                               lat = 'LatLon_START_LAT', 
                               lon = 'LatLon_START_LON', latsign = FALSE, 
                               lonsign = FALSE, replace = TRUE)

## End(Not run)


## Not run: 
# check format
degree(pollockMainDataTable, 'pollock', lat = 'LatLon_START_LAT',
       lon = 'LatLon_START_LON')

# change signs and convert to decimal degrees
pollockMainDataTable <- degree(pollockMainDataTable, 'pollock', 
                               lat = 'LatLon_START_LAT', 
                               lon = 'LatLon_START_LON', latsign = FALSE, 
                               lonsign = FALSE, replace = TRUE)

## End(Not run)

Delete table meta data or project meta file

Description

Delete table meta data or project meta file

Usage

delete_meta(project, tab.name = NULL, delete_file = FALSE)
delete_meta(project, tab.name = NULL, delete_file = FALSE)

Arguments

`project`	Project name.
`tab.name`	String, table name.
`delete_file`	Logical, whether to delete project meta file.

Delete models from FishSET Database

Description

Delete models from the model design file (MDF) and the model output table (MOT).

Usage

delete_models(project, model.names, delete.nested = FALSE)
delete_models(project, model.names, delete.nested = FALSE)

Arguments

`project`	String, name of project.
`model.names`	String, name of models to delete. Use `model_names()` to see model names from the model design file.
`delete.nested`	Logical, whether to delete a model containing nested models. Defaults to `FALSE`.

Details

Nested models are conditional logit models that include more than one expected catch/revenue model. For example, if a conditional logit model named 'logit_c_mod1' was saved to the MDF with the argument expectcatchmodels = list('exp1', 'recent', 'older'), then ⁠'logit_c_mod1⁠ will include three separate models, each using a different expected catch matrix. To delete all three models, enter model.names = 'logit_c_mod1' and set delete.nested = TRUE. To delete one or more specific nested models, use model.names = 'logit_c_mod1.exp1', i.e. the original model name, a period, and the name of the expected catch matrix used in the model.

Create KDE, CDF, or empirical CDF plots

Description

Creates a kernel density estimate, empirical cumulative distribution function, or cumulative distribution function plot of selected variable. Grouping, filtering, and several plot options are available.

Usage

density_plot(
  dat,
  project,
  var,
  type = "kde",
  group = NULL,
  combine = TRUE,
  date = NULL,
  filter_date = NULL,
  date_value = NULL,
  filter_by = NULL,
  filter_value = NULL,
  filter_expr = NULL,
  facet_by = NULL,
  conv = "none",
  tran = "identity",
  format_lab = "decimal",
  scale = "fixed",
  bw = 1,
  position = "identity",
  pages = "single"
)
density_plot(
  dat,
  project,
  var,
  type = "kde",
  group = NULL,
  combine = TRUE,
  date = NULL,
  filter_date = NULL,
  date_value = NULL,
  filter_by = NULL,
  filter_value = NULL,
  filter_expr = NULL,
  facet_by = NULL,
  conv = "none",
  tran = "identity",
  format_lab = "decimal",
  scale = "fixed",
  bw = 1,
  position = "identity",
  pages = "single"
)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in the FishSET database contains the string 'MainDataTable'.
`project`	String, name of project.
`var`	String, name of variable to plot.
`type`	String, type of density plot. Options include `"kde"` (kernel density estimate), `"ecdf"` (empirical cdf), `"cdf"` (cumulative distribution function), or "all" (all plot types). Two or more plot types can be chosen.
`group`	Optional, string names of variables to group by. If two or grouping variables are included, the default for `"cdf"` and `"ecdf"` plots is to not combine groups. This can be changed using `combine = TRUE`. `"kde"` plots always combine two or more groups. `"cdf"` and `"ecdf"` plots can use up to two grouping variables if `combine = FALSE`: the first variable is represented by color and second by line type.
`combine`	Logical, whether to combine the variables listed in `group` for plot.
`date`	Date variable from `dat` used to subset and/or facet the plot by.
`filter_date`	The type of filter to apply to 'MainDataTable'. To filter by a range of dates, use `filter_date = "date_range"`. To filter by a given period, use "year-day", "year-week", "year-month", "year", "month", "week", or "day". The argument `date_value` must be provided.
`date_value`	This argument is paired with `filter_date`. To filter by date range, set `filter_date = "date_range"` and enter a start- and end-date into `date_value` as a string: `date_value = c("2011-01-01", "2011-03-15")`. To filter by period (e.g. "year", "year-month"), use integers (4 digits if year, 1-2 digits if referencing a day, month, or week). Use a vector if filtering by a single period: `date_filter = "month"` and `date_value = c(1, 3, 5)`. This would filter the data to January, March, and May. Use a list if using a year-period type filter, e.g. "year-week", with the format: `list(year, period)`. For example, `filter_date = "year-month"` and `date_value = list(2011:2013, 5:7)` will filter the data table from May through July for years 2011-2013.
`filter_by`	String, variable name to filter 'MainDataTable' by. the argument `filter_value` must be provided.
`filter_value`	A vector of values to filter 'MainDataTable' by using the variable in `filter_by`. For example, if `filter_by = "GEAR_TYPE"`, `filter_value = 1` will include only observations with a gear type of 1.
`filter_expr`	String, a valid R expression to filter 'MainDataTable' by.
`facet_by`	Variable name to facet by. This can be a variable that exists in `dat` or a variable created by `density_plot()` such as `"year"`, `"month"`, or `"week"`. `date` is required if facetting by period.
`conv`	Convert catch variable to `"tons"`, `"metric_tons"`, or by using a function entered as a string. Defaults to `"none"` for no conversion.
`tran`	String; name of function to transform variable, for example `"log"` or `"sqrt"`.
`format_lab`	Formatting option for x-axis labels. Options include `"decimal"` or `"scientific"`.
`scale`	Scale argument passed to `facet_grid`. Defaults to `"fixed"`. Other options include `"free_y"`, `"free_x"`, and `"free"`.
`bw`	Adjusts KDE bandwidth. Defaults to 1.
`position`	The position of the grouped variable for KDE plot. Options include `"identity"`, `"stack"`, and `"fill"`.
`pages`	Whether to output plots on a single page (`"single"`, the default) or multiple pages (`"multi"`).

Details

The data can be filtered by date or by variable (see filter_date and filter_by). If type contains "kde" or "all" then grouping variables are automatically combined. Any variable in dat can be used for faceting, but "year", "month", or "week" are also available if date is provided.

Value

denstiy_plot() can return up to three plots in a single call. When pages = "single" all plots are combined and stacked vertically. pages = "multi" will return separate plots.

Examples

## Not run: 

density_plot(pollockMainDataTable, "pollock", var = "OFFICIAL_TOTAL_CATCH_MT",
             type = c("kde", "ecdf"))

# facet 
density_plot(pollockMainDataTable, "pollock", var = "OFFICIAL_TOTAL_CATCH_MT",
             type = c("kde", "ecdf"), facet_by = "GEAR_TYPE")

# filter by period
density_plot(pollockMainDataTable, "pollock", var = "OFFICIAL_TOTAL_CATCH_MT", 
             type = "kde", date = "FISHING_START_DATE", filter_date = "year-month", 
             filter_value = list(2011, 9:11))

## End(Not run)
## Not run: 

density_plot(pollockMainDataTable, "pollock", var = "OFFICIAL_TOTAL_CATCH_MT",
             type = c("kde", "ecdf"))

# facet 
density_plot(pollockMainDataTable, "pollock", var = "OFFICIAL_TOTAL_CATCH_MT",
             type = c("kde", "ecdf"), facet_by = "GEAR_TYPE")

# filter by period
density_plot(pollockMainDataTable, "pollock", var = "OFFICIAL_TOTAL_CATCH_MT", 
             type = "kde", date = "FISHING_START_DATE", filter_date = "year-month", 
             filter_value = list(2011, 9:11))

## End(Not run)

Run discrete choice model

Description

Subroutine to run chosen discrete choice model. Function pulls necessary data generated in make_model_design and loops through model design choices and expected catch cases. Output is saved to the FishSET database.

Usage

discretefish_subroutine(
  project,
  run = "new",
  select.model = FALSE,
  explorestarts = TRUE,
  max.iterations = 500,
  breakearly = TRUE,
  space = NULL,
  dev = NULL,
  use.scalers = FALSE,
  scaler.func = NULL,
  CV = FALSE
)
discretefish_subroutine(
  project,
  run = "new",
  select.model = FALSE,
  explorestarts = TRUE,
  max.iterations = 500,
  breakearly = TRUE,
  space = NULL,
  dev = NULL,
  use.scalers = FALSE,
  scaler.func = NULL,
  CV = FALSE
)

Arguments

`project`	String, name of project.
`run`	String, how models should be run. `'new'` will only run models that exist in the model design file but not in the model output table. `'all'` will run all models in the model design file, replacing existing model output. The third option is to enter a vector of model names to run (use `model_names()` to see current model names). If the specified model already has output it will be replaced.
`select.model`	Return an interactive data table that allows users to select and save table of best models based on measures of fit.
`explorestarts`	Logical, should starting parameters value space be explored? Set to `TRUE` if unsure of the number of starting parameter values to include or of reasonable starting parameters values. Better starting parameter values can help with model convergence.
`max.iterations`	If `explorestarts = TRUE`, max.iterations indicates the maximum number of iterations to run in search of valid starting parameter values. If the maximum is reached before valid parameter values are found (i.e., likelihood = Inf) the loop will terminate and an error message will be reported for that model.
`breakearly`	Logical, if `explorestarts = TRUE`, should the first set of starting parameter values that returns a valid (numeric) loglikelihood value be returned (`TRUE`) or should the entire parameter space be considered and the set of starting parameter values that return the lowest loglikelihood value be returned (`FALSE`).
`space`	Specify if `explorestarts = TRUE`. List of length 1 or length equal to the number of models to be evaluated. `space` is the number of starting value permutations to test (the size of the space to explore). The greater the `dev` argument, the larger the `space` argument should be.
`dev`	Specify if `explorestarts = TRUE`. List of length 1 or length equal to the number of models to be evaluated. `dev` refers to how far to deviate from the average parameter values when exploring (random normal deviates). The less certain the average parameters are, the greater the `dev` argument should be.
`use.scalers`	Logical, should data be normalized? Defaults to `FALSE`. Rescaling factors are the mean of the numeric vector unless specified with `scaler.func`.
`scaler.func`	Function to calculate rescaling factors. Can be a generic function, such as mean, or a user-defined function. User-defined functions must be specified as `scaler.fun = function(x, FUN = sd) 2*FUN(x)`. This example returns two times the standard deviation of `x`.
`CV`	Logical, `CV = TRUE` when running `discretefish_subroutine` for k-fold cross validation, and the default value is `CV = FALSE`.

Details

Runs through model design choices generated by make_model_design and stored as 'ModelInputData' in FishSET database. Data matrix is created in create_model_input. Required data, optional data, and details on likelihood functions are outlined in make_model_design.

Likelihood-specific initial parameter estimates:

Conditional logit likelihood (logit_c)
Starting parameter values takes the order of: c([alternative-specific parameters], [travel-distance parameters]). The alternative-specific parameters and travel-distance parameters are of length (# of alternative-specific variables) and (# of travel-distance variables) respectively.
Zonal logit with area specific constants (logit_zonal)
Starting parameters takes the order of: c([average-catch parameters], [travel-distance parameters]). The average-catch and travel-distance parameters are of length (# of average-catch variables)*(k-1) and (# of travel-distance variables) respectively, where (k) equals the number of alternative fishing choices.
Full information model with Dahl's correction function (logit_correction)
Starting parameter values takes the order of: c([marginal utility from catch], [catch-function parameters], [polynomial starting parameters], [travel-distance parameters], [catch sigma]). The number of polynomial interaction terms is currently set to 2, so given the chosen degree 'polyn' there should be "(((polyn+1)*2)+2)*(k)" polynomial starting parameters, where (k) equals the number of alternative fishing choices. The marginal utility from catch and catch sigma are of length equal to unity respectively. The catch-function and travel-distance parameters are of length (# of catch variables)*(k) and (# of cost variables) respectively.
Expected profit model with normal catch function (epm_normal)
Starting parameters values take the order of: c([catch-function parameters], [travel-distance parameters], [catch sigma(s)], [scale parameter]). The catch-function and travel-distance parameters are of length (# of catch-function variables)*(k) and (# of travel-distance variables) respectively, where (k) equals the number of alternative fishing choices. The catch sigma(s) are either of length equal to unity or length (k) if the analyst is estimating location-specific catch sigma parameters. The scale parameter is of length equal to unity.
Expected profit model with Weibull catch function (epm_weibull)
Starting parameter values takes the order of: c([catch-function parameters], [travel-distance parameters], [catch sigma(s)], [scale parameter]). The catch-function and travel-distance parameters are of length (# of catch-function variables)*(k) and (# of travel-distance variables) respectively, where (k) equals the number of alternative fishing choices. The catch sigma(s) are either of length equal to unity or length (k) if the analyst is estimating location-specific catch sigma parameters. The scale parameter is of length equal to unity.
Expected profit model with log-normal catch function (epm_lognormal)
Starting parameter values takes the order of: c([catch-function parameters], [travel-distanceparameters], [catch sigma(s)], [scale parameter]). The catch-function and travel-distance parameters are of length (# of catch-function variables)*(k) and (# of travel-distance variables) respectively, where (k) equals the number of alternative fishing choices. The catch sigma(s) are either of length equal to unity or length (k) if the analyst is estimating location-specific catch sigma parameters. The scale parameter is of length equal to unity.

Model output are saved to the FishSET database and can be loaded to the console with:

`model_out_view`:	model output including optimization information, standard errors, coefficients, and t- statistics.
`model_params`:	model estimates and standard error
`model_fit`:	model comparison metrics
`globalcheck_view`:	model error message

For obtaining catch, choice, distance, and otherdat data generated from make_model_design function. ModelInputData table will be pulled from FishSET database.

Value

OutLogit:	[outmat1 se1 EPM2] (coefs, ses, tstats)
optoutput:	optimization information
seoumat2:	ses
MCM:	Model Comparison metrics

Examples

## Not run: 
results <- discretefish_subroutine("pcod", run = 'all', select.model = TRUE)

## End(Not run)

## Not run: 
results <- discretefish_subroutine("pcod", run = 'all', select.model = TRUE)

## End(Not run)

Create dummy matrix from a coded ID variable

Description

Create dummy matrix from a coded ID variable

Usage

dummy_matrix(dat, project, x)
dummy_matrix(dat, project, x)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in FishSET database contains the string 'MainDataTable'.
`project`	Project name.
`x`	Variable in `dat` used to generate dummy matrix.

Details

Creates a dummy matrix of 1/0 with dimensions [(number of observations in dataset) x (number of factors in x)] where each column is a unique factor level. Values are 1 if the value in the column matches the column factor level and 0 otherwise.

Examples

## Not run: 
PortMatrix <- dummy_matrix(pollockMainDataTable, 'pollock', 'PORT_CODE')

## End(Not run)
## Not run: 
PortMatrix <- dummy_matrix(pollockMainDataTable, 'pollock', 'PORT_CODE')

## End(Not run)

Create a binary vector from numeric, date, and character or factor vectors.

Description

Create a binary vector from numeric, date, and character or factor vectors.

Usage

dummy_num(dat, project, var, value, opts = "more_less", name = "dummy_num")
dummy_num(dat, project, var, value, opts = "more_less", name = "dummy_num")

Arguments

`dat`	Primary data containing information on hauls or trips. Table in the FishSET database contains the string 'MainDataTable'.
`project`	Project name.
`var`	Variable in `dat` to create dummy variable from.
`value`	String, value to set dummy variable by. If `var` is a date, value should be a year, If `var` is a factor, value should be a factor level. If `var` is numeric, value should be a single number or range of numbers [use c(1,5)].
`opts`	String, how dummy variable should be defined. Choices are `"x_y"` and `"more_less’"`. For `"x_y"`, each element of `var` is set to 1 if the element matches `value`, otherwise 0. For `"more_less"`, each element of `var` less than `value` is set to 0 and all elements greater than `value` set to 1. If `var` is a factor, then elements that match value will be set to 1 and all other elements set to 0. Default is set to `"more_less"`.
`name`	String, name of created dummy variable. Defaults to name of the function if not defined.

Details

For date variables, the dummy variable is defined by a date (year) and may be either year x versus all other years ("x_y") or before vs after year x ("more_less"). Use this function to create a variable defining whether or not a policy action had been implemented.
Example: before vs. after a 2008 amendment:
dummy_num('pollockMainDataTable', 'Haul_date', 2008, 'more_less', 'amend08')

For factor variables, both choices in opts compare selected factor level(s) against all other factor levels.
Example: Fishers targeting pollock vs. another species:
dummy_num('pollockMainDataTable', 'GF_TARGET_FT', c('Pollock - bottom', 'Pollock - midwater'), 'x_y', 'pollock_target')

For numeric variables, value can be a single number or a range of numbers. The dummy variable is the selected value(s) against all others (x_y) or less than the selected value versus more than the selected value (more_less). For more_less, the mean is used as the critical value if a range of values is provided.

Value

Returns primary dataset with dummy variable added.

Examples

## Not run: 
pollockMainDataTable <- dummy_num(pollockMainDataTable, 'pollock', 'Haul_date', 2008, 
  'more_less', 'amend80')

## End(Not run)
## Not run: 
pollockMainDataTable <- dummy_num(pollockMainDataTable, 'pollock', 'Haul_date', 2008, 
  'more_less', 'amend80')

## End(Not run)

Create dummy variable

Description

Create dummy variable

Usage

dummy_var(dat, project, DumFill = 1, name = "dummy_var")
dummy_var(dat, project, DumFill = 1, name = "dummy_var")

Arguments

`dat`	Primary data containing information on hauls or trips. Table in the FishSET database contains the string 'MainDataTable'.
`project`	Project name.
`DumFill`	Fill the dummy variable with 1 or 0
`name`	String, name of created dummy variable. Defaults to name of the function if not defined.

Details

Creates a dummy variable of either 0 or 1 with length of the number of rows of the data set.

Value

Primary dataset with dummy variable added.

Examples

## Not run: 
pollockMainDataTable <- dummy_var(pollockMainDataTable, 'pollock', DumFill=1, 'dummyvar')

## End(Not run)
## Not run: 
pollockMainDataTable <- dummy_var(pollockMainDataTable, 'pollock', DumFill=1, 'dummyvar')

## End(Not run)

Check variables are not empty

Description

Check for and remove empty variables from dataset. Empty variables are columns in the data that contain all NAs and/or empty strings.

Usage

empty_vars_filter(dat, project, remove = FALSE)
empty_vars_filter(dat, project, remove = FALSE)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in the FishSET database contains the string 'MainDataTable'.
`project`	String, name of project.
`remove`	Logical, whether to remove empty variables. Defaults to `FALSE`.

Details

Function checks for empty variables and prints an outcome message to the console. If empty variables are present and remove = TRUE, then empty variables will be removed from the dataset. Empty variables are columns in the dataset that contain all NAs or empty strings.

Value

Returns the dataset with empty variables removed if remove = TRUE.

Examples

## Not run: 
# check for empty vars
empty_vars_filter(pollockMainDataTable)

# remove empty vars from data
mod.dat <- empty_vars_filter(pollockMainDataTable, 'pollock', remove = TRUE)

## End(Not run)
## Not run: 
# check for empty vars
empty_vars_filter(pollockMainDataTable)

# remove empty vars from data
mod.dat <- empty_vars_filter(pollockMainDataTable, 'pollock', remove = TRUE)

## End(Not run)

Expected profit model with log-normal catch function

Description

Calculate the negative log-likelihood of the expected profit model (EPM) with log-normal catch function. For more information on the EPM lognormal model see section 8.4.5 in the FishSET user manual. https://docs.google.com/document/d/1dzXsVt5iWcAQooDDXRJ3XyMoqnSmpZOqirU_f_PnQUM/edit#heading=h.ps7td88zo4ge

Usage

epm_lognormal(starts3, dat, otherdat, alts, project, expname, mod.name)
epm_lognormal(starts3, dat, otherdat, alts, project, expname, mod.name)

Arguments

`starts3`	Starting parameter values as a numeric vector. The order of parameters in the vector is: c([catch-function params], [travel-dist params], [stdev], [common scale param]), where the length of catch-function parameters is the # of alternatives * # of catch-function variables, length of travel-distance parameters is the # of travel-distance variables, length of standard deviation defaults to 1 but alternative- specific standard deviation values can be specified (length = # of alternatives), and the common scale parameter is a single value.
`dat`	Data matrix, see output from `shift_sort_x`, alternatives with distance.
`otherdat`	List that contains other data used in the model, see section 8.4.5 in the FishSET user manual for more details (link in the description above): (1) 'griddat': catch-function variables that interact with alternative-specific catch-function parameters and do not vary across alternatives (e.g., vessel gross tonnage). (2) 'intdat': travel-distance variables that interact with travel-distance parameters and the distance matrix and do not vary across alternatives. (3) 'prices': price in terms of $/landings units. This is typically a vector with prices for each observation, but can be a single value representing price for the entire dataset.
`alts`	Number of alternative choices in model
`project`	Name of project
`expname`	Expected catch table (optional)
`mod.name`	Name of model run for model result output table

Details

This function is called in discretefish_subroutine when running an EPM model with a log-normal catch function.

Value

ld: negative log likelihood

Expected profit model with normal catch function

Description

Calculate the negative log-likelihood of the expected profit model (EPM) with a normal catch function. For more information on the EPM normal model see section 8.4.3 in the FishSET user manual. https://docs.google.com/document/d/1p8mK65uG8yp-HbzCeBgtO0q6DSpKV1Zyk_ucNskt5ug/edit#heading=h.mrt9b1ee2yb8

Usage

epm_normal(starts3, dat, otherdat, alts, project, expname, mod.name)
epm_normal(starts3, dat, otherdat, alts, project, expname, mod.name)

Arguments

`starts3`	Starting values as a numeric vector. The order of parameters in the vector is: c([catch-function params], [travel-dist params], [stdev], [common scale param]), where the length of catch-function parameters is the # of alternatives * # of catch-function variables, length of travel-distance parameters is the # of travel-distance variables, length of standard deviation defaults to 1 but alternative- specific standard deviation values can be specified (length = # of alternatives), and the common scale parameter is a single value.
`dat`	Data matrix, see output from `shift_sort_x`, alternatives with distance.
`otherdat`	List that contains other data used in the model, see section 8.4.3 in the FishSET user manual for more details (link in the description above): (1) 'griddat': catch-function variables that interact with alternative-specific catch-function parameters and do not vary across alternatives (e.g., vessel gross tonnage). (2) 'intdat': travel-distance variables that interact with travel-distance parameters and the distance matrix and do not vary across alternatives. (3) 'prices': price in terms of $/landings units. This is typically a vector with prices for each observation, but can be a single value representing price for the entire dataset.
`alts`	Number of alternative choices in model
`project`	Name of project
`expname`	Expected catch table (optional)
`mod.name`	Name of model run for model result output table

Details

This function is called in discretefish_subroutine when running an EPM model with a normal catch function.

Value

ld: negative log likelihood

Expected profit model with Weibull catch function

Description

Calculate the negative log-likelihood of the expected profit model (EPM) with Weibull catch function. For more information on the EPM Weibull model see section 8.4.4 in the FishSET user manual. https://docs.google.com/document/d/1dzXsVt5iWcAQooDDXRJ3XyMoqnSmpZOqirU_f_PnQUM/edit#heading=h.gh3zw8f9nsdi

Usage

epm_weibull(starts3, dat, otherdat, alts, project, expname, mod.name)
epm_weibull(starts3, dat, otherdat, alts, project, expname, mod.name)

Arguments

`starts3`	Starting parameter values as a numeric vector. The order of parameters in the vector is: c([catch-function params], [travel-dist params], [shape params], [common scale param]), where the length of catch-function parameters is the # of alternatives * # of catch-function variables, length of travel-distance parameters is the # of travel-distance variables, length of shape parameters defaults to 1 but alternative- specific shape parameters can be specified (length = # of alternatives), and the common scale parameter is a single value.
`dat`	Data matrix, see output from `shift_sort_x`, alternatives with distance.
`otherdat`	List that contains other data used in the model, see section 8.4.4 in the FishSET user manual for more details (link in the description above): (1) 'griddat': catch-function variables that interact with alternative-specific catch-function parameters and do not vary across alternatives (e.g., vessel gross tonnage). (2) 'intdat': travel-distance variables that interact with travel-distance parameters and the distance matrix and do not vary across alternatives. (3) 'prices': price in terms of $/landings units. This is typically a vector with prices for each observation, but can be a single value representing price for the entire dataset.
`alts`	Number of alternative choices in the model
`project`	Name of project
`expname`	Expected catch table (optional)
`mod.name`	Name of model run for model result output table

Details

This function is called in discretefish_subroutine when running an EPM model with a Weibull catch function.

Value

ld: negative log likelihood

Return names of expected catch matrices

Description

Return the names of expected catch matrices saved to the FishSET database.

Usage

exp_catch_names(project)
exp_catch_names(project)

Arguments

project

Name of project.

Get Expected Catch List

Description

Returns the Expected Catch list from the FishSET database.

Usage

expected_catch_list(project, name = NULL)
expected_catch_list(project, name = NULL)

Arguments

`project`	Name of project.
`name`	Name of expected catch table from the FishSET database. The table name will contain the string "ExpectedCatch". If `NULL`, the default table is returned. Use `tables_database` to see a list of FishSET database tables by project.

Explore starting value parameter space

Description

Shotgun method to find better parameter starting values by exploring starting value parameter space.

Usage

explore_startparams(project, space, dev, startsr = NULL)
explore_startparams(project, space, dev, startsr = NULL)

Arguments

`project`	String, name of project.
`space`	List of length 1 or length equal to the number of models to be evaluated. `space` is the number of starting value permutations to test (the size of the space to explore). The greater the `dev` argument, the larger the `space` argument should be.
`dev`	List of length 1 or length equal to the number of models to be evaluated. `dev` refers to how far to deviate from the average parameter values when exploring (random normal deviates). The less certain the average parameters are, the greater the `dev` argument should be.
`startsr`	Optional. List, average starting value parameters for revenue/location-specific covariates then cost/distance. The best guess at what the starting value parameters should be (e.g. all ones). Specify starting value parameters for each model if values should be differetn than ones. The number of starting value parameters should correspond to the likelihood and data that you want to test.

Details

Function is used to identify better starting parameters when convergence is an issue. For more details on the likelihood functions or data, see make_model_design. Function calls the model design file and should be used after the make_model_design function is called.
If more than one model is defined in the model design file, then starting parameters must be defined for each model.

Value

Returns three data frames.

newstart:	Chosen starting values with smallest likelihood
saveLLstarts:	Likelihood values for each starting value permutation
savestarts:	Starting value permutations (corresponding to each saved likelihood value)

Examples

## Not run: 
Example with only one model specified
results <- explore_startparams('myproject', 15, 3, rep(1,17))

Example with three models specified
results <- explore_startparams('myproject', space = list(15,10,100),
   dev=list(3,3,1), startsr=list(c(1,2,3), c(1,0, -1), c(0,0,.5)))

View results
results$startsOut

## End(Not run)
## Not run: 
Example with only one model specified
results <- explore_startparams('myproject', 15, 3, rep(1,17))

Example with three models specified
results <- explore_startparams('myproject', space = list(15,10,100),
   dev=list(3,3,1), startsr=list(c(1,2,3), c(1,0, -1), c(0,0,.5)))

View results
results$startsOut

## End(Not run)

Remove rows based on filter expressions defined in 'filterTable'

Description

Remove rows based on filter expressions defined in 'filterTable'

Usage

filter_dat(dat, project, exp, filterTable = NULL)
filter_dat(dat, project, exp, filterTable = NULL)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in the FishSET database contains the string 'MainDataTable'.
`project`	Project name.
`exp`	How to filter. May be a row in the filter table generated by `filter_table` that contains a filter expression or the filter expression to apply to the data. If the filter expression is supplied, it should take on the form of `"x < 100"` or `"is.na(x) == FALSE"`.
`filterTable`	Name of filter table in FishSET database. Name should contain the phrase 'filterTable'.

Details

Filter data frame based on a predefined filter expression from filter_table or a filter expression. We recommend creating a filter table using filter_table so that filter expressions are stored and easily accessed in the future.

Value

Filtered data frame

Examples

## Not run: 
newdat <- filter_dat(pcodMainDataTable, 'pcod', exp = 3, 
                     filterTable = 'pcodfilterTable01012011')
                     
newdat <- filter_dat(pcodMainDataTable, 'pcod', 
                     exp = 'PERFORMANCE_Code == 1', filteTable = NULL)
                     
newdat <- filter_dat(pcodMainDataTable, "pcod", exp = "SEASON == 'A'",
                     filterTable = NULL)

## End(Not run)

## Not run: 
newdat <- filter_dat(pcodMainDataTable, 'pcod', exp = 3, 
                     filterTable = 'pcodfilterTable01012011')
                     
newdat <- filter_dat(pcodMainDataTable, 'pcod', 
                     exp = 'PERFORMANCE_Code == 1', filteTable = NULL)
                     
newdat <- filter_dat(pcodMainDataTable, "pcod", exp = "SEASON == 'A'",
                     filterTable = NULL)

## End(Not run)

Filter out-of-sample data for model predictions

Description

Filter the out-of-sample dataset and prepare for predictions of fishing probability.

Usage

filter_outsample(
  dat,
  project,
  mod.name,
  spatial_outsample = FALSE,
  zone.dat = NULL,
  spat = NULL,
  zone.spat = NULL,
  outsample_zones = NULL,
  lon.spat = NULL,
  lat.spat = NULL,
  use.scalers = FALSE,
  scaler.func = NULL
)
filter_outsample(
  dat,
  project,
  mod.name,
  spatial_outsample = FALSE,
  zone.dat = NULL,
  spat = NULL,
  zone.spat = NULL,
  outsample_zones = NULL,
  lon.spat = NULL,
  lat.spat = NULL,
  use.scalers = FALSE,
  scaler.func = NULL
)

Arguments

`dat`	Out-of-sample data
`project`	Name of project
`mod.name`	Name of saved model to use. Argument can be the name of the model or can pull the name of the saved "best" model. Leave `mod.name` empty to use the saved "best" model. If more than one model is saved, `mod.name` should be the numeric indicator of which model to use. Use `table_view("modelChosen", project)` to view a table of saved models.
`spatial_outsample`	Logical, indicate whether the data are out-of-sample spatially or not. Note that models with zone-specific coefficients (e.g., zonal logit) cannot be used to predict data that are out-of-sample spatially. `spatial_outsample = FALSE` can represent data out-of-sample temporally or out-of-sample based on another variable (e.g., vessel tonnage, gear type, etc.)
`zone.dat`	Variable in `dat`that identifies the individual areas or zones.
`spat`	Required, data file or character. `spat` is a spatial data file containing information on fishery management or regulatory zones boundaries. Shape, json, geojson, and csv formats are supported. geojson is the preferred format. json files must be converted into geoson. This is done automatically when the file is loaded with `read_dat` with `is.map` set to true. `spat` cannot, at this time, be loaded from the FishSET database.
`zone.spat`	Variable in `spat` that identifies the individual areas or zones.
`outsample_zones`	Vector of out-of-sample zones to filter `dat`. Only provided as input when running this function in the main app.
`lon.spat`	Required for csv files. Variable or list from `spat` containing longitude data. Leave as NULL if `spat` is a shape or json file.
`lat.spat`	Required for csv files. Variable or list from `spat` containing latitude data. Leave as NULL if `spat` is a shape or json file.
`use.scalers`	Input for `create_model_input()`. Logical, should data be normalized? Defaults to `FALSE`. Rescaling factors are the mean of the numeric vector unless specified with `scaler.func`.
`scaler.func`	Input for `create_model_input()`. Function to calculate rescaling factors.

Details

This function filters the out-of-sample data. If the data is out-of-sample spatially, then set spatial_outsample = TRUE and provide a spatial file (spat) and the zone id in the spatial file zone.spat. An interactive map is used for selecting out of sample zones. If the data are not spatially out-of-sample, then just filter the data for the zones included in the selected model. Note that models with zone-specific coefficients (e.g., zonal logit) cannot predict spatial out-of-sample data. Upon successful execution of filter_outsample() the filtered dataset will be saved to an RDS file in the outputs folder. This function will overwrite the existing RDS file each time it is run.

Value

Returns probability of logit model by choice

Define and store filter expressions

Description

Define and store filter expressions

Usage

filter_table(dat, project, x, exp)
filter_table(dat, project, x, exp)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in the FishSET database contains the string 'MainDataTable'.
`project`	String, name of project.
`x`	Variable in `dat` over which filter will be applied.
`exp`	Filter expression. Should take on the form of `"x < 100"` or `"is.na(x) == FALSE"`.

Details

This function allows users to define and store data filter expressions which can then be applied to the data. The filter table will be saved in the FishSET database under the project name and 'filterTable'. The new filter functions are added each time the function is run and the table is automatically updated in the FishSET database. The function call will be logged in the log file.

Value

Filter expressions saved as a table to the FishSET database.

Examples

## Not run: 
filter_table(pcodMainDataTable, 'pcod', x = 'PERFORMANCE_Code',
             exp = 'PERFORMANCE_Code == 1')

## End(Not run)

## Not run: 
filter_table(pcodMainDataTable, 'pcod', x = 'PERFORMANCE_Code',
             exp = 'PERFORMANCE_Code == 1')

## End(Not run)

Identify geographic centroid of fishery management or regulatory zone

Description

Identify geographic centroid of fishery management or regulatory zone

Usage

find_centroid(
  spat,
  project,
  spatID,
  lon.spat = NULL,
  lat.spat = NULL,
  cent.name = NULL,
  log.fun = TRUE
)
find_centroid(
  spat,
  project,
  spatID,
  lon.spat = NULL,
  lat.spat = NULL,
  cent.name = NULL,
  log.fun = TRUE
)

Arguments

`spat`	Spatial data containing information on fishery management or regulatory zones. Can be shape file, json, geojson, data frame, or list.
`project`	Name of project
`spatID`	Variable or list in `spat` that identifies the individual areas or zones. If `spat` is class sf, `spatID` should be name of list containing information on zones.
`lon.spat`	Variable or list from `spat` containing longitude data. Required for csv files. Leave as NULL if `spat` is a shape or json file.
`lat.spat`	Variable or list from `spat` containing latitude data. Required for csv files. Leave as NULL if `spat` is a shape or json file.
`cent.name`	String, name to include in centroid table. Centroid name take the form of '"projectNameZoneCentroid"'. Defaults to 'NULL' (e.g. '"projectZoneCentroid"').
`log.fun`	Logical, whether to log function call (for internal use).

Details

Returns the geographic centroid of each area/zone in spat. The centroid table is saved to the FishSET database. Function is called by the create_alternative_choice and create_dist_between functions.

Value

Returns a data frame where each row is a unique zone and columns are the zone ID and the latitude and longitude defining the centroid of each zone.

Create fishing or weighted fishing centroid

Description

Create fishing or weighted fishing centroid

Usage

find_fishing_centroid(
  dat,
  project,
  zoneID,
  weight.var = NULL,
  lon.dat,
  lat.dat,
  names = NULL,
  cent.name = NULL,
  log.fun = TRUE
)
find_fishing_centroid(
  dat,
  project,
  zoneID,
  weight.var = NULL,
  lon.dat,
  lat.dat,
  names = NULL,
  cent.name = NULL,
  log.fun = TRUE
)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in FishSET database contains the string 'MainDataTable'.
`project`	Name of project
`zoneID`	Variable in `dat` that identifies zonal assignments or the If `spat` is class sf, `zoneID` should be name of list containing information on zones.
`weight.var`	Variable from `dat` for weighted average. If `weight.var` is defined, the centroid is defined by the latitude and longitude of fishing locations in each zone weighted by `weight.var`.
`lon.dat`	Required. Longitude variable in `dat`.
`lat.dat`	Required. Latitude variable in `dat`.
`names`	Then names of the fishing centroid columns to be added. A vector of length two in the order of `c("lon", "lat")`. The default is `c("fish_cent_lon", "fish_cent_lat")` and `c("weight_cent_lon", "weight_cent_lat")` if `weight.var` is used.
`cent.name`	A string to include in the centroid table name. Table names take the form of '"projectNameFishCentroid"' for fishing centroids.
`log.fun`	Logical, whether to log function call (for internal use).

Details

Fishing centroid defines the centroid by mean latitude and longitude of fishing locations in each zone. Weighted centroid defines the centroid by the mean latitude and longitude of fishing locations in each zone weighted by the weight.var. The fishing and weighted centroid variables can be used anywhere latitude/longitude variables appear. Each observation in dat must be assigned to a fishery or regulatory area/zone. If the zone identifier exists in dat and is not called 'ZoneID', then zoneID should be the variable name containing the zone identifier. If a zone identifier variable does not exist in dat, spat must be be specified and zoneID must be zone identifier in spat. The assignment_column function will be run and a zone identifier variable added to dat.

Value

Returns primary dataset with fishing centroid and, if weight.var is specified, the weighted fishing centroid.

Compare imported data table to the previously saved version of the data table

Description

Compare imported data table to the previously saved version of the data table

Usage

fishset_compare(x, y, compare = c(TRUE, FALSE), project)
fishset_compare(x, y, compare = c(TRUE, FALSE), project)

Arguments

`x`	Updated data table to be saved.
`y`	Previously saved version of data table.
`compare`	Logical, if TRUE, compares `x` to `y` before saving `x` to FishSET database.
`project`	Name of project

Details

Function is optional. It is designed to check for consistency between versions of the same data frame so that the logged functions can be used to rerun the previous analysis on the updated data. The column names, including spelling and capitalization, must match the previous version to use the logged functions to rerun code after data has been updated (i.e., new year of data). The function is called by the data import functions (load_maindata, load_port, load_aux, load_grid). Set the compare argument to TRUE to compare column names of the new and previously saved data tables. The new data tables will be saved to the FishSET database if column names match. Set the compare argument to FALSE if no previous versions of the data table exist in the FishSET database. No comparison will be made and the new file will be saved to the database.

Show all SQL Tables in FishSET Folder

Description

Returns a data frame containing all tables from each project by project name and table type.

Usage

fishset_tables(project = NULL)
fishset_tables(project = NULL)

Arguments

project

Project name. If NULL, tables from all available projects will be displayed.

Examples

## Not run: 
# return all tables for all projects
fishset_tables()

# return all tables for a specific project
fishset_tables("pollock")

## End(Not run)
## Not run: 
# return all tables for all projects
fishset_tables()

# return all tables for a specific project
fishset_tables("pollock")

## End(Not run)

Create fleet variable using fleet definition table

Description

Add a fleet ID column to the primary data using a fleet table (see fleet_table for details).

Usage

fleet_assign(
  dat,
  project,
  fleet_tab,
  assign = NULL,
  overlap = FALSE,
  format_var = "string"
)
fleet_assign(
  dat,
  project,
  fleet_tab,
  assign = NULL,
  overlap = FALSE,
  format_var = "string"
)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in FishSET database contains the string 'MainDataTable'.
`project`	String, name of project.
`fleet_tab`	String, name of the fleet table stored in FishSET database. Should contain the string 'FleetTable'.
`assign`	Integer, a vector of row numbers from `fleet_tab`. Only fleet definitions in these rows will be used and added to 'MainDataTable'. If `assign = NULL` (the default), all fleet definitions in the table will be used.
`overlap`	Logical; whether overlapping fleet assignments are allowed. Defaults to `FALSE`.
`format_var`	String. If `format_var = "string"`, a single column named "fleet" will be added to 'MainDataTable'. If `overlap = TRUE`, observations with multiple fleet assignments are duplicated. `format_var ="dummy"` outputs a binary column for each fleet in the fleet table. Defaults to `"string"`.

Value

Returns the primary dataset with added fleet variable(s).

Examples

## Not run: 
fleet_assign(pollockMainDataTable, 'pollock', fleet_tab = 'pollockFleetTable', 
             overlap = TRUE)

## End(Not run)
## Not run: 
fleet_assign(pollockMainDataTable, 'pollock', fleet_tab = 'pollockFleetTable', 
             overlap = TRUE)

## End(Not run)

Define and store fleet expressions

Description

fleet_table saves a table of fleet expression to the FishSET database which can then be applied to a dataset with fleet_assign. The table must contain a 'condition' and 'fleet' column with each row corresponding to a set of expressions that will be used to assign observations to fleets. A table can be created with the cond and fleet_val arguments or by uploading an existing table that matches the format requirements. See 'Details' below for examples of how tables can be formatted.

Usage

fleet_table(
  dat,
  project,
  cond = NULL,
  fleet_val = NULL,
  table = NULL,
  save = TRUE
)
fleet_table(
  dat,
  project,
  cond = NULL,
  fleet_val = NULL,
  table = NULL,
  save = TRUE
)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in the FishSET database contains the string 'MainDataTable'.
`project`	String, name of project.
`cond`	String; a vector containing valid R expressions saved as strings. Must be used with the `fleet_val` argument. Each expression should be in quotes (double or single) with nested quotes indicated with escaped quotes (\') or with the opposite quote used to contain the expression. For example, "species == 'cod'" and "species == \'cod\'" are both valid.
`fleet_val`	String; a vector of fleet names to be assigned. Must be used with the `cond` argument.
`table`	A data frame that has one condition column and one fleet column. See 'Details' for table formatting.
`save`	Logical; whether to save the current fleet_table to the FishSET database. Defaults to `TRUE`. Tables are saved in the format of 'projectFleetTable'. Each project can only have one fleet table. New fleet definitions are appended to the exiting fleet table. See `table_remove` to delete a table.

Details

Below is a simple example of a fleet table. For a fleet table to be created, it must contain one "condition" column and one "fleet" column. Each fleet definition can be as long as necessary. For example, the first expression in the condition column example could also be "GEAR == 8 & species == 'pollock'". Use the '&' operator when combining expressions.

condition	fleet
'GEAR == 8'	'A'
'species == "cod"'	'B'
'area %in% c(640, 620)'	'C'

Value

Returns a table of fleet conditions that is saved to the FishSET database with the name 'projectFleetTable'.

Examples

## Not run:  
fleet_table("MainDataTable", "myProject", 
            cond = c("GEAR == 8", "species == 'cod'", "area %in% c(640, 620)"),
            fleet_val = c("A", "B", "C"), save = TRUE
            ) 

## End(Not run)

## Not run:  
fleet_table("MainDataTable", "myProject", 
            cond = c("GEAR == 8", "species == 'cod'", "area %in% c(640, 620)"),
            fleet_val = c("A", "B", "C"), save = TRUE
            ) 

## End(Not run)

Format Gridded Data

Description

Change the format of a gridded dataset from wide to long (or vice versa) and remove any unmatched area/zones from grid. This is a necessary step for including gridded variables in the conditional logit (logit_c()) model.

Usage

format_grid(
  grid,
  dat,
  project,
  dat.key,
  area.dat,
  area.grid = NULL,
  id.cols,
  from.format = "wide",
  to.format = "wide",
  val.name = NULL,
  save = FALSE
)
format_grid(
  grid,
  dat,
  project,
  dat.key,
  area.dat,
  area.grid = NULL,
  id.cols,
  from.format = "wide",
  to.format = "wide",
  val.name = NULL,
  save = FALSE
)

Arguments

`grid`	Gridded dataset to format.
`dat`	Primary data containing information on hauls or trips. Table in the FishSET database contains the string 'MainDataTable'.
`project`	Name of project.
`dat.key`	String, name of column(s) in MainDataTable to join by. The number of columns must match `id.cols`.
`area.dat`	String, the name of the area or zone column in `dat`.
`area.grid`	String, the name of the area or zone column in `dat` if `from.format = "long"`. Ignored if `from.format = "wide"`.
`id.cols`	String, the names of columns from `grid` that are neither area (`area.grid`) or value (`val.name`) columns, for example date or period column(s).
`from.format`	The original format of `grid`. Options include `"long"` or `"wide"`. Use `"long"` if a single area column exists in `grid`. Use `"wide"` if `grid` contains a column for each area.
`to.format`	The desired format of `grid`. Options include `"long"` or `"wide"`. Use `"long"` if you want a single area column with a corresponding value column. Use `"wide"` if you would like each area to have its own column.
`val.name`	Required if converting from wide to long or long to wide format. When `from.format = "wide"` and `to.format = "long"`, `val.name` will be the name of the new value variable associated with the area column. When `from.format = "long"` and `to.format = "wide"`, `val.name` will be the name of the existing value variable associated with the area column.
`save`	Logical, whether to save formatted `grid`. When `TRUE`, the table will be saved with the string `"Wide"` or `"Long"` appended depending on the value of `to.format`.

Reformat out-of-sample model coefficients

Description

Reformat out-of-sample model coefficients by removing zones not included in the out-of-sample dataset

Usage

format_outsample_coefs(in_zones, out_zones, Eq, likelihood)
format_outsample_coefs(in_zones, out_zones, Eq, likelihood)

Arguments

`in_zones`	Vector of zoneIDs in the in-sample dataset
`out_zones`	Vector of zoneIDs in the out-of-sample dataset
`Eq`	Tibble containing estimated model coefficients (including standard errors and t-values)
`likelihood`	Character, name of the likelihood

Value

Return a list with (1) vector of coefficients (zones not in the out-of-sample dataset removed) and (2) flag indicating if the first alt (in-sample dataset) is not included in the out-of-sample dataset.

Display summary of function calls

Description

Display summary of function calls

Usage

function_summary(project, date = NULL, type = "dat_load", show = "all")
function_summary(project, date = NULL, type = "dat_load", show = "all")

Arguments

`project`	Project name.
`date`	Character string; the date of the log file (" retrieve. If `NULL` the most recent log is pulled.
`type`	The type of function to display. "dat_load", "dat_quality", "dat_create", "dat_exploration", "fleet", and "model".
`show`	Whether to display `"all"` calls, the `"last"` (most recent) call, or the `"first"` (oldest) function call from the log file.

Details

Displays a list of functions by type and their arguments from a log file. If no date is entered the most recent log file is pulled.

Examples

## Not run: 
function_summary("pollock")

## End(Not run)
## Not run: 
function_summary("pollock")

## End(Not run)

Retrieve closure scenario by project

Description

Retrieve closure scenario by project

Usage

get_closure_scenario(project)
get_closure_scenario(project)

Arguments

project

Name of project.

Examples

## Not run: 
get_closure_scenario("pollock")

## End(Not run)
## Not run: 
get_closure_scenario("pollock")

## End(Not run)

Return cached confidentiality tables

Description

This function lists the confidentiality "check" tables used to suppress values.

Usage

get_confid_cache(project, show = "all")
get_confid_cache(project, show = "all")

Arguments

`project`	Name of project
`show`	Output `"all"` tables, `"last"` table, or `"first"` table.

Value

A list of tables containing suppression conditions.

Return the confidentiality settings

Description

This function returns the confidentiality settings from project settings file.

Usage

get_confid_check(project)
get_confid_check(project)

Arguments

project

Name of project

Value

A list containing the confidentiality parameters: check, v_id, rule, and value.

Retrieve grid log file

Description

Retrieves the grid log file for a project. The grid log shows which grid files are currently saved to the project data folder.

Usage

get_grid_log(project)
get_grid_log(project)

Arguments

project

Name of project.

Details

The grid log is a list containing information about the grid files currently saved to the project data folder. Each grid entry contains three fields: grid_name, closure_name, and combined_areas. grid_name is the name of the original grid object. If the other two fields are empty, this means that the grid file has not been altered and is the same as the original. closure_name is the name of a second grid file containing closure areas that were combined with grid_name. combined_areas are the names/IDs of the closures areas from the closure grid file that were combined with grid_name.

Examples

## Not run: 
get_grid_log("pollock")

## End(Not run)
## Not run: 
get_grid_log("pollock")

## End(Not run)

Pull data from latest project file

Description

Pull data from latest project file

Usage

get_latest_projectfile(project, mod.name)
get_latest_projectfile(project, mod.name)

Arguments

`project`	Project name
`mod.name`	Model name

Examples

## Not run: 
get_latest_projectfile("pollock", "logit_mod1")

## End(Not run)
## Not run: 
get_latest_projectfile("pollock", "logit_mod1")

## End(Not run)

Retrieve project settings

Description

Retrieve project settings

Usage

get_proj_settings(project, format = FALSE)
get_proj_settings(project, format = FALSE)

Arguments

`project`	Name of project.
`format`	Logical, output project settings using `pander`. Useful for markdown documents.

Details

The project settings file includes confidentiality settings, the user output folder directory, and the default plot saving size.

Calculate and view Getis-Ord statistic

Description

Wrapper function to calculate global and local Getis-Ord by discrete area

Usage

getis_ord_stats(
  dat,
  project,
  varofint,
  zoneid,
  spat,
  cat,
  lon.dat = NULL,
  lat.dat = NULL,
  lon.spat = NULL,
  lat.spat = NULL
)
getis_ord_stats(
  dat,
  project,
  varofint,
  zoneid,
  spat,
  cat,
  lon.dat = NULL,
  lat.dat = NULL,
  lon.spat = NULL,
  lat.spat = NULL
)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in the FishSET database contains the string 'MainDataTable'.
`project`	String, name of project.
`varofint`	Numeric variable in `dat` to test for spatial high/low clustering.
`zoneid`	Variable in `dat` that identifies the individual zones or areas. Define if exists in `dat` and is not named 'ZoneID'. Defaults to NULL.
`spat`	Spatial data containing information on fishery management or regulatory zones. See `load_spatial`.
`cat`	Variable in `spat` defining the individual areas or zones.
`lon.dat`	Longitude variable in `dat`.Require if `zoneid` is not defined.
`lat.dat`	Latitude variable in `dat`. Require if `zoneid` is not defined.
`lon.spat`	Variable or list from `spat` containing longitude data. Required for csv files. Leave as NULL if `spat` is a shape or json file.
`lat.spat`	Variable or list from `spat` containing latitude data. Required for csv files. Leave as NULL if `spat` is a shape or json file.

Details

Calculates the degree, within each zone, that high or low values of the varofint cluster in space. Function utilizes the localG and knearneigh functions from the spdep package. The spatial input is a row-standardized spatial weights matrix for computed nearest neighbor matrix, which is the null setting for the nb2listw function. Requires a data frame with area as a factor, the lon/lat centroid for each area, the lat/lon outlining each area, and the variable of interest (varofint) or a map file with lat/lon defining boundaries of area/zones and variable of interest for weighting. Also required is the lat/lon defining the center of a zone/area. If the centroid is not included in the map file, then find_centroid can be called to calculate the centroid of each zone. If the variable of interest is not associated with an area/zone then the assignment_column function can be used to assign each observation to a zone. Arguments to identify centroid and assign variable of interest to area/zone are optional and default to NULL.

Value

Returns a plot and table. Both are saved to the output folder.

Examples

## Not run: 
getis_ord_stats(pcodMainDataTable, project = 'pcod', varofint = 'OFFICIAL_MT_TONS',
  spat = spatdat, lon.dat = 'LonLat_START_LON', lat.dat = 'LonLat_START_LAT', cat = 'NMFS_AREA')

## End(Not run)

## Not run: 
getis_ord_stats(pcodMainDataTable, project = 'pcod', varofint = 'OFFICIAL_MT_TONS',
  spat = spatdat, lon.dat = 'LonLat_START_LON', lat.dat = 'LonLat_START_LAT', cat = 'NMFS_AREA')

## End(Not run)

View error output from discrete choice model for the defined project

Description

Returns error output from running the discretefish_subroutine function. The table argument must be the full name of the table name in the FishSET database. Use tables_databaseto view table names in FishSET database.

Usage

globalcheck_view(table, project)
globalcheck_view(table, project)

Arguments

`table`	Table name in FishSET database. Should contain the project, the phrase 'LDGlobalCheck', and a date in YMD format (20200101). Table name must be in quotes.
`project`	Name of project

Examples

## Not run: 
globalcheck_view('pcodLDGlobalCheck20190604', 'pcod')

## End(Not run)
## Not run: 
globalcheck_view('pcodLDGlobalCheck20190604', 'pcod')

## End(Not run)

Create a within-group running sum variable

Description

Create a within-group running sum variable

Usage

group_cumsum(
  dat,
  project,
  group,
  sort_by,
  value,
  name = "group_cumsum",
  create_group_ID = FALSE,
  drop_total_col = FALSE
)
group_cumsum(
  dat,
  project,
  group,
  sort_by,
  value,
  name = "group_cumsum",
  create_group_ID = FALSE,
  drop_total_col = FALSE
)

Arguments

`dat`	Primaryy data frame over which to apply function. Table in FishSET database should contain the string 'MainDataTable'.
`project`	String, project name.
`group`	String, the grouping variable(s) to sum `value` by. Used to create the "group_total" variable.
`sort_by`	String, a date variable to order 'MainDataTable' by.
`value`	String, the value variable used to calculate cumulative sum. Must be numeric.
`name`	String, the name for the new variable. Defaults to "group_cumsum".
`create_group_ID`	Logical, whether to create a group ID variable using `ID_var`. Defaults to `FALSE`.
`drop_total_col`	Logical, whether to remove the "group_total" variable created to calculate percentage. Defaults to `FALSE`.

Details

group_cumsum sums value by group, then cumulatively sums within groups. For example, a running sum by trip variable can be made by entering variables that identify unique vessels and trips into group and a numeric variable (such as catch or # of hauls) into value. Each vessel's trip total is calculated then cumulatively summed. The "group_total" variable gives the total value by group and can be dropped by setting drop_total_col = TRUE. A group ID column can be created using the variables in group by setting create_group_ID = TRUE.

Examples

## Not run: 
group_cumsum(pollockMainDataTable, "pollock", group = c("PERMIT", "TRIP_ID"),
             sort_by = "HAUL_DATE", value = "OFFICIAL_TOTAL_CATCH")

## End(Not run)
## Not run: 
group_cumsum(pollockMainDataTable, "pollock", group = c("PERMIT", "TRIP_ID"),
             sort_by = "HAUL_DATE", value = "OFFICIAL_TOTAL_CATCH")

## End(Not run)

Create a within-group lagged difference variable

Description

Create a within-group lagged difference variable

Usage

group_diff(
  dat,
  project,
  group,
  sort_by,
  value,
  name = "group_diff",
  lag = 1,
  create_group_ID = FALSE,
  drop_total_col = FALSE
)
group_diff(
  dat,
  project,
  group,
  sort_by,
  value,
  name = "group_diff",
  lag = 1,
  create_group_ID = FALSE,
  drop_total_col = FALSE
)

Arguments

`dat`	Primary data frame over which to apply function. Table in FishSET database should contain the string 'MainDataTable'.
`project`	String, project name.
`group`	String, the grouping variable(s) to sum `value` by. Used to create the "group_total" variable.
`sort_by`	String, a date variable to order 'MainDataTable' by.
`value`	String, the value variable used to calculate lagged difference. Must be numeric.
`name`	String, the name for the new variable. Defaults to "group_diff".
`lag`	Integer, adjusts lag length. Defaults to 1.
`create_group_ID`	Logical, whether to create a group ID variable using `ID_var`. Defaults to `FALSE`.
`drop_total_col`	Logical, whether to remove the "group_total" variable created to calculate percentage. Defaults to `FALSE`.

Details

group_diff creates a grouped lagged difference variable. value is first summed by the variable(s) in group, then the difference within-group is calculated. The "group_total" variable gives the total value by group and can be dropped by setting drop_total_col = TRUE. A group ID column can be created using the variables in group by setting create_group_ID = TRUE.

Examples

## Not run: 
group_diff(pollockMainDataTable, "pollock", group = c("PERMIT", "TRIP_ID"),
           sort_by = "HAUL_DATE", value = "HAUL")

## End(Not run)
## Not run: 
group_diff(pollockMainDataTable, "pollock", group = c("PERMIT", "TRIP_ID"),
           sort_by = "HAUL_DATE", value = "HAUL")

## End(Not run)

Create a within-group percentage variable

Description

Create a within-group percentage variable

Usage

group_perc(
  dat,
  project,
  id_group,
  group = NULL,
  value,
  name = "group_perc",
  create_group_ID = FALSE,
  drop_total_col = FALSE
)
group_perc(
  dat,
  project,
  id_group,
  group = NULL,
  value,
  name = "group_perc",
  create_group_ID = FALSE,
  drop_total_col = FALSE
)

Arguments

`dat`	Primary data frame over which to apply function. Table in FishSET database should contain the string 'MainDataTable'.
`project`	String, project name.
`id_group`	String, primary grouping variable(s). Used to create the "total_value" variable which sums `value` by `id_group`. If `group = NULL`, then `value` is divided by "total_value".
`group`	String, secondary grouping variable(s). Used to create the "group_total" variable which sums `value` by `id_group` and `group`. Percentage is calculated by dividing "group_total" by "total_value". Defaults to `NULL`.
`value`	String, the value variable used to calculate percentage. Must be numeric.
`name`	String, the name for the new variable. Defaults to "group_perc".
`create_group_ID`	Logical, whether to create a group ID variable using `ID_var`. Defaults to `FALSE`.
`drop_total_col`	Logical, whether to remove the "total_value" and "group_total" variables created to calculate percentage. Defaults to `FALSE`.

Details

group_perc creates a within-group percentage variable using a primary group ID (id_group) and secondary group (group). The total value of id_group is stored in the "total_value" variable, and the within-group total stored in "group_total". The group percentage is calculated using these two function-created variables. "total_value" and "group_total" can be dropped by setting drop_total_col = TRUE. A group ID column can be created using the variables inid_group and group by setting create_group_ID = TRUE.

Examples

## Not run: 
group_perc(pollockMainDataTable, "pollock", id_group = "PERMIT", group = NULL, 
           value = "OFFICIAL_TOTAL_CATCH_MT")
           
group_perc(pollockMainDataTable, "pollock", id_group = "PERMIT",
           group = "DISEMBARKED_PORT", value = "HAUL")

## End(Not run)
## Not run: 
group_perc(pollockMainDataTable, "pollock", id_group = "PERMIT", group = NULL, 
           value = "OFFICIAL_TOTAL_CATCH_MT")
           
group_perc(pollockMainDataTable, "pollock", id_group = "PERMIT",
           group = "DISEMBARKED_PORT", value = "HAUL")

## End(Not run)

Collapse data frame from haul to trip

Description

Collapse data frame from haul to trip

Usage

haul_to_trip(
  dat,
  project,
  fun.numeric = mean,
  fun.time = mean,
  tripID,
  haul_count = TRUE,
  log_fun = TRUE
)
haul_to_trip(
  dat,
  project,
  fun.numeric = mean,
  fun.time = mean,
  tripID,
  haul_count = TRUE,
  log_fun = TRUE
)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in the FishSET database contains the string 'MainDataTable'.
`project`	String, name of project.
`fun.numeric`	How to collapse numeric or temporal data. For example, `min`, `mean`, `max`, `sum`. Defaults to `mean`.
`fun.time`	How to collapse temporal data. For example, `min`, `mean`, `max`. Cannot be `sum` for temporal variables.
`tripID`	Column(s) that identify the individual trip.
`haul_count`	Logical, whether to return a column of the number of hauls per trip.
`log_fun`	Logical, whether to log function call (for internal use).

Details

Collapses primary dataset from haul to trip level. Unique trips are defined based on selected column(s), for example, landing permit number and disembarked port. This id column is used to collapse the data to trip level. fun.numeric and fun.time define how multiple observations for a trip are collapsed. For variables that are not numeric or dates, the first observation is used.

Value

Returns the primary dataset where each row is a trip.

Examples

## Not run: 
pollockMainDataTable <- haul_to_trip("pollockMainDataTable","pollock",
    min, mean, "PERMIT", "DISEMBARKED_PORT"
    )

## End(Not run)

## Not run: 
pollockMainDataTable <- haul_to_trip("pollockMainDataTable","pollock",
    min, mean, "PERMIT", "DISEMBARKED_PORT"
    )

## End(Not run)

Create ID variable

Description

Create ID variable from one or more variables

Usage

ID_var(
  dat,
  project,
  vars,
  name = NULL,
  type = "string",
  drop = FALSE,
  sep = "_",
  log_fun = TRUE
)
ID_var(
  dat,
  project,
  vars,
  name = NULL,
  type = "string",
  drop = FALSE,
  sep = "_",
  log_fun = TRUE
)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in FishSET database contains the string 'MainDataTable'.
`project`	Project name.
`vars`	Character string, additional column(s) in `dat` that define unique observations.
`name`	String, name of new ID column.
`type`	String, the class type of the new ID column. Choices are 'string“ or 'integar'. 'string' returns a character vector where each column in `vars` is combined and separated by `sep`. 'integer' returns an integer vector where each value corresponds to a unique group in `vars`.
`drop`	Logical, whether to drop columns in `vars`.
`sep`	Symbol used to combined variables.
`log_fun`	Logical, whether to log function call (for internal use).

Details

ID variable can be based on a single or multiple variables. Use sep = TRUE if dropping variables that create the ID variable.

Value

Returns the 'MainDataTable' with the ID variable included.

Examples

## Not run: 
pcodMainDataTable <- ID_var(pcodMainDataTable, "pcod", name = "PermitID", 
        vars = c("GEAR_TYPE", "TRIP_SEQ"), type = 'integar')
pcodMainDataTable <- ID_var(pcodMainDataTable, "pcod", name = "PermitID", 
        vars = c("GEAR_TYPE", "TRIP_SEQ"), type = 'string', sep="_")

## End(Not run)

## Not run: 
pcodMainDataTable <- ID_var(pcodMainDataTable, "pcod", name = "PermitID", 
        vars = c("GEAR_TYPE", "TRIP_SEQ"), type = 'integar')
pcodMainDataTable <- ID_var(pcodMainDataTable, "pcod", name = "PermitID", 
        vars = c("GEAR_TYPE", "TRIP_SEQ"), type = 'string', sep="_")

## End(Not run)

Insert plot from user folder

Description

Insert plot from user folder

Usage

insert_plot(out, project)
insert_plot(out, project)

Arguments

`out`	String, plot file name.
`project`	Name of project.

Examples

## Not run: 
insert_plot("pollock_plot.png")

## End(Not run)
## Not run: 
insert_plot("pollock_plot.png")

## End(Not run)

Insert table from user folder

Description

Insert table from user folder

Usage

insert_table(out, project)
insert_table(out, project)

Arguments

`out`	String, table file name.
`project`	Name of project.

Examples

## Not run: 
insert_table("pollock_table.csv")

## End(Not run)
## Not run: 
insert_table("pollock_table.csv")

## End(Not run)

Jitter longitude and latitude variables

Description

Jitter longitude and latitude variables

Usage

jitter_lonlat(dat, project, lon, lat, factor = 1, amount = NULL)
jitter_lonlat(dat, project, lon, lat, factor = 1, amount = NULL)

Arguments

`dat`	Primary data frame over which to apply function. Table in FishSET database should contain the string 'MainDataTable'.
`project`	Project name.
`lon`	String, variable name containing longitude.
`lat`	String, variable name containing latitude.
`factor`	Numeric, see `jitter` for details.
`amount`	Numeric, see `jitter` for details. Default (NULL): factor * d/5 where d is about the smallest difference between x values.

Details

This is one of the FishSET confidentiality functions. It "jitters" longitude and latitude using the base R function jitter.

Examples

## Not run: 
jitter_lonlat(pollockMainDataTable, "pollock",
              lon = "LonLat_START_LON", lat = "LonLat_START_LAT")

## End(Not run)
## Not run: 
jitter_lonlat(pollockMainDataTable, "pollock",
              lon = "LonLat_START_LON", lat = "LonLat_START_LAT")

## End(Not run)

View list of all log files

Description

View list of all log files

Usage

list_logs(project = NULL, chron = FALSE, modified = FALSE)
list_logs(project = NULL, chron = FALSE, modified = FALSE)

Arguments

`project`	Project name. Displays all logs if NULL.
`chron`	Logical, whether to display logs in chronological order (TRUE) or reverse chronological order (FALSE).
`modified`	Logical, whether to include date modified.

Display FishSET database tables by type

Description

Show project table names by table type. To see all tables for all projects in the FishSETFolder, use fishset_tables.

Usage

list_tables(project, type = "main")
list_tables(project, type = "main")

Arguments

project

A project name to show main tables by.

type

the type of fishset_db table to search for. Options include "main" (MainDataTable), "port" (PortTable), "spat" (SpatTable), "grid" (GridTable), "aux" (AuxTable) "ec" (ExpectedCatch), "altc" (AltMatrix), "info" (MainDataTableInfo), "gc" (ldglobalcheck), "fleet" (FleetTable), "filter" (FilterTable), "centroid" (Centroid or FishCentroid), "model" (ModelOut), "model data" or "model design" (ModelInputData), "outsample" (OutSampleDataTable).

Examples

## Not run: 
list_tables("pollock", type = "main")
list_tables("pollock", "ec")

## End(Not run)

## Not run: 
list_tables("pollock", type = "main")
list_tables("pollock", "ec")

## End(Not run)

Import, parse, and save auxiliary data to FishSET database

Description

Auxiliary data is additional data that connects the primary dataset. Function pulls the data, parses it, and then and saves the data to the FishSET database. A project must exist before running load_aux(). See load_maindata to create a new project.

Usage

load_aux(dat, aux, name, over_write = TRUE, project = NULL)
load_aux(dat, aux, name, over_write = TRUE, project = NULL)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in the FishSET database contains the string 'MainDataTable'.
`aux`	File name, including path of auxiliary data.
`name`	Name auxiliary data should be saved as in FishSET database.
`over_write`	Logical, If TRUE, saves data over previously saved data table in the FishSET database.
`project`	String, name of project.

Details

Auxiliary data is any additional data beyond the primary data and the port data. Auxiliary data can be any data that can be merged with the primary dataset (ex. prices by date, vessel characteristics, or fishery season). The auxiliary data does not have to be at a haul or trip level but must contain a variable to connect the auxiliary data to the primary dataset. The function checks that at least one column name of the auxiliary data matches a column name in the primary dataset. The function checks that each row is unique, that no variables are empty, and that column names are case-insensitive unique. There data issues are resolved before the data is saved to the database. The data is saved in the FishSET database as the raw data and the working data. The naming convention for auxiliary tables is "projectNameAuxTable". Date is also added to the name for the raw data. See table_view to view/load auxiliary tables into the working environment.

Examples

## Not run: 
load_aux(pcodMainDataTable, name = 'FisherySeason', over_write = TRUE, 
         project = 'pcod')

## End(Not run)
## Not run: 
load_aux(pcodMainDataTable, name = 'FisherySeason', over_write = TRUE, 
         project = 'pcod')

## End(Not run)

Load data from FishSET database into the R environment

Description

Load data from FishSET database into the R environment

Usage

load_data(project, name = NULL)
load_data(project, name = NULL)

Arguments

`project`	String, name of project.
`name`	Optional, name of table in FishSET database. Use this argument if pulling raw or dated table (not the working table).

Details

Pulls the primary data table from the FishSET database and loads it into the working environment as the project and MainDataTable. For example, if the project was pollock, then data would be saved to the working environment as 'pollockMainDataTable'.

Value

Data loaded to working environment as the project and ‘MainDataTable’.

Examples

## Not run: 
load_data('pollock')

load_data('pollock', 'pollockMainDataTable20190101')

## End(Not run)

## Not run: 
load_data('pollock')

load_data('pollock', 'pollockMainDataTable20190101')

## End(Not run)

Import, parse, and save gridded data to FishSET database

Description

Gridded data is data that varies by two dimensions. Column names must be zone names. Load, parse, and save gridded data to FishSET database. A project must exist before running load_grid(). See load_maindata to create a new project.

Usage

load_grid(grid, name, project, over_write = TRUE)
load_grid(grid, name, project, over_write = TRUE)

Arguments

`grid`	File name, including path, of gridded data.
`name`	Name gridded data should be saved as in FishSET database.
`project`	String, name of project.
`over_write`	Logical, If TRUE, saves dat over previously saved data table in the FishSET database.

Details

Grid data is an optional data frame that contains a variable that varies by the map grid (ex. sea surface temperature, wind speed). Data can also vary by a second dimension (e.g., date/time). Both dimensions in the gridded data file need to be variables included in the primary data set. The grid locations (zones) must define the columns and the optional second dimension defines the rows. The row variable must have the exact name as the variable in the primary data frame that it will be linked to. The function DOES NOT check that column and row variables match a variable in the primary data set. The function checks that each row is unique, that no variables are empty, and that column names are case-insensitive unique. These data issues are resolved before the data is saved to the database. The data is saved in the FishSET database as the raw data and the working data. In both cases, the table name is the project and the file name x. Date is attached to the name for the raw data. The naming convention for gridded tables is "projectNameGridTable". See table_view to view/load gridded tables into the working environment.

Examples

## Not run: 
load_grid(dat = 'pcodMainDataTable', name = 'SeaSurfaceTemp', 
          over_write = TRUE, project = 'pcod')

## End(Not run)

## Not run: 
load_grid(dat = 'pcodMainDataTable', name = 'SeaSurfaceTemp', 
          over_write = TRUE, project = 'pcod')

## End(Not run)

Import, parse, and save data to the FishSET Database

Description

load_maindata() saves the primary dataset to the FishSET Database (located in the FishSETFolder) and is a required step. The primary data will also be loaded into the working environment as a dataframe named "projectMainDataTable". Running load_maindata() creates a new project directory in the FishSETFolder. To see a list of existing projects run projects() or open the FishSETFolder.

Usage

load_maindata(dat, project, over_write = FALSE, compare = FALSE, y = NULL)
load_maindata(dat, project, over_write = FALSE, compare = FALSE, y = NULL)

Arguments

`dat`	Primary data containing information on hauls or trips. This can be the full path to the file, the name of a main table in the FishSET database, or a dataframe object in the working environment. Main tables in the FishSET database contain the string 'MainDataTable'. A complete list of FishSET tables can be display by running `fishset_tables()`.
`project`	String, name of project. Cannot contain spaces.
`over_write`	Logical, If `TRUE`, saves data over previously saved data table in the FishSET database. Defaults to `FALSE`.
`compare`	Logical, whether to compare new dataframe to previously saved dataframe `y`. See `fishset_compare`.
`y`	Name of previously saved table in FishSET Database. `y` must be defined if `compare = TRUE`.

Details

The dataset is saved in the FishSET database as raw and working tables. The table name is the project and the table type, 'MainDataTable'. The raw table is the original, unedited table. The working table contains any changes made to the table after uploading. An eight digit date string is included in the name of the raw table (e.g. "pollockMainDataTable20220210"). The primary data is loaded into the working environment as ‘projectMainDataTable’. The fishset_compare argument compares dat to an existing FishSET table in y and returns a message noting basic differences between the two. The column names are checked for case-insensitivity and uniqueness.

Examples

## Not run: 
# upload data from filepath
load_maindata(dat = "PATH/TO/DATA", project = "pollock")

# upload from dataframe in working environment
load_maindata(dat = Mydata, project = 'pollock', over_write = TRUE, 
              compare = TRUE, y = 'MainDataTable01012011')
              
# upload from an exisitng FishSET primary data table
looad_maindata(dat = "pollockMainDataTable", project = "pollock2020")

## End(Not run)

## Not run: 
# upload data from filepath
load_maindata(dat = "PATH/TO/DATA", project = "pollock")

# upload from dataframe in working environment
load_maindata(dat = Mydata, project = 'pollock', over_write = TRUE, 
              compare = TRUE, y = 'MainDataTable01012011')
              
# upload from an exisitng FishSET primary data table
looad_maindata(dat = "pollockMainDataTable", project = "pollock2020")

## End(Not run)

Import, parse, and save out-of-sample data to FishSET database

Description

load_outsample() saves out-of-sample dataset to the FishSET Database (located in the FishSETFolder) and the structure must match the primary dataset. A project must exist before running load_outsample(). See load_maindata to create a new project. Note: if the data are out-of-sample temporally then upload a new datafile, if the data are only out-of-sample spatially then upload the primary data file in this function.

Usage

load_outsample(dat, project, over_write = FALSE, compare = FALSE, y = NULL)
load_outsample(dat, project, over_write = FALSE, compare = FALSE, y = NULL)

Arguments

`dat`	Out-of-sample data containing information on hauls or trips with same structure as the primary data table. This can be the full path to the file, the name of a out-of-sample table in the FishSET database, or a dataframe object in the working environment. Out-of-sample tables in the FishSET database contain the string 'OutSampleDataTable'. A complete list of FishSET tables can be viewed by running `fishset_tables()`.
`project`	String, name of project.
`over_write`	Logical, If `TRUE`, saves data over previously saved data table in the FishSET database. Defaults to `FALSE`.
`compare`	Logical, whether to compare new dataframe to previously saved dataframe `y`. See `fishset_compare`.
`y`	Name of previously saved table in FishSET Database. `y` must be defined if `compare = TRUE`.

Details

The out-of-sample dataset is saved in the FishSET database as raw and working tables. The table name is the project and the table type, 'OutSampleDataTable'. The raw table is the original, unedited table. The working table contains any changes made to the table after uploading. An eight digit date string is included in the name of the raw table (e.g. "pollockOutSampleDataTable20220210"). The out-of-sample data is loaded into the working environment as ‘projectOutSampleDataTable’. The fishset_compare argument compares dat to an existing FishSET table in y and returns a message noting basic differences between the two. The column names are checked for case-insensitivity and uniqueness.

Examples

## Not run: 
# upload data from filepath
load_outsample(dat = "PATH/TO/DATA", project = "pollock")

# upload from dataframe in working environment
load_outsample(dat = MyData, project = 'pollock', over_write = TRUE, 
              compare = TRUE, y = 'OutSampleDataTable01012011')
              
# upload from an exisitng FishSET out-of-sample data table
load_outsample(dat = "pollockOutSampleDataTable", project = "pollock")

## End(Not run)

## Not run: 
# upload data from filepath
load_outsample(dat = "PATH/TO/DATA", project = "pollock")

# upload from dataframe in working environment
load_outsample(dat = MyData, project = 'pollock', over_write = TRUE, 
              compare = TRUE, y = 'OutSampleDataTable01012011')
              
# upload from an exisitng FishSET out-of-sample data table
load_outsample(dat = "pollockOutSampleDataTable", project = "pollock")

## End(Not run)

Import, parse, and save port data to FishSET database

Description

A project must exist before running load_port(). See load_maindata to create a new project.

Usage

load_port(
  dat,
  port_name,
  project,
  over_write = TRUE,
  compare = FALSE,
  y = NULL
)
load_port(
  dat,
  port_name,
  project,
  over_write = TRUE,
  compare = FALSE,
  y = NULL
)

Arguments

`dat`	Dataset containing port data. At a minimum, must include three columns, the port names, and the latitude and longitude of ports. `dat` can be a filepath, a existing FishSET table, or a dataframe in the working environment.
`port_name`	Variable containing port names. Names should match port names in primary dataset.
`project`	String, name of project.
`over_write`	Logical, if TRUE, saves over data table previously saved in the FishSET database.
`compare`	Logical, should new data be compared to previously saved dataframe `y`.
`y`	Name of previously saved table in FishSET database. `y` must be defined if `compare` is TRUE.

Details

Runs a series of checks on the port data. The function checks that each row is unique, that no variables are empty, and that column names are case-insensitive unique. There data issues are resolved before the data is saved to the database. If checks pass, runs the fishset_compare function and saves the new data frame to the FishSET database. The data is saved in the FishSET database as the raw data and the working data. The naming convention for port tables is "projectPortTable". Date is also attached to the name for the raw data. See table_view to view/load port tables into the working environment.

Examples

## Not run: 
load_port(PortTable, over_write = TRUE, project  ='pollock',
          compare = TRUE, y = 'pollockPortTable01012011')

## End(Not run)
## Not run: 
load_port(PortTable, over_write = TRUE, project  ='pollock',
          compare = TRUE, y = 'pollockPortTable01012011')

## End(Not run)

Import, parse, and save spatial data

Description

Saves a spatial table to the FishSETFolder as a geojson file. A project must exist before running load_spatial(). See load_maindata to create a new project.

Usage

load_spatial(
  spat,
  name = NULL,
  over_write = TRUE,
  project,
  data.type = NULL,
  lon = NULL,
  lat = NULL,
  id = NULL,
  ...
)
load_spatial(
  spat,
  name = NULL,
  over_write = TRUE,
  project,
  data.type = NULL,
  lon = NULL,
  lat = NULL,
  id = NULL,
  ...
)

Arguments

`spat`	File name, including path, of spatial data.
`name`	Name spatial data should be saved as in FishSET project folder. Cannot be empty or contain spaces.
`over_write`	Logical, If `TRUE`, saves `spat` over previously saved data table in the FishSET project folder.
`project`	String, name of project.
`data.type`	Data type argument passed to `read_dat`. If reading from a shape folder use `data.type = "shape"`.
`lon`	Variable or list from `spat` containing longitude data. Required for csv files. Leave as `NULL` if `spat` is a shape or json file.
`lat`	Variable or list from `spat` containing latitude data. Required for csv files. Leave as `NULL` if `spat` is a shape or json file
`id`	Polygon ID column. Required for csv files. Leave as `NULL` if `spat` is a shape or json file.
`...`	Additional argument passed to `read_dat`.

Details

Function to import, parse, and saved project folder in 'FishSETFolder' directory. To export as shape file, use write_dat specifying ‘type=’shp''. load_spatial() performs basic quality check before saving spatial tables to the project data folder as a geojson file. To be saved, the spatial must pass the checks in check_spatdat. The spatial table is converted to an sf object, and checked for unique rows and empty columns. The naming convention for spatial tables is "projectNameSpatTable". See table_view to view/load spatial tables into the working environment.

Examples

## Not run: 
# upload from filepath
load_spatial(spat = "FILE/PATH/TO/SPAT", name = 'tenMinSqr', 
             over_write = TRUE, project = 'pcod')

# upload from object in working environment
load_spatial(spat = NMFSAreas, name = "NMFS", project = "pcod")

# upload from an existing FishSET spatial table
load_spatial(spat = "pcodNMFSSpatTable", name = "NMFS", project = "pcod2020")

## End(Not run)
## Not run: 
# upload from filepath
load_spatial(spat = "FILE/PATH/TO/SPAT", name = 'tenMinSqr', 
             over_write = TRUE, project = 'pcod')

# upload from object in working environment
load_spatial(spat = NMFSAreas, name = "NMFS", project = "pcod")

# upload from an existing FishSET spatial table
load_spatial(spat = "pcodNMFSSpatTable", name = "NMFS", project = "pcod2020")

## End(Not run)

Log user-created functions or models

Description

Log user-created functions or models

Usage

log_func_model(x, project)
log_func_model(x, project)

Arguments

`x`	Name of function.
`project`	Project name.

Details

Logs function name, arguments, and, call. Use this function to log user-defined likelihood functions.

Examples

## Not run: 
my_func <- function(a, b) {
  a + b
}
log_func_model(my_func)

## End(Not run)

## Not run: 
my_func <- function(a, b) {
  a + b
}
log_func_model(my_func)

## End(Not run)

Console function for rerunning project log

Description

Console function for rerunning project log

Usage

log_rerun(
  log_file,
  dat = NULL,
  portTable = NULL,
  aux = NULL,
  gridfile = NULL,
  spat = NULL,
  ind = NULL,
  run = FALSE
)
log_rerun(
  log_file,
  dat = NULL,
  portTable = NULL,
  aux = NULL,
  gridfile = NULL,
  spat = NULL,
  ind = NULL,
  run = FALSE
)

Arguments

`log_file`	String, name of the log file starting with the date (YYYY-MM-DD) and ending in ".json".
`dat`	String, new primary data table to rerun log
`portTable`	String, name of port table. Defualts to NULL.
`aux`	String, name of auxiliary table. Defaults to NULL.
`gridfile`	String, name of gridded data table. Defaults to NULL.
`spat`	String, name of spatial data table. Defaults to NULL.
`ind`	Numeric, indices of function calls to rerun.
`run`	Logical, whether to run the logged function calls (TRUE) or simply list all function calls (FALSE).

Examples

## Not run: 
log_rerun("pollock_2020-10-23.json", run = TRUE) # reruns entire log with original data table
# runs log with new data table
log_rerun("pollock_2020-10-23.json", dat = "pollockMainDataTable", run = TRUE) 

## End(Not run)
## Not run: 
log_rerun("pollock_2020-10-23.json", run = TRUE) # reruns entire log with original data table
# runs log with new data table
log_rerun("pollock_2020-10-23.json", dat = "pollockMainDataTable", run = TRUE) 

## End(Not run)

Interactive function for rerunning project log

Description

Interactive function for rerunning project log

Usage

log_rerun_gui()
log_rerun_gui()

Examples

## Not run: 
log_rerun_gui()

## End(Not run)
## Not run: 
log_rerun_gui()

## End(Not run)

Reset log file

Description

Reset log file

Usage

log_reset(project, over_write = FALSE)
log_reset(project, over_write = FALSE)

Arguments

`project`	Project name.
`over_write`	Logical, whether to over write an existing log file. This only applies if a log was created and reset in the same day for the same project. See "Details".

Details

Logs are saved by project name and date (date created, not date modified). For example, "pollock_2021-05-12.json". Calls to log functions are automatically appended to the existing project log file. Resetting the log file will create a new project log file with the current date. A log will not be reset if log_reset() is run the same day the log was created (or if the log is reset two or more times in a single day), unless over_write = TRUE. This will replace that day's log file.

Examples

## Not run: 
log_reset("pollock")

## End(Not run)

## Not run: 
log_reset("pollock")

## End(Not run)

Conditional logit likelihood

Description

Conditional logit likelihood

Usage

logit_c(starts3, dat, otherdat, alts, project, expname, mod.name)
logit_c(starts3, dat, otherdat, alts, project, expname, mod.name)

Arguments

`starts3`	Starting values as a vector (num). For this likelihood, the order takes: c([alternative-specific parameters], [travel-distance parameters]). The alternative-specific parameters and travel-distance parameters are of length (# of alternative-specific variables) and (# of travel-distance variables) respectively.
`dat`	Data matrix, see output from shift_sort_x, alternatives with distance.
`otherdat`	Other data used in model (as a list containing objects 'intdat' and 'griddat'). For this likelihood, ‘intdat' are ’travel-distance variables', which are alternative-invariant variables that are interacted with travel distance to form the cost portion of the likelihood. Each variable name therefore corresponds to data with dimensions (number of observations) by (unity), and returns a single parameter. In ‘griddat' are ’alternative-specific variables', that vary across alternatives, e.g. catch rates. Each variable name therefore corresponds to data with dimensions (number of observations) by (number of alternatives), and returns a single parameter for each variable (e.g. the marginal utility from catch). For both objects any number of variables are allowed, as a list of matrices. Note the variables (each as a matrix) within 'griddat' and ‘intdat' have no naming restrictions. ’Alternative-specific variables' may correspond to catches that vary by location, and 'travel-distance variables' may be vessel characteristics that affect how much disutility is suffered by traveling a greater distance. Note in this likelihood 'alternative-specific variables' vary across alternatives because each variable may have been estimated in a previous procedure (i.e. a construction of expected catch). If there are no other data, the user can set 'griddat' as ones with dimension (number of observations) by (number of alternatives) and 'intdat' variables as ones with dimension (number of observations) by (unity).
`alts`	Number of alternative choices in model as length equal to unity (as a numeric vector).
`project`	Name of project
`expname`	Expected catch table
`mod.name`	Name of model run for model result output table

Value

ld: negative log likelihood

Graphical examples

$Figure: logit\_c\_grid.png$
$Figure: logit\_c\_travel.png$

Examples

## Not run: 
data(zi)
data(catch)
data(choice)
data(distance)
data(si)

optimOpt <- c(1000,1.00000000000000e-08,1,0)

methodname <- 'BFGS'

kk <- 4

si2 <- matrix(sample(1:5,dim(si)[1]*kk,replace=TRUE),dim(si)[1],kk)
zi2 <- sample(1:10,dim(zi)[1],replace=TRUE)

otherdat <- list(griddat=list(predicted_catch=as.matrix(predicted_catch),
    si2=as.matrix(si2)), intdat=list(zi=as.matrix(zi),
    zi2=as.matrix(zi2)))

initparams <- c(2.5, 2, -1, -2)

func <- logit_c

results <- discretefish_subroutine(catch,choice,distance,otherdat,
    initparams,optimOpt,func,methodname)

## End(Not run)
## Not run: 
data(zi)
data(catch)
data(choice)
data(distance)
data(si)

optimOpt <- c(1000,1.00000000000000e-08,1,0)

methodname <- 'BFGS'

kk <- 4

si2 <- matrix(sample(1:5,dim(si)[1]*kk,replace=TRUE),dim(si)[1],kk)
zi2 <- sample(1:10,dim(zi)[1],replace=TRUE)

otherdat <- list(griddat=list(predicted_catch=as.matrix(predicted_catch),
    si2=as.matrix(si2)), intdat=list(zi=as.matrix(zi),
    zi2=as.matrix(zi2)))

initparams <- c(2.5, 2, -1, -2)

func <- logit_c

results <- discretefish_subroutine(catch,choice,distance,otherdat,
    initparams,optimOpt,func,methodname)

## End(Not run)

Full information model with Dahl's correction function

Description

Full information model with Dahl's correction function

Usage

logit_correction(starts3, dat, otherdat, alts, project, expname, mod.name)
logit_correction(starts3, dat, otherdat, alts, project, expname, mod.name)

Arguments

`starts3`	Starting values as a vector (num). For this likelihood, the order takes: c([marginal utility from catch], [catch-function parameters], [polynomial starting parameters], [travel-distance parameters], [catch sigma]). The number of polynomial interaction terms is currently set to 2, so given the chosen degree 'polyn' there should be (((polyn+1)2) + 2)(k) polynomial starting parameters, where (k) equals the number of alternatives. The marginal utility from catch and catch sigma are of length equal to unity respectively. The catch-function and travel-distance parameters are of length (# of catch variables)*(k) and (# of cost variables) respectively.
`dat`	Data matrix, see output from shift_sort_x, alternatives with distance.
`otherdat`	Other data used in model (as a list containing objects 'griddat', 'intdat', 'startloc', 'polyn', and 'distance'). For catch-function variables ('griddat') alternative-invariant variables that are interacted with zonal constants to form the catch portion of the likelihood. Each variable name therefore corresponds to data with dimensions (number of observations) by (unity), and returns (k) parameters where (k) equals the number of alternatives. For travel-distance variables alternative-invariant variables that are interacted with travel distance to form the cost portion of the likelihood. Each variable name therefore corresponds to data with dimensions (number of observations) by (unity), and returns a single parameter. Any number of catch-function and travel-distance variables are allowed, as a list of matrices. Note the variables (each as a matrix) within 'griddat' and 'intdat' have no naming restrictions. Catch-function variables may correspond to variables that affect catches across locations, or travel-distance variables may be vessel characteristics that affect how much disutility is suffered by traveling a greater distance. Note in this likelihood the catch-function variables vary across observations but not for each location: they are allowed to affect catches across locations due to the location-specific coefficients. If there are no other data, the user can set catch-function variables as ones with dimension (number of observations) by (number of alternatives) and travel-distance variables as ones with dimension (number of observations) by (unity). The variable startloc is a matrix of dimension (number of observations) by (unity), that corresponds to the starting location when the agent decides between alternatives. The variable polyn is a vector of length equal to unity corresponding to the chosen polynomial degree. The variable distance is a matrix of dimension (number of observations) by (number of alternatives) corresponding to the distance to each alternative.
`alts`	Number of alternative choices in model as length equal to unity (as a numeric vector).
`project`	Name of project
`expname`	Expected catch table
`mod.name`	Name of model run for model result output table

Value

ld: negative log likelihood

Graphical examples

$Figure: logit\_correction\_grid.png$
$Figure: logit\_correction\_travel.png$
$Figure: logit\_correction\_poly.png$

Examples

## Not run: 
data(zi)
data(catch)
data(choice)
data(distance)
data(si)
data(startloc)

optimOpt <- c(1000,1.00000000000000e-08,1,0)

methodname <- 'BFGS'

polyn <- 3
kk <- 4

si2 <- sample(1:5,dim(si)[1],replace=TRUE)
zi2 <- sample(1:10,dim(zi)[1],replace=TRUE)

otherdat <- list(griddat=list(si=as.matrix(si),si2=as.matrix(si2)),
    intdat=list(zi=as.matrix(zi),zi2=as.matrix(zi2)),
    startloc=as.matrix(startloc),polyn=polyn,
    distance=as.matrix(distance))

initparams <- c(3, 0.5, 0.4, 0.3, 0.2, 0.55, 0.45, 0.35, 0.25,
    rep(0, (((polyn+1)*2) + 2)*kk), -0.3,-0.4, 3)

func <- logit_correction

results <- discretefish_subroutine(catch,choice,distance,otherdat,
    initparams,optimOpt,func,methodname)

## End(Not run)
## Not run: 
data(zi)
data(catch)
data(choice)
data(distance)
data(si)
data(startloc)

optimOpt <- c(1000,1.00000000000000e-08,1,0)

methodname <- 'BFGS'

polyn <- 3
kk <- 4

si2 <- sample(1:5,dim(si)[1],replace=TRUE)
zi2 <- sample(1:10,dim(zi)[1],replace=TRUE)

otherdat <- list(griddat=list(si=as.matrix(si),si2=as.matrix(si2)),
    intdat=list(zi=as.matrix(zi),zi2=as.matrix(zi2)),
    startloc=as.matrix(startloc),polyn=polyn,
    distance=as.matrix(distance))

initparams <- c(3, 0.5, 0.4, 0.3, 0.2, 0.55, 0.45, 0.35, 0.25,
    rep(0, (((polyn+1)*2) + 2)*kk), -0.3,-0.4, 3)

func <- logit_correction

results <- discretefish_subroutine(catch,choice,distance,otherdat,
    initparams,optimOpt,func,methodname)

## End(Not run)

Zonal logit with area-specific constants procedure

Description

Zonal logit with area-specific constants procedure

Usage

logit_zonal(starts3, dat, otherdat, alts, project, expname, mod.name)
logit_zonal(starts3, dat, otherdat, alts, project, expname, mod.name)

Arguments

`starts3`	Starting values as a vector (num). For this likelihood, the order takes: c([area-specific parameters], [travel-distance parameters]). The area-specific parameters and travel-distance parameters are of length (# of area-specific parameters)*(k-1) and (# of travel-distance variables respectively, where (k) equals the number of alternatives.
`dat`	Data matrix, see output from shift_sort_x, alternatives with distance.
`otherdat`	Other data used in model (as a list containing objects 'intdat' and 'griddat'). For this likelihood, ‘intdat' are ’travel-distance variables', which are alternative-invariant values that are interacted with travel distance to form the cost portion of the likelihood. Each variable name therefore corresponds to data with dimensions (number of observations) by (unity), and returns a single parameter. In ‘griddat' are ’area-specific parameters' that do not vary across alternatives, e.g. vessel gross tonnage. Each constant name therefore corresponds to data with dimensions (number of observations) by (unity), and returns (k-1) parameters where (k) equals the number of alternatives, as a normalization of parameters is needed as the probabilities sum to one. Interpretation is therefore relative to the first alternative. For both objects any number of variables are allowed, as a list of matrices. Note the variables (each as a matrix) within 'griddat' and ‘intdat' have no naming restrictions. ’Area-specific parametes ' may correspond to variables that impact average catches by location, or 'travel-distance variables' may be vessel characteristics that affect how much disutility is suffered by traveling a greater distance. Note in this likelihood the 'area-specific parameters' vary across observations but not for each location: they are allowed to affect alternatives differently due to the location-specific coefficients. If there are no other data, the user can set 'griddat' as ones with dimension (number of observations) by (unity) and 'intdat' variables as ones with dimension (number of observations) by (unity).
`alts`	Number of alternative choices in model as length equal to unity (as a numeric vector).
`project`	Name of project
`expname`	Expected catch table
`mod.name`	Name of model run for model result output table

Value

ld: negative log likelihood

Graphical examples

$Figure: logit\_zonal\_grid.png$
$Figure: logit\_zonal\_travel.png$

Examples

## Not run: 
data(zi)
data(catch)
data(choice)
data(distance)
data(si)

optimOpt <- c(1000,1.00000000000000e-08,1,0)

methodname <- 'BFGS'

si2 <- sample(1:5,dim(si)[1],replace=TRUE)
zi2 <- sample(1:10,dim(zi)[1],replace=TRUE)

otherdat <- list(griddat=list(si=as.matrix(si),si2=as.matrix(si2)),
    intdat=list(zi=as.matrix(zi),zi2=as.matrix(zi2)))

initparams <- c(1.5, 1.25, 1.0, 0.9, 0.8, 0.75, -1, -0.5)

func <- logit_zonal

results <- discretefish_subroutine(catch,choice,distance,otherdat,
    initparams,optimOpt,func,methodname)

## End(Not run)
## Not run: 
data(zi)
data(catch)
data(choice)
data(distance)
data(si)

optimOpt <- c(1000,1.00000000000000e-08,1,0)

methodname <- 'BFGS'

si2 <- sample(1:5,dim(si)[1],replace=TRUE)
zi2 <- sample(1:10,dim(zi)[1],replace=TRUE)

otherdat <- list(griddat=list(si=as.matrix(si),si2=as.matrix(si2)),
    intdat=list(zi=as.matrix(zi),zi2=as.matrix(zi2)))

initparams <- c(1.5, 1.25, 1.0, 0.9, 0.8, 0.75, -1, -0.5)

func <- logit_zonal

results <- discretefish_subroutine(catch,choice,distance,otherdat,
    initparams,optimOpt,func,methodname)

## End(Not run)

Assign longitude and latitude points to zonal centroid

Description

Assign longitude and latitude points to zonal centroid

Usage

lonlat_to_centroid(dat, project, lon, lat, spat, zone)
lonlat_to_centroid(dat, project, lon, lat, spat, zone)

Arguments

`dat`	Primary data frame over which to apply function. Table in FishSET database should contain the string 'MainDataTable'.
`project`	Project name.
`lon`	String, variable name containing longitude.
`lat`	String, variable name containing latitude.
`spat`	Spatial data table containing regulatory zones. This can be a "spatial feature" or sf object.
`zone`	String, column name contain the assigned zone. Must be the same for both the spatial data table and MainDataTable.

Details

This is one of the FishSET confidentiality functions. It replaces the selected longitude and latitude columns with the zonal centroid derived from a spatial data table.

Examples

## Not run: 
lonlat_to_centroid(pollockMainDataTable, "pollock", spatdat, 
                  lon = "LonLat_START_LON", lat = "LonLat_START_LAT",
                  zone = "NMFS_AREA")

## End(Not run)

## Not run: 
lonlat_to_centroid(pollockMainDataTable, "pollock", spatdat, 
                  lon = "LonLat_START_LON", lat = "LonLat_START_LAT",
                  zone = "NMFS_AREA")

## End(Not run)

View list of MainDataTables in FishSET database

Description

View list of MainDataTables in FishSET database

Usage

main_tables(project, show_all = TRUE)
main_tables(project, show_all = TRUE)

Arguments

`project`	A project name to filter main tables by.
`show_all`	Logical, whether to show all main tables (including raw and final tables) or just editable tables.

Examples

## Not run: 
main_tables("pollock")

## End(Not run)
## Not run: 
main_tables("pollock")

## End(Not run)

Make model design file

Description

Create a list containing likelihood function, parameters, and data to be pass to model call function

Usage

make_model_design(
  project,
  catchID,
  likelihood = NULL,
  initparams = NULL,
  optimOpt = c(100, 1e-08, 1, 1),
  methodname = "BFGS",
  mod.name = NULL,
  vars1 = NULL,
  vars2 = NULL,
  priceCol = NULL,
  expectcatchmodels = list("all"),
  startloc = NULL,
  polyn = NULL,
  crs = NULL,
  outsample = FALSE,
  CV_dat = NULL
)
make_model_design(
  project,
  catchID,
  likelihood = NULL,
  initparams = NULL,
  optimOpt = c(100, 1e-08, 1, 1),
  methodname = "BFGS",
  mod.name = NULL,
  vars1 = NULL,
  vars2 = NULL,
  priceCol = NULL,
  expectcatchmodels = list("all"),
  startloc = NULL,
  polyn = NULL,
  crs = NULL,
  outsample = FALSE,
  CV_dat = NULL
)

Arguments

project

String, name of project.

catchID

String, variable from dat that contains catch data.

likelihood

String, name of likelihood function. A description of explanatory variables for each likelihood is provided below in the details sections. Information on likelihood- specific initial parameter specification can be found in discretefish_subroutine() documentation.

logit_c:	Conditional logit likelihood
logit_zonal:	Zonal logit with area-specific constants procedure
logit_correction:	Full information model with Dahl's correction function
epm_normal:	Expected profit model with normal catch function
epm_weibull:	Expected profit model with Weibull catch function
epm_lognormal:	Expected profit model with lognormal catch function

initparams

Vector or list, initial parameter estimates for revenue/location-specific covariates then cost/distance. The number of parameter estimate varies by likelihood function. See Details section for more information. The initial parameters will be set to 1 if initparams == NULL. If initparams is a single numeric value, it will be used for each parameter. If using parameter estimates from previous model, initparams should be the name of the model the parameter estimates should come from. Examples: initparams = 'epm_mod1', initparams = list('epm_mod1', 'epm_mod2').

optimOpt

String, optimization options (max function evaluations, max iterations, (reltol) tolerance of x, trace) Note: add optim reference here?.

methodname

String, optimization method (see stats::optim() options). Defaults to "BFGS".

mod.name

String, name of model run for model result output table.

vars1

Character string, additional ‘travel-distance’ variables to include in the model. These depend on the likelihood. See the Details section for how to specify for each likelihood function.

vars2

Character string, additional variables to include in the model. These depend on the likelihood. See the Details section for how to specify for each likelihood function. For likelihood = 'logit_c', vars2 should be the name of the gridded table saved to the FishSET Database, and should contain the string "GridTableWide". See format_grid() for details.

priceCol

Variable in dat containing price information. Required if specifying an expected profit model for the likelihood (epm_normal, epm_weibull, epm_lognormal).

expectcatchmodels

List, name of expected catch models to include in model run. Defaults to all models. Each list item should be a string of expected catch models to include in a model. For example, list(c('recent', 'older'), c('user1')) would run one model with the medium and long expected catch matrices, and one model with just the user-defined expected catch matrix. Choices are "recent", "older", "oldest", "logbook", "all", and "individual". See create_expectations() for details on the different models. Option "all" will run all expected catch matrices jointly. Option "individual" will run the model for each expected catch matrix separately. The final option is to select one more expected catch matrices to run jointly.

startloc

Variable in dat identifying the location when choice of where to fish next was made. Required for logit_correction likelihood. Use the create_startingloc() function to create the starting location vector.

polyn

Numeric, correction polynomial degree. Required for logit_correction() likelihood.

crs

coordinate reference system to be assigned when creating the distance matrix. Passed on to create_dist_matrix().

outsample

Logical, indicates whether the model design is for primary data (FALSE) or out-of-sample data (TRUE). The default is outsample = FALSE.

CV_dat

Dataframe that contains training or testing data for k-fold cross validation. Defaults to CV_dat = NULL.

Details

Function creates the model matrix list that contains the data and modeling choices. The model design list is saved to the FishSET database and called by the discretefish_subroutine(). Alternative fishing options come from the Alternative Choice list, generated from the create_alternative_choice() function, and the expected catch matrices from the create_expectations() function. The distance from the starting point to alternative choices is calculated.

Variable names details:

	vars1	vars2

logit_c:	"travel-distance variables" are alternative-invariant variables that are interacted with travel distance to form the cost portion of the likelihood. Each variable name therefore corresponds to data with dimensions (number of observations) by (unity), and returns a single parameter.	"alternative-specific variables" vary across alternatives, e.g. catch rates. Each variable name therefore corresponds to data with dimensions (number of observations) by (number of alternatives), and returns a single parameter for each variable (e.g. the marginal utility from catch).

logit_zonal:	"travel-distance variables" are alternative-invariant variables that are interacted with travel distance to form the cost portion of the likelihood. Each variable name therefore corresponds to data with dimensions (number of observations) by (unity), and returns a single parameter.	"average-catch variables" are alternative-invariant variables, e.g. vessel gross tonnage. Each variable name therefore corresponds to data with dimensions (number of observations) by (unity), and returns (k-1) parameters where (k) equals the number of alternatives, as a normalization of parameters is needed as the probabilities sum to one. Interpretation is therefore relative to the first alternative.

epm_normal:	"travel-distance variables" are alternative-invariant variables that are interacted with travel distance to form the cost portion of the likelihood. Each variable name therefore corresponds to data with dimensions (number of observations) by (unity), and returns a single parameter.	"catch-function variables" are alternative-invariant variables that are interacted with zonal constants to form the catch portion of the likelihood. Each variable name therefore corresponds to data with dimensions (number of observations) by (unity), and returns (k) parameters where (k) equals the number of alternatives.

epm_lognormal:	"travel-distance variables" are alternative-invariant variables that are interacted with travel distance to form the cost portion of the likelihood. Each variable name therefore corresponds to data with dimensions (number of observations) by (unity), and returns a single parameter.	"catch-function variables" are alternative-invariant variables that are interacted with zonal constants to form the catch portion of the likelihood. Each variable name therefore corresponds to data with dimensions (number of observations) by (unity), and returns (k) parameters where (k) equals the number of alternatives.

epm_weibull:	"travel-distance variables" are alternative-invariant variables that are interacted with travel distance to form the cost portion of the likelihood. Each variable name therefore corresponds to data with dimensions (number of observations) by (unity), and returns a single parameter.	"catch-function variables" are alternative-invariant variables that are interacted with zonal constants to form the catch portion of the likelihood. Each variable name therefore corresponds to data with dimensions (number of observations) by (unity), and returns (k) parameters where (k) equals the number of alternatives.

logit_correction:	"travel-distance variables" are alternative-invariant variables that are interacted with travel distance to form the cost portion of the likelihood. Each variable name therefore corresponds to data with dimensions (number of observations) by (unity), and returns a single parameter.	"catch-function variables" are alternative-invariant variables that are interacted with zonal constants to form the catch portion of the likelihood. Each variable name therefore corresponds to data with dimensions (number of observations) by (unity), and returns (k) parameters where (k) equals the number of alternatives.

Value

Function creates the model matrix list that contains the data and modeling choices. The model design list is saved to the FishSET database and called by the discretefish_subroutine(). Alternative fishing options come from the ⁠Alternative Choice⁠ list, generated from the create_alternative_choice() function, and the expected catch matrices from the create_expectations() function. The distance from the starting point to alternative choices is calculated.

Model design list:

likelihood:	Name of likelihood function
catch:	Data corresponding to actual zonal catch
catchID:	Character for the name of the variable with catch data
choice:	Data corresponding to actual zonal choice
initparms:	Initial parameter values
optimOpt:	Optimization options
methodname:	Optimization method
mod.name:	Model name for referencing
vars1:	Character vector for variables with 'travel-distance' variables
vars2:	Character vector for additional variables
priceCol:	Variable in dat with price information
mod.date:	Date the model was designed
startingloc:	starting locations
scales:	Scale vectors to put catch data, zonal data, and other data on same scale
distance:	Data corresponding to distance
instances:	Number of observations
alts:	Number of alternative zones
epmDefaultPrice:	Price data
dataZoneTrue:	Vector of 0/1 indicating whether the data from that zone is to be included based on the minimum number of hauls.
typeOfNecessary:	Whether data is at haul or trip level
altChoiceType:	Function choice. Set to distance
altChoiceUnits:	Units of distance
occasion:	The choice occasion
occasion_var:	Character for variable with choice occasion
alt_choice:	Alternative choice matrix
bCHeader:	Variables to include in the model that do not vary by zone. Includes independent variables and interactions
gridVaryingVariables:	Variables to include in the model that do vary by zone such as expected catch (from `create_expectations()` function)
startloc:	Variable in dat identifying location when choice of where to fish next was made
polyn:	Numeric, correction polynomial degree
spat:	A spatial data file
spatID:	Variable in spat that identifies areas or zones
crs:	coordinate reference system
gridVaryingVariables:	Area-specific variables
expectcatchmodels:	List of expected catch matrices

Examples

## Not run: 
make_model_design("pollock", catchID= "OFFICIAL_TOTAL_CATCH",  
  likelihood='logit_zonal', 
  vars1=NULL, vars2=NULL, initparams=c(-0.5,0.5),
  optimOpt=c(100000, 1.0e-08, 1, 1), methodname = "BFGS", mod.name = "logit4"
)

## End(Not run)

## Not run: 
make_model_design("pollock", catchID= "OFFICIAL_TOTAL_CATCH",  
  likelihood='logit_zonal', 
  vars1=NULL, vars2=NULL, initparams=c(-0.5,0.5),
  optimOpt=c(100000, 1.0e-08, 1, 1), methodname = "BFGS", mod.name = "logit4"
)

## End(Not run)

Add an area/polygon to spatial data

Description

Add an area/polygon to spatial data

Usage

make_spat_area(spat, project, coord, spat.id, new.id, combine)
make_spat_area(spat, project, coord, spat.id, new.id, combine)

Arguments

`spat`	Spatial dataset to add polygon too.
`project`	Name of project.
`coord`	Longitude and latitude coordinates forming a polygon. Can be a numeric vector of even length or a numeric matrix with two columns.
`spat.id`	The ID column in `spat`
`new.id`	The ID for new polygon.
`combine`	Whether to use `combine_zone`. This will turn the intersections between `poly` and `spat` into new polygons. Note that the new polygon IDs will be derived from `spat` and `new.id` will not be used.

Details

Adds an area/polygone to a spatial area

Kernel density (hotspot) plot

Description

Kernel density (hotspot) plot

Usage

map_kernel(
  dat,
  project,
  latlon,
  type = "contours",
  group = NULL,
  facet = FALSE,
  date = NULL,
  filter_date = NULL,
  filter_value = NULL,
  minmax = NULL
)
map_kernel(
  dat,
  project,
  latlon,
  type = "contours",
  group = NULL,
  facet = FALSE,
  date = NULL,
  filter_date = NULL,
  filter_value = NULL,
  minmax = NULL
)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in FishSET database contains the string 'MainDataTable'.
`project`	String, name of project.
`latlon`	Character string, specified as latitude then longitude, in decimal degrees.
`type`	String, plot type. Choices are `"point"`, `"contours"`, or `"gradient"`. Note if you have a group, you must facet when choosing `"gradient"` (cannot overlap polygons clearly).
`group`	Optional group argument. Should be a factor with length of (# of observations), where each observation corresponds to the latlon coordinate of the same index. Recall that the legend will output the names of factor levels as you have named them (see `?factor`).
`facet`	Optional facet parameter. TRUE if mapping each group as a separate facet. Defaults to FALSE.
`date`	Optional date variable to filter data by.
`filter_date`	Whether to filter data table by `"year"`, `"month"`, or `"year-month"`. `date` and `filter_value` must be provided. Defaults to `NULL`.
`filter_value`	Integer (4 digits if year, 1-2 if month). The year, month, or year-month to filter data table by. Use a list if using `"year-month"`, with the format: list(year(s), month(s)). For example, `list(2011:2013, 5:7)` will filter the data table from May to July, 2011-2013.
`minmax`	Optional map extent argument, a vector (num) of length 4 corresponding to c(minlat, maxlat, minlon, maxlon).

Value

Returns ggplot2 object. Map plot saved to Output folder.

Examples

## Not run: 
map_kernel(pollockMainDataTable, project = 'pollock', type = 'contours',
latlon = c('LonLat_START_LAT', 'LonLat_START_LON'), group = 'PORT_CODE',
facet = TRUE, minmax = NULL, date = 'FISHING_START_DATE',
filter_date = 'year-month', filter_value = list(2011, 2:4))

## End(Not run)
## Not run: 
map_kernel(pollockMainDataTable, project = 'pollock', type = 'contours',
latlon = c('LonLat_START_LAT', 'LonLat_START_LON'), group = 'PORT_CODE',
facet = TRUE, minmax = NULL, date = 'FISHING_START_DATE',
filter_date = 'year-month', filter_value = list(2011, 2:4))

## End(Not run)

Map observed vessel locations

Description

Plot observed locations on a map. For large datasets, it is best to plot a subset of points. Use percshown to randomly subset the number of points. If the predefined map extent needs adjusting, use minmax.

Usage

map_plot(dat, project, lat, lon, minmax = NULL, percshown = NULL)
map_plot(dat, project, lat, lon, minmax = NULL, percshown = NULL)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in FishSET database contains the string 'MainDataTable'.
`project`	String, project name.
`lat`	Variable in `dat` that defines latitude, in decimal degrees.
`lon`	Variable in `dat` that defines longitude, in decimal degrees.
`minmax`	Optional map extent argument, a vector (num) of length four corresponding to c(minlat, maxlat, minlon, maxlon).
`percshown`	Whole number, percent of points to show. Use this option if there are a lot of data points.

Value

mapout: ggplot2 object

Examples

## Not run: 
map_plot(pollockMainDataTable, 'pollock', 'LonLat_START_LAT', 'LonLat_START_LON', percshown=10)

## End(Not run)
## Not run: 
map_plot(pollockMainDataTable, 'pollock', 'LonLat_START_LAT', 'LonLat_START_LON', percshown=10)

## End(Not run)

Interactive vessel locations and fishery zones map

Description

View vessel locations and fishery zones on interactive map.

Usage

map_viewer(
  dat,
  project,
  spat,
  avd,
  avm,
  num_vars,
  temp_vars,
  id_vars,
  lon_start,
  lat_start,
  lon_end = NULL,
  lat_end = NULL
)
map_viewer(
  dat,
  project,
  spat,
  avd,
  avm,
  num_vars,
  temp_vars,
  id_vars,
  lon_start,
  lat_start,
  lon_end = NULL,
  lat_end = NULL
)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in FishSET database contains the string 'MainDataTable'.
`project`	Project name.
`spat`	Spatial data containing information on fishery management or regulatory zones. Shape, json, geojson, and csv formats are supported.
`avd`	Variable name in the primary data file that gives the unique ID associated to the polygon.
`avm`	The name of the property in the GeoJson file that identifies the polygon to cross reference to `dat`. Variable name in the spatial file that represents the unique ID.
`num_vars`	List, name of numeric variable(s) in `dat` to include for plotting.
`temp_vars`	List, name of temporal variable(s) in `dat` to include for plotting.
`id_vars`	List, name of categorical variable(s) in `dat` to group by.
`lon_start`	String, variable in `dat` that identifies a single longitude point or starting longitude decimal degrees.
`lat_start`	String, variable in `dat` that identifies a single latitude point or starting latitude decimal degrees.
`lon_end`	String, variable in `dat` that identifies ending longitude decimal degrees.
`lat_end`	String, variable in `dat` that identifies ending latitude decimal degrees.

Details

The map_viewer function creates the files required to run the MapViewer program. Users can map points or trip path. To plot points, leave lon_end and lat_end and NULL. After creating the inputs, a map with zones is opened in the default web browser. To close the server connection run servr::daemon_stop() in the console. Lines on the map represent the starting and ending lat/long for each observation in the data set color coded based on the selected variable. It can take up to a minute for the data to be loaded onto the map. At this time, the map can only be saved by taking a screen shot.

Examples

## Not run: 
# Plot trip path
map_viewer(scallopMainDataTable, 'scallop', "scallopTMSSpatTable", 
           avd = 'ZoneID', avm = 'TEN_ID', num_vars = 'LANDED_thousands', 
           temp_vars = 'DATE_TRIP', lon_start = 'previous_port_lon', 
           lat_start = 'previous_port_lat', lon_end = 'DDLON', 
           lat_end = 'DDLAT')
   
# Plot observed fishing locations        
map_viewer(scallopMainDataTable, 'scallop', "scallopTMSSpatTable", 
           avd = 'ZoneID', avm = 'TEN_ID', num_vars = 'LANDED_thousands', 
           temp_vars = 'DATE_TRIP', lon_start = 'DDLON', lat_start = 'DDLAT')

#Plot haul path
map_viewer(pollockMainDataTable, 'pollock', spat=spatdat, avd='NMFS_AREA',
avm='NMFS_AREA', num_vars=c('HAUL','OFFICIAL_TOTAL_CATCH'),
temp_vars='HAUL_DATE', id_vars=c('GEAR_TYPE', 'PORT'), 
       'Lon_Start', 'Lat_Start', 'Lon_End', 'Lat_End')

#Plot haul midpoint
map_viewer(pollockMainDataTable, 'pollock', spat=spatdat, avd='NMFS_AREA',
avm='NMFS_AREA', num_vars=c('HAUL','OFFICIAL_TOTAL_CATCH'),
temp_vars='HAUL_DATE', id_vars=c('GEAR_TYPE', 'PORT'), 'Lon_Mid', 'Lat_Mid')

## End(Not run)
## Not run: 
# Plot trip path
map_viewer(scallopMainDataTable, 'scallop', "scallopTMSSpatTable", 
           avd = 'ZoneID', avm = 'TEN_ID', num_vars = 'LANDED_thousands', 
           temp_vars = 'DATE_TRIP', lon_start = 'previous_port_lon', 
           lat_start = 'previous_port_lat', lon_end = 'DDLON', 
           lat_end = 'DDLAT')
   
# Plot observed fishing locations        
map_viewer(scallopMainDataTable, 'scallop', "scallopTMSSpatTable", 
           avd = 'ZoneID', avm = 'TEN_ID', num_vars = 'LANDED_thousands', 
           temp_vars = 'DATE_TRIP', lon_start = 'DDLON', lat_start = 'DDLAT')

#Plot haul path
map_viewer(pollockMainDataTable, 'pollock', spat=spatdat, avd='NMFS_AREA',
avm='NMFS_AREA', num_vars=c('HAUL','OFFICIAL_TOTAL_CATCH'),
temp_vars='HAUL_DATE', id_vars=c('GEAR_TYPE', 'PORT'), 
       'Lon_Start', 'Lat_Start', 'Lon_End', 'Lat_End')

#Plot haul midpoint
map_viewer(pollockMainDataTable, 'pollock', spat=spatdat, avd='NMFS_AREA',
avm='NMFS_AREA', num_vars=c('HAUL','OFFICIAL_TOTAL_CATCH'),
temp_vars='HAUL_DATE', id_vars=c('GEAR_TYPE', 'PORT'), 'Lon_Mid', 'Lat_Mid')

## End(Not run)

Merge data tables using a left join

Description

Merge data tables using a left join

Usage

merge_dat(
  dat,
  other,
  project,
  main_key,
  other_key,
  other_type,
  merge_type = "left"
)
merge_dat(
  dat,
  other,
  project,
  main_key,
  other_key,
  other_type,
  merge_type = "left"
)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in the FishSET database contains the string 'MainDataTable'.
`other`	A second data table to join to MainDataTable. Use string if referencing a table saved in the FishSET database.
`project`	Project name.
`main_key`	String, name of column(s) in MainDataTable to join by. The number of columns must match `other_key`.
`other_key`	String, name of column(s) in the `other` table to join by. The number of columns must match `main_key`.
`other_type`	String, the type of secondary data being merged. Options include "aux" (auxiliary), "grid" (gridded), "spat" (spatial), and "port".
`merge_type`	String, the type of merge to perform. `"left"` keeps all rows from `dat` and merges shared rows from `other`. `"full"` keeps all rows from each table.

Details

This function merges two datasets using a left join: all columns and rows from the MainDataTable are kept while only matching columns and rows from the secondary table are joined.

Examples

## Not run: 
 pollockMainDataTable <- 
    merge_dat("pollockMainDataTable", "pollockPortTable", "pollock", 
              main_key = "PORT_CODE", other_key = "PORT_CODE")

## End(Not run) 
## Not run: 
 pollockMainDataTable <- 
    merge_dat("pollockMainDataTable", "pollockPortTable", "pollock", 
              main_key = "PORT_CODE", other_key = "PORT_CODE")

## End(Not run)

Merge expected catch

Description

Merge expected catch matrices to the primary dataset.

Usage

merge_expected_catch(
  dat,
  project,
  zoneID,
  date,
  exp.name,
  new.name = NULL,
  ec.table = NULL,
  log_fun = TRUE
)
merge_expected_catch(
  dat,
  project,
  zoneID,
  date,
  exp.name,
  new.name = NULL,
  ec.table = NULL,
  log_fun = TRUE
)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in FishSET database contains the string 'MainDataTable'.
`project`	String, name of project.
`zoneID`	zone ID Variable in `dat` that identifies the individual zones or areas.
`date`	Date variable used to create the expected catch matrix.
`exp.name`	Name(s) of expected catch matrix to merge into `dat`.
`new.name`	Optional, new name for `exp.name`. These should be in the same order as `exp.name`.
`ec.table`	Optional, the name of a specific expected catch table to use. Defaults to `projectnameExpectedCatch`.
`log_fun`	For internal use. Whether to log the function call.

Value

Merges an expected catch matrix created using create_expectations() to the primary dataset, dat.

Check if meta file exists for a project

Description

Check if meta file exists for a project

Usage

meta_file_exists(project)
meta_file_exists(project)

Arguments

project

Project name.

Value

TRUE if project meta file exists, FALSE if not.

Print meta tables by project and/or type

Description

Print meta tables by project and/or type

Usage

meta_tables(project, tab.type = NULL)
meta_tables(project, tab.type = NULL)

Arguments

`project`	Name of project.
`tab.type`	String, table type. Optional, used to filter output. Options include "main", "spat" (spatial), "port", "grid" (gridded), and "aux" (auxiliary).

Import, create, and edit metadata

Description

metadata_gui allows users to import metadata from various file types, create and save new metadata, and edit metadata in a shiny application. Metadata is stored in the user's project folder.

Usage

metadata_gui()
metadata_gui()

Details

The app has two tabs: "Create" and "Edit". The Create tab allows users to create new metadata for a selected FishSET table. When a table is loaded, the app creates several text boxes that the user can fill. There are four metadata sections: About, Column Description, Contact Info, and Other.

About

Author The author of the data.
Date created The date data was created.
Date modified The last data the data was modified.
Version The current version of the data.
Confidentiality Whether the data contains confidential information.

Column Description

A text box for each column in the data. Include the data type, unit, and values (if categorical)

Contact Info

Person The primary contact.
Organization The primary contact's organization.
Address The primary contact's and/or organization's address.
Phone The primary contact's work phone number.
Email The primary contact's work email.

Other

License License for data.
Citation Citation for data.
Other Other relevant information.

Users can also import a metadata file from the Create tab, for example, an XML, CSV, or JSON file. This gets saved as "raw" metadata and is separate from the user-created metadata. To see a comprehensive list of accepted file types, see parse_meta and read_dat. To extract metadata from a data file (i.e. the data and metadata are both in the same file, but the metadata is not contained within the data itself), use the Reader parameters text box to selectively parse the file (see parse_meta for details).

The Edit tab allows users to view, edit, and/or delete metadata saved to FishSET.

Examples

## Not run: 
metadata_gui()

## End(Not run)
## Not run: 
metadata_gui()

## End(Not run)

Get Model Design List

Description

Returns the Model Design list from the FishSET database.

Usage

model_design_list(project, name = NULL)
model_design_list(project, name = NULL)

Arguments

`project`	Name of project.
`name`	Name of Model Design list in the FishSET database. The table name will contain the string "ModelInputData". If `NULL`, the default table is returned. Use `tables_database` to see a list of FishSET database tables by project.

Design hold-out model

Description

Use selected model design settings to create a model design for hold-out data. The hold-out data can be out-of-sample data or subsetted data for k-fold cross validation.

Usage

model_design_outsample(
  project,
  mod.name,
  outsample.mod.name = NULL,
  CV = FALSE,
  CV_dat = NULL,
  use.scalers = FALSE,
  scaler.func = NULL
)
model_design_outsample(
  project,
  mod.name,
  outsample.mod.name = NULL,
  CV = FALSE,
  CV_dat = NULL,
  use.scalers = FALSE,
  scaler.func = NULL
)

Arguments

`project`	Name of project
`mod.name`	Name of saved model to use. Argument can be the name of the model or can pull the name of the saved "best" model. Leave `mod.name` empty to use the saved "best" model. If more than one model is saved, `mod.name` should be the numeric indicator of which model to use. Use `table_view("modelChosen", project)` to view a table of saved models.
`outsample.mod.name`	Name assigned to out-of-sample model design. Must be unique and not already exist in model design list. If `outsample.mod.name = NULL` then a default name will be chosen based on mod.name, which is the default value.
`CV`	Logical, Indicates whether the model design is being created for cross validation `TRUE`, or for simple out- of-sample dataset. Defaults to `CV = TRUE`.
`CV_dat`	Training or testing dataset for k-fold cross validation.
`use.scalers`	Input for `create_model_input()`. Logical, should data be normalized? Defaults to `FALSE`. Rescaling factors are the mean of the numeric vector unless specified with `scaler.func`.
`scaler.func`	Input for `create_model_input()`. Function to calculate rescaling factors.

Details

This function automatically pulls model settings from the selected model and creates an alternative choice matrix, expected catch/revenue matrices, and model design for a hold-out dataset. The hold-out data set can be an out-of-sample dataset or subset of primary data for cross validation. If running out-of-sample data, this function requires that a filtered out-of-sample data file (.rds file) exists in the output folder. For cross validation, this function is called in the cross_validation() function. Note: the out-of-sample functions only work with a single selected model at a time. To run out-of-sample functions on a new out-of-sample dataset, start with load_outsample() if an entirely new dataset or filter_outsample().

Examples

## Not run: 

# For out-of-sample dataset
model_design_outsample("scallop", "scallopModName")


## End(Not run)
## Not run: 

# For out-of-sample dataset
model_design_outsample("scallop", "scallopModName")


## End(Not run)

Load model comparison metrics to console for the defined project

Description

Load model comparison metrics to console. Metrics are displayed for each model that was fun. Metrics produced by discretefish_subroutine.

Usage

model_fit(project, CV = FALSE)
model_fit(project, CV = FALSE)

Arguments

`project`	String, name of project.
`CV`	Logical, `CV = TRUE` to get model fit for training data in k-fold cross validation routine.

Examples

## Not run: 
model_fit('pollock')

## End(Not run)
## Not run: 
model_fit('pollock')

## End(Not run)

Return model names

Description

Returns model names saved to to the model design file.

Usage

model_names(project)
model_names(project)

Arguments

project

Name of project.

Load discrete choice model output to console for the defined project

Description

Returns output from running the discretefish_subroutine function. The table parameter must be the full name of the table name in the FishSET database.

Usage

model_out_view(project, CV = FALSE)
model_out_view(project, CV = FALSE)

Arguments

`project`	Name of project
`CV`	Logical, `CV = TRUE` when viewing model output from training data in k-fold cross validation

Details

Returns output from running discretefish_subroutine. The table argument must be the full name of the table name in the FishSET database. Output includes information on model convergence, standard errors, t-stats, etc.

Examples

## Not run: 
model_out_view('pcod')

## End(Not run)
## Not run: 
model_out_view('pcod')

## End(Not run)

Load model parameter estimates, standard errors, and t-statistic to console for the defined project

Description

Returns parameter estimates, standard errors, and t-statistic from running the discretefish_subroutine function. The table parameter must be the full name of the table name in the FishSET database.

Usage

model_params(project, output = "list")
model_params(project, output = "list")

Arguments

`project`	Name of project
`output`	Options include list, table, or print.

Details

Returns parameter estimates from running discretefish_subroutine. The table argument must be the full name of the table name in the FishSET database.

Examples

## Not run: 
model_params('pcod')

## End(Not run)

## Not run: 
model_params('pcod')

## End(Not run)

Calculate and view Moran's I statistic

Description

Wrapper function to calculate global and local Moran's I by discrete area.

Usage

moran_stats(
  dat,
  project,
  varofint,
  zoneid,
  spat,
  cat,
  lon.dat = NULL,
  lat.dat = NULL,
  lon.spat = NULL,
  lat.spat = NULL
)
moran_stats(
  dat,
  project,
  varofint,
  zoneid,
  spat,
  cat,
  lon.dat = NULL,
  lat.dat = NULL,
  lon.spat = NULL,
  lat.spat = NULL
)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in FishSET database contains the string 'MaindataTable'.
`project`	String, name of project.
`varofint`	Numeric variable from `dat` to test for spatial autocorrelation.
`zoneid`	Variable in `dat` that identifies the individual zones or areas. Define if exists in `dat` and is not named 'ZoneID'. Defaults to NULL.
`spat`	Spatial data containing information on fishery management or regulatory zones. Shape, json, geojson, and csv formats are supported.
`cat`	Variable or list in `spat` that identifies the individual areas or zones. If `spat` is class sf, `cat` should be name of list containing information on zones.
`lon.dat`	Longitude variable from `dat`.
`lat.dat`	Latitude variable from `dat`.
`lon.spat`	Variable or list from `spat` containing longitude data. Required for csv files. Leave as NULL if `spat` is a shape or json file.
`lat.spat`	Variable or list from `spat` containing latitude data. Required for csv files. Leave as NULL if `spat` is a shape or json file.

Details

Measure degree of spatial autocorrelation. Function utilizes the localmoran and knearneigh functions from the spdep package. The spatial input is a row-standardized spatial weights matrix for computed nearest neighbor matrix, which is the null setting for the nb2listw function. The function requires a map file with lat/lon defining boundaries of area/zones and varofint for to test for spatial autocorrelation. If zonal centroid is not included in the map file, then the find_centroid function is called to calculate the centroid of each zone. If the variable of interest is not associated with an area/zone then assignment_column is called to assign each observation to a zone. Arguments to identify centroid and assign variable of interest to area/zone are optional and default to NULL.

Value

Returns a plot and map of Moran’s I. Output is saved to the Output folder.

Examples

## Not run: 
moran_stats(pcodMainDataTable, project='pcod', varofint='OFFICIAL_MT_TONS',
spat=spatdat, lon.dat='LonLat_START_LON', lat.dat ='LonLat_START_LAT', cat='NMFS_AREA')

## End(Not run)

## Not run: 
moran_stats(pcodMainDataTable, project='pcod', varofint='OFFICIAL_MT_TONS',
spat=spatdat, lon.dat='LonLat_START_LON', lat.dat ='LonLat_START_LAT', cat='NMFS_AREA')

## End(Not run)

Identify, remove, or replace NAs and NaNs

Description

Replaces NAs and NaNs in the primary data with the chosen value or removes rows containing NAs and NaNs.

Usage

na_filter(
  dat,
  project,
  x = NULL,
  replace = FALSE,
  remove = FALSE,
  rep.value = "mean",
  over_write = FALSE
)
na_filter(
  dat,
  project,
  x = NULL,
  replace = FALSE,
  remove = FALSE,
  rep.value = "mean",
  over_write = FALSE
)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in FishSET database contains the string 'MainDataTable'.
`project`	Project name.
`x`	Character string. Column(s) in `dat` in which to remove or replace NAs.
`replace`	Logical, if `TRUE`, replaces NAs in a vector with `rep.value`. Defaults to `FALSE`.
`remove`	Logical, if `TRUE` removes the entire row of the `dat` where NA is present in a `dat`. Defaults to `FALSE`.
`rep.value`	Value to replace all NAs in a numeric column. Defaults to the mean value of the column. Other options include `"median"` or a numeric value, e.g. `rep.value = 0`.
`over_write`	Logical, If `TRUE`, saves data over previously saved data table in the FishSET database.

Details

To check for NAs across dat run the function specifying only dat (na_filter(dataset, project)). The function will return a statement of which variables, if any, contain NAs. To remove NAs, use remove = TRUE. All rows containing NAs in x will be removed from dat. To replace NAs, use replace = TRUE. If replace = FALSE and rep.value is not defined, then NAs are replaced with mean value. The modified dataset will be returned if replace = TRUE or remove = TRUE. If both replace and remove are TRUE then replace is used. Save the modified data table to the FishSET database by setting over_write = TRUE).

Value

If replace and remove are FALSE then a statement of whether NAs are found is returned. If either replace or remove is TRUE the modified primary dataset is returned.

Examples

## Not run: 
na_filter(pcodMainDataTable, 'pcod', 'OFFICIAL_TOTAL_CATCH_MT')

mod.dat <- na_filter(pcodMainDataTable, 'pcod', 'OFFICIAL_TOTAL_CATCH_MT', 
                     replace = TRUE)
                     
mod.dat <- na_filter(pcodMainDataTable,'pcod', 'OFFICIAL_TOTAL_CATCH_MT',
                     replace = TRUE, rep.value = 0)
                     
mod.dat <- na_filter(pcodMainDataTable, 'pcod',
                     c('OFFICIAL_TOTAL_CATCH_MT', 'CATCH_VALUE'), 
                     remove = TRUE)

## End(Not run)
## Not run: 
na_filter(pcodMainDataTable, 'pcod', 'OFFICIAL_TOTAL_CATCH_MT')

mod.dat <- na_filter(pcodMainDataTable, 'pcod', 'OFFICIAL_TOTAL_CATCH_MT', 
                     replace = TRUE)
                     
mod.dat <- na_filter(pcodMainDataTable,'pcod', 'OFFICIAL_TOTAL_CATCH_MT',
                     replace = TRUE, rep.value = 0)
                     
mod.dat <- na_filter(pcodMainDataTable, 'pcod',
                     c('OFFICIAL_TOTAL_CATCH_MT', 'CATCH_VALUE'), 
                     remove = TRUE)

## End(Not run)

Check for unique and syntatcic column names

Description

Used for creating new columns.

Usage

name_check(dat, names, repair = FALSE)
name_check(dat, names, repair = FALSE)

Arguments

`dat`	Dataset that will contain new columns.
`names`	New names to be added to dataset.
`repair`	Logical, whether to return repaired column names (`repair = TRUE`) or just check for unique column names (`repair = FALSE`).

Details

name_check() first checks to see if new column names are unique and returns an error if not. When repair = TRUE, name_check() will check for unique column names and returns new column names that are unique and syntactic (see vec_as_names for details).

Identify, remove, or replace NaNs

Description

Replaces NaNs in the primary data with the chosen value or removes rows containing NaNs

Usage

nan_filter(
  dat,
  project,
  x = NULL,
  replace = FALSE,
  remove = FALSE,
  rep.value = "mean",
  over_write = FALSE
)
nan_filter(
  dat,
  project,
  x = NULL,
  replace = FALSE,
  remove = FALSE,
  rep.value = "mean",
  over_write = FALSE
)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in the FishSET database contains the string 'MainDataTable'.
`project`	Project name.
`x`	Character string of variables to remove or replace NaNs.
`replace`	Logical, If `TRUE`, NaNs are replaced. Defaults to `FALSE`.
`remove`	Logical, if `TRUE`, removes the entire row of the dataset where NaN is present. Defaults to `FALSE`.
`rep.value`	Value to replace all NaNs in a numeric column. Defaults to the mean value of the column. Other options include `"median"` or a numeric value, e.g. `rep.value = 0`.
`over_write`	Logical, If `TRUE`, saves data over previously saved data table in the FishSET database. Defaults to `FALSE`.

Details

To check for NaNs across dat run the function specifying only dat (nan_filter(dataset, project)). The function will return a statement of which variables, if any, contain NaNs. To remove NaNs, use remove = TRUE. All rows containing NaNs in x will be removed from dat. To replace NaNs, use replace = TRUE. If both replace and remove are TRUE then replace is used. If replace is FALSE and rep.value is not defined, then NaNs are replaced with mean value. The modified dataset will be returned if replace = TRUE or remove = TRUE. Save the modified data table to the FishSET database by setting over_write = TRUE).

Value

If replace and remove are FALSE then a statement of whether NaNs are found is returned. If either replace or remove is TRUE the modified primary dataset is returned.

Examples

## Not run: 
nan_filter(pcodMainDataTable, 'pcod', 'OFFICIAL_TOTAL_CATCH_MT')

mod.dat <- nan_filter(pcodMainDataTable, 'pcod', 'OFFICIAL_TOTAL_CATCH_MT', 
                      replace = TRUE)
                      
mod.dat <- nan_filter(pcodMainDataTable, 'pcod', 'OFFICIAL_TOTAL_CATCH_MT',
                      replace = TRUE, rep.value = 0)
                      
mod.dat <- nan_filter(pcodMainDataTable, 'pcod', 'OFFICIAL_TOTAL_CATCH_MT', 
                      remove = TRUE)

## End(Not run)
## Not run: 
nan_filter(pcodMainDataTable, 'pcod', 'OFFICIAL_TOTAL_CATCH_MT')

mod.dat <- nan_filter(pcodMainDataTable, 'pcod', 'OFFICIAL_TOTAL_CATCH_MT', 
                      replace = TRUE)
                      
mod.dat <- nan_filter(pcodMainDataTable, 'pcod', 'OFFICIAL_TOTAL_CATCH_MT',
                      replace = TRUE, rep.value = 0)
                      
mod.dat <- nan_filter(pcodMainDataTable, 'pcod', 'OFFICIAL_TOTAL_CATCH_MT', 
                      remove = TRUE)

## End(Not run)

Identify NaNs and NAs

Description

Check whether any columns in the primary dataset contain NAs or NaNs. Returns column names containing NAs or NaNs.

Usage

nan_identify(dat, project)
nan_identify(dat, project)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in the FishSET database contains the string 'MainDataTable'.
`project`	Project name.

Details

Check whether any columns in the primary dataset contain NAs or NaNs.

Value

Returns names of columns containing NAs or NaNs, if any.

Examples

## Not run: 
nan_identify(pcodMainDataTable, "pcod")

## End(Not run)
## Not run: 
nan_identify(pcodMainDataTable, "pcod")

## End(Not run)

Create one or more binned frequency tables

Description

Create one or more binned frequency, relative frequency, or density table.

Usage

nfreq_table(
  dataset,
  var,
  group = NULL,
  bins = 30,
  type = "dens",
  v_id = NULL,
  format_lab = "decimal",
  format_tab = "wide"
)
nfreq_table(
  dataset,
  var,
  group = NULL,
  bins = 30,
  type = "dens",
  v_id = NULL,
  format_lab = "decimal",
  format_tab = "wide"
)

Arguments

`dataset`	Primary data containing information on hauls or trips. Table in FishSET database should contain the string 'MainDataTable'.
`var`	String, name of numeric variable to bin.
`group`	String, name of variable(s) to group `var` by.
`bins`	Integer, the number of bins to create.
`type`	String, the type of binned frequency table to create. `"freq"` creates a frequency table, `"perc"` creates a relative frequency table, and `"dens"` creates a density table.
`v_id`	String, name of vessel ID column (used to detect confidential information).
`format_lab`	Formatting option for bin labels. Options include `"decimal"` or `"scientific"`.
`format_tab`	Format table "wide" or "long"

Shape file for NMFS fishing zones

Description

Simple feature collection with 25 features and 2 fields

Format

shape file

Boxplot to assess outliers

Description

Boxplot to assess outliers

Usage

outlier_boxplot(dat, project, x = NULL)
outlier_boxplot(dat, project, x = NULL)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in the FishSET database contains the string 'MainDataTable'.
`project`	Project name.
`x`	Variables in `dat` to check for outliers. Leave as `x = NULL` to plot all numeric variables. To specify multiple variables use `c('var1', 'var2')`

Details

Creates a visual representation of five summary statistics: median, two hinges (first and third quartiles), two whiskers (extends to 1.5*IQR where IQR is the distance between the first and third quartiles. "Outlying" points, those beyond the two whiskers (1.5*IQR) are shown individually.

Value

Box and whisker plot for all numeric variables. Saved to 'output' folder.

Evaluate outliers in plot format

Description

Visualize spread of data and measures to identify outliers.

Usage

outlier_plot(
  dat,
  project,
  x,
  dat.remove = "none",
  sd_val = NULL,
  x.dist = "normal",
  date = NULL,
  group = NULL,
  pages = "single",
  output.screen = FALSE,
  log_fun = TRUE
)
outlier_plot(
  dat,
  project,
  x,
  dat.remove = "none",
  sd_val = NULL,
  x.dist = "normal",
  date = NULL,
  group = NULL,
  pages = "single",
  output.screen = FALSE,
  log_fun = TRUE
)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in the FishSET database contains the string 'MainDataTable'.
`project`	String, name of project.
`x`	Variable in `dat` to check for outliers.
`dat.remove`	Outlier measure. Values outside the measure are removed. Users can use the predefined values (see below) or user-defined distance from the mean. For user-defined values, `dat.remove` should be a numeric value. For example, `dat.remove = 6` would would result in value outside 6SD from the mean being class as outliers. User-defined standard deviations from the mean can also be applied using `sd_val`. Pre-defined choices: `"none"`, `"5_95_quant"`, `"25_75_quant"`, `"mean_2SD"`, `"median_2SD"`, `"mean_3SD"`, `"median_3SD"`. See the Details section for more information.
`sd_val`	Optional. Number of standard deviations from mean defining outliers. Example, `sd_val = 6` would mean values outside +/- 6 SD from the mean would be outliers.
`x.dist`	Distribution of the data. Choices include: `"normal"`, `"lognormal"`, `"exponential"`, `"Weibull"`, `"Poisson"`, `"negative binomial"`.
`date`	(Optional) date variable to group the histogram by year.
`group`	(Optional) additional variable to group the histogram by.
`pages`	Whether to output plots on a single page (`"single"`, the default) or multiple pages (`"multi"`).
`output.screen`	Logical, if true, return plots to the screen. If `FALSE`, returns plot to the 'output' folder as a png file.
`log_fun`	Logical, whether to log function call (for internal use).

Details

The function returns three plots: the data, a probability plot, and a Q-Q plot. The data plot returns x against row number. Red points are data points that would be removed based on dat.remove. Blue points are data points within the bounds of dat.remove. If dat.remove is "none", then only blue points will be shown. The probability plot is a histogram of the data, after applying dat.remove, with the fitted probability distribution based on x.dist. group groups the histogram by a variable from dat, date groups the histogram by year. The Q-Q plot plots are sampled quantiles against theoretical quantiles, after applying dat.remove.

The dat.remove choices are:

numeric value: Remove data points outside +/- 'x'SD of the mean
none: No data points are removed
5_95_quant: Removes data points outside the 5th and 95th quantiles
25_75_quant: Removes data points outside the 25th and 75th quantiles
mean_2SD: Removes data points outside +/- 2SD of the mean
median_2SD: Removes data points outside +/- 2SD of the median
mean_3SD: Removes data points outside +/- 3SD of the mean
median_3SD: Removes data points outside +/- 3SD of the median

The distribution choices are:

normal
lognormal
exponential
Weibull
Poisson
negative binomial

Value

Plot of the data

Examples

## Not run: 

outlier_plot(pollockMainDataTable, 'pollock', x = 'Haul', dat.remove = 'mean_2SD', 
             x.dist = 'normal', output.screen = TRUE)
# user-defined outlier        
outlier_plot(pollockMainDataTable, 'pollock', x = 'Haul', dat.remove = 6, 
             x.dist = 'lognormal', output.screen = TRUE)

## End(Not run)
## Not run: 

outlier_plot(pollockMainDataTable, 'pollock', x = 'Haul', dat.remove = 'mean_2SD', 
             x.dist = 'normal', output.screen = TRUE)
# user-defined outlier        
outlier_plot(pollockMainDataTable, 'pollock', x = 'Haul', dat.remove = 6, 
             x.dist = 'lognormal', output.screen = TRUE)

## End(Not run)

Remove outliers from data table

Description

Remove outliers based on outlier measure.

Usage

outlier_remove(
  dat,
  project,
  x,
  dat.remove = "none",
  sd_val = NULL,
  over_write = FALSE
)
outlier_remove(
  dat,
  project,
  x,
  dat.remove = "none",
  sd_val = NULL,
  over_write = FALSE
)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in the FishSET database contains the string 'MainDataTable'.
`project`	Project name.
`x`	Variable in `dat` containing potential outliers.
`dat.remove`	Defines measure to subset the data. Users can use the predefined values (see below) or user-defined standard deviations from the mean. For user-defined values, `dat.remove` should be a numeric value. For example, `dat.remove=6` would would result in value outside 6SD from the mean being class as outliers. User-defined standard deviations from the mean can also be applied using `sd_val`. Predefined choices: `"none"`, `"5_95_quant"`, `"25_75_quant"`, `"mean_2SD"`, `"median_2SD"`, `"mean_3SD"`, `"median_3SD"`.
`sd_val`	Optional. Number of standard deviations from mean defining outliers. For example, `sd_val=6` would mean values outside +/- 6 SD from the mean would be outliers.
`over_write`	Logical, If `TRUE`, saves data over previously saved data table in the FishSET database.

Details

The dat.remove choices are:

numeric value: Remove data points outside +/- 'x'SD of the mean
none: No data points are removed
5_95_quant: Removes data points outside the 5th and 95th quantiles
25_75_quant: Removes data points outside the 25th and 75th quantiles
mean_2SD: Removes data points outside +/- 2SD of the mean
median_2SD: Removes data points outside +/- 2SD of the median
mean_3SD: Removes data points outside +/- 3SD of the mean
median_3SD: Removes data points outside +/- 3SD of the median

Value

Returns the modified primary dataset. Modified dataset will be saved to the FishSET database.

Examples

## Not run: 
pollockMainDataTable <- outlier_remove(pollockMainDataTable, 'pollock', 'dist', 
   dat.remove = 'mean_2SD', save.output = TRUE)

## End(Not run)
## Not run: 
pollockMainDataTable <- outlier_remove(pollockMainDataTable, 'pollock', 'dist', 
   dat.remove = 'mean_2SD', save.output = TRUE)

## End(Not run)

Evaluate outliers in Data

Description

outlier_table() returns a summary table which shows summary statistics of a variable after applying several outlier filters.

Usage

outlier_table(dat, project, x, sd_val = NULL, log_fun = TRUE)
outlier_table(dat, project, x, sd_val = NULL, log_fun = TRUE)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in the FishSET database contains the string 'MainDataTable'.
`project`	String, name of project.
`x`	Variable or column number in `dat` to check for outliers.
`sd_val`	Optional. Number of standard deviations from mean defining outliers. For example, `sd_val = 4` would mean values outside +/- 4 SD from the mean would be outliers.
`log_fun`	Logical, whether to log function call (for internal use).

Details

Returns a table of summary statistics (mean, median, standard deviation, minimum, maximum, number of NAs, and skew of the data) for x after values outside the outlier measure have been removed. Outlier measures include 5-95% quantiles, 25-75% quantiles, mean +/-2SD, mean +/-3SD, median +/-2SD, and median +/-3SD. Only one variable can be checked at a time. Table is saved to the Output folder.

Value

Table for evaluating whether outliers may exist in the selected data column.

Examples

## Not run: 
outlier_table(pollockMainDataTable, 'pollock', x = 'HAUL')

## End(Not run)
## Not run: 
outlier_table(pollockMainDataTable, 'pollock', x = 'HAUL')

## End(Not run)

Parse metadata from a data file

Description

General purpose meta parsing function. parse_meta attempts to parse a file based on its file extension.

Usage

parse_meta(file, ..., simplify_meta = FALSE)
parse_meta(file, ..., simplify_meta = FALSE)

Arguments

`file`	String, file path.
`...`	Additional arguments passed to a parsing function based on file extension. See below.
`simplify_meta`	Logical, attempt to simplify the metadata output. This uses `simplify_list`. This can be useful if metadata is not tabular.

Details

Function supports xls, xlsx, csv, tsv, excel, json, and xlm extensions. #' Extension-specific notes:

txt:
⁠ ⁠sep Field separator character. defaults to comment = "#".
⁠ ⁠comment The comment character used to separate (or "comment-out") the metadata from the data. Only text that has been commented-out will be read.
⁠ ⁠d_list Logical, is metadata stored as a description list (i.e. Field: value, value format). If a colon (":") is used after the field name set this to TRUE.

xls, xlsx:
⁠ ⁠range The cell range to read from (e.g. "A1:C5"). See read_excel for more details.

Selectively display a note section

Description

Selectively display a note section

Usage

parse_notes(project, date = NULL, section, output = "print")
parse_notes(project, date = NULL, section, output = "print")

Arguments

`project`	The project name.
`date`	Date to pull notes from. If NULL then the most recent version of notes from the project are retrieved.
`section`	The note section to display. Options include "upload" for Upload data, "quality" for Data quality evaluation, "explore" for Data exploration, "fleet" for Fleet functions, "analysis" for Simple analysis, "new_variable" for Create new variable, "alt_choice" for Alternative choice, "models", and "bookmark".
`output`	Output type. "print" returns formatted notes. "string" returns a character vector of the notes. "print" is recommended for displaying notes in a report.

Examples

## Not run: 
parse_notes("pollock", type = "explore")

## End(Not run)
## Not run: 
parse_notes("pollock", type = "explore")

## End(Not run)

Plot spatial dataset

Description

Simple plotting function for viewing spatial data.

Usage

plot_spat(spat)
plot_spat(spat)

Arguments

spat

Spatial dataset to view. Must be an object of class sf or sfc.

Examples

## Not run: 
plot_spat(pollockNMFSSpatTable)

## End(Not run)
## Not run: 
plot_spat(pollockNMFSSpatTable)

## End(Not run)

Policy change metrics

Description

Policy change metrics

Usage

policy_metrics(
  dat,
  project,
  tripID = "row",
  vesselID,
  catchID,
  datevar = NULL,
  price = NULL
)
policy_metrics(
  dat,
  project,
  tripID = "row",
  vesselID,
  catchID,
  datevar = NULL,
  price = NULL
)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in the FishSET database contains the string 'MainDataTable'.
`project`	Name of project
`tripID`	Trip identifier. Can be 'row' or the name or names of variables that define trips. If `tripID='row'` then each row of the primary dataset is considered to be a unique trip
`vesselID`	Vessel identifier. Variable name in primary dataset that contains unique vessel identifier.
`catchID`	Name of variable in primary dataset that contains catch data.
`datevar`	Name of variable containing date data. Used to split data into years.
`price`	Name of variable containing data on revenue or price data.

Details

The policy change metrics reflect the impact of proposed policies in the absence of changes in fisher behavior. Policy scenarios are defined using zone_closure() function. Percent of vessels is calculated from the unique vessel identifiers grouped by year and zone. Trips are identified using the tripID argument, otherwise each row is assumed to be a trip. If price is not defined then percent of revenue loss will be reported as NA.

Value

Tables containing basic metrics on effects of proposed zone closures.

Summarize predicted probabilities

Description

Create summary table and figures for the predicted probabilities of fishing per zone for each model and policy scenario. The table and figures include the base case scenario, which is the proportion of observations in each zone. The table also includes the squared error between the predicted probabilities and base case probabilities. The first figure option displays predicted probabilities for each model, and the second figure option shows predicted probabilities for each model and policy.

Usage

pred_prob_outputs(
  project,
  mod.name = NULL,
  zone.dat = NULL,
  policy.name = NULL,
  output_option = "table"
)
pred_prob_outputs(
  project,
  mod.name = NULL,
  zone.dat = NULL,
  policy.name = NULL,
  output_option = "table"
)

Arguments

`project`	Name of project
`mod.name`	Name of model
`zone.dat`	Variable in primary data table that contains unique zone ID.
`policy.name`	List of policy scenario names created in zone_closure function
`output_option`	"table" to return summary table (default); "model_fig" for predicted probabilities; or "policy_fig" to return predicted probabilities for each model/policy scenario ; "diff_table" to return difference between predicted probabilities between model and policy scenario for each zone.

Details

This function requires that model and prediction output tables exist in the FishSET database. If these tables are not present in the database to function with terminate and return an error message.

Value

A model prediction summary table (default), model prediction figure, or policy prediction figure. See output_option argument.

Examples

## Not run: 

pred_prob_outputs(project = "scallop")


## End(Not run)
## Not run: 

pred_prob_outputs(project = "scallop")


## End(Not run)

Map of predicted probabilities

Description

Create a map showing predicted probabilities by zone

Usage

predict_map(
  project,
  mod.name = NULL,
  policy.name = NULL,
  spat,
  zone.spat,
  outsample = FALSE,
  outsample_pred = NULL
)
predict_map(
  project,
  mod.name = NULL,
  policy.name = NULL,
  spat,
  zone.spat,
  outsample = FALSE,
  outsample_pred = NULL
)

Arguments

`project`	Name of project
`mod.name`	Name of model
`policy.name`	Name of policy scenario
`spat`	A spatial data file containing information on fishery management or regulatory zones boundaries. 'sf' objects are recommended, but 'sp' objects can be used as well. See [dat_to_sf()] to convert a spatial table read from a csv file to an 'sf' object. To upload your spatial data to the FishSETFolder see [load_spatial()].
`zone.spat`	Name of zone ID column in 'spat'.
`outsample`	Logical, indicating if `predict_map()` is being used for creating map of out-of-sample predicted fishing probabilities `outsample = TRUE` or policy scenario `outsample = FALSE`.
`outsample_pred`	A dataframe with fishing location and predicted probabilities for out-of-sample data. `outsample_pred = NULL` by default and when plotting policy scenarios.

Details

This function requires that model and prediction output tables exist in the FishSET database when plotting policy scenario maps.

Value

A map showing predicted probabilities

Examples

## Not run: 

predict_map(project = "scallop", policy.name = "logit_c_mod1 closure_1", 
            spat = spat, zone.spat = "TEN_ID")


## End(Not run)
## Not run: 

predict_map(project = "scallop", policy.name = "logit_c_mod1 closure_1", 
            spat = spat, zone.spat = "TEN_ID")


## End(Not run)

Predict out-of-sample data

Description

Calculate predicted probabilities for out-of-sample dataset

Usage

predict_outsample(
  project,
  mod.name,
  outsample.mod.name,
  use.scalers = FALSE,
  scaler.func = NULL
)
predict_outsample(
  project,
  mod.name,
  outsample.mod.name,
  use.scalers = FALSE,
  scaler.func = NULL
)

Arguments

`project`	Name of project
`mod.name`	Name of saved model to use. Argument can be the name of the model or can pull the name of the saved "best" model. Leave `mod.name` empty to use the saved "best" model. If more than one model is saved, `mod.name` should be the numeric indicator of which model to use. Use `table_view("modelChosen", project)` to view a table of saved models.
`outsample.mod.name`	Name of the saved out-of-sample model design.
`use.scalers`	Input for `create_model_input()`. Logical, should data be normalized? Defaults to `FALSE`. Rescaling factors are the mean of the numeric vector unless specified with `scaler.func`.
`scaler.func`	Input for `create_model_input()`. Function to calculate rescaling factors.

Details

This function predicts out-of-sample fishing probabilities and calculates model prediction performance (percent absolute prediction error).

Examples

## Not run: 

predict_outsample("scallop1", "logit_c_mod1", "logit_c_mod1_outsample")


## End(Not run)
## Not run: 

predict_outsample("scallop1", "logit_c_mod1", "logit_c_mod1_outsample")


## End(Not run)

Format numbers in table

Description

Format numeric columns.

Usage

pretty_lab(tab, cols = "all", type = "pretty", ignore = NULL)
pretty_lab(tab, cols = "all", type = "pretty", ignore = NULL)

Arguments

`tab`	Table to format.
`cols`	Character string of columns to format. defaults to `"all"` which will include all numeric variables in `tab`. If `ignore = TRUE` then the columns listed in `cols` will be not be formatted and all other columns in `tab` will be formatted.
`type`	The type of formatting to apply. `"pretty"` uses `prettyNum` which uses commas (",") to mark big intervals. `"scientific"` uses scientific notation. `"decimal"` simply rounds to two decimal places.
`ignore`	Logical, whether to exclude the columns listed in `cols` and apply formatting to all other columns in `tab`.

Format table for R Markdown

Description

Format table for R Markdown

Usage

pretty_tab(tab, full_width = FALSE)
pretty_tab(tab, full_width = FALSE)

Arguments

`tab`	Table to format.
`full_width`	Logical, whether table should fill out entire width of the page.

Scroll box for R Markdown table

Description

Allows tables to become scrollable. Useful for large tables.

Usage

pretty_tab_sb(tab, width = "100%", height = "500px", full_width = FALSE)
pretty_tab_sb(tab, width = "100%", height = "500px", full_width = FALSE)

Arguments

`tab`	Table to format.
`width`	A character string indicating the width of the box. Can be in pixels (e.g. "50px") or as a percentage (e.g. "50%").
`height`	A character string indicating the height of the box. Can be in pixels (e.g. "50px") or as a percentage (e.g. "50%").
`full_width`	Logical, whether table should fill out entire width of the page.

Create Previous Location/Area Variable

Description

Creates a variable of the previous port/zone (previous area) or the previous longitude/latitude for a vessel.

Usage

previous_loc(
  dat,
  spat,
  project,
  starting_port,
  v_id,
  tripID,
  haulID,
  zoneID = NULL,
  spatID = NULL,
  date = NULL,
  lon = NULL,
  lat = NULL
)
previous_loc(
  dat,
  spat,
  project,
  starting_port,
  v_id,
  tripID,
  haulID,
  zoneID = NULL,
  spatID = NULL,
  date = NULL,
  lon = NULL,
  lat = NULL
)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in FishSET database contains the string 'MainDataTable'.
`spat`	A spatial data file containing information on fishery management or regulatory zones boundaries. `sf` objects are recommended, but `sp` objects can be used as well. See `dat_to_sf()` to convert a spatial table read from a csv file to an `sf` object. To upload your spatial data to the FishSETFolder see `load_spatial()`.
`project`	String, name of project.
`starting_port`	The name of the starting (or disembarking) port in `dat`.
`v_id`	The name of the variable in `dat` that uniquely identifies vessels.
`tripID`	Variable name in `dat` that uniquely identifies trips.
`haulID`	Variable name in `dat` that uniquely identifies hauls.
`zoneID`	Name of zone ID column in `dat`. Used to identify the previous area. Required for previous area variable.
`spatID`	Name of zone ID column in `spat`. `spat` is used to assign ports to spatial areas. Required for previous area variable.
`date`	Optional, a date variable to order hauls by.
`lon`	Longitude variable from `dat`. Required for previous location variable.
`lat`	Latitude variable from `dat`. Required for previous location variable.

Details

previous_loc() can create a previous area or location variable. "Previous area" is defined as the port or zone the vessel last visited. The first area for each trip is the disembarking port (starting_port). If a port is within a zone, the zone is returned. If a port is not within a zone, the name of the port is returned. "Previous location" is defined as the previous longitude and latitude of the vessel. The first set of coordinates is the location of the port. Users must have a port table saved to the FishSET database to use this function (see load_port()). This variable can be used to define the distance matrix (see create_alternative_choice()).

Check if option file exists for a project

Description

Check if option file exists for a project

Usage

proj_settings_exists(project)
proj_settings_exists(project)

Arguments

project

Project name.

Value

TRUE if project options file exists, FALSE if not.

List output files by project name

Description

List output files by project name

Usage

project_files(project)
project_files(project)

Arguments

project

Project name

Examples

## Not run: 
project_files("pollock")

## End(Not run)
## Not run: 
project_files("pollock")

## End(Not run)

List logs by project

Description

List logs by project

Usage

project_logs(project, modified = FALSE)
project_logs(project, modified = FALSE)

Arguments

`project`	Name of project.
`modified`	Logical, whether to show modification date. Returns a data frame.

Display database table names by project

Description

Display database table names by project

Usage

project_tables(project, ...)
project_tables(project, ...)

Arguments

`project`	Name of project.
`...`	String, additional characters to match by.

Examples

## Not run: 
project_tables("pollock")
project_tables("pollock", "main")

## End(Not run)
## Not run: 
project_tables("pollock")
project_tables("pollock", "main")

## End(Not run)

Display projects names

Description

Display projects names

Usage

projects()
projects()

Details

Lists the unique project names currently in the FishSET Database.

Examples

## Not run: 
projects()

## End(Not run) 

## Not run: 
projects()

## End(Not run)

Retrieve/display meta data by project

Description

Retrieve/display meta data by project

Usage

pull_meta(project, tab.name = NULL, tab.type = NULL, format = FALSE)
pull_meta(project, tab.name = NULL, tab.type = NULL, format = FALSE)

Arguments

`project`	Project name.
`tab.name`	String, table name. Optional, used to filter output to a specific table.
`tab.type`	String, table type. Optional, used to filter output. Options include "main", "spat" (spatial), "port", "grid" (gridded), and "aux" (auxiliary).
`format`	Logical, whether to format output using `pander`. Useful for displaying in reports.

Pull notes from output folder

Description

Pull notes from output folder

Usage

pull_notes(project, date = NULL, output = "print")
pull_notes(project, date = NULL, output = "print")

Arguments

`project`	String, the project name.
`date`	String, date to pull notes from. If NULL, most recent note file is retrieved.
`output`	Output type. "print" returns formatted notes. "string" returns a character vector of the notes. "print" is recommended for displaying notes in a report.

Details

Notes are saved to the output folder by project name and date. If date is not specified then the most recent notes file with the project name is pulled. Notes are are also saved by FishSET app session; if more than one session occurred in the same day, each session's notes are pulled and listed in chronological order.

Retrieve output file name by project, function, and type

Description

Retrieve output file name by project, function, and type

Usage

pull_output(project, fun = NULL, date = NULL, type = "plot", conf = TRUE)
pull_output(project, fun = NULL, date = NULL, type = "plot", conf = TRUE)

Arguments

`project`	Name of project
`fun`	Name of function.
`date`	Output file date in " the most recent output file is pulled.
`type`	Whether to return the `"plot"` (.png), `"table"` (.csv), "notes" (.txt) or `"all"` files matching the project name, function, and date.
`conf`	Logical, whether to return suppressed confidential data. Unsuppressed output will be pulled if suppressed output is not available.

Examples

## Not run: 
pull_output("pollock", "species_catch", type = "plot")

## End(Not run)
## Not run: 
pull_output("pollock", "species_catch", type = "plot")

## End(Not run)

Import and format plots to notebook file

Description

Import and format plots to notebook file

Usage

pull_plot(project, fun, date = NULL, conf = TRUE)
pull_plot(project, fun, date = NULL, conf = TRUE)

Arguments

`project`	Project name.
`fun`	String, the name of the function that created the plot.
`date`	the date the plot was created. If NULL, then the most recent version is retrieved.
`conf`	Logical, whether to return suppressed confidential data. Unsuppressed output will be pulled if suppressed output is not available.

Examples

## Not run: 
pull_plot("pollock", "density_plot")

## End(Not run)
## Not run: 
pull_plot("pollock", "density_plot")

## End(Not run)

Import and format table to notebook file

Description

Import and format table to notebook file

Usage

pull_table(project, fun, date = NULL, conf = TRUE)
pull_table(project, fun, date = NULL, conf = TRUE)

Arguments

`project`	Project name.
`fun`	String, the name of the function that created the table.
`date`	the date the table was created. If NULL, then the most recent version is retrieved.
`conf`	Logical, whether to return suppressed confidential data. Unsuppressed output will be pulled if suppressed output is not available.

Examples

## Not run: 
pull_table("pollock", "vessel_count")

## End(Not run)

## Not run: 
pull_table("pollock", "vessel_count")

## End(Not run)

Randomize latitude and longitude points by zone

Description

Randomize latitude and longitude points by zone

Usage

randomize_lonlat_zone(dat, project, spat, lon, lat, zone)
randomize_lonlat_zone(dat, project, spat, lon, lat, zone)

Arguments

`dat`	Primary data frame over which to apply function. Table in FishSET database should contain the string 'MainDataTable'.
`project`	Project name.
`spat`	Spatial data table containing regulatory zones. This can be a "spatial feature" or sf object.
`lon`	String, variable name containing longitude.
`lat`	String, variable name containing latitude.
`zone`	String, column name contain the assigned zone. Must be the same for both the spatial data table and MainDataTable.

Details

This is one of the FishSET confidentiality functions. It replaces longitude and latitude values with randomly sampled coordinates from the regulatory zone the observation occurred in.

Examples

## Not run: 
randomize_lonlat_zone(pollockMainDataTable, "pollock", spatdat, 
                   lon = "LonLat_START_LON", lat = "LonLat_START_LAT",
                   zone = "NMFS_AREA")

## End(Not run)
## Not run: 
randomize_lonlat_zone(pollockMainDataTable, "pollock", spatdat, 
                   lon = "LonLat_START_LON", lat = "LonLat_START_LAT",
                   zone = "NMFS_AREA")

## End(Not run)

Randomize variable value by percentage range

Description

Randomize variable value by percentage range

Usage

randomize_value_range(dat, project, value, perc = NULL)
randomize_value_range(dat, project, value, perc = NULL)

Arguments

`dat`	Primary data frame over which to apply function. Table in FishSET database should contain the string 'MainDataTable'.
`project`	Project name.
`value`	String, name of variable to jitter.
`perc`	Numeric, a vector of percentages to randomly adjust a column of values by. Defaults to a range of 0.05 - 0.15 (i.e. 5-15 percent of original value).

Details

This is one of the FishSET confidentiality functions. It adjusts a value by adding or substracting (chosen at random for each value) a percentage of the value. The percentage is randomly sampled from a range of percentages provided in the "perc" argument.

Examples

## Not run: 
randomize_value_range(pollockMainDataTable, "pollock", "LBS_270_POLLOCK_LBS")

## End(Not run) 
## Not run: 
randomize_value_range(pollockMainDataTable, "pollock", "LBS_270_POLLOCK_LBS")

## End(Not run)

Randomize value between rows

Description

Randomize value between rows

Usage

randomize_value_row(dat, project, value)
randomize_value_row(dat, project, value)

Arguments

`dat`	Primary data frame over which to apply function. Table in FishSET database should contain the string 'MainDataTable'.
`project`	Project name.
`value`	String, variable name to be randomly distributed between rows.

Details

This is one of the FishSET confidentiality functions. It is useful for randomly assigning ID values between observations.

Examples

## Not run: 
randomize_value_row(pollockMainDataTable, "pollock", "PERMIT")

## End(Not run)
## Not run: 
randomize_value_row(pollockMainDataTable, "pollock", "PERMIT")

## End(Not run)

Import data from local file directory or webpage into the R environment

Description

Import data from local file directory or webpage into the R environment

Usage

read_dat(
  x,
  data.type = NULL,
  is.map = FALSE,
  drv = NULL,
  dbname = NULL,
  user = NULL,
  password = NULL,
  ...
)
read_dat(
  x,
  data.type = NULL,
  is.map = FALSE,
  drv = NULL,
  dbname = NULL,
  user = NULL,
  password = NULL,
  ...
)

Arguments

`x`	Name and path of dataset to be read in. To load data directly from a webpage, `x` should be the web address.
`data.type`	Optional. Data type can be defined by user or based on the file extension. If undefined, `data.type` is the string after the last period or equal sign. `data.type` must be defined if `x` is the path to a shape folder, if the file is a Google spreadsheet use `data.type = 'google'`, or if the correct extension cannot be derived from `x`. R, comma-delimited, tab-delimited, excel, Matlab, json, geojson, sas, spss, stata, and html, and XML data extensions do not have to be specified.
`is.map`	logical, for .json file extension, set `is.map = TRUE` if data is a spatial file. Spatial files ending in .json will not be read in properly unless `is.map = TRUE`.
`drv`	Use with sql files. Database driver.
`dbname`	Use with sql files. If required, database name.
`user`	Use with sql files. If required, user name for SQL database.
`password`	Use with sql files. If required, SQL database password.
`...`	Optional arguments

Details

Uses the appropriate function to read in data based on data type. Use write_dat to save data to the data folder in the project directory. Supported data types include shape, csv, json, matlab, R, spss, and stata files. Use data.type = 'shape' if x is the path to a shape folder. Use data.type = 'google' if the file is a Google spreadsheet.

For sql files, use data.type = 'sql'. The function will connect to the specified DBI and pull the table. Users must specify the DBI driver (drv), for example: RSQLite::SQLite(), RPostgreSQL::PostgreSQL(), odbc::odbc(). Further arguments may be required, including database name (dbname), user id (user), and password (password).

Additional arguments can be added, such as skip lines skip = 2 and header header = FALSE. To specify the separator argument for a delimited file, include tab-delimited, specify data.type = 'delim'.

For more details, see load for loading R objects, read_csv for reading in comma separated value files, read_tsv for reading in tab separated value files, read_delim for reading in delimited files, read_excel for reading in excel files (xls, xlsx), st_read for reading in geojson , GeoPackage files, and shape files, readMat for reading in matlab data files, read_dta for reading in stata data files, read_spss for reading in spss data files, read_sas for reading in sas data files, and fromJSON for reading in json files. read_xml for reading in XML files. Further processing may be required. read_html for reading in html tables. See read_sheet in range_read for reading in google spreadsheets. Google spreadsheets require data.type be specified. Use data.type = 'google'. read_ods for reading in open document spreadsheets.

Examples

## Not run: 
# Read in shape file
dat <- read_dat('C:/data/nmfs_manage_simple', data.type = 'shape')

# Read in spatial data file in json format
dat <- read_dat('C:/data/nmfs_manage_simple.json', is.map = TRUE)

# read in data directly from web page
dat <- read_dat("https://s3.amazonaws.com/assets.datacamp.com/blog_assets/test.txt", 
                data.type = 'delim', sep = '', header = FALSE)

## End(Not run)

## Not run: 
# Read in shape file
dat <- read_dat('C:/data/nmfs_manage_simple', data.type = 'shape')

# Read in spatial data file in json format
dat <- read_dat('C:/data/nmfs_manage_simple.json', is.map = TRUE)

# read in data directly from web page
dat <- read_dat("https://s3.amazonaws.com/assets.datacamp.com/blog_assets/test.txt", 
                data.type = 'delim', sep = '', header = FALSE)

## End(Not run)

Remove a model design from list in ModelInputData table

Description

Remove a model design from list in ModelInputData table

Usage

remove_model_design(project, names)
remove_model_design(project, names)

Arguments

`project`	Name of project.
`names`	Names of model designs to be deleted from the table

Replace suppression code

Description

This function replaces the default suppression code in a table.

Usage

replace_sup_code(output, code = NA)
replace_sup_code(output, code = NA)

Arguments

`output`	Table containing suppressed values.
`code`	The replacement suppression code. `code = NA` by default; this is ideal for plotting as ggplot automatically removes NAs.

Details

Suppressed values are represented as ‘-999' by default. This isn’t ideal for plotting. NAs – the default in 'replace_sup_code()'– are a better alternative for plots as they can easily be removed.

Examples

## Not run: 
summary_tab <- replace_sup_code(summary_tab, code = NA)

## End(Not run)
## Not run: 
summary_tab <- replace_sup_code(summary_tab, code = NA)

## End(Not run)

Reset confidentiality cache tables

Description

This function deletes all confidentiality check tables stored in the "confid_cache.json" file located in the project output folder. Resetting this cache is recommended after a long period of use as check tables can accumulate over time.

Usage

reset_confid_cache(project)
reset_confid_cache(project)

Arguments

project

Project name

Apply moving average function to catch

Description

Apply moving average function to catch

Usage

roll_catch(
  dat,
  project,
  catch,
  date,
  group = NULL,
  combine = FALSE,
  k = 10,
  fun = "mean",
  filter_date = NULL,
  date_value = NULL,
  filter_by = NULL,
  filter_value = NULL,
  filter_expr = NULL,
  facet_by = NULL,
  scale = "fixed",
  align = "center",
  conv = "none",
  tran = "identity",
  format_lab = "decimal",
  output = "tab_plot",
  ...
)
roll_catch(
  dat,
  project,
  catch,
  date,
  group = NULL,
  combine = FALSE,
  k = 10,
  fun = "mean",
  filter_date = NULL,
  date_value = NULL,
  filter_by = NULL,
  filter_value = NULL,
  filter_expr = NULL,
  facet_by = NULL,
  scale = "fixed",
  align = "center",
  conv = "none",
  tran = "identity",
  format_lab = "decimal",
  output = "tab_plot",
  ...
)

Arguments

`dat`	Primary data frame over which to apply function. Table in FishSET database should contain the string 'MainDataTable'.
`project`	Name of project.
`catch`	Variable name or names containing catch data. Multiple variables can entered as a vector.
`date`	Date variable to aggregate by.
`group`	Variable name or names to group by. Plot will display up to two grouping variables.
`combine`	Whether to combine variables listed in `group`. This is passed to the "fill" or "color" aesthetic for plots.
`k`	The width of the window.
`fun`	The function to be applied to window. Defaults to `mean`.
`filter_date`	The type of filter to apply to table. The "date_range" option will subset the data by two date values entered in `filter_val`. Other options include "year-day", "year-week", "year-month", "year", "month", "week", or "day". The argument filter_value must be provided.
`date_value`	String containing a start and end date if using filter_date = "date_range", e.g. c("2011-01-01", "2011-03-15"). If filter_date = "period" or "year-period", use integers (4 digits if year, 1-2 if day, month, or week). Use a list if using a two-part filter, e.g. "year-week", with the format `list(year, period)` or a vector if using a single period, `c(period)`. For example, `list(2011:2013, 5:7)` will filter the data table from weeks 5 through 7 for years 2011-2013 if filter_date = "year-week".`c(2:5)` will filter the data February through May when filter_date = "month".
`filter_by`	String, variable name to filter by.
`filter_value`	A vector of values to filter 'MainDataTable' by using the variable in `filter_by`.
`filter_expr`	String, a valid R expression to filter 'MainDataTable' by using the variable in `filter_by`.
`facet_by`	Variable name to facet by. This can be a variable that exists in the dataset, or a variable created by `roll_catch()` such as `"year"`, `"month"`, or `"species"` if more than one variable is entered in `catch`.
`scale`	Scale argument passed to `facet_grid`. Options include `"free"`, `"free_x"`, `"free_y"`. Defaults to `"fixed"`.
`align`	Indicates whether results of window should be left-aligned (`"left"`), right-aligned (`"right"`), or centered (`"center"`). Defaults to `"center"`.
`conv`	Convert catch variable to `"tons"`, `"metric_tons"`, or by using a function. Defaults to `FALSE`.
`tran`	A function to transform the y-axis. Options include log, log2, log10, sqrt.
`format_lab`	Formatting option for y-axis labels. Options include `"decimal"` or `"scientific"`.
`output`	Whether to display `"plot"`, `"table"`, or both. Defaults to both (`"tab_plot"`).
`...`	Additional arguments passed to `rollapply`

Examples

## Not run: 
roll_catch(pollockMainDataTable, project = "pollock", catch = "LBS_270_POLLOCK_LBS",
  date = "FISHING_START_DATE", group = "GEAR_TYPE", k = 15
)

roll_catch(pollockMainDataTable, project = "pollock", catch = c("LBS_270_POLLOCK_LBS", 
 "LBS_110_PACIFIC_COD_LBS"), date = "FISHING_START_DATE", group = "GEAR_TYPE", k = 5, 
 filter_date = "month", date_value = 4:6, facet_by = "month", conv = "tons"
)

## End(Not run)

## Not run: 
roll_catch(pollockMainDataTable, project = "pollock", catch = "LBS_270_POLLOCK_LBS",
  date = "FISHING_START_DATE", group = "GEAR_TYPE", k = 15
)

roll_catch(pollockMainDataTable, project = "pollock", catch = c("LBS_270_POLLOCK_LBS", 
 "LBS_110_PACIFIC_COD_LBS"), date = "FISHING_START_DATE", group = "GEAR_TYPE", k = 5, 
 filter_date = "month", date_value = 4:6, facet_by = "month", conv = "tons"
)

## End(Not run)

Guided user interface for FishSET functions

Description

Runs functions associated with loading data, exploring data, checking for data quality issues, generating new variables, and basic data analysis function.

Usage

run_fishset_gui()
run_fishset_gui()

Details

Opens an interactive page that allows users to select which functions to run by clicking check boxes. Data can be modified and saved. Plot and table output are saved to the output folder. Functions calls are logged in the log file.

Examples

## Not run: 
run_fishset_gui()

## End(Not run)

## Not run: 
run_fishset_gui()

## End(Not run)

Runs policy scenarios

Description

Estimate redistributed fishing effort and welfare loss/gain from changes in policy or change in other factors that influence fisher location choice.

Usage

run_policy(
  project,
  mod.name = NULL,
  policy.name = NULL,
  betadraws = 1000,
  marg_util_income = NULL,
  income_cost = NULL,
  zone.dat = NULL,
  group_var = NULL,
  enteredPrice = NULL,
  expected.catch = NULL,
  use.scalers = FALSE,
  scaler.func = NULL
)
run_policy(
  project,
  mod.name = NULL,
  policy.name = NULL,
  betadraws = 1000,
  marg_util_income = NULL,
  income_cost = NULL,
  zone.dat = NULL,
  group_var = NULL,
  enteredPrice = NULL,
  expected.catch = NULL,
  use.scalers = FALSE,
  scaler.func = NULL
)

Arguments

`project`	Name of project
`mod.name`	Model name. Argument can be the name of the model or the name can be pulled the 'modelChosen' table. Leave `mod.name` empty to use the name of the saved 'best' model. If more than one model is saved, `mod.name` should be the numeric indicator of which model to use. Use `table_view("modelChosen", project)` to view a table of saved models.
`policy.name`	List of policy scenario names created in zone_closure function
`betadraws`	Integer indicating the number of times to run the welfare simulation. Default value is `betadraws = 1000`
`marg_util_income`	For conditional and zonal logit models. Name of the coefficient to use as marginal utility of income.
`income_cost`	For conditional and zonal logit models. Logical indicating whether the coefficient for the marginal utility of income relates to cost (`TRUE`) or revenue (`FALSE`).
`zone.dat`	Variable in primary data table that contains unique zone ID.
`group_var`	Categorical variable from primary data table to group welfare outputs.
`enteredPrice`	Price data. Leave as NULL if using price data from primary dataset.
`expected.catch`	Required for conditional logit (`logit_c`) model. Name of expected catch table to use. Can be the expected catch from the short-term scenario (`short`), the medium-term scenario (`med`), the long-term scenario (`long`), or the user-defined temporal parameters (`user`).
`use.scalers`	Input for `create_model_input()`. Logical, should data be normalized? Defaults to `FALSE`. Rescaling factors are the mean of the numeric vector unless specified with `scaler.func`.
`scaler.func`	Input for `create_model_input()`. Function to calculate rescaling factors.

Details

run_policy is a wrapper function for model_prediction and welfare_predict. model_prediction estimates redistributed fishing effort after policy changes, and welfare_predict simulates welfare loss/gain.

Save modified primary data table to FishSET database

Description

Save modified primary data table to FishSET database

Usage

save_dat(dat, project)
save_dat(dat, project)

Arguments

`dat`	Name of data frame in working environment to save to FishSET database.
`project`	String, name of project.

Details

Use function to save modified data to the FishSET database. The primary data is only saved automatically in data upload and data check functions. It is therefore advisable to save the modified data to the database before moving on to modeling functions. Users should use primary data in the working environment for assessing data quality issues, modifying the data, and generating new variables. Pulling the primary data from the FishSET database on each function without manually saving will result in a loss of changes.

Examples

## Not run: 
save_dat(pollockMainDataTable, 'pollock')

## End(Not run)
## Not run: 
save_dat(pollockMainDataTable, 'pollock')

## End(Not run)

Save a meta data file to project folder

Description

Raw (i.e. original or pre-existing) meta data can be saved to the project folder. To add additional meta data (e.g. column descriptions), see ...

Usage

save_raw_meta(
  file,
  project,
  dataset = NULL,
  tab.name = NULL,
  tab.type,
  parse = FALSE,
  overwrite = FALSE,
  ...
)
save_raw_meta(
  file,
  project,
  dataset = NULL,
  tab.name = NULL,
  tab.type,
  parse = FALSE,
  overwrite = FALSE,
  ...
)

Arguments

`file`	String, file path.
`project`	Project name.
`dataset`	Optional, the data.frame associated with the meta data. Used to add column names to meta file.
`tab.name`	The table name as it appears in the FishSET Database (e.g. "projectMainDataTable" if the main table).
`tab.type`	The table type. Options include "main", "spat" (spatial), "port", "grid" (gridded), and "aux" (auxiliary).
`parse`	Logical, whether to parse meta data from a data file. See `parse_meta`.
`overwrite`	Logical, whether to overwrite existing meta table entry.
`...`	Additional arguments passed to `parse_meta`.

Northeast Scallop Data

Description

A subset of anonymized scallop data

Usage

scallop
scallop

Format

'scallop' A data.frame with 10,000 rows and 31 columns:

TRIPID: Randomly assigned trip ID number.
DATE_TRIP: Date of landing.
PERMIT.y: Randomly assigned six-digit vessel fishing permit number.
TRIP_LENGTH: Days calculated from the elapsed time between the date-time sailed and date-time landed; this is a measure of days absent.
GEARCODE: Fishing gear used on the trip.
port_lat: Latitude of the geoid.
port_lon: longitude of the geoid.
previous_port_lat: Previous latitude of geoid.
previous_port_lon: Previous longitude of geoid.
Plan Code: Portion of the VMS declaration code that identifies the fishery being declared into for the trip.
Program Code: Portion of the VMS declaration code that identifies the program within the declared fishery. For scallops, the program code delineates LA and LAGC trips, as well as access area trips from other trips.
TRIP_COST_WINSOR_2020_DOL: The estimated or real composite trip cost for the VTR trip record generated using the methods described in the Commercial Trip Cost Estimation 2007-2019 PDF file. However, these values have been Winsorized by gear type as a method of avoiding unreasonably high or low trip costs, replacing any value within each gear-group that is less than the 1st percentile or greater than the 99th percentile with the 1st and 99th percentile value, respectively.
DDLAT: The latitude reported on a VTR (Vessel Trip Reports).
DDLON: The longitude reported on a VTR (Vessel Trip Reports).
NAME: Name of wind lease which is found within a given ten minute square.
ZoneID: FishSET's version of a ten minute square.
POUNDS: Live pounds.
LANDED: Landed pounds from the dealer report.
LANDED_OBSCURED: Landed pounds from the dealer report (jittered/obscured).
DOLLAR_OBSCURED: The value of catch paid by the dealer, from the dealer report (jittered/obscured).
DOLLAR_2020_OBSCURED: The value of catch paid by the dealer, from the dealer report (in 2020 dollars, jittered/obscured).
DOLLAR_ALL_SP_2020_OBSCURED: The value of catch for all species caught (in 2020 dollars, jittered/obscured).

Source

Add source here

Ports from the NE scallop fishery

Description

A dataset containing the names and lat/lon coordinates of ports used in the US northeast scallop fishery.

Usage

scallop_ports
scallop_ports

Format

A data frame (tibble) with 40 observations and 3 variables.

[,1] Port names
[,2] Longitude
[,3] Latitude

Source

NEED TO ADD SOURCE DESCRIPTION

Create single binary fishery season identifier variable

Description

Create single binary fishery season identifier variable

Usage

seasonalID(
  dat,
  project,
  seasonal.dat = NULL,
  start,
  end,
  overlap = FALSE,
  name = NULL
)
seasonalID(
  dat,
  project,
  seasonal.dat = NULL,
  start,
  end,
  overlap = FALSE,
  name = NULL
)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in the FishSET database contains the string 'MainDataTable'.
`project`	Project name.
`seasonal.dat`	Data table containing date of fishery season(s). Data table can be pulled from the FishSET database. Leave `seasonal.dat` as NULL if supplying start and end dates with `start` and `end` arguments.
`start`	Date, supplied as a string (example: start='2011/04/22', start='04222011'), or variable in `seasonal.dat` which identifies start date of fishery season
`end`	DDate, supplied as a string (example: start='2011/04/22', start='04222011'), or variable in `seasonal.dat` which identifies end date of fishery season
`overlap`	Logical. Should trip or haul dates that start before or end after the fishery season date but starts or ends within the fishery season dates be included? FALSE indicates to inlude only hauls/trips that fall completely within the bounds of a fishery season date. Defaults to FALSE.
`name`	String Seasonal identifier name

Details

Uses a supplied dates or a table of fishery season dates to create fishery season identifier variables. Output is a binary variable called name or 'SeasonID' if name is not supplied.

For each row dat, the function matches fishery season dates provided in seasonal.dat to the earliest date variable in dat.

Value

Returns a binary variable of within (1) or outside (0) the fishery season.

Examples

## Not run: 
#Example using a table stored in the FishSET database
pcodMainDataTable <- season_ID("pcodMainDataTable", 'pcod', seasonal_dat='seasonTable', 
     start='SeasonStart', end='SeasonEnd', name='2001A')
#Example using manually entered dates
pcodMainDataTable <- season_ID("pcodMainDataTable", 'pcod', seasonal.dat=NULL, 
    start='04152011', end='06302011', name='2001A')

## End(Not run)

## Not run: 
#Example using a table stored in the FishSET database
pcodMainDataTable <- season_ID("pcodMainDataTable", 'pcod', seasonal_dat='seasonTable', 
     start='SeasonStart', end='SeasonEnd', name='2001A')
#Example using manually entered dates
pcodMainDataTable <- season_ID("pcodMainDataTable", 'pcod', seasonal.dat=NULL, 
    start='04152011', end='06302011', name='2001A')

## End(Not run)

View model metrics and record best model interactively

Description

Model metrics are displayed as a table in an R Shiny application. Check boxes next to models allow users to record preferred or best model.

Usage

select_model(project, overwrite_table = FALSE)
select_model(project, overwrite_table = FALSE)

Arguments

`project`	String, name of project.
`overwrite_table`	Logical, should best model table be written over? If table exists and value is FALSE, appends new results to existing table. Defaults to FALSE.

Details

Opens an interactive data table that displays model measures of fit for each model run saved in the model measures of fit table in the FishSET database. The name of this table should contain the string 'out.mod'. Users can delete models from the table and select the preferred model by checking the "selected" box. The table is then saved to the FishSET database with two new columns added, a TRUE/FALSE selected column and the date it was selected. The table is saved with the phrase 'modelChosen' in the FishSET database. The function can also be called indirectly in the discretefish_subroutine by specifying the select.model argument as TRUE. The 'modelChosen' table is not used in any functions. The purpose of this function and the 'modelChosen' table is to save a reference of the preferred model.

Examples

## Not run: 
select_model("pollock", overwrite_table = FALSE)

## End(Not run)

## Not run: 
select_model("pollock", overwrite_table = FALSE)

## End(Not run)

Interactive application to select variables to include/exclude in primary dataset

Description

Opens an R Shiny web application. With the application select on variables in the primary dataset that should be retained.

Usage

select_vars(dat, project)
select_vars(dat, project)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in the FishSET database contains the string 'MainDataTable'.
`project`	String, name of project.

Details

Opens an interactive table that allows users to select which variables to be included by clicking check boxes. Data should be loaded into the FishSET database before running this function. Select variables that will be used to generate further variables, such as rates or cpue, and variables to be included in models. Removed variables can be added back into the dataset at a later date using the add_vars function.

Examples

## Not run: 
select_vars(pcodMainDataTable, "pcod")

## End(Not run)

## Not run: 
select_vars(pcodMainDataTable, "pcod")

## End(Not run)

Set confidentiality parameters

Description

This function specifics whether to check for confidentiality and which rule should be applied.

Usage

set_confid_check(project, check = TRUE, v_id = NULL, rule = "n", value = NULL)
set_confid_check(project, check = TRUE, v_id = NULL, rule = "n", value = NULL)

Arguments

`project`	Name of project.
`check`	Logical, whether to check for confidentiality.
`v_id`	String, the column name containing the vessel identifier.
`rule`	String, the confidentiality rule to apply. See "Details" below. `rule = "n"` suppresses values containing fewer than n vessels. `rule = "k"` suppresses values where a single vessel contains k percent or more of the total catch.
`value`	The threshold for confidentiality. for `rule = "n"` must be an integer of at least 2. For `rule = "k"` any numeric value from 0 to 100.

Details

rule = "n" counts the number of vessel in each strata and suppresses values where fewer than n vessels are present. For rule = "k", or the "Majority allocation rule", each vessel's share of catch is calculated by strata. If any vessel's total catch share is greater than or equal to k percent the value is suppressed.

Examples

## Not run: 
set_confid_check("pollock", check = TRUE, v_id = "PERMIT", rule = "n", value = 3L)

## End(Not run)
## Not run: 
set_confid_check("pollock", check = TRUE, v_id = "PERMIT", rule = "n", value = 3L)

## End(Not run)

Create factor variable from quantiles

Description

Create a factor variable from numeric data. Numeric variable is split into categories based on quantile categories.

Usage

set_quants(
  dat,
  project,
  x,
  quant.cat = c(0.1, 0.2, 0.25, 0.33, 0.4),
  custom.quant = NULL,
  name = "set_quants"
)
set_quants(
  dat,
  project,
  x,
  quant.cat = c(0.1, 0.2, 0.25, 0.33, 0.4),
  custom.quant = NULL,
  name = "set_quants"
)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in FishSET database contains the string 'MainDataTable'.
`project`	Project name.
`x`	Variable to transform into quantiles.
`quant.cat`	Quantile options: `"0.2"`, `"0.25"`, `"0.33"`, and `"0.4"` 0.1: (0%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 100%) 0.2: (0%, 20%, 40%, 60%, 80%, 100%) 0.25: (0%, 25%, 50%, 75%, 100%) 0.33: (0%, 33%, 66%, 100%) 0.4: (0%, 10%, 50%, 90%, 100%)
`custom.quant`	Vector, user defined quantiles.
`name`	String, name of created vector. Defaults to name of the function if not defined.

Value

Primary dataset with quantile variable added.

Examples

## Not run: 
pollockMainDataTable <- set_quants(pollockMainDataTable, 'pollock', 'HAUL', 
   quant.cat=.2, 'haul.quant')

## End(Not run)
## Not run: 
pollockMainDataTable <- set_quants(pollockMainDataTable, 'pollock', 'HAUL', 
   quant.cat=.2, 'haul.quant')

## End(Not run)

Set user folder directory

Description

Set user folder directory

Usage

set_user_locoutput(loc_dir, project)
set_user_locoutput(loc_dir, project)

Arguments

`loc_dir`	Local user directory
`project`	Name of project.

Details

This function saves the local user directory to the project settings file with a valid folder directory. This directory path is used for inserting plots and tables from a folder outside the FishSET package into the FishSET RMarkdown Template.

shift_sort_x

Description

Shifts choices so that the chosen zone will be automatically the first one for choice possibilities and distances.

Usage

shift_sort_x(x, ch, y, distance, alts, ab)
shift_sort_x(x, ch, y, distance, alts, ab)

Arguments

`x`	Matrix of choice possibilities from `create_logit_input`.
`ch`	Data corresponding to actual zonal choice.
`y`	Data corresponding to actual catch.
`distance`	Data corresponding to distance.
`alts`	Number of alternative choices in model.
`ab`	Number of cost parameters + number of alts.

Value

d: matrix of choice possibilities and distance. IMPORTANT NOTE: both choice probabilities AND distances are sorted even though the column names for distances remain unchanged.

Evaluate sparsity in data over time in table format

Description

Create table of data sparsity by predefined time periods.

Usage

sparsetable(dat, project, timevar, zonevar, var)
sparsetable(dat, project, timevar, zonevar, var)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in the FishSET database contains the string 'MainDataTable'.
`project`	String, name of project.
`timevar`	Variable in `dat` containing temporal data
`zonevar`	Variable in `dat` containing zone observation assigned to
`var`	Variable in `dat` containing catch data

Evaluate sparsity in data over time in plot format

Description

Evaluate sparsity in data over time in plot format

Usage

sparsplot(project, x = NULL)
sparsplot(project, x = NULL)

Arguments

`project`	String, name of project.
`x`	Output from `sparsetable`. If `x` is null, the sparsity table will be pulled from the output folder if it exists.

Details

Returns a plot of sparsity values over time. Requires sparsity table generated by sparsetable.

GUI for spatial data checks

Description

Runs the spatial checks performed by spatial_qaqc in a shiny application.

Usage

spat_qaqc_gui(dataset, project, spatdat, checks = NULL)
spat_qaqc_gui(dataset, project, spatdat, checks = NULL)

Arguments

`dataset`	Primary data containing information on hauls or trips. Table in FishSET database contains the string 'MainDataTable'.
`project`	Name of project.
`spatdat`	Spatial data containing information on fishery management or regulatory zones. See `read_dat` for details on importing spatial data.
`checks`	(Optional) A list of spatial data quality checks outputted by `spatial_qaqc`.

Histogram of latitude and longitude by grouping variable

Description

Histogram of latitude and longitude by grouping variable

Usage

spatial_hist(dat, project, group = NULL)
spatial_hist(dat, project, group = NULL)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in FishSET database contains the string 'MainDataTable'.
`project`	String, name of project.
`group`	Column in `dat` containing grouping categories.

Details

Returns a histogram of observed lat/lon split by grouping variable. Output printed to console and saved to Output folder. Function is used to assess spatial variance/clumping of selected grouping variable.

Value

Returns histogram of latitude and longitude by grouping variable. Output returned to console and saved to Output folder.

Examples

## Not run: 
spatial_hist(pollockMainDataTable, 'pollock', 'GEAR_TYPE')

## End(Not run)
## Not run: 
spatial_hist(pollockMainDataTable, 'pollock', 'GEAR_TYPE')

## End(Not run)

Spatial data quality checks

Description

This function performs spatial quality checks and outputs summary tables and plots. Checks include percent of observations on land, outside regulatory zone (spat), and on a zone boundary. If any observation occurs outside the regulatory zones then summary information on distance from nearest zone is provided. spatial_qaqc can filter out observations that are not within the distance specified in filter_dist.

Usage

spatial_qaqc(
  dat,
  project,
  spat,
  lon.dat,
  lat.dat,
  lon.spat = NULL,
  lat.spat = NULL,
  id.spat = NULL,
  epsg = NULL,
  date = NULL,
  group = NULL,
  filter_dist = NULL
)
spatial_qaqc(
  dat,
  project,
  spat,
  lon.dat,
  lat.dat,
  lon.spat = NULL,
  lat.spat = NULL,
  id.spat = NULL,
  epsg = NULL,
  date = NULL,
  group = NULL,
  filter_dist = NULL
)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in FishSET database contains the string 'MainDataTable'.
`project`	Name of project.
`spat`	Spatial data containing information on fishery management or regulatory zones. `sf` objects are recommended, but `sp` objects can be used as well. If using a spatial table read from a csv file, then arguments `lon.spat` and `lat.spat` are required. To upload your spatial data to the FishSETFolder see `load_spatial`.
`lon.dat`	Longitude variable in `dat`.
`lat.dat`	Latitude variable in `dat`.
`lon.spat`	Variable or list from `spat` containing longitude data. Required for spatial tables read from csv files. Leave as `NULL` if `spat` is an `sf` or `sp` object.
`lat.spat`	Variable or list from `spat` containing latitude data. Required for spatial tables read from csv files. Leave as `NULL` if `spat` is an `sf` or `sp` object.
`id.spat`	Polygon ID column. Required for spatial tables read from csv files. Leave as `NULL` if `spat` is an `sf` or `sp` object.
`epsg`	EPSG number. Manually set the epsg code, which will be applied to `spat` and `dat`. If epsg is not specified but is defined for `spat`, then the `spat` epsg will be applied to `dat`. In addition, if epsg is not specified and epsg is not defined for `spat`, then a default epsg value will be applied to `spat` and `dat` (`epsg = 4326`). See http://spatialreference.org/ to help identify optimal epsg number.
`date`	String, name of date variable. Used to summarize over year. If `NULL` the first date column will be used. Returns an error if no date columns can be found.
`group`	String, optional. Name of variable to group spatial summary by.
`filter_dist`	(Optional) Numeric, distance value to filter primary data by (in meters). Rows containing distance values greater than or equal to `filter_dist` will be removed from the data. This action will be saved to the filter table.

Value

A list of plots and/or dataframes depending on whether spatial data quality issues are detected. The list includes:

dataset: Primary data. Up to five logical columns will be added if spatial issues are found: "ON_LAND" (if obs fall on land), "OUTSIDE_ZONE" (if obs occur at sea but outside zone), "ON_ZONE_BOUNDARY" (if obs occurs on zone boundary), "EXPECTED_LOC" (whether obs occurs at sea, within a zone, and not on zone boundary), and "NEAREST_ZONE_DIST_M" (distance in meters from nearest zone. Applies only to obs outside zone or on land).
spatial_summary: Dataframe containing the percentage of observations that occur at sea and within zones, on land, outside zones but at sea, or on zone boundary by year and/or group. The total number of observations by year/group are in the "N" column.
outside_plot: Plot of observations outside regulatory zones.
land_plot: Plot of observations that fall on land.
land_out_plot: Plot of observations that occur on land and are outside the regulatory zones (combines outside_plot and land_plot if both occur).
boundary_plot: Plot of observations that fall on zone boundary.
expected_plot: Plot of observations that occur at sea and within zones.
distance_plot: Histogram of distance form nearest zone (meters) by year for observations that are outside regulatory grid.
distance_freq: Binned frequency table of distance values.
distance_summary: Dataframe containing the minimum, 1st quartile, median, mean, 3rd quartile, and maximum distance values by year and/or group.

Examples

## Not run: 
# run spatial checks
spatial_qaqc("pollockMainDataTable", "pollock", spat = NMFS_AREAS, 
             lon.dat = "LonLat_START_LON", lat.dat = "LonLat_START_LAT")
             
# filter obs by distance
spat_out <- 
     spatial_qaqc(pollockMainDataTable, "pollock", spat = NMFS_AREAS,
                  lon.dat = "LonLat_START_LON", lat.dat = "LonLat_START_LAT",
                  filter_dist = 100)
mod.dat <- spat_out$dataset

## End(Not run)
  
## Not run: 
# run spatial checks
spatial_qaqc("pollockMainDataTable", "pollock", spat = NMFS_AREAS, 
             lon.dat = "LonLat_START_LON", lat.dat = "LonLat_START_LAT")
             
# filter obs by distance
spat_out <- 
     spatial_qaqc(pollockMainDataTable, "pollock", spat = NMFS_AREAS,
                  lon.dat = "LonLat_START_LON", lat.dat = "LonLat_START_LAT",
                  filter_dist = 100)
mod.dat <- spat_out$dataset

## End(Not run)

Summarize variable over data and time

Description

View summary and exploratory statistics of selected variable by date and zone.

Usage

spatial_summary(
  dat,
  project,
  stat.var = c("length", "no_unique_obs", "perc_total", "mean", "median", "min", "max",
    "sum"),
  variable,
  spat,
  lon.spat = NULL,
  lat.spat = NULL,
  lon.dat = NULL,
  lat.dat = NULL,
  cat
)
spatial_summary(
  dat,
  project,
  stat.var = c("length", "no_unique_obs", "perc_total", "mean", "median", "min", "max",
    "sum"),
  variable,
  spat,
  lon.spat = NULL,
  lat.spat = NULL,
  lon.dat = NULL,
  lat.dat = NULL,
  cat
)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in FishSET database contains the string 'MainDataTable'.
`project`	String, name of project.
`stat.var`	Options are `"length"`, `"no_unique_obs"`, `"perc_total"`, `"mean"`, `"median"`, `"min"`, `"max"`, and `"sum"`.
`variable`	Variable in `dat` to summarize over date and zone.
`spat`	Spatial data containing information on fishery management or regulatory zones. Shape, json, geojson, and csv formats are supported. Leave as NULL if the variable ‘ZoneID’ assigning observations to zones exists in `dat`.
`lon.spat`	Variable or list from `spat` containing longitude data. Required for csv files. Leave as NULL if `spat` is a shape or json file or if the variable ‘ZoneID’ exists in `dat`.
`lat.spat`	Variable or list from `spat` containing latitude data. Required for csv files. Leave as NULL if `spat` is a shape or json file, or if the variable ‘ZoneID’ exists in `dat`.
`lon.dat`	Longitude variable in `dat`. Leave as NULL if the variable ‘ZoneID’ (zonal assignment) exists in `dat`.
`lat.dat`	Latitude variable in `dat`. Leave as NULL if the variable ‘ZoneID’ (zonal assignments) exists in `dat`.
`cat`	Variable or list in `spat` that identifies the individual areas or zones. If `spat` is class sf, `cat` should be name of list containing information on zones. Leave as NULL if the variable ‘ZoneID’ exists in `dat`.

Details

stat.var details:

length:	Number of observations
no_unique_obs:	Number of unique observations
perc_total:	Percent of total observations
mean:	Mean
median:	Median
min:	Minimum
max:	Maximum
sum:	Sum

Value

Returns two plots, the variable aggregated by stat.var plotted against date and against zone.

Examples

## Not run: 
Example where ZoneID exists in dataset
   spatial_summary(pcodMainDataTable, project = 'pcod', 
      stat.var = "no_unique_obs", variable = 'HAUL')

Example where obs. have not been assigned to zones
    spatial_summary(pcodMainDataTable, project = 'pcod', stat.var = "no_unique_obs",
       variable = 'HAUL', spat = spatdat, lon.dat = 'MidLat', lat.dat = 'MidLat',
       cat = 'NMFS_AREA')

## End(Not run)
## Not run: 
Example where ZoneID exists in dataset
   spatial_summary(pcodMainDataTable, project = 'pcod', 
      stat.var = "no_unique_obs", variable = 'HAUL')

Example where obs. have not been assigned to zones
    spatial_summary(pcodMainDataTable, project = 'pcod', stat.var = "no_unique_obs",
       variable = 'HAUL', spat = spatdat, lon.dat = 'MidLat', lat.dat = 'MidLat',
       cat = 'NMFS_AREA')

## End(Not run)

Summarize species catch

Description

species_catch summarizes catch (or other numeric variables) in the main table. It can summarize by period if date is provided, grouping variables, and filter by period or value. There are several options for customizing the table and plot output.

Usage

species_catch(
  dat,
  project,
  species,
  date = NULL,
  period = NULL,
  fun = "sum",
  group = NULL,
  sub_date = NULL,
  filter_date = NULL,
  date_value = NULL,
  filter_by = NULL,
  filter_value = NULL,
  filter_expr = NULL,
  facet_by = NULL,
  type = "bar",
  conv = "none",
  tran = "identity",
  format_lab = "decimal",
  value = "count",
  position = "stack",
  combine = FALSE,
  scale = "fixed",
  output = "tab_plot",
  format_tab = "wide"
)
species_catch(
  dat,
  project,
  species,
  date = NULL,
  period = NULL,
  fun = "sum",
  group = NULL,
  sub_date = NULL,
  filter_date = NULL,
  date_value = NULL,
  filter_by = NULL,
  filter_value = NULL,
  filter_expr = NULL,
  facet_by = NULL,
  type = "bar",
  conv = "none",
  tran = "identity",
  format_lab = "decimal",
  value = "count",
  position = "stack",
  combine = FALSE,
  scale = "fixed",
  output = "tab_plot",
  format_tab = "wide"
)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in FishSET database contains the string 'MainDataTable'.
`project`	String, name of project.
`species`	Variable in `dat` containing the species catch or a vector of species variables (in pounds).
`date`	Variable in `dat` containing dates to aggregate by.
`period`	Time period to count by. Options include 'year', 'month', 'week' (week of the year), 'weekday', 'day' (day of the month), and 'day_of_year'. `date` is required.
`fun`	String, name of function to aggregate by. Defaults to `sum`.
`group`	Grouping variable name(s). Up to two grouping variables are available for line plots and one for bar plots. For bar plots, if only one species is entered the first group variable is passed to 'fill'. If multiple species are entered, species is passed to "fill" and the grouping variable is dropped. An exception occurs when facetting by species, then the grouping variable is passed to "fill". For line plots, the first grouping variable is passed to "fill" and the second to "linetype" if a single species column is entered or if facetting by species. Otherwise species is passed to "fill", the first group variable to "linetype", and second is dropped.
`sub_date`	Date variable used for subsetting, grouping, or splitting by date.
`filter_date`	The type of filter to apply to 'MainDataTable'. To filter by a range of dates, use `filter_date = "date_range"`. To filter by a given period, use "year-day", "year-week", "year-month", "year", "month", "week", or "day". The argument `date_value` must be provided.
`date_value`	This argument is paired with `filter_date`. To filter by date range, set `filter_date = "date_range"` and enter a start- and end-date into `date_value` as a string: `date_value = c("2011-01-01", "2011-03-15")`. To filter by period (e.g. "year", "year-month"), use integers (4 digits if year, 1-2 digits if referencing a day, month, or week). Use a vector if filtering by a single period: `date_filter = "month"` and `date_value = c(1, 3, 5)`. This would filter the data to January, March, and May. Use a list if using a year-period type filter, e.g. "year-week", with the format: `list(year, period)`. For example, `filter_date = "year-month"` and `date_value = list(2011:2013, 5:7)` will filter the data table from May through July for years 2011-2013.
`filter_by`	String, variable name to filter 'MainDataTable' by. the argument `filter_value` must be provided.
`filter_value`	A vector of values to filter 'MainDataTable' by using the variable in `filter_by`. For example, if `filter_by = "GEAR_TYPE"`, `filter_value = 1` will include only observations with a gear type of 1.
`filter_expr`	String, a valid R expression to filter 'MainDataTable' by using the variable in `filter_by`.
`facet_by`	Variable name to facet by. Accepts up to two variables. These can be variables that exist in the `dat`, or a variable created by `species_catch()` such as `"year"`, `"month"`, or `"week"` if a date variable is added to `sub_date`. Facetting by `"species"` is available if multiple catch columns are included in `"species"`. The first variable is facetted by row and the second by column.
`type`	Plot type, options include `"bar"` (the default) and `"line"`.
`conv`	Convert catch variable to `"tons"`, `"metric_tons"`, or by using a function entered as a string. Defaults to `"none"` for no conversion.
`tran`	A function to transform the y-axis. Options include log, log2, log10, sqrt.
`format_lab`	Formatting option for y-axis labels. Options include `"decimal"` or `"scientific"`.
`value`	Whether to calculate raw `"count"` or `"percent"` of total catch.
`position`	Positioning of bar plot. Options include 'stack', 'dodge', and 'fill'.
`combine`	Whether to combine variables listed in `group`. This is passed to the "fill" or "color" aesthetic for plots.
`scale`	Scale argument passed to `facet_grid`. Defaults to `"fixed"`.
`output`	Output a `"plot"` or `"table"`. Defaults to both (`"tab_plot"`).
`format_tab`	How table output should be formatted. Options include 'wide' (the default) and 'long'.

Value

species_catch() aggregates catch using one or more columns of catch data. Users can aggregate by time period, by group, or by both. When multiple catch variables are entered, a new column "species" is created and used to group values in plots. The "species" column can also be used to split (or facet) the plot. For table output, the "species" column will be kept if format_tab = "long", i.e. a column of species names ("species") and a column containing catch ("catch"). When format_tab = "wide", each species is given its own column of catch.

The data can be filtered by date and/or by a variable. filter_date specifies the type of date filter to apply–by date-range or by period. date_value should contain the values to filter the data by. To filter by a variable, enter its name as a string in filter_by and include the values to filter by in filter_value.

Plots can handle Up to two grouping variables, there is no limit for tables. Grouping variables can be merged into one variable using combine; in this case any number of variables can be joined, but no more than three is recommended.

For faceting, any variable (including ones listed in group) can be used, but "year", "month", "week" are also available provided a date variable is added to sub_date. Currently, combined variables cannot be faceted. A list containing a table and plot are printed to the console and viewer by default.

Examples

## Not run: 
# summarizing one catch column by one group variable
species_catch(pollockMainDataTable, species = "OFFICIAL_TOTAL_CATCH_MT",
              group = "GEAR_TYPE", ouput = "tab_plot")

# summarizing three catch columns by month
species_catch('pollockMainDataTable', 
              species = c('HAUL_LBS_270_POLLOCK_LBS', 
                          'HAUL_LBS_110_PACIFIC_COD_LBS', 
                          'HAUL_LBS_OTHER_LBS'), 
              date = 'HAUL_DATE', period = 'month_num', output = 'plot', 
              conv = 'tons')

# filtering by variable
species_catch(pollockMainDataTable, species = "OFFICAL_TOTAL_CATCH_MT",
              group = "GEAR_TYPE", filter_by = "PORT_CODE", 
              filter_value = "Dutch Harbor")
              
 # filtering by date
 species_catch(pollockMainDataTable, species = "OFFICAL_TOTAL_CATCH_MT",
               sub_date = "HAUL_DATE", filter_date = "month", date_value = 7:10)

## End(Not run)
## Not run: 
# summarizing one catch column by one group variable
species_catch(pollockMainDataTable, species = "OFFICIAL_TOTAL_CATCH_MT",
              group = "GEAR_TYPE", ouput = "tab_plot")

# summarizing three catch columns by month
species_catch('pollockMainDataTable', 
              species = c('HAUL_LBS_270_POLLOCK_LBS', 
                          'HAUL_LBS_110_PACIFIC_COD_LBS', 
                          'HAUL_LBS_OTHER_LBS'), 
              date = 'HAUL_DATE', period = 'month_num', output = 'plot', 
              conv = 'tons')

# filtering by variable
species_catch(pollockMainDataTable, species = "OFFICAL_TOTAL_CATCH_MT",
              group = "GEAR_TYPE", filter_by = "PORT_CODE", 
              filter_value = "Dutch Harbor")
              
 # filtering by date
 species_catch(pollockMainDataTable, species = "OFFICAL_TOTAL_CATCH_MT",
               sub_date = "HAUL_DATE", filter_date = "month", date_value = 7:10)

## End(Not run)

Separate secondary data table from MainDataTable

Description

Separate secondary data table from MainDataTable

Usage

split_dat(dat, aux = NULL, project, split_by = NULL, key, output = "main")
split_dat(dat, aux = NULL, project, split_by = NULL, key, output = "main")

Arguments

`dat`	Primary data containing information on hauls or trips. Table in the FishSET database contains the string 'MainDataTable'.
`aux`	Auxiliary data table in fishset_db or environment. Use string if referencing a table saved in the FishSET database. The column names from "aux" will be used to find and separate the auxiliary table from the MainDataTable.
`project`	String, name of project.
`split_by`	String, columns in MainDataTable to split by. These columns will be separated from MainDataTable. Must contain values from "key".
`key`	String, the column(s) that link the main and auxiliary data tables. If using "aux" method, "key" must match a column in both MainDataTable and "aux" data table. If using "split_by", "key" must match a column in "MainDataTable and also be included in "split_by".
`output`	String, return either the "main" data table, "aux" data table, or "both" main and aux data tables in a list.

Details

This function separates auxiliary data (or gridded and port data) from the MainDatatable. Users can either input the secondary data table (from environment or fishset_db) to determine which columns to remove or by passing a string of columns names to "split_by". Use either the "aux" or the "split_by" method. Defaults to "aux" method if both arguments are used.

Examples

## Not run: 
split_dat("pollockMainDataTable", "pollock", aux = "pollockPortTable", key = "PORT_CODE")

## End(Not run)
## Not run: 
split_dat("pollockMainDataTable", "pollock", aux = "pollockPortTable", key = "PORT_CODE")

## End(Not run)

View summary statistics

Description

View summary statistics in table format for entire dataset or for a specific variable.

Usage

summary_stats(dat, project, x = NULL, log_fun = TRUE)
summary_stats(dat, project, x = NULL, log_fun = TRUE)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in FishSET database contains the string 'MainDataTable'.
`project`	Name of project
`x`	Optional. Variable in `dat` to view summary statistics for. If not defined, summary stats are displayed for all columns in the dataset.
`log_fun`	Logical, whether to log function call (for internal use).

Details

Prints summary statistics for each variable in the data set. If x is specified, summary stats will be returned only for that variable. Numeric variables are summarized by minimum, median, mean, maximum, and the number of NA's, unique values, and zeros. Non-numeric variables are summarized by first value and the number of NA's, unique values, and empty values. Function is called in the data_check function.

Examples

## Not run: 
summary_stats(pcodMainDataTable, project = "pcod")

summary_stats(pcodMainDataTable, project = "pcod", x = "HAUL")

## End(Not run)

## Not run: 
summary_stats(pcodMainDataTable, project = "pcod")

summary_stats(pcodMainDataTable, project = "pcod", x = "HAUL")

## End(Not run)

Check if table exists in the FishSET database for the defined project

Description

Wrapper for dbExistsTable. Check if a table exists in the FishSET database.

Usage

table_exists(table, project)
table_exists(table, project)

Arguments

`table`	Name of table in FishSET database.Table name must be in quotes.
`project`	Name of project

Value

Returns a logical statement of table existence.

Examples

## Not run: 
table_exists('pollockMainDataTable', 'pollock')

## End(Not run)
## Not run: 
table_exists('pollockMainDataTable', 'pollock')

## End(Not run)

Lists fields for FishSET database table

Description

Wrapper for dbListFields. View fields of selected table.

Usage

table_fields(table, project)
table_fields(table, project)

Arguments

`table`	String, name of table in FishSET database. Table name must be in quotes.
`project`	Project name

Examples

## Not run: 
table_fields('pollockMainDataTable', 'pollock')

## End(Not run)
## Not run: 
table_fields('pollockMainDataTable', 'pollock')

## End(Not run)

Remove table from FishSET database

Description

Wrapper for dbRemoveTable. Remove a table from the FishSET database.

Usage

table_remove(table, project)
table_remove(table, project)

Arguments

`table`	String, name of table in FishSET database. Table name must be in quotes.
`project`	Name of project

Details

Function utilizes sql functions to remove tables from the FishSET database.

Examples

## Not run: 
table_remove('pollockMainDataTable', 'pollock')

## End(Not run)
## Not run: 
table_remove('pollockMainDataTable', 'pollock')

## End(Not run)

Save an existing FishSET DB table

Description

table_save() updates existing FishSET DB tables. If the table doesn't exist, the user is reminded to use the appropriate load_ function.

Usage

table_save(table, project, type, name = NULL)
table_save(table, project, type, name = NULL)

Arguments

`table`	A dataframe to save to the FishSET Database.
`project`	Name of project.
`type`	The table type. Options include, `"main"` for primary data tables, `"port"` for port tables, `"grid"` for gridded tables, `"aux"` for auxiliary tables.
`name`	String, table name. Applicable only for gridded, auxiliary, and spatial tables.

View FishSET Database table

Description

Wrapper for dbGetQuery. View or call the selected table from the FishSET database.

Usage

table_view(table, project)
table_view(table, project)

Arguments

`table`	String, name of table in FishSET database. Table name must be in quotes.
`project`	Name of project.

Details

table_view() returns a table from a project's FishSET Database.

Examples

## Not run: 
head(table_view('pollockMainDataTable', project = 'pollock'))

## End(Not run)
## Not run: 
head(table_view('pollockMainDataTable', project = 'pollock'))

## End(Not run)

View names of project tables

Description

Wrapper for dbListTables. View names of tables in a project's FishSET database.

Usage

tables_database(project)
tables_database(project)

Arguments

project

Project name

Examples

## Not run: 
tables_database('pollock')

## End(Not run)
## Not run: 
tables_database('pollock')

## End(Not run)

Number of observations by temporal unit

Description

View the number of observations by year, month, and zone in table format

Usage

temp_obs_table(
  dat,
  project,
  x,
  zoneid = NULL,
  spat = NULL,
  lon.dat = NULL,
  lat.dat = NULL,
  cat = NULL,
  lon.spat = NULL,
  lat.spat = NULL
)
temp_obs_table(
  dat,
  project,
  x,
  zoneid = NULL,
  spat = NULL,
  lon.dat = NULL,
  lat.dat = NULL,
  cat = NULL,
  lon.spat = NULL,
  lat.spat = NULL
)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in FishSET database contains the string 'MainDataTable'.
`project`	String, name of project.
`x`	Variable in `dat` containing date variable.
`zoneid`	Variable in `dat` that identifies the individual zones or areas. Defaults to NULL. Define if the name of the zone identifier variable is not 'ZoneID'.
`spat`	Spatial data containing information on fishery management or regulatory zones. Shape, json, geojson, and csv formats are supported. Required if `zoneid` does not exist in `dat`.
`lon.dat`	Longitude variable in `dat`. Required if `zoneid` does not exist in `dat`.
`lat.dat`	Latitude variable in `dat`. Required if `zoneid` does not exist in `dat`.
`cat`	Variable or list in `spat` that identifies the individual areas or zones. If `spat` is class sf, `cat` should be name of list containing information on zones. Required if `zoneid` does not exist in `dat`.
`lon.spat`	Variable or list from `spat` containing longitude data. Required if `zoneid` does not exist in `dat` and `spat` is a csv file. Leave as NULL if `spat` is a shape or json file.
`lat.spat`	Variable or list from `spat` containing latitude data. Required if `zoneid` does not exist in `dat` and `spat` is a csv file. Leave as NULL if `spat` is a shape or json file.

Details

Prints tables displaying the number of observations by year, month, and zone. assignment_column is called to assign observations to zones if zoneid does not exist in dat. Output is not saved.

Examples

## Not run: 
temp_obs_table(pollockMainDataTable, spat = map2, x = "DATE_FISHING_BEGAN",
  lon.dat = "LonLat_START_LON", lat.dat = "LonLat_START_LAT", cat = "NMFS_AREA",
  lon.spat = "", lat.spat = ""
  )

## End(Not run)

## Not run: 
temp_obs_table(pollockMainDataTable, spat = map2, x = "DATE_FISHING_BEGAN",
  lon.dat = "LonLat_START_LON", lat.dat = "LonLat_START_LAT", cat = "NMFS_AREA",
  lon.spat = "", lat.spat = ""
  )

## End(Not run)

Plot variable by month/year

Description

Returns three plots showing the variable of interest against time (as month or month/year). Plots are raw points by date, number of observations by date, and measures of a representative observation by date.

Usage

temp_plot(
  dat,
  project,
  var.select,
  len.fun = "length",
  agg.fun = "mean",
  date.var = NULL,
  alpha = 0.5,
  pages = "single",
  text.size = 8
)
temp_plot(
  dat,
  project,
  var.select,
  len.fun = "length",
  agg.fun = "mean",
  date.var = NULL,
  alpha = 0.5,
  pages = "single",
  text.size = 8
)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in FishSET database contains the string 'MainDataTable'.
`project`	String, name of project.
`var.select`	Variable in `dat` to plot against a date variable.
`len.fun`	Method, `"length"` returns the number of observations, `"unique"` returns the number of unique observations, `"percent"` returns the percentage of total observations.
`agg.fun`	Method to aggregate `var.select` by date. Choices are `"mean"`, `"median"`, `"min"`, `"max"`, or `"sum"`.
`date.var`	Date variable in `dat`. Defaults to first date variable in `dat` set if not defined.
`alpha`	The opaqueness of each data point in scatterplot. 0 is total transparency and 1 is total opaqueness. Defaults to .5.
`pages`	Whether to output plots on a single page (`"single"`, the default) or multiple pages (`"multi"`).
`text.size`	Text size of x-axes.

Details

Returns three plots showing the variable of interest against time (as month or month/year). Plots are raw points by time, number of observations by time, and aggregated variable of interest by time.

Value

Returns plot to R console and saves output to the Output folder.

Examples

## Not run: 
temp_plot(pollockMainDataTable, project='pollock', 
          var.select = 'OFFICIAL_TOTAL_CATCH_MT', len.fun = 'percent', 
          agg.fun = 'mean', date.var = 'HAUL_DATE')
          
temp_plot(pollockMainDataTable, project='pollock', 
          var.select = 'OFFICIAL_TOTAL_CATCH_MT', len.fun = 'length',
          agg.fun = 'max')

## End(Not run)

## Not run: 
temp_plot(pollockMainDataTable, project='pollock', 
          var.select = 'OFFICIAL_TOTAL_CATCH_MT', len.fun = 'percent', 
          agg.fun = 'mean', date.var = 'HAUL_DATE')
          
temp_plot(pollockMainDataTable, project='pollock', 
          var.select = 'OFFICIAL_TOTAL_CATCH_MT', len.fun = 'length',
          agg.fun = 'max')

## End(Not run)

Transform units of date variables

Description

Creates a new temporal variable by extracting temporal unit, such as year, month, or day from a date variable.

Usage

temporal_mod(
  dat,
  project,
  x,
  define.format = NULL,
  timezone = NULL,
  name = NULL,
  log_fun = TRUE,
  ...
)
temporal_mod(
  dat,
  project,
  x,
  define.format = NULL,
  timezone = NULL,
  name = NULL,
  log_fun = TRUE,
  ...
)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in FishSET database contains the string 'MainDataTable'.
`project`	Project name.
`x`	Time variable to modify from `dat`.
`define.format`	Format of temporal data. `define.format` should be NULL if converting timezone for `x` but not changing format. Format can be user-defined or from pre-defined choices. Format follows `as.Date` format. See Details for more information.
`timezone`	String, defaults to NULL. Returns the date-time in the specified time zone. Must be a recognizable timezone, such as "UTC", "America/New_York", "Europe/Amsterdam".
`name`	String, name of created variables. Defaults to 'TempMod'.
`log_fun`	Logical, whether to log function call (for internal use).
`...`	Additional arguments. Use `tz=''` to specify time zone.

Details

Converts a date variable to desired timezone or units using as.Date. date_parser is also called to ensure the date variable is in an acceptable format for as.Date. define.format defines the format that the variable should take on. Examples include "%Y%m%d", "%Y-%m-%d %H:%M:%S". Users can define their own format or use one of the predefined ones. Hours is 0-23. To return a list of time-zone name in the Olson/IANA database paste OlsonNames() to the console.

Predefined formats:

year: Takes on the format "%Y" and returns the year.
month: Takes on the format "%Y/%m" and returns the year and month.
day: Takes on the format "%Y/%m/%d" and returns the year, month, and day.
hour: Takes on the format "%Y/%m/%d %H" and returns the year, month, day and hour.
minute: Takes on the format "%Y/%m/%d %H:%M" and returns the year, month, day, hour, and minute.

For more information on formats, see https://www.stat.berkeley.edu/~s133/dates.html.

Value

Primary data set with new variable added.

Examples

## Not run: 
pcodMainDataTable <- temporal_mod(pcodMainDataTable, "pcod", 
   "DATE_LANDED", define.format = "%Y%m%d")
pcodMainDataTable <- temporal_mod(pcodMainDataTable, "pcod", 
   "DATE_LANDED", define.format = "year")

## End(Not run)


# Change to Year, month, day, minutes
## Not run: 
pcodMainDataTable <- temporal_mod(pcodMainDataTable, "pcod", 
   "DATE_LANDED", define.format = "%Y%m%d")
pcodMainDataTable <- temporal_mod(pcodMainDataTable, "pcod", 
   "DATE_LANDED", define.format = "year")

## End(Not run)


# Change to Year, month, day, minutes

Northeast Ten Minute Squares

Description

Northeast Ten Minute Squares

Usage

tenMNSQR
tenMNSQR

Format

'tenMNSQR' A simple feature COLLECTION with 5267 features and 9 fields:

AREA
PERIMETER
TEN_
TEN_ID
LL
LAT
LON
TEMP
LOC

Trip duration table and plot

Description

Display trip duration and value per unit effort

Usage

trip_dur_out(
  dat,
  project,
  start,
  end,
  units = "days",
  vpue = NULL,
  group = NULL,
  combine = TRUE,
  haul_count = TRUE,
  sub_date = NULL,
  filter_date = NULL,
  date_value = NULL,
  filter_by = NULL,
  filter_value = NULL,
  filter_expr = NULL,
  facet_by = NULL,
  type = "hist",
  bins = 30,
  density = TRUE,
  scale = "fixed",
  tran = "identity",
  format_lab = "decimal",
  pages = "single",
  remove_neg = FALSE,
  output = "tab_plot",
  tripID = NULL,
  fun.time = NULL,
  fun.numeric = NULL
)
trip_dur_out(
  dat,
  project,
  start,
  end,
  units = "days",
  vpue = NULL,
  group = NULL,
  combine = TRUE,
  haul_count = TRUE,
  sub_date = NULL,
  filter_date = NULL,
  date_value = NULL,
  filter_by = NULL,
  filter_value = NULL,
  filter_expr = NULL,
  facet_by = NULL,
  type = "hist",
  bins = 30,
  density = TRUE,
  scale = "fixed",
  tran = "identity",
  format_lab = "decimal",
  pages = "single",
  remove_neg = FALSE,
  output = "tab_plot",
  tripID = NULL,
  fun.time = NULL,
  fun.numeric = NULL
)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in FishSET database should contain the string 'MainDataTable'.
`project`	String, name of project.
`start`	Date variable containing the start of vessel trip.
`end`	Date variable containing the end of vessel trip.
`units`	Time unit, defaults to `"days"`. Options include `"secs"`, `"mins"`, `"hours"`, `"days"`, or `"weeks"`.
`vpue`	Optional, numeric variable in `dat` for calculating value per unit effort (VPUE).
`group`	Optional, string names of variables to group by. By default, grouping variables are combined unless `combine = FALSE` and `type = "freq_poly"` (frequency polygon). `combine = TRUE` will not work when `type = "hist"` (histogram). Frequency polygon plots can use up to two grouping variables if `combine = FALSE`: the first variable is assigned to the "color" aesthetic and second to the "linetype" aesthetic.
`combine`	Logical, whether to combine the variables listed in `group` for plot.
`haul_count`	Logical, whether to include hauls per trip in table and/or plot (this can only be used if collapsing data to trip level using `tripID`. If data is already at trip level, add your haul frequency variable to `vpue`).
`sub_date`	Date variable used for subsetting, grouping, or splitting by date.
`filter_date`	The type of filter to apply to 'MainDataTable'. To filter by a range of dates, use `filter_date = "date_range"`. To filter by a given period, use "year-day", "year-week", "year-month", "year", "month", "week", or "day". The argument `date_value` must be provided.
`date_value`	This argument is paired with `filter_date`. To filter by date range, set `filter_date = "date_range"` and enter a start- and end-date into `date_value` as a string: `date_value = c("2011-01-01", "2011-03-15")`. To filter by period (e.g. "year", "year-month"), use integers (4 digits if year, 1-2 digits if referencing a day, month, or week). Use a vector if filtering by a single period: `date_filter = "month"` and `date_value = c(1, 3, 5)`. This would filter the data to January, March, and May. Use a list if using a year-period type filter, e.g. "year-week", with the format: `list(year, period)`. For example, `filter_date = "year-month"` and `date_value = list(2011:2013, 5:7)` will filter the data table from May through July for years 2011-2013.
`filter_by`	String, variable name to filter 'MainDataTable' by. the argument `filter_value` must be provided.
`filter_value`	A vector of values to filter 'MainDataTable' by using the variable in `filter_by`. For example, if `filter_by = "GEAR_TYPE"`, `filter_value = 1` will include only observations with a gear type of 1.
`filter_expr`	String, a valid R expression to filter 'MainDataTable' by.
`facet_by`	Variable name to facet by. Facetting by `"year"`, `"month"`, or `"week"` provided a date variable is added to `sub_date`.
`type`	The type of plot. Options include histogram (`"hist"`, the default) and frequency polygon (`"freq_poly"`).
`bins`	The number of bins used in histogram/freqency polygon.
`density`	Logical, whether densities or frequencies are used for histogram. Defaults to `TRUE`.
`scale`	Scale argument passed to `facet_grid`. Defaults to `"fixed"`. Other options include `"free_y"`, `"free_x"`, and `"free_xy"`.
`tran`	Transformation to be applied to the x-axis. A few options include `"log"`, `"log10"`, and `"sqrt"`. See `scale_continuous` for a complete list.
`format_lab`	Formatting option for x-axis labels. Options include `"decimal"` or `"scientific"`.
`pages`	Whether to output plots on a single page (`"single"`, the default) or multiple pages (`"multi"`).
`remove_neg`	Logical, whether to remove negative trip durations from the plot and table.
`output`	Options include 'table', 'plot', or 'tab_plot' (both table and plot, the default).
`tripID`	Column(s) that identify the individual trip.
`fun.time`	How to collapse temporal data. For example, `min`, `mean`, `max`. Cannot be `sum` for temporal variables.
`fun.numeric`	How to collapse numeric or temporal data. For example, `min`, `mean`, `max`, `sum`. Defaults to `mean`.

Value

trip_dur_out() calculates vessel trip duration given a start and end date, converts trip duration to the desired unit of time (e.g. weeks, days, or hours), and returns a table and/or plot. There is an option for calculating vpue (value per unit of effort) as well. The data can be filtered by date and/or by a variable. filter_date specifies the type of date filter to apply–by date-range or by period. date_value should contain the values to filter the data by. To filter by a variable, enter its name as a string in filter_by and include the values to filter by in filter_value. If multiple grouping variables are given then they are combined into one variable unless combine = FALSE and type = "freq_poly". No more than three grouping variables is recommended if pages = "single". Any variable in the dataset can be used for faceting, but "year", "month", and "week" are also available. Distribution plots can be combined on a single page or printed individually with pages.

Examples

## Not run: 
trip_dur_out(pollockMainDataTable,
  start = "FISHING_START_DATE", end = "HAUL_DATE",
  units = "days", vpue = "OFFICIAL_TOTAL_CATCH", output = "plot",
  tripID = c("PERMIT", "TRIP_SEQ"), fun.numeric = sum, fun.time = min
)


## End(Not run)
## Not run: 
trip_dur_out(pollockMainDataTable,
  start = "FISHING_START_DATE", end = "HAUL_DATE",
  units = "days", vpue = "OFFICIAL_TOTAL_CATCH", output = "plot",
  tripID = c("PERMIT", "TRIP_SEQ"), fun.numeric = sum, fun.time = min
)


## End(Not run)

Check rows are unique

Description

Check for and remove non-unique rows from primary dataset.

Usage

unique_filter(dat, project, remove = FALSE)
unique_filter(dat, project, remove = FALSE)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in the FishSET database contains the string 'MainDataTable'.
`project`	String, name of project.
`remove`	Logical, if `TRUE` removes non-unique rows. Defaults to `FALSE`.

Details

Output is determined by remove. If remove = TRUE then non-unique rows are removed. If remove = FALSE then only a statement is returned regarding the number of rows that are not unique.

Value

Returns the modified primary dataset with non-unique rows removed if remove = TRUE.

Examples

## Not run: 
# check for unique rows
unique_filter(pollockMainDataTable)

# remove non-unique rows from dataset
mod.dat <- unique_filter(pollockMainDataTable, remove = TRUE)

## End(Not run)
## Not run: 
# check for unique rows
unique_filter(pollockMainDataTable)

# remove non-unique rows from dataset
mod.dat <- unique_filter(pollockMainDataTable, remove = TRUE)

## End(Not run)

Summarize active vessels

Description

vessel_count counts the number of active vessels in the main table. It can summarize by period if date is provided, group by any number of grouping variables, and filter by period or value. There are several options for customizing table/plot output.

Usage

vessel_count(
  dat,
  project,
  v_id,
  date = NULL,
  period = NULL,
  group = NULL,
  sub_date = NULL,
  filter_date = NULL,
  date_value = NULL,
  filter_by = NULL,
  filter_value = NULL,
  filter_expr = NULL,
  facet_by = NULL,
  combine = FALSE,
  position = "stack",
  tran = "identity",
  format_lab = "decimal",
  value = "count",
  type = "bar",
  scale = "fixed",
  output = "tab_plot"
)
vessel_count(
  dat,
  project,
  v_id,
  date = NULL,
  period = NULL,
  group = NULL,
  sub_date = NULL,
  filter_date = NULL,
  date_value = NULL,
  filter_by = NULL,
  filter_value = NULL,
  filter_expr = NULL,
  facet_by = NULL,
  combine = FALSE,
  position = "stack",
  tran = "identity",
  format_lab = "decimal",
  value = "count",
  type = "bar",
  scale = "fixed",
  output = "tab_plot"
)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in FishSET database contains the string 'MainDataTable'.
`project`	String, name of project.
`v_id`	Variable in `dat` containing vessel identifier to count.
`date`	Date variable to aggregate by.
`period`	Time period to aggregate by. Options include `"year"`, `"month"`, `"week"` (weeks in the year), `"weekday"`, `"weekday_abv"`, `"day_of_month"`, `"day_of_year"`, and `"cal_date"` (calender date).
`group`	Names of grouping variables. For line plots (`type = "line"`) two grouping variables can be entered, the first is passed to "color" and second to "linetype". Only one grouping variable can be used for barplots (`type = "bar"`), which is passed to "fill". When `combine = TRUE` all variables in `group` will be joined. Grouping by `"year"`, `"month"`, and `"week"` are available if a date variable is added to `sub_date`.
`sub_date`	Date variable used for subsetting, grouping, or splitting by date.
`filter_date`	The type of filter to apply to 'MainDataTable'. To filter by a range of dates, use `filter_date = "date_range"`. To filter by a given period, use "year-day", "year-week", "year-month", "year", "month", "week", or "day". The argument `date_value` must be provided.
`date_value`	This argument is paired with `filter_date`. To filter by date range, set `filter_date = "date_range"` and enter a start- and end-date into `date_value` as a string: `date_value = c("2011-01-01", "2011-03-15")`. To filter by period (e.g. "year", "year-month"), use integers (4 digits if year, 1-2 digits if referencing a day, month, or week). Use a vector if filtering by a single period: `date_filter = "month"` and `date_value = c(1, 3, 5)`. This would filter the data to January, March, and May. Use a list if using a year-period type filter, e.g. "year-week", with the format: `list(year, period)`. For example, `filter_date = "year-month"` and `date_value = list(2011:2013, 5:7)` will filter the data table from May through July for years 2011-2013.
`filter_by`	String, variable name to filter 'MainDataTable' by. the argument `filter_value` must be provided.
`filter_value`	A vector of values to filter 'MainDataTable' by using the variable in `filter_by`. For example, if `filter_by = "GEAR_TYPE"`, `filter_value = 1` will include only observations with a gear type of 1.
`filter_expr`	String, a valid R expression to filter 'MainDataTable' by using the variable in `filter_by`.
`facet_by`	Variable name to facet by. Accepts up to two variables. These can be variables that exist in `dat`, or a variable created by `vessel_count()` such as `"year"`, `"month"`, or `"week"` if a date variable is added to `sub_date`. The first variable is facetted by row and the second by column.
`combine`	Whether to combine variables listed in `group`. This is passed to the "fill" or "color" aesthetic for plots.
`position`	Positioning of bar plot. Options include 'stack', 'dodge', and 'fill'.
`tran`	A function to transform the y-axis. Options include log, log2, log10, and sqrt.
`format_lab`	decimal or scientific
`value`	Whether to return `"count"` or `"percent"` of active vessels. Defaults to `"count"`.
`type`	Plot type, options include `"bar"` (the default) and `"line"`.
`scale`	Scale argument passed to `facet_grid`. Options include `"free"`, `"free_x"`, `"free_y"`. Defaults to `"fixed"`.
`output`	Whether to display `"plot"`, `"table"`. Defaults to both (`"tab_plot"`).

Details

vessel_count gives the number (or percent) of active vessels using a column of unique vessel IDs. The data can be filtered by date and/or by a variable. (console users may want to use a separate filtering function, like dplyr::filter, before running vessel_count: note that this is okay but will lead to different output if using log_rerun). filter_date specifies the type of date filter to apply–by date-range or by period. date_value should contain the values to filter the data by. To filter by a variable, enter its name as a string in filter_by and include the values to filter by in filter_value.

Up to two grouping variables can be entered. Grouping variables can be merged into one variable using combine = TRUE; in this case any number of variables can be joined, but no more than three is recommended.

Value

When output = "tab_plot" a list containing a table and plot are returned. If output = "table" only the summary table is returned, if output = "plot" only the plot.

Examples

## Not run: 
# grouping by two variables
vessel_count(pollockMainDataTable, v_id = "VESSEL_ID", 
             group = c("GEAR_TYPE", "IFQ"))
             
# filter by variable
vessel_count(pollockMainDataTable, v_id = "VESSEL_ID", group = "GEAR_TYPE",
             filter_by = "IFQ", filter_value = "Y")
             
# filter by month
vessel_count(pollockMainDataTable, v_id = "VESSEL_ID", group = "GEAR_TYPE",
             sub_date = "HAUL_DATE", date_filter = "month", date_value = 1:5)
             
#' # filter by date
vessel_count(pollockMainDataTable, v_id = "VESSEL_ID", group = "GEAR_TYPE",
             sub_date = "HAUL_DATE", date_filter = "date_range", 
             date_value = c("2011-01-01", "2011-02-05"))

# summarize by month
vessel_count(pollockMainDataTable, v_id = 'VESSEL_ID', date = 'DATE_FISHING_BEGAN', 
             period = 'month', group = 'DISEMBARKED_PORT', position = 'dodge', 
             output = 'plot')

## End(Not run)
## Not run: 
# grouping by two variables
vessel_count(pollockMainDataTable, v_id = "VESSEL_ID", 
             group = c("GEAR_TYPE", "IFQ"))
             
# filter by variable
vessel_count(pollockMainDataTable, v_id = "VESSEL_ID", group = "GEAR_TYPE",
             filter_by = "IFQ", filter_value = "Y")
             
# filter by month
vessel_count(pollockMainDataTable, v_id = "VESSEL_ID", group = "GEAR_TYPE",
             sub_date = "HAUL_DATE", date_filter = "month", date_value = 1:5)
             
#' # filter by date
vessel_count(pollockMainDataTable, v_id = "VESSEL_ID", group = "GEAR_TYPE",
             sub_date = "HAUL_DATE", date_filter = "date_range", 
             date_value = c("2011-01-01", "2011-02-05"))

# summarize by month
vessel_count(pollockMainDataTable, v_id = 'VESSEL_ID', date = 'DATE_FISHING_BEGAN', 
             period = 'month', group = 'DISEMBARKED_PORT', position = 'dodge', 
             output = 'plot')

## End(Not run)

View the most recent fleet table by project

Description

View the most recent fleet table by project

Usage

view_fleet_table(project)
view_fleet_table(project)

Arguments

project

The name of project.

Examples

## Not run: 
view_fleet_table("pollock")

## End(Not run)
## Not run: 
view_fleet_table("pollock")

## End(Not run)

Visualize gridded data on a map

Description

Visualize gridded data on a map

Usage

view_grid_dat(
  grid,
  project,
  lon,
  lat,
  value,
  split_by = NULL,
  group = NULL,
  agg_fun = "mean"
)
view_grid_dat(
  grid,
  project,
  lon,
  lat,
  value,
  split_by = NULL,
  group = NULL,
  agg_fun = "mean"
)

Arguments

`grid`	Gridded data table to visualize. Use string if visualizing a gridded data table in the FishSET Database.
`project`	String, project name.
`lon`	String, variable name containing longitude.
`lat`	String, variable name containing latitude.
`value`	String, variable name containing gridded values, e.g. sea surface temperature, wind speed, etc.
`split_by`	String, variable in gridded data table to split by.
`group`	String, variable in gridded data table to group `value` by. In addition to the variable(s) in `group`, `value` is also aggregated by each longitude-latitude pair. The string `"lonlat"` is a shortcut for `group = c("lon", "lat")` which aggregates the `value` for each longitude-latitude pair across the entire dataset.
`agg_fun`	Aggregating function applied to `group`. Defaults to mean.

View

Description

View

Usage

view_lon_lat(dat, lon, lat, id = NULL, crs = 4326)
view_lon_lat(dat, lon, lat, id = NULL, crs = 4326)

Arguments

`dat`	Data containing `lon` and `lat` columns.
`lon`	Name of Longitude column.
`lat`	Name of Lattitude column.
`id`	Optional, name of an ID variable that is paired with `lon` and `lat` columns
`crs`	Optional, coordinate reference system to use. Defaults to EPSG code 4326 (WGS 84).

View model design file in database

Description

View model design file in database

Usage

view_model_design(project, date = NULL)
view_model_design(project, date = NULL)

Arguments

`project`	Project name.
`date`	String, date model design file was created.

View interactive map of spatial data

Description

View interactive map of spatial data

Usage

view_spat(spat, id = NULL, type = "polygon")
view_spat(spat, id = NULL, type = "polygon")

Arguments

`spat`	Spatial dataset to view. Must be an object of class `sf` or `sfc`.
`id`	Optional, name of spatial ID column to view with spatial data.
`type`	Can be `"polygon"`, `"line"`, or `"point"`.

Examples

## Not run: 
view_spat(pollockNMFSSpatTable, id = "NMFS_AREA")

## End(Not run)
## Not run: 
view_spat(pollockNMFSSpatTable, id = "NMFS_AREA")

## End(Not run)

Summarize weekly catch

Description

weekly_catch summarizes catch (or other numeric variables) in the main table by week. It can summarize by grouping variables and filter by period or value. There are several options for customizing the table and plot output.

Usage

weekly_catch(
  dat,
  project,
  species,
  date,
  fun = "sum",
  group = NULL,
  sub_date = NULL,
  filter_date = NULL,
  date_value = NULL,
  filter_by = NULL,
  filter_value = NULL,
  filter_expr = NULL,
  facet_by = NULL,
  type = "bar",
  conv = "none",
  tran = "identity",
  format_lab = "decimal",
  value = "count",
  position = "stack",
  combine = FALSE,
  scale = "fixed",
  output = "tab_plot",
  format_tab = "wide"
)
weekly_catch(
  dat,
  project,
  species,
  date,
  fun = "sum",
  group = NULL,
  sub_date = NULL,
  filter_date = NULL,
  date_value = NULL,
  filter_by = NULL,
  filter_value = NULL,
  filter_expr = NULL,
  facet_by = NULL,
  type = "bar",
  conv = "none",
  tran = "identity",
  format_lab = "decimal",
  value = "count",
  position = "stack",
  combine = FALSE,
  scale = "fixed",
  output = "tab_plot",
  format_tab = "wide"
)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in FishSET database contains the string 'MainDataTable'.
`project`	String, name of project.
`species`	A variable in `dat` containing the species catch or a vector of species variables.
`date`	Variable in `dat` containing dates to aggregate by.
`fun`	Name of function to aggregate by. Defaults to `sum`.
`group`	Grouping variable names(s). Up to two grouping variables are available for line plots and one for bar plots. For bar plots, if only one species is entered the first group variable is passed to "fill". If multiple species are entered, species is passed to "fill" and the grouping variable is dropped. An exception occurs when faceting by species, then the grouping variable is passed to "fill". For line plots, the first grouping variable is passed to "fill" and the second to "linetype" if a single species column is entered or if faceting by species. Otherwise, species is passed to "fill", the first group variable to "linetype", and second is dropped.
`sub_date`	Date variable used for subsetting, grouping, or splitting by date.
`filter_date`	The type of filter to apply to 'MainDataTable'. To filter by a range of dates, use `filter_date = "date_range"`. To filter by a given period, use "year-day", "year-week", "year-month", "year", "month", "week", or "day". The argument `date_value` must be provided.
`date_value`	This argument is paired with `filter_date`. To filter by date range, set `filter_date = "date_range"` and enter a start- and end-date into `date_value` as a string: `date_value = c("2011-01-01", "2011-03-15")`. To filter by period (e.g. "year", "year-month"), use integers (4 digits if year, 1-2 digits if referencing a day, month, or week). Use a vector if filtering by a single period: `date_filter = "month"` and `date_value = c(1, 3, 5)`. This would filter the data to January, March, and May. Use a list if using a year-period type filter, e.g. "year-week", with the format: `list(year, period)`. For example, `filter_date = "year-month"` and `date_value = list(2011:2013, 5:7)` will filter the data table from May through July for years 2011-2013.
`filter_by`	String, variable name to filter 'MainDataTable' by. the argument `filter_value` must be provided.
`filter_value`	A vector of values to filter 'MainDataTable' by using the variable in `filter_by`. For example, if `filter_by = "GEAR_TYPE"`, `filter_value = 1` will include only observations with a gear type of 1.
`filter_expr`	String, a valid R expression to filter 'MainDataTable' by using the variable in `filter_by`.
`facet_by`	Variable name to facet by. Accepts up to two variables. These can be variables that exist in the dataset, or a variable created by `species_catch()` such as `"year"`, `"month"`, or `"week"` if a date variable is added to `sub_date`. Facetting by `"species"` is available if multiple catch columns are included in `"species"`. The first variable is facetted by row and the second by column.
`type`	Plot type, options include `"bar"` (the default) and `"line"`.
`conv`	Convert catch variable to `"tons"`, `"metric_tons"`, or by using a function entered as a string. Defaults to `"none"` for no conversion.
`tran`	A function to transform the y-axis. Options include log, log2, log10, sqrt.
`format_lab`	Formatting option for y-axis labels. Options include `"decimal"` or `"scientific"`.
`value`	Whether to calculate raw `"count"` or `"percent"` of total catch.
`position`	Positioning of bar plot. Options include 'stack', 'dodge', and 'fill'.
`combine`	Whether to combine variables listed in `group`. This is passed to the "fill" or "color" aesthetic for plots.
`scale`	Scale argument passed to `facet_grid`. Defaults to `"fixed"`.
`output`	Return output as `"plot"`, `"table"`, or both `"tab_plot"`. Defaults to both (`"tab_plot"`).
`format_tab`	How table output should be formatted. Options include 'wide' (the default) and 'long'.

Value

weekly_catch() aggregates catch by week using one or more columns of catch data. When multiple catch variables are entered, a new column "species" is created and used to group values in plots. The "species" column can also be used to split (or facet) the plot. For table output, the "species" column will be kept if format_tab = "long", i.e. a column of species names ("species") and a column containing catch ("catch"). When format_tab = "wide", each species is given its own column of catch. The data can be filtered by date and/or by a variable. filter_date specifies the type of date filter to apply–by date-range or by period. date_value should contain the values to filter the data by. To filter by a variable, enter its name as a string in filter_by and include the values to filter by in filter_value. Up to two grouping variables can be entered. Grouping variables can be merged into one variable using combine; in this case any number of variables can be joined, but no more than three is recommended. For faceting, any variable (including ones listed in group) can be used, but "year", "month", "week" are also available provided a date variable is added to sub_date. Currently, combined variables cannot be faceted. A list containing a table and plot are printed to the console and viewer by default.

Examples

## Not run: 
weekly_catch(pollockMainDataTable,
  species = c(
    "HAUL_LBS_270_POLLOCK_LBS",
    "HAUL_LBS_110_PACIFIC_COD_LBS",  "HAUL_LBS_OTHER_LBS"
  ), date = "DATE_FISHING_BEGAN",
  conv = "tons", year = 2011, output = "plot"
)

## End(Not run)
## Not run: 
weekly_catch(pollockMainDataTable,
  species = c(
    "HAUL_LBS_270_POLLOCK_LBS",
    "HAUL_LBS_110_PACIFIC_COD_LBS",  "HAUL_LBS_OTHER_LBS"
  ), date = "DATE_FISHING_BEGAN",
  conv = "tons", year = 2011, output = "plot"
)

## End(Not run)

Summarize average CPUE by week

Description

weekly_effort summarizes CPUE (or other numeric variables) in the main table by week. It can summarize by grouping variables and filter by period or value. There are several options for customizing the table and plot output.

Usage

weekly_effort(
  dat,
  project,
  cpue,
  date,
  group = NULL,
  sub_date = NULL,
  filter_date = NULL,
  date_value = NULL,
  filter_by = NULL,
  filter_value = NULL,
  filter_expr = NULL,
  facet_by = NULL,
  conv = "none",
  tran = "identity",
  format_lab = "decimal",
  combine = FALSE,
  scale = "fixed",
  output = "tab_plot",
  format_tab = "wide"
)
weekly_effort(
  dat,
  project,
  cpue,
  date,
  group = NULL,
  sub_date = NULL,
  filter_date = NULL,
  date_value = NULL,
  filter_by = NULL,
  filter_value = NULL,
  filter_expr = NULL,
  facet_by = NULL,
  conv = "none",
  tran = "identity",
  format_lab = "decimal",
  combine = FALSE,
  scale = "fixed",
  output = "tab_plot",
  format_tab = "wide"
)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in FishSET database contains the string 'MainDataTable'.
`project`	String, name of project.
`cpue`	Variable(s) in `dat` containing catch per unit effort.
`date`	A variable in `dat` containing dates to aggregate by.
`group`	Grouping variable name(s). Up to two grouping variables are available. For plotting, if a single CPUE column is entered the first grouping variable is passed to the "color" aesthetic and the second to "linetype". If multiple CPUE columns are entered, a new variable named "species" is created and passed to "fill", the first group variable to "linetype", and second is dropped.
`sub_date`	Date variable used for subsetting, grouping, or splitting by date.
`filter_date`	The type of filter to apply to 'MainDataTable'. To filter by a range of dates, use `filter_date = "date_range"`. To filter by a given period, use "year-day", "year-week", "year-month", "year", "month", "week", or "day". The argument `date_value` must be provided.
`date_value`	This argument is paired with `filter_date`. To filter by date range, set `filter_date = "date_range"` and enter a start- and end-date into `date_value` as a string: `date_value = c("2011-01-01", "2011-03-15")`. To filter by period (e.g. "year", "year-month"), use integers (4 digits if year, 1-2 digits if referencing a day, month, or week). Use a vector if filtering by a single period: `date_filter = "month"` and `date_value = c(1, 3, 5)`. This would filter the data to January, March, and May. Use a list if using a year-period type filter, e.g. "year-week", with the format: `list(year, period)`. For example, `filter_date = "year-month"` and `date_value = list(2011:2013, 5:7)` will filter the data table from May through July for years 2011-2013.
`filter_by`	String, variable name to filter 'MainDataTable' by. the argument `filter_value` must be provided.
`filter_value`	A vector of values to filter 'MainDataTable' by using the variable in `filter_by`. For example, if `filter_by = "GEAR_TYPE"`, `filter_value = 1` will include only observations with a gear type of 1.
`filter_expr`	String, a valid R expression to filter 'MainDataTable' by using the variable in `filter_by`.
`facet_by`	Variable name to facet by. Accepts up to two variables. Facetting by `"year"` is available if a date variable is added to `sub_date`. Facetting by `"species"` is available if multiple cpue columns are included in `"cpue"`. The first variable is facetted by row and the second by column.
`conv`	Convert catch variable to `"tons"`, `"metric_tons"`, or by using a function entered as a string. Defaults to `"none"` for no conversion.
`tran`	A function to transform the y-axis. Options include log, log2, log10, sqrt.
`format_lab`	Formatting option for y-axis labels. Options include `"decimal"` or `"scientific"`.
`combine`	Whether to combine variables listed in `group`. This is passed to the "color" aesthetic for plots.
`scale`	Scale argument passed to `facet_grid`. Defaults to `"fixed"`.
`output`	Whether to display `"plot"`, `"table"`. Defaults to both (`"tab_plot"`).
`format_tab`	How table output should be formatted. Options include 'wide' (the default) and 'long'.

Value

weekly_effort() calculates mean CPUE by week. This function doesn't calculate CPUE; the CPUE variable must be created in advance (see cpue). When multiple CPUE variables are entered, a new column named "species" is created and used to group values in plots. The "species" column can also be used to split (or facet) the plot. For table output, the "species" column will be kept if format_tab = "long", i.e. a column of species names ("species") and a column containing the mean CPUE ("mean_cpue"). When format_tab = "wide", each CPUE variable is given its own value column. The data can be filtered by date and/or by a variable. filter_date specifies the type of date filter to apply–by date-range or by period. date_value should contain the values to filter the data by. To filter by a variable, enter its name as a string in filter_by and include the values to filter by in filter_value. Up to two grouping variables can be entered. Grouping variables can be merged into one variable using combine; in this case any number of variables can be joined, but no more than three is recommended. For faceting, any variable (including ones listed in group) can be used, but "year" and "species" are also available. Facetting by "year" requires a date variable be added to sub_date. Currently, combined variables cannot be faceted. A list containing a table and plot are printed to the console and viewer by default.

Examples

## Not run: 
weekly_effort(pollockMainDataTable, "CPUE", "DATE_FISHING_BEGAN", filter_date = "year", 
              date_value = 2011, output = "table")

## End(Not run)
## Not run: 
weekly_effort(pollockMainDataTable, "CPUE", "DATE_FISHING_BEGAN", filter_date = "year", 
              date_value = 2011, output = "table")

## End(Not run)

Welfare plots and tables

Description

Generate plots and tables for welfare simulations

Usage

welfare_outputs(
  project,
  mod.name,
  closures,
  betadraws = 1000,
  zone.dat = NULL,
  group_var = NULL
)
welfare_outputs(
  project,
  mod.name,
  closures,
  betadraws = 1000,
  zone.dat = NULL,
  group_var = NULL
)

Arguments

`project`	Name of project
`mod.name`	Model name. Argument can be the name of the model or the name can be pulled the 'modelChosen' table. Leave `mod.name` empty to use the name of the saved 'best' model. If more than one model is saved, `mod.name` should be the numeric indicator of which model to use. Use `table_view("modelChosen", project)` to view a table of saved models.
`closures`	Closure scenarios
`betadraws`	Integer indicating the numer of times to run the welfare simulation. Default value is `betadraws = 1000`
`zone.dat`	Variable in primary data table that contains unique zone ID.
`group_var`	Categorical variable from primary data table to group welfare outputs.

Details

Returns a list with (1) plot showing welfare loss/gain for all scenarios in dollars, (2) plot showing welfare loss/gain as percentage, (3) dataframe with welfare summary stats in dollars, (4) dataframe with welfare summary stats as percentages, and (5) dataframe with welfare details such as number of trips, mean loss per trip, and mean of the total welfare loss across all trips.

Welfare analysis

Description

Simulate the welfare loss/gain from changes in policy or changes in other factors that influence fisher location choice.

Usage

welfare_predict(
  project,
  mod.name,
  closures,
  betadraws = 1000,
  marg_util_income = NULL,
  income_cost = NULL,
  expected.catch = NULL,
  enteredPrice = NULL
)
welfare_predict(
  project,
  mod.name,
  closures,
  betadraws = 1000,
  marg_util_income = NULL,
  income_cost = NULL,
  expected.catch = NULL,
  enteredPrice = NULL
)

Arguments

`project`	Name of project
`mod.name`	Name of selected model (mchoice)
`closures`	Closure scenarios
`betadraws`	Integer indicating the numer of times to run the welfare simulation. Default value is `betadraws = 1000`
`marg_util_income`	For conditional and zonal logit models. Name of the coefficient to use as marginal utility of income
`income_cost`	For conditional and zonal logit models. Logical indicating whether the coefficient for the marginal utility of income relates to cost (`TRUE`) or revenue (`FALSE`)
`expected.catch`	Name of expectedchatch table to use
`enteredPrice`	Price for welfare

Details

To simulate welfare loss/gain, the model coefficients are sampled 1000 times using a multivariate random number generator (mvgrnd) and the welfare loss/gain for each observation is calculated (see section 9.3 in the user manual) for each of the sampled coefficients, and all of the estimated welfare values are saved to a file in the project outputs folder.

Note that this function is called by run_policy.

Northeast wind closure areas

Description

Northeast wind closure areas

Usage

windLease
windLease

Format

'windLease' Simple features collection with 32 features and 1 field:

NAME: Name of wind lease.

Write a data table to local file directory

Description

Write a data table to local file directory

Usage

write_dat(dat, project, path = NULL, file_type = "csv", ...)
write_dat(dat, project, path = NULL, file_type = "csv", ...)

Arguments

`dat`	Name of data frame in working environment to save to file.
`project`	String, project name.
`path`	String, path or connection to write to. If left empty, the file will be written to the dat folder in the project directory.
`file_type`	String, the type of file to write to. Options include `"csv"`, `"txt"` (tab-separated text file), `"xlsx"` (excel), `"rdata"`, `"json"`, `"stata"`, `"spss"`, `"sas"`, and `"matlab"`.
`...`	Additional arguments passed to writing function. See "details" for the list of functions.

Details

Leave path = NULL to save dat to the data folder in the project directory See write.table for csv and tab-separated files, save for R data files, write.xlsx, read_json for json files, st_write for geojson files, read_dta for Stata files, read_spss for SPSS files, read_sas for SAS files, and writeMat for Matlab files, and st_write for shape files.

Examples

## Not run: 
# Save to the default data folder in project directory
write_dat(pollockMainDataTable, type = "csv", "pollock")

# Save to defined directory location
write_dat(pollockMainDataTable, path = "C://data/pollock_dataset.csv", 
          type = "csv", "pollock")
          
# Save shape file
write_dat(ST6, path = "C://data//ST6.shp", type = "shp", project = 'Pollock')

## End(Not run)
## Not run: 
# Save to the default data folder in project directory
write_dat(pollockMainDataTable, type = "csv", "pollock")

# Save to defined directory location
write_dat(pollockMainDataTable, path = "C://data/pollock_dataset.csv", 
          type = "csv", "pollock")
          
# Save shape file
write_dat(ST6, path = "C://data//ST6.shp", type = "shp", project = 'Pollock')

## End(Not run)

Plot relationship of two variables

Description

Evaluate relationship of two variables in a plot format. Plots first variable against second variable.

Plot of var1 against var 2

Usage

xy_plot(dat, project, var1, var2, regress = FALSE, alpha = 0.5)
xy_plot(dat, project, var1, var2, regress = FALSE, alpha = 0.5)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in FishSET database contains the string 'MainDataTable'.
`project`	String, name of project.
`var1`	First variable in `dat`.
`var2`	Second variable in `dat`.
`regress`	Logical, if TRUE, returns plot with fitted linear regression line. Defaults to `FALSE`.
`alpha`	The opaqueness of each data point in scatterplot. 0 is total transparency and 1 is total opaqueness. Defaults to .5.

Value

Returns plot output to R console and saves plot to Output folder.

Examples

## Not run: 
xy_plot(pollockMainDataTable, var1 = 'OFFICIAL_TOTAL_CATCH_MT',
        var2 = 'HAUL', regress = TRUE)

## End(Not run)
## Not run: 
xy_plot(pollockMainDataTable, var1 = 'OFFICIAL_TOTAL_CATCH_MT',
        var2 = 'HAUL', regress = TRUE)

## End(Not run)

Define zone closure scenarios

Description

Define zone closure scenarios

Usage

zone_closure(
  project,
  spatdat,
  cat,
  lon.spat = NULL,
  lat.spat = NULL,
  epsg = NULL
)
zone_closure(
  project,
  spatdat,
  cat,
  lon.spat = NULL,
  lat.spat = NULL,
  epsg = NULL
)

Arguments

`project`	Required, name of project.
`spatdat`	Required, data file or character. `spatdat` is a spatial data file containing information on fishery management or regulatory zones boundaries. Shape, json, geojson, and csv formats are supported. geojson is the preferred format. json files must be converted into geoson. This is done automatically when the file is loaded with `read_dat` with `is.map` set to true. `spatdat` cannot, at this time, be loaded from the FishSET database.
`cat`	Variable in `spatdat` that identifies the individual areas or zones.
`lon.spat`	Required for csv files. Variable or list from `spatdat` containing longitude data. Leave as NULL if `spatdat` is a shape or json file.
`lat.spat`	Required for csv files. Variable or list from `spatdat` containing latitude data. Leave as NULL if `spatdat` is a shape or json file.
`epsg`	EPSG number. Set the epsg to ensure that `spatdat` have the correct projections. If epsg is not specified but is defined for `spatdat`. See http://spatialreference.org/ to help identify optimal epsg number.

Details

Define zone closure scenarios. Function opens an interactive map. Define zone closures by clicking on one or more zones and clicking the 'Close zones' button. To define another closure scenario, unclick zones and then click the desired zones. Press the 'Save closures' button to save choices. The saved choices are called in the policy scenario function.

Value

Returns a yaml file to the project output folder.

Summarize zones, closure areas

Description

'zone_summary' counts observations and aggregates values in 'dat' by regulatory zone or closure area.

Usage

zone_summary(
  dat,
  spat,
  project,
  zone.dat,
  zone.spat,
  count = TRUE,
  var = NULL,
  group = NULL,
  fun = NULL,
  breaks = NULL,
  n.breaks = 10,
  bin_colors = NULL,
  na.rm = TRUE,
  dat.center = TRUE,
  output = "plot"
)
zone_summary(
  dat,
  spat,
  project,
  zone.dat,
  zone.spat,
  count = TRUE,
  var = NULL,
  group = NULL,
  fun = NULL,
  breaks = NULL,
  n.breaks = 10,
  bin_colors = NULL,
  na.rm = TRUE,
  dat.center = TRUE,
  output = "plot"
)

Arguments

`dat`	Primary data containing information on hauls or trips. Table in FishSET database contains the string 'MainDataTable'.
`spat`	A spatial data file containing information on fishery management or regulatory zones boundaries. 'sf' objects are recommended, but 'sp' objects can be used as well. See [dat_to_sf()] to convert a spatial table read from a csv file to an 'sf' object. To upload your spatial data to the FishSETFolder see [load_spatial()].
`project`	Name of project.
`zone.dat`	Name of zone ID column in 'dat'.
`zone.spat`	Name of zone ID column in 'spat'.
`count`	Logical. if 'TRUE', then the number observations per zone will be returned. Can be paired with 'fun = "percent"' and 'group'. 'zone_summary' will return an error if 'var' is include and 'count = TRUE'.
`var`	Optional, name of numeric variable to aggregate by zone/closure area.
`group`	Name of grouping variable to aggregate by zone/closure area. Only one variable is allowed.
`fun`	Function name (string) to aggregate by. '"percent"' the percentage of observations in a given zone. Other options include "sum", "mean", "median", "min", and "max".
`breaks`	A numeric vector of breaks to bin zone frequencies by. Overrides 'n.breaks' if entered.
`n.breaks`	The number of break points to create if breaks are not given directly. Defaults to 10.
`bin_colors`	Optional, a vector of colors to use in plot. Must be same length as breaks. Defaults to 'fishset_viridis(10)'.
`na.rm`	Logical, whether to remove zones with zero counts.
`dat.center`	Logical, whether the plot should center on 'dat' ('TRUE') or 'spat' ('FALSE'). Recommend 'dat.center = TRUE' when aggregating by regulatory zone and 'dat.center = FALSE' when aggregating by closure area.
`output`	Output a '"plot"', '"table"', or both ('"tab_plot"'). Defaults to '"plot"'.

Details

Observations in 'dat' must be assigned to regulatory zones to use this function. See [assignment_column()] for details. 'zone_summary' can return: the number of observations per zone ('count = TRUE', 'fun = NULL', 'group = NULL'), the percentage of observations by zone ('count = TRUE', 'fun = "percent"', 'group = NULL'), the percentage of observations by zone and group ('count = TRUE', 'fun = "percent"', 'group = "group"'), summary of a numeric variable by zone ('count = FALSE', 'var = "var"', 'fun = "sum"', 'group = NULL'), summary of a numeric variable by zone and group ('count = FALSE', 'var = "var"', 'fun = "sum"', 'group = "group"'), share (percentage) of a numeric variable by zone ('count = FALSE', 'var = "var"', 'fun = "percent"', 'group = NULL'), share (percentage) of a numeric variable by zone and group ('count = FALSE', 'var = "var"', 'fun = "percent"', 'group = "group"').

Examples

## Not run: 

# count # of obs
zone_summary(pollockMainTable, spat = nmfs_area, zone.dat = "ZoneID", 
            zone.spat = "NMFS_AREA")
            
# percent of obs
zone_summary(pollockMainTable, spat = nmfs_area, zone.dat = "ZoneID", 
            zone.spat = "NMFS_AREA", count = TRUE, fun = "percent")

# count by group
zone_summary(pollockMainTable, spat = nmfs_area, zone.dat = "ZoneID", 
            zone.spat = "NMFS_AREA", group = "GEAR_TYPE")   

# total catch by zone           
zone_summary(pollockMainTable, spat = nmfs_area, zone.dat = "ZoneID", 
            zone.spat = "NMFS_AREA", var = "OFFICIAL_TOTAL_CATCH_MT",
            count = FALSE, fun = "sum")  

# percent of catch by zone           
zone_summary(pollockMainTable, spat = nmfs_area, zone.dat = "ZoneID", 
            zone.spat = "NMFS_AREA", var = "OFFICIAL_TOTAL_CATCH_MT",
            count = FALSE, fun = "percent")         
            

## End(Not run)
## Not run: 

# count # of obs
zone_summary(pollockMainTable, spat = nmfs_area, zone.dat = "ZoneID", 
            zone.spat = "NMFS_AREA")
            
# percent of obs
zone_summary(pollockMainTable, spat = nmfs_area, zone.dat = "ZoneID", 
            zone.spat = "NMFS_AREA", count = TRUE, fun = "percent")

# count by group
zone_summary(pollockMainTable, spat = nmfs_area, zone.dat = "ZoneID", 
            zone.spat = "NMFS_AREA", group = "GEAR_TYPE")   

# total catch by zone           
zone_summary(pollockMainTable, spat = nmfs_area, zone.dat = "ZoneID", 
            zone.spat = "NMFS_AREA", var = "OFFICIAL_TOTAL_CATCH_MT",
            count = FALSE, fun = "sum")  

# percent of catch by zone           
zone_summary(pollockMainTable, spat = nmfs_area, zone.dat = "ZoneID", 
            zone.spat = "NMFS_AREA", var = "OFFICIAL_TOTAL_CATCH_MT",
            count = FALSE, fun = "percent")         
            

## End(Not run)

Package 'FishSET'

Help Index

Add removed variables back into dataset - non-interactive version

Description

Usage

Arguments

Details

Examples

Add removed variables back into dataset

Description

Usage

Arguments

Details

Examples

Aggregating function

Description

Usage

Arguments

Examples

Get Alternative Choice List

Description

Usage

Arguments

Set x-axis labels to 45 degrees

Description

Usage

Assign observations to fishing zones

Description

Usage

Arguments

Details

Value

Examples

Creates numeric variables divided into equal sized groups

Description

Usage

Arguments

Details

Value

Examples

Compare bycatch CPUE and total catch/percent of total catch for one or more species

Description

Usage

Arguments

Details

Value

Examples

Linear Model for Catch

Description

Usage

Arguments

Details

Value

See Also

Save Primary Table's Centroid Columns to FishSET Database

Description

Usage

Arguments

Details

Change variable data class

Description

Usage

Arguments

Details

Value

Examples

Check for common data quality issues affecting modeling functions

Description

Usage

Arguments

Details

Value

Examples

Check and correct spatial data format

Description

Usage

Arguments

Details

Retrieve closure scenario names

Description