This function is a wrapper to fit models in caret using caretSDM data.
Usage
train_sdm(occ,
pred = NULL,
algo,
ctrl = NULL,
variables_selected = NULL,
parallel = FALSE,
...)
get_tune_length(i)
algorithms_used(i)
get_models(i)
get_validation_metrics(i)
mean_validation_metrics(i)
Arguments
- occ
A
occurrences
or ainput_sdm
object.- pred
A
predictors object
. Ifocc
is ainput_sdm
object, thenpred
is obtained from it.- algo
A
character
vector. Algorithms to be used. For a complete list see (https://topepo.github.io/caret/available-models.html) or in caretSDM::algorithms.- ctrl
A
trainControl
object to be used to build models. See?caret::trainControl
.- variables_selected
A
vector
of variables to be used as predictors. IfNULL
, predictors names frompred
will be used. Can also be a selection method (e.g. 'vif').- parallel
Should a paralelization method be used (not yet implemented)?
- ...
Additional arguments to be passed to
caret::train
function.- i
A
models
or ainput_sdm
object.
Details
The object algorithms
has a table comparing algorithms available. If the function
detects that the necessary packages are not available it will ask for installation. This will
happen just in the first time you use the algorithm.
get_tune_length
return the length used in grid-search for tunning.
algorithms_used
return the names of the algorithms used in the modeling process.
get_models
returns a list
with trained models (class train
) to each species.
get_validation_metrics
return a list
with a data.frame
to each species
with complete values for ROC, Sensitivity, Specificity, with their respectives Standard
Deviations (SD) and TSS to each of the algorithms and pseudoabsence datasets used.
mean_validation_metrics
return a list
with a tibble
to each species
summarizing values for ROC, Sensitivity, Specificity and TSS to each of the algorithms used.
Examples
# Create sdm_area object:
sa <- sdm_area(parana, cell_size = 100000, crs = 6933)
#> ! Making grid over study area is an expensive task. Please, be patient!
#> ℹ Using GDAL to make the grid and resample the variables.
# Include predictors:
sa <- add_predictors(sa, bioc) |> select_predictors(c("bio1", "bio12"))
#> ! Making grid over the study area is an expensive task. Please, be patient!
#> ℹ Using GDAL to make the grid and resample the variables.
# Include scenarios:
sa <- add_scenarios(sa)
# Create occurrences:
oc <- occurrences_sdm(occ, crs = 6933) |> join_area(sa)
#> Warning: Some records from `occ` do not fall in `pred`.
#> ℹ 2 elements from `occ` were excluded.
#> ℹ If this seems too much, check how `occ` and `pred` intersect.
# Create input_sdm:
i <- input_sdm(oc, sa)
# Pseudoabsence generation:
i <- pseudoabsences(i, method="bioclim")
# Custom trainControl:
ctrl_sdm <- caret::trainControl(method = "repeatedcv",
number = 2,
repeats = 1,
classProbs = TRUE,
returnResamp = "all",
summaryFunction = summary_sdm,
savePredictions = "all")
# Train models:
i <- train_sdm(i, algo = c("naive_bayes"), ctrl=ctrl_sdm) |>
suppressWarnings()