This function thins species occurrence records in an n-dimensional environmental space by randomly sampling exactly one point from each occupied n-dimensional grid cell (hypercube).
Arguments
- data
A data.frame containing species occurrences and pre-scaled environmental variables, typically the output of `prepare_bean()`.
- env_vars
A character vector of two or more column names representing the environmental variables (dimensions) to use for thinning.
- grid_resolution
A numeric vector of resolutions for each environmental axis. Its length must match the length of `env_vars`.
- seed
(numeric) An optional random seed for reproducibility. If provided, the random number generator state is safely isolated to this function call and will not affect the global environment. Default = NULL.
Value
An object of class `bean_thinned`, which is a list containing:
- thinned_data
A data.frame containing the occurrence records that were retained after the thinning process.
- n_original
An integer representing the number of complete occurrence records in the input data before thinning.
- n_thinned
An integer representing the number of occurrence records remaining after thinning.
- parameters
A list of the key parameters used during the thinning process.
Examples
data(origin_dat_prepared, package = "bean")
thinned <- thin_env_nd(
data = origin_dat_prepared,
env_vars = c("bio_1", "bio_12"),
grid_resolution = c(0.5, 0.5),
seed = 123
)
print(thinned)
#> --- Bean Stochastic Thinning Results ---
#>
#> Thinned 1024 original points to 26 points.
#> This represents a retention of 2.5% of the data.
#>
#> --------------------------------------
