
Simulate One Disease Assessment per Weather Series
Source:R/simulate_assessment_data.R
simulate_assessment_data.RdCreates one synthetic disease assessment for each weather time series. In the usual use case, each time series represents one location, plot, or experimental unit, and the assessment is measured once at the end of that series. The simulated response can be binary, ordinal, or a 0–100 percent severity value.
Usage
simulate_assessment_data(
weather,
id_col = NULL,
response_type = c("percent", "binary", "ordinal"),
n_levels = 10,
seed = NULL,
id_prefix = "A",
time_col = "time",
response_col = "disease_intensity"
)Arguments
- weather
A weather data frame with at least the standard windcut columns.
- id_col
Optional column identifying independent weather series, such as locations. If
NULL, the whole weather table is treated as one series.- response_type
Type of disease response to simulate. One of
"binary","ordinal", or"percent".- n_levels
Number of ordinal classes when
response_type = "ordinal". For example,n_levels = 10returns scores from 0 to 9.- seed
Optional random seed for reproducibility.
- id_prefix
Prefix used to label assessments when
id_col = NULL.- time_col
Name of the weather time column.
- response_col
Name of the simulated response column.
Value
A data frame with one row per weather series. It includes the series identifier, assessment time, response type, and simulated disease response.
Details
Disease risk is generated from interpretable weather summaries over each series: cumulative rain, mean relative humidity, mean temperature, and the proportion of wet observations. The result is intended for examples, tutorials, and tests rather than biological inference.
Examples
weather <- simulate_weather_series(days = 30, seed = 1)
simulate_assessment_data(weather, response_type = "binary", seed = 1)
#> assessment_id assessment_time response_type disease_intensity
#> 1 A01 2024-01-30 23:00:00 binary 0
site_weather <- simulate_weather_series(
days = 30,
n_series = 4,
id_col = "site_id",
seed = 1
)
simulate_assessment_data(
site_weather,
id_col = "site_id",
response_type = "ordinal",
n_levels = 10,
seed = 1
)
#> site_id assessment_id assessment_time response_type disease_intensity
#> 1 S01 S01 2024-01-30 23:00:00 ordinal 5
#> 2 S02 S02 2024-01-30 23:00:00 ordinal 5
#> 3 S03 S03 2024-01-30 23:00:00 ordinal 5
#> 4 S04 S04 2024-01-30 23:00:00 ordinal 5