Skip to contents

Creates one synthetic disease assessment for each weather time series. In the usual use case, each time series represents one location, plot, or experimental unit, and the assessment is measured once at the end of that series. The simulated response can be binary, ordinal, or a 0–100 percent severity value.

Usage

simulate_assessment_data(
  weather,
  id_col = NULL,
  response_type = c("percent", "binary", "ordinal"),
  n_levels = 10,
  seed = NULL,
  id_prefix = "A",
  time_col = "time",
  response_col = "disease_intensity"
)

Arguments

weather

A weather data frame with at least the standard windcut columns.

id_col

Optional column identifying independent weather series, such as locations. If NULL, the whole weather table is treated as one series.

response_type

Type of disease response to simulate. One of "binary", "ordinal", or "percent".

n_levels

Number of ordinal classes when response_type = "ordinal". For example, n_levels = 10 returns scores from 0 to 9.

seed

Optional random seed for reproducibility.

id_prefix

Prefix used to label assessments when id_col = NULL.

time_col

Name of the weather time column.

response_col

Name of the simulated response column.

Value

A data frame with one row per weather series. It includes the series identifier, assessment time, response type, and simulated disease response.

Details

Disease risk is generated from interpretable weather summaries over each series: cumulative rain, mean relative humidity, mean temperature, and the proportion of wet observations. The result is intended for examples, tutorials, and tests rather than biological inference.

Examples

weather <- simulate_weather_series(days = 30, seed = 1)
simulate_assessment_data(weather, response_type = "binary", seed = 1)
#>   assessment_id     assessment_time response_type disease_intensity
#> 1           A01 2024-01-30 23:00:00        binary                 0

site_weather <- simulate_weather_series(
  days = 30,
  n_series = 4,
  id_col = "site_id",
  seed = 1
)

simulate_assessment_data(
  site_weather,
  id_col = "site_id",
  response_type = "ordinal",
  n_levels = 10,
  seed = 1
)
#>   site_id assessment_id     assessment_time response_type disease_intensity
#> 1     S01           S01 2024-01-30 23:00:00       ordinal                 5
#> 2     S02           S02 2024-01-30 23:00:00       ordinal                 5
#> 3     S03           S03 2024-01-30 23:00:00       ordinal                 5
#> 4     S04           S04 2024-01-30 23:00:00       ordinal                 5