--- title: "notes" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{notes} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, prompt=TRUE, comment = " " ) ``` ```{r setup} library(stipple) ``` ```{r makestuff,echo=FALSE} library(sf) set.seed(310366) space1 = sf::st_as_sf(data.frame(x=runif(10),y=runif(10),Place=LETTERS[1:10]),coords=c("x","y")) space2 = space1 space2$Pop = 10000+round(runif(10)*10000) space2 = space2[,c("Place","Pop")] nrow = 5 data1 = data.frame(T0=sort(1+round(100*runif(nrow))), Place=sample(LETTERS[1:nrow])) data2 = data1 data2$T1 = Inf data2 = data2[,c("T0","T1","Place")] data3 = data1 data3$T1 = data3$T0 + 2 + round(9*runif(nrow(data3))) data3 = data3[,c("T0","T1","Place")] data4 = data3 data4$Place[3] = data4$Place[2] data5a = data.frame(T0=sort(1+round(100*runif(nrow))), Place=sample(LETTERS[1:nrow]), Grp=1) data5b = data.frame(T0=sort(1+round(100*runif(nrow))), Place=sample(LETTERS[1:nrow]), Grp=2) data5 = rbind(data5a, data5b) data6 = data5 data6$Age = round(runif(nrow(data5), 19, 65)) data7 = data6 data7$T1 = data7$T0 + 2 + round(9*runif(nrow(data7))) data7 = data7[,c("T0","T1","Place","Age","Grp")] ``` # Model Fitting Stipple functions are of the form: ``` fit = stipple(formula, data, space, time, ...) ``` The `formula` has to specify the start (and potentially end) times of infections, the spatial location label, plus the linear and offset terms, and the distance-decay function. The `data` needs a column for location as well as start and finish times of infections, as well as covariates for the linear and offset terms. The `space` needs a column that uniquely identifies each location which will match the spatial location in the formula (and the data). This will usually be a name or other unique identifier. The `time` specifies the time points. The `...` args are passed through. # The `space` parameter. For a discrete set of spatial units, use an `sf` spatial object with point or polygon geometry. ```{r showspace} space1 ``` Spatial units may have extra covariate information in columns: ```{r spacecovs} space2 ``` # Data for `stipple` models Minimally the data needs an identifier that matches a column in the `space` data and an infection start time. ```{r data1} data1 ``` If the infections are not recovered such that the case is infectious for the whole of the period, then no infection end time is needed, and the data is equivalent to specifying an end time of infinity. ```{r data2} data2 ``` If the infections do result in recoveries then a second column specifying recovery times should be present. ```{r data3} data3 ``` Multiple infections in the same spatial unit should appear multiple times in the data. ```{r data4} data4 ``` Multiple independent experiments or observation sets should have extra columns to indicate the grouping. ```{r data5} data5 ``` Infection cases may have explanatory variables as extra columns in the data. ```{r data6} data6 ``` The fullest general data looks like this: ```{r data7} data7 ``` # Formula ## Simplest This says that the cases happen at time `T0` in location `Place` and once infected are always infected. The model has only a constant term on the RHS and so this has no other covariates: ``` T0@Place ~ 1 ``` ## Most Complex This says that the cases are infected at time `T0`, recover at time `T1`, and are in location `Place`. We have one case-based covariate, `Age`, and one place-based covariate, `Pop`. The data consists of multiple replications defined by `Grp`: ``` (T0 - T1)@Place + Age|Grp ~ Pop ``` More complex formulae can be constructed by adding further terms to either of the covariates or grouping parts: ``` (T0 - T1)@Place + Age + Salary | Grp + Type ~ Pop + Area + Gov ``` This specifies a model dependent on the age and salary of the case, and on the population, area, and government in the location. Replications are defined by unique combinations of `Grp` and `Type`. # Formula testing ```{r fortest} FList = list( T0@Place ~ 1, T0@Place ~ Age, T0@Place + Age ~ Pop, T0@Place|Exp ~ 1, T0@Place |Exp ~ Pop, (T0-T1)@Place ~ 1, (T0-T1)@Place + Age ~ 1, (T0-T1)@Place + Age ~ Pop, (T0-T1)@Place|Exp ~ 1, (T0-T1)@Place + Age|Exp ~ Pop ) parsed = lapply(FList, function(f){ stipple:::parse_stipple_formula(f) }) do.call(rbind, parsed) ```