---
title: "notes"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{notes}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
prompt=TRUE,
comment = " "
)
```
```{r setup}
library(stipple)
```
```{r makestuff,echo=FALSE}
library(sf)
set.seed(310366)
space1 = sf::st_as_sf(data.frame(x=runif(10),y=runif(10),Place=LETTERS[1:10]),coords=c("x","y"))
space2 = space1
space2$Pop = 10000+round(runif(10)*10000)
space2 = space2[,c("Place","Pop")]
nrow = 5
data1 = data.frame(T0=sort(1+round(100*runif(nrow))), Place=sample(LETTERS[1:nrow]))
data2 = data1
data2$T1 = Inf
data2 = data2[,c("T0","T1","Place")]
data3 = data1
data3$T1 = data3$T0 + 2 + round(9*runif(nrow(data3)))
data3 = data3[,c("T0","T1","Place")]
data4 = data3
data4$Place[3] = data4$Place[2]
data5a = data.frame(T0=sort(1+round(100*runif(nrow))), Place=sample(LETTERS[1:nrow]), Grp=1)
data5b = data.frame(T0=sort(1+round(100*runif(nrow))), Place=sample(LETTERS[1:nrow]), Grp=2)
data5 = rbind(data5a, data5b)
data6 = data5
data6$Age = round(runif(nrow(data5), 19, 65))
data7 = data6
data7$T1 = data7$T0 + 2 + round(9*runif(nrow(data7)))
data7 = data7[,c("T0","T1","Place","Age","Grp")]
```
# Model Fitting
Stipple functions are of the form:
```
fit = stipple(formula, data, space, time, ...)
```
The `formula` has to specify the start (and potentially end) times of infections, the
spatial location label, plus the linear and offset terms, and the distance-decay
function.
The `data` needs a column for location as well as start and finish times of infections,
as well as covariates for the linear and offset terms.
The `space` needs a column that uniquely identifies each location which will match the spatial location
in the formula (and the data). This will usually be a name or other unique identifier.
The `time` specifies the time points.
The `...` args are passed through.
# The `space` parameter.
For a discrete set of spatial units, use an `sf` spatial object with point or polygon geometry.
```{r showspace}
space1
```
Spatial units may have extra covariate information in columns:
```{r spacecovs}
space2
```
# Data for `stipple` models
Minimally the data needs an identifier that matches a column in the `space` data and an
infection start time.
```{r data1}
data1
```
If the infections are not recovered such that the case is infectious for the whole of the
period, then no infection end time is needed, and the data is equivalent to specifying an
end time of infinity.
```{r data2}
data2
```
If the infections do result in recoveries then a second column specifying recovery times
should be present.
```{r data3}
data3
```
Multiple infections in the same spatial unit should appear multiple
times in the data.
```{r data4}
data4
```
Multiple independent experiments or observation sets should have extra columns
to indicate the grouping.
```{r data5}
data5
```
Infection cases may have explanatory variables as extra columns in the data.
```{r data6}
data6
```
The fullest general data looks like this:
```{r data7}
data7
```
# Formula
## Simplest
This says that the cases happen at time `T0` in location `Place` and once infected
are always infected. The model has only a constant term on the RHS and so this
has no other covariates:
```
T0@Place ~ 1
```
## Most Complex
This says that the cases are infected at time `T0`, recover at time `T1`, and are
in location `Place`. We have one case-based covariate, `Age`, and one place-based
covariate, `Pop`. The data consists of multiple replications defined by `Grp`:
```
(T0 - T1)@Place + Age|Grp ~ Pop
```
More complex formulae can be constructed by adding further terms to either of
the covariates or grouping parts:
```
(T0 - T1)@Place + Age + Salary | Grp + Type ~ Pop + Area + Gov
```
This specifies a model dependent on the age and salary of the case, and on the
population, area, and government in the location. Replications are defined by
unique combinations of `Grp` and `Type`.
# Formula testing
```{r fortest}
FList = list(
T0@Place ~ 1,
T0@Place ~ Age,
T0@Place + Age ~ Pop,
T0@Place|Exp ~ 1,
T0@Place |Exp ~ Pop,
(T0-T1)@Place ~ 1,
(T0-T1)@Place + Age ~ 1,
(T0-T1)@Place + Age ~ Pop,
(T0-T1)@Place|Exp ~ 1,
(T0-T1)@Place + Age|Exp ~ Pop
)
parsed = lapply(FList, function(f){
stipple:::parse_stipple_formula(f)
})
do.call(rbind, parsed)
```