Attach the package. If it is not installed or you are developing, use
load_all
from the devtools
package.
Otherwise:
The open WW data is distributed in ODS (Open Document Spreadsheet) format. This can be read in Excel, Libre Office, and other spreadsheet programs. Functions in the package will read and re-format sheets from the file.
A sample spreadsheet is included in the package.
The daily data can be read with read_daily_ww_ods()
:
ww_daily = read_daily_ww_ods(ods_file)
head(ww_daily)
#> Site_code date conc
#> 2 UKENAN_AW_TP000004 2021-06-02 4340
#> 4 UKENAN_AW_TP000004 2021-06-04 NA
#> 6 UKENAN_AW_TP000004 2021-06-06 NA
#> 7 UKENAN_AW_TP000004 2021-06-07 NA
#> 9 UKENAN_AW_TP000004 2021-06-09 NA
#> 11 UKENAN_AW_TP000004 2021-06-11 45426
summary(ww_daily)
#> Site_code date conc
#> Length:29232 Min. :2021-06-01 Min. : 162
#> Class :character 1st Qu.:2021-07-25 1st Qu.: 3248
#> Mode :character Median :2021-09-19 Median : 9554
#> Mean :2021-09-17 Mean : 26661
#> 3rd Qu.:2021-11-08 3rd Qu.: 26826
#> Max. :2022-01-10 Max. :1879113
#> NA's :5053
The spreadsheet is a full table of sites in rows and dates in
columns. When observations are not made the cell is blank. When
observations are taken but are below the threshold level of detection
then the cell contains the text "tLOD"
. In the converted
data read here, un-made observations are excluded, and below-threshold
measurements are recorded as NA
.
The weekly data can be read with
read_weekly_ww_ods()
:
ww_weekly = read_weekly_ww_ods(ods_file)
head(ww_weekly)
#> Site_code date conc
#> 1 UKENAN_AW_TP000004 2021-06-01 1205
#> 2 UKENAN_AW_TP000004 2021-06-08 16369
#> 3 UKENAN_AW_TP000004 2021-06-15 1315
#> 4 UKENAN_AW_TP000004 2021-06-22 442
#> 5 UKENAN_AW_TP000004 2021-06-29 7420
#> 6 UKENAN_AW_TP000004 2021-07-06 220
summary(ww_weekly)
#> Site_code date conc
#> Length:8218 Min. :2021-06-01 Min. : 160
#> Class :character 1st Qu.:2021-07-20 1st Qu.: 3734
#> Mode :character Median :2021-09-14 Median : 10536
#> Mean :2021-09-15 Mean : 22596
#> 3rd Qu.:2021-11-09 3rd Qu.: 26125
#> Max. :2022-01-04 Max. :596142
The package files includes a geopackage of spatial data. One layer in this file is the point locations of the treatment works.
library(sf)
#> Linking to GEOS 3.12.1, GDAL 3.8.4, PROJ 9.4.0; sf_use_s2() is TRUE
sites_gpkg = system.file("extdata","sites.gpkg",package="openww")
stw = st_read(sites_gpkg, "sites", quiet=TRUE)
head(stw)
#> Simple feature collection with 6 features and 6 fields
#> Geometry type: POINT
#> Dimension: XY
#> Bounding box: xmin: -0.4339846 ymin: 52.13165 xmax: 1.603073 ymax: 53.69675
#> Geodetic CRS: WGS 84
#> code name Region_name
#> 1 ukenanawtp000004 ANWICK STW East Midlands
#> 2 ukenanawtp000012 BARTON-UPON-HUMBER STW Yorkshire and The Humber
#> 3 ukenanawtp000015 BECCLES STW East of England
#> 4 ukenanawtp000016 BEDFORD STW East of England
#> 5 ukenanawtp000023 BOSTON STW East Midlands
#> 6 ukenanawtp000026 BOURNE STW East Midlands
#> Site_code Site_name Population geom
#> 1 UKENAN_AW_TP000004 Anwick 5866 POINT (-0.3387851 53.0357)
#> 2 UKENAN_AW_TP000012 Barton-upon-Humber 11518 POINT (-0.4339846 53.69675)
#> 3 UKENAN_AW_TP000015 Beccles 10882 POINT (1.603073 52.45682)
#> 4 UKENAN_AW_TP000016 Bedford 151259 POINT (-0.416479 52.13165)
#> 5 UKENAN_AW_TP000023 Boston 38150 POINT (0.01826465 52.94919)
#> 6 UKENAN_AW_TP000026 Bourne 21752 POINT (-0.3575288 52.76608)
summary(stw)
#> code name Region_name Site_code
#> Length:274 Length:274 Length:274 Length:274
#> Class :character Class :character Class :character Class :character
#> Mode :character Mode :character Mode :character Mode :character
#>
#>
#>
#> Site_name Population geom
#> Length:274 Min. : 2204 POINT :274
#> Class :character 1st Qu.: 27412 epsg:4326 : 0
#> Mode :character Median : 59651 +proj=long...: 0
#> Mean : 134611
#> 3rd Qu.: 122028
#> Max. :3031194
## take a subset of one day
day_7_4 = ww_daily[ww_daily$date=="2021-07-04",]
## merge with spatial by common column "Site_code":
day_7_4 = st_as_sf(merge(day_7_4, stw))
## transform concentration to log
day_7_4$log_conc = log(day_7_4$conc)
## plot
plot(day_7_4[,"log_conc"], pch=19, cex=0.5)
Better maps with context can be done with the tmap
package.