In areas where good OpenStreetMap
building surveys are available, this data makes a good proxy for a
population density when no census is available. The osmdata
package lets a user query the OSM database for particular features. This
vignette illustrates how to get OSM building data and use it for a
sampling design.
First we need to define a polygon for our study region. Here we
construct it from a WKT string, but typically this might be read in from
a data source such as a geopackage or a shapefile, or even drawn on a
map using R’s interactove locator
function.
library(sidclosepairs)
library(sf)
#> Linking to GEOS 3.12.1, GDAL 3.8.4, PROJ 9.4.0; sf_use_s2() is TRUE
study_area = "POLYGON (
(-2.794796 54.02322, -2.801250 54.02686, -2.802311 54.02867, -2.802351 54.02891, -2.802334 54.03084,
-2.802218 54.03190, -2.801649 54.03199, -2.801142 54.03202, -2.784621 54.03237, -2.780087 54.03196,
-2.778327 54.03087, -2.777984 54.03048, -2.775405 54.02391, -2.775982 54.02327, -2.794796 54.02322))"
## remove newline markers
study_area = gsub("\n","", study_area)
# convert WKT to spatial polygon data frame
study_area = st_as_sf(data.frame(wkt=study_area), wkt="wkt", crs=4326)
Now with our study area defined, we construct a query using
osmdata
package functions and get back the points in the
bounding box of our study area. Taking the centroid of the buildings
makes sure we reduce polygons down to points. Finally we take only the
points that are within the polygon itself, not just the bounding box.
We’ll wrap these calls into a function for re-use and call it to get
buildings in our study area.
library(osmdata)
#> Data (c) OpenStreetMap contributors, ODbL 1.0. https://www.openstreetmap.org/copyright
## or use another overpass server
set_overpass_url("https://overpass-api.de/api/interpreter")
get_buildings <- function(area){
osmd <- opq(bbox = st_bbox(area))
osmd <- add_osm_feature(osmd, key = "building")
osmd <- osmdata_sf(osmd)
centroids_in_area <- st_intersection(st_centroid(osmd$osm_polygons), area)
return(centroids_in_area)
}
buildings = get_buildings(study_area)
#> Warning: st_centroid assumes attributes are constant over geometries
#> Warning: attribute variables are assumed to be spatially constant throughout
#> all geometries
If you want to do this for several study areas you should consider
wrapping the osmdata
calls into a single function for
convenience.
Now we’ll sample fifty points in total, with ten sets of close pairs, keeping our inhibitory points 100 metres apart, and our close pairs in a 20 metre radius.
First we have to convert our data to a local Cartesian coordinate system, so we use EPSG code 27700, for the Ordnance Survey GB system which is appropriate for this region in Lancaster.
Now we can sample our 50 points with 10 close pairs using the “census” proposal method for the inhibitory points, and a 20 metre radius for the close pairs.
pts_buildings = icpSample(50, 10, 100, 20, censusxy=buildings, poly=study_gb, proposal="census")
pts_buildings
#> Simple feature collection with 50 features and 0 fields
#> Geometry type: POINT
#> Dimension: XY
#> Bounding box: xmin: 347574.9 ymin: 458892.7 xmax: 349078.5 ymax: 459832.3
#> Projected CRS: OSGB36 / British National Grid
#> First 10 features:
#> geometry
#> 1 POINT (348653.1 458927.4)
#> 2 POINT (348129 459428.4)
#> 3 POINT (348082.6 459700.9)
#> 4 POINT (348169.2 459752.9)
#> 5 POINT (347897.2 459020.8)
#> 6 POINT (348374.2 459217.5)
#> 7 POINT (348655.3 459616.5)
#> 8 POINT (348542.3 459182.1)
#> 9 POINT (347819.7 459264.6)
#> 10 POINT (347574.9 459634.9)
To show these points we can first plot the building locations in grey, then overlay the first 40 points in red to show the inhibitory points, and then add the close pairs in blue.
plot(study_gb)
plot(st_geometry(buildings), add=TRUE, col="grey", cex=.5)
plot(st_geometry(pts_buildings[1:40,]), add=TRUE, col="red")
plot(st_geometry(pts_buildings[41:50,]), add=TRUE, col="blue")
Note that the blue close pair points are not guaranteed to be on a building, but some other method will need to be used to select the second building for a survey related to that point. Often the nearest building to the point will be sampled, or the nearest in the direction indicated by the two points.