Welcome toVigges Developer Community-Open, Learning,Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
2.6k views
in Technique[技术] by (71.8m points)

r - Quickly filter down a grid of sf points according to a polygon

I want to make grids (in the sense of data frames of x- and y-coordinates) over the US, or regions of the US, throwing out any points in the original grid that are beyond the borders of the US. I have some code that seems to work, but it's quite slow when I shrink the grid increment down to 1 km (1e3) or so. How can this be done faster? Perhaps there's a way to build the simple feature collection that I need without a lapply or loop, or perhaps this can be done with a MULTIPOINT instead of a simple feature collection of POINTs.

library(sf)

crs.us.atlas = 2163 # https://epsg.io/2163

us = read_sf(paste0("/vsizip/",
    "/tmp/us.zip")) # from: https://www2.census.gov/geo/tiger/GENZ2019/shp/cb_2019_us_state_500k.zip
# Filter down to the lower 48 + DC.
us = us[us$STUSPS %in% setdiff(c(state.abb, "DC"), c("AK", "HI")),]
us = st_transform(us, crs = crs.us.atlas)
l = as.list(st_bbox(us))

increment = 1e5

g = expand.grid(
    x = seq(l$xmin, l$xmax, by = increment),
    y = seq(l$ymin, l$ymax, by = increment))

message("Running the slow part")
print(system.time(g <- g[0 < sapply(FUN = length, st_intersects(
    st_as_sfc(crs = crs.us.atlas, lapply(1 : nrow(g), function(i)
        st_point(as.numeric(g[i, c("x", "y")])))),
    us)),]))

print(nrow(g))

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

I would solve the problem as follows. First, load the packages. tmap is used just for the map, you can easily ignore that

library(sf)
#> Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 6.3.1
library(sfheaders)
library(tmap)

Download and read-in data

temp_zip <- tempfile(fileext = ".zip")
download.file(
  "https://www2.census.gov/geo/tiger/GENZ2019/shp/cb_2019_us_state_500k.zip", 
  destfile = temp_zip
)
unzip(temp_zip, exdir = tempdir())
us <- st_read(file.path(tempdir(), "cb_2019_us_state_500k.shp"))
#> Reading layer `cb_2019_us_state_500k' from data source `C:UsersUtenteAppDataLocalTempRtmpCakscOcb_2019_us_state_500k.shp' using driver `ESRI Shapefile'
#> Simple feature collection with 56 features and 9 fields
#> geometry type:  MULTIPOLYGON
#> dimension:      XY
#> bbox:           xmin: -179.1489 ymin: -14.5487 xmax: 179.7785 ymax: 71.36516
#> geographic CRS: NAD83

Filter down to the lower 48 + DC.

us = us[us$STUSPS %in% setdiff(c(state.abb, "DC"), c("AK", "HI")),]
us = st_transform(us, crs = 2163)

Define increment, grid and sfc object. The key part is to create and sfc_point object for subsequent operations

increment = 1e5
g = expand.grid(
  x = seq(st_bbox(us)$xmin, st_bbox(us)$xmax, by = increment),
  y = seq(st_bbox(us)$ymin, st_bbox(us)$ymax, by = increment)
)
g_sfc <- sfc_point(as.matrix(g)) %>% 
  st_set_crs(2163)

Find the ID(s) of points included in the US

my_ids <- unlist(st_contains(us, g_sfc))

Visualise result

tm_shape(g_sfc) + 
  tm_dots(col = "grey", size = 0.05) + 
tm_shape(g_sfc[my_ids]) + 
  tm_dots(col = "darkred", size = 0.05) + 
tm_shape(us) + 
  tm_borders(lwd = 1.3)

Repeat for 1e3 (but I won't add any plot since that's almost 13 million points)

increment = 1e3
g = expand.grid(
  x = seq(st_bbox(us)$xmin, st_bbox(us)$xmax, by = increment),
  y = seq(st_bbox(us)$ymin, st_bbox(us)$ymax, by = increment)
)

It takes approximately 20 seconds to generate the data

system.time({
  g_sfc <- sfc_point(as.matrix(g)) %>% 
    st_set_crs(2163)
})
#>    user  system elapsed 
#>   16.29    0.92   17.27

and 80 seconds to find the IDs of the points within US.

system.time({
  my_ids <- unlist(st_contains(us, g_sfc))
})
#>    user  system elapsed 
#>   67.75    8.32   80.86

Created on 2021-01-13 by the reprex package (v0.3.0)

If you need something even more efficient, I suggest you polyclip.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to Vigges Developer Community for programmer and developer-Open, Learning and Share
...