Skip to content

hypertidy/controlledburn

Repository files navigation

controlledburn

R-CMD-check

The goal of controlledburn is to rasterize without materializing any pixel values.

A very fast rasterization algorithm for polygons in fasterize does the following:

  • restructures polygons to edges
  • considers only non-horizontal polygon edges and indexes them in y,x order (slope always down)
  • determines all affected raster rows and scans these for edge start and end column
  • burns polygon value or identity into every pixel between those starts and ends

controlledburn avoids that last step. With these rasterization functions the return value is not a set of pixels or a side-effect of a materialized raster file but simply those row and column start and end indexes.

This is an expression of the cell abstraction wish item fasterize/issues/11.

TODO

  • move to cpp11
  • rasterize lines fasterize/issues/30
  • formats for import (wk, geos, grd, rct, triangles etc.)
  • streaming with wkb/xy unpack with wk
  • provide output options (see next section)
  • port back into fasterize, with options for efficiently writing out to a tiled and sparse GeoTIFF
  • points is too easy, see vaster::cell_from_xy
  • name the package
  • copy logic from fasterize, and remove Armadillo array handling
  • remove use of raster objects, in favour of input extent and dimension
  • remove all trace of the raster package
  • implement return of the ‘yline, xpix’ and polygon ID info to user (see below)
  • make return of ylin,xpix structure efficient (CollectorList.h ftw)

Lines is working but still has some problems. For polygons it’s pretty good, see fasterize #6 for a remaining issue.

Outputs

Internal function burn_polygon() has arguments sf, extent, dimension. The first is a sf polygons data frame, extent is c(xmin, xmax, ymin, ymax) and dimension is c(ncol, nrow). In fasterize you need an actual non-materialized RasterLayer object, but all that was really used for was the six numbers extent, dimension and as the shell for the in-memory output.

The output of burn_polygon() is a list of four-element indexes start,end,row,poly_id - these are zero-based atm because they reflect the underlying C++ code. Examples shown here flatten this to a 4-column matrix (and add 1).

These options are still in play for what the interface/s could do:

  • record presence of polygon (this is all we have currently) OR ID of polygon
  • tools to format this meaningfully, and plot lazily (see example for quick plot)
  • tools to materialize as actual raster data

I also found this real world example for which this is relevant, discussed in PROJ for very fast lookup for large non-materialized (highly compressed) grids by Thomas Knudsen:

OSGeo/PROJ#1461 (comment)

Installation

You can install the development version of controlledburn like so:

remotes::install_github("hypertidy/controlledburn")

Example

This is a basic example, this is fast, and shows that it works.

pols <- silicate::inlandwaters
library(vaster)
#> 
#> Attaching package: 'vaster'
#> The following object is masked from 'package:stats':
#> 
#>     ts
## define a raster (xmin, xmax, ymin, ymax), (ncol, nrow)
ext <- unlist(lapply(silicate::sc_vertex(pols), range))
dm <- c(50, 40)
r <- controlledburn:::burn_polygon(pols, extent = ext,
                               dimension = dm)

## our index is triplets of start,end,line where the polygon edge was detected - 
## this essentially an rle by scanline of start,end polygon coverage
index <- matrix(unlist(r, use.names = F), ncol = 4L, byrow = TRUE) + 1 ## plus one because 0-index internally

## plot just the start and ends of each scanline detected
xy0 <- vaster::xy_from_cell(dm, ext, vaster::cell_from_row_col(dm, index[,c(3, 3)], index[,1:2]))
plot(xy0)
plot(silicate::PATH0(pols), add = TRUE)

## expand out to every cell
  
cr <- do.call(rbind, apply(index, 1,\(.x) cbind(seq(.x[1], .x[2]), .x[3], .x[4])))
xy <- vaster::xy_from_cell(dm, ext, vaster::cell_from_row_col(dm, cr[,2], cr[,1]))
plot(xy, pch = 19, cex = .3, col = palr::d_pal(cr[,3]))
plot(silicate::PATH0(pols), add = TRUE)

It scales to very large tasks, with small output.

dm <- c(500000, 400000)
system.time(r <- controlledburn:::burn_polygon(pols, extent = ext,
                               dimension = dm))
#>    user  system elapsed 
#>   0.913   0.068   0.982
length(r)
#> [1] 989153

## consider a prod(dm) raster of type double (or even bool) obviously would compress again but why churn
pryr::object_size(r)  
#> 71.22 MB

The following is inefficient, but shows that we get the right result.

dm <- c(500, 400)
system.time(r <- controlledburn:::burn_polygon(pols, extent = ext,
                               dimension = dm))
#>    user  system elapsed 
#>   0.002   0.001   0.003

index <- matrix(unlist(r, use.names = F), ncol = 4L, byrow = TRUE) + 1 ## plus one because 0-index internally

## now go inefficient, this is every column,row index, then converted to cell, converted to xy
cr <- do.call(rbind, apply(index, 1, \(.x) cbind(seq(.x[1], .x[2]), .x[3], .x[4])))
xy <- vaster::xy_from_cell(dm, ext, vaster::cell_from_row_col(dm, cr[,2], cr[,1]))
plot(xy, pch = ".", col = palr::d_pal(cr[,3]))

rr <- terra::rast(cbind(xy, 0), type = "xyz")

rr[terra::cellFromXY(rr, xy)] <- 1
terra::plot(rr ,col = "firebrick", asp = NA)
plot(silicate::SC0(pols), add = TRUE)

Code of Conduct

Please note that the controlledburn project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

About

don't materialize, just rasterize!

Topics

Resources

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •