daiR is an R package for Google Document AI, a powerful server-based OCR processor. The package provides a wrapper for the Document AI API and comes with additional tools for output file parsing and text reconstruction. See the daiR
website for more details.
Google Document AI is a paid service that requires a Google Cloud account and a Google Storage bucket. I recommend using Mark Edmondson's googleCloudStorageR
package in combination with daiR
.
Install the latest development version from Github:
devtools::install_github("hegghammer/daiR")