pycerpt is a command line utility for extracting highlighted text from PDFs.
Get the latest version with pip install pycerpt
.
pycerpt outputs to markdown as default. Use with excerpt test.pdf
or save to a file with excerpt test.pdf > out.md
or excerpt test.pdf out.md
.
For PDF generation additional dependencies are needed: pip install pycerpt[pdf]
.
Usage: excerpt test.pdf out.pdf
.