Releases · michelcrypt4d4mus/pdfalyzer · GitHub

25 Jan 08:53

michelcrypt4d4mus

v1.19.0 Latest

Latest

Enable permanently setting almost all command line options via environment variables or a custom .pdfalyzer file, add --env-vars option to display exactly which command line options can be set by which variables
Add --export-png option to render .png images of output
Add --echo-command option to save the exact command used along with the output
Add --no-timestamps option for exported filenames
Add --suppress-output option
Highlight some of the interesting object reference keys in the rich tree view
In rich tree view sort DictionaryObject key/value pairs by key alphabetically, except /Type and /Subtype are at the top
Display the number of revisions (max generation value) in metadata table
Display obj types/labels determined when walking the tree instead of newly constructed PdfObjectProperties which may lack nuance in their labeling
Better labeling of /StructElem objects in a StructTreeRoot hierarchy
Coerce /Nums number trees into dict like objects for the purposes of assigning addresses
/Annots and other indeterminate nodes now have /Subtype integrated into their labeling (e.g. /Annots:Link instead of just /Annots)
Fancier table for PDF metadata that also contains the number of pages, images, and revisions (if possible)
Test suite now checks results against pre-recorded fixture output

Assets 2

20 Jan 21:02

michelcrypt4d4mus

v1.18.1

Ensure cryptography package is installable as an extra

Assets 2

20 Jan 20:12

michelcrypt4d4mus

v1.18.0

Handle encrypted PDFs via --password option and/or prompting user for the password
pdfalyze script now returns error code 1 to shell if there's unplaced nodes unless new --allow-missed-nodes option is used
Upgrade pypdf to 6.6.0 and make use of new Font object
Send logs to stderr instead of stdout, redirect and reformat pypdf logs, other logging improvements
Sort Pdfalyzer.font_infos array by node ID
Placement of formerly orphaned nodes:
- Force stranded /Pages nodes to be children of /Catalog
- Better placement of orphaned nodes that are members of an ArrayObject
- Place special /Linearization nodes under root
- Force /Xobject nodes with /Subtype of /Form to be children of /AcroForm nodes
- Remove non_tree_relationships if there's an actual parent/child relationship
- Insert grandparents in situations where there's nodes that are in any array but also claim a node other than the array is their parent

Assets 2

18 Jan 00:22

michelcrypt4d4mus

v1.17.13

Remove YARA rule invalid_trailer_structure because it's causing YARA to crash with internal error 46: TOO_MANY_RE_FIBERS on some files (opened issue in YARA repo), fixes #15
Indent the font character maps under the font info panel
Bump yaralyzer to 1.0.11

Assets 2

15 Jan 19:46

michelcrypt4d4mus

v1.17.12

Set max version of pypdf to below 6.6.0 because of breaking change with _cmap.build_char_map() from

Assets 2

18 Dec 04:49

michelcrypt4d4mus

v1.17.11

Properly escape image OCR text with rich.markup.escape() when printing to page_buffer to avoid exceptions on weird OCR text
Upgrade pypdf to 6.4.2, bump pymupdf

Assets 2

05 Dec 22:27

michelcrypt4d4mus

v1.17.10

Fix logging bug in create_dir_if_it_does_not_exist()
Upgrade pypdf to 6.4.0

Assets 2

05 Nov 03:23

michelcrypt4d4mus

v1.17.9

Handle errors in FontInfo extraction more gracefully

Assets 2

02 Nov 20:47

michelcrypt4d4mus

v1.17.7

Bump pypdf to 6.1.3 (fixes #31)
Bump PyMuPDF to 1.26.5

Assets 2

27 Sep 19:17

michelcrypt4d4mus

v1.17.6

Better handling for errors resulting from bugs in PyPDF
Properly close file handle when pdfalyzing is complete

Assets 2