Skip to content

Releases: michelcrypt4d4mus/pdfalyzer

v1.19.0

25 Jan 08:53

Choose a tag to compare

  • Enable permanently setting almost all command line options via environment variables or a custom .pdfalyzer file, add --env-vars option to display exactly which command line options can be set by which variables
  • Add --export-png option to render .png images of output
  • Add --echo-command option to save the exact command used along with the output
  • Add --no-timestamps option for exported filenames
  • Add --suppress-output option
  • Highlight some of the interesting object reference keys in the rich tree view
  • In rich tree view sort DictionaryObject key/value pairs by key alphabetically, except /Type and /Subtype are at the top
  • Display the number of revisions (max generation value) in metadata table
  • Display obj types/labels determined when walking the tree instead of newly constructed PdfObjectProperties which may lack nuance in their labeling
  • Better labeling of /StructElem objects in a StructTreeRoot hierarchy
  • Coerce /Nums number trees into dict like objects for the purposes of assigning addresses
  • /Annots and other indeterminate nodes now have /Subtype integrated into their labeling (e.g. /Annots:Link instead of just /Annots)
  • Fancier table for PDF metadata that also contains the number of pages, images, and revisions (if possible)
  • Test suite now checks results against pre-recorded fixture output

v1.18.1

20 Jan 21:02

Choose a tag to compare

  • Ensure cryptography package is installable as an extra

v1.18.0

20 Jan 20:12

Choose a tag to compare

  • Handle encrypted PDFs via --password option and/or prompting user for the password
  • pdfalyze script now returns error code 1 to shell if there's unplaced nodes unless new --allow-missed-nodes option is used
  • Upgrade pypdf to 6.6.0 and make use of new Font object
  • Send logs to stderr instead of stdout, redirect and reformat pypdf logs, other logging improvements
  • Sort Pdfalyzer.font_infos array by node ID
  • Placement of formerly orphaned nodes:
    • Force stranded /Pages nodes to be children of /Catalog
    • Better placement of orphaned nodes that are members of an ArrayObject
    • Place special /Linearization nodes under root
    • Force /Xobject nodes with /Subtype of /Form to be children of /AcroForm nodes
    • Remove non_tree_relationships if there's an actual parent/child relationship
    • Insert grandparents in situations where there's nodes that are in any array but also claim a node other than the array is their parent

v1.17.13

18 Jan 00:22

Choose a tag to compare

  • Remove YARA rule invalid_trailer_structure because it's causing YARA to crash with internal error 46: TOO_MANY_RE_FIBERS on some files (opened issue in YARA repo), fixes #15
  • Indent the font character maps under the font info panel
  • Bump yaralyzer to 1.0.11

v1.17.12

15 Jan 19:46

Choose a tag to compare

  • Set max version of pypdf to below 6.6.0 because of breaking change with _cmap.build_char_map() from

v1.17.11

18 Dec 04:49

Choose a tag to compare

  • Properly escape image OCR text with rich.markup.escape() when printing to page_buffer to avoid exceptions on weird OCR text
  • Upgrade pypdf to 6.4.2, bump pymupdf

v1.17.10

05 Dec 22:27

Choose a tag to compare

  • Fix logging bug in create_dir_if_it_does_not_exist()
  • Upgrade pypdf to 6.4.0

v1.17.9

05 Nov 03:23

Choose a tag to compare

  • Handle errors in FontInfo extraction more gracefully

v1.17.7

02 Nov 20:47

Choose a tag to compare

  • Bump pypdf to 6.1.3 (fixes #31)
  • Bump PyMuPDF to 1.26.5

v1.17.6

27 Sep 19:17

Choose a tag to compare

  • Better handling for errors resulting from bugs in PyPDF
  • Properly close file handle when pdfalyzing is complete