Version 4.0.0, 2024-01-19
What's new
pypdf==4.0.0 is a big milestone forward:
- We finally have a layout-mode text extraction. This enables users who want to detect / extract tables with heuristics to give it a try.
- We deprecated a lot of the old PyPDF2 API that was either not following PEP8 naming styles or was not using a property. Users coming from PyPDF2 might want to switch to pypdf<4.0.0 first to get helpful error messages that show the new API in their specific cases.
A big 'Thank you!' the the whole pypdf community for your work. Thanks to you, pypdf is better than ever.
Kudos to @shartzog who added the layout-mode with his first contribution!
Deprecations (DEP)
- Drop Python 3.6 support (#2369) by @MartinThoma
- Remove deprecated code (#2367) by @MartinThoma
- Remove deprecated XMP properties (#2386) by @stefan6419846
New Features (ENH)
- Add "layout" mode for text extraction (#2388) by @shartzog
- Add Jupyter Notebook integration for PdfReader (#2375) by @MartinThoma
- Improve/rewrite PDF permission retrieval (#2400) by @stefan6419846
Bug Fixes (BUG)
- PdfWriter.add_uri was setting the wrong type (#2406) by @pmiller66
- Add support for GBK2K cmaps (#2385) by @stefan6419846
Documentation (DOC)
- Add pmiller66 for #2406 as a contributor by @MartinThoma
- Add missing expand parameter (#2393) by @Atomnp
- Resolve build warnings (#2380) by @stefan6419846
- Fix testing prerequisites (#2381) by @stefan6419846
- Improve formatting of contributors page (#2383) by @stefan6419846
- Add Tobeabellwether as a contributor for #2341 by @MartinThoma
Developer Experience (DEV)
- Make dependabot aware of our PR prefixes (#2415) by @stefan6419846
- Fail on Sphinx issues (#2405) by @stefan6419846
- Move title check to own workflow (#2384) by @MasterOdin
- Write to temporary files instead of the working directory (#2379) by @stefan6419846
- Ensure that the PR titles have the correct format (#2378) by @stefan6419846
Maintenance (MAINT)
- Return None instead of -1 when page is not attached (#2376) by @MartinThoma
- Complete FileSpecificationDictionaryEntries constants (#2416) by @MartinThoma
- Replace warning with logging.error (#2377) by @MartinThoma
Testing (TST)
- Add missing pytest.mark.samples annotations (#2412) by @kitterma
- Correctly close temporary files (#2396) by @stefan6419846
- Fix side effect #2379 (#2395) by @pubpub-zz
- Add test for layout extraction mode (#2390) by @MartinThoma
Code Style (STY)
- Use the UserAccessPermissions enum (#2398) by @MartinThoma
- Run black (#2370) by @MartinThoma