-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve resilience against malformed or corrupt documents #152
base: master
Are you sure you want to change the base?
Conversation
@packdat , hi! Thanks for you work on this PR. I was looking at an incremental pdf which had the following structure, and which happen to have an incorrect
Acrobat shows the following information: But the code is currently overwriting the reference used in the trailing with the second occurence of the |
@Greybird Can you provide the mentioned document or one that shows the same behavior ? |
Could I asking you kindly for an update, we would love to have this pr? |
This PR is intended to allow PDFsharp to read documents that would otherwise throw exceptions when trying to open them.
Notable changes:
For example, in one of my test-documents there is a string-value that looks like this:
(text))
Note the double closing parenthesis, which can (and should) be just ignored.
The tolerance of 20 bytes is insufficient in most (if not all) cases.
The method was enhanced to search for the end of the stream until EOF.
NotImplementedException
.The code that throws was inside an
#if true
block but the#else
block seems to work just fine, so i just switched the condition.startxref
-keyword pointing to something that is not an xref-table, etc.), an attempt is made to rebuild the trailer and the CrossReferenceTable by manually scanning the whole document.I included a zip-file containing the documents (out of my >1000 test-files), that could not be opened with the original version 6.2.0-preview-1, but opened just fine in Chrome, Edge, Firefox and Acrobat Reader.
With the changes, PDFsharp was able to open them.
DefectFiles.zip
I used the following test-case (in
PdfSharp.Tests.IO.ReaderTests
):