Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Work around crazy emails with non-base64 encoded attachments #283

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

eliask
Copy link

@eliask eliask commented Jan 1, 2017

GMail IMAP push will get in a weird state if it encounters an email
which is encoded in this manner. The connection will be reset after a
long timeout after encountering this binary crap.

I suspect an earlier version of GMail server software, or some non-IMAP
import allows or allowed this in at some point, and yet it is not
possible to upload it back verbatim.

For my purposes, it suffices to skip these rare species but in general
the email parts should be re-encoded as base64, or the problematic
invalidly encoded attachments should be removed.

The illegally encoded binary emails are detected here by looking for the
PNG magic string with bytes that SHOULD NOT appear in any legitimate
email data.

A screenshot of one of the affected emails is at
https://imgur.com/a/i09Xa.

GMail IMAP push will get in a weird state if it encounters an email
which is encoded in this manner. The connection will be reset after a
long timeout after encountering this binary crap.

I suspect an earlier version of GMail server software, or some non-IMAP
import allows or allowed this in at some point, and yet it is not
possible to upload it back verbatim.

For my purposes, it suffices to skip these rare species but in general
the email parts should be re-encoded as base64, or the problematic
invalidly encoded attachments should be removed.

The illegally encoded binary emails are detected here by looking for the
PNG magic string with bytes that SHOULD NOT appear in any legitimate
email data.

A screenshot of one of the affected emails is at
https://imgur.com/a/i09Xa.
@gaubert
Copy link
Owner

gaubert commented Jan 2, 2017

Thanks eliask but this seems to be just a particular case ? I see only a unicode character tabulation with justification and the start of a PNG file ? This seems to solve your particular use case but not others ?
What do you think ?

@eliask
Copy link
Author

eliask commented Jan 2, 2017 via email

@BangisBanginis
Copy link

I have encountered same issue. Email source tells me that it contains based64 encoded PDF document attachment, but when I look at it, i see that there is binary. Yet somehow gmail browser client manages to open such file correctly for viewing or download it to PC, yet if I use API - I fail to read such attachment, because base64 encoded claim is false.

This commit tests all base64-encoded parts of emails to push.

This generalizes the previous ad hoc method which only worked for
detecting the PNG magic header anywhere in the message.

If a email parts/payloads claim to be base64 encoded but contain
invalid characters outside the base64 alphabet, the email is never pushed
to server. The erroneous emails are NOT sent to quarantine since they
are assumed/observed to be both rare and can be detected client
side (and notably CANNOT be detected server-side as such because they
break the protocol contract).
@eliask
Copy link
Author

eliask commented May 4, 2017

@BangisBanginis thanks for the heads up. I got motivation to generalize this patch to test all base64-encoded data in emails.

@gaubert now that the patch is "complete", could you merge it to master?

NOTE: it is theoretically possible that invalid non-base64 encoded data may cause similar issues, but on balance, I believe only base64 is in a privileged position and other data would probably not cause protocol issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants