Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSRF assumes the token to be valid latin-1 #3763

Open
hynek opened this issue Jul 9, 2024 · 2 comments
Open

CSRF assumes the token to be valid latin-1 #3763

hynek opened this issue Jul 9, 2024 · 2 comments
Labels

Comments

@hynek
Copy link
Contributor

hynek commented Jul 9, 2024

Bug Report

Describe the bug
Today Sentry reported a crash that I don't think I can do anything about (except adding custom checks) where someone (presumable an attacker) submitted a CSRF token that cannot be encoded at latin-1.

To Reproduce
I'm currently traveling so can't come up with anything simple, but given that the check looks like this:

def check_csrf_token(self, request, supplied_token):
"""Returns ``True`` if the ``supplied_token`` is valid."""
expected_token = self.get_csrf_token(request)
return not strings_differ(
bytes_(expected_token), bytes_(supplied_token)
)

and bytes_ looks like this:

def bytes_(s, encoding='latin-1', errors='strict'):
"""If ``s`` is an instance of ``str``, return
``s.encode(encoding, errors)``, otherwise return ``s``"""
if isinstance(s, str):
return s.encode(encoding, errors)
return s

It makes sense that if someone manages to sneak in a token that's not latin-1-encodable, it will crash with an UnicodeEncodeError.

I guess wrapping strings_differ into a try except UnicodeError this would fix it?

For completeness, the token in question were:

  • "1����%2527%2522\\\'\\""
  • "10fc8c867a0c4552831a44a16f193a77����%2527%2522\\\'\\"".

Unfortunately it's a bit difficult to trace what exactly happens, because Sentry removes everything with token in the name.

Expected behavior
No crash.

Additional context

I'm pretty sure I'm not doing anything wrong; my app doesn't appear in the traceback except for tweens that don't touch the headers at all.

It's Pyramid 2.0.2 running in Unicorn 22.0.0 and

config.set_default_csrf_options(require_csrf=True)
config.set_csrf_storage_policy(CookieCSRFStoragePolicy())
@mmerickel
Copy link
Member

I feel like this is a valid concern and it's natural to say that it should just be counted as not-equal versus raising an exception for an unusable value in the supplied csrf value. We should fix this.

@mmerickel mmerickel added the bugs label Jul 10, 2024
@ztane
Copy link
Contributor

ztane commented Oct 9, 2024

Another approach would be to use errors='backslashreplace' for the supplied_token, then all non-representable characters would turn into Unicode escapes:

>>> '😀'.encode('latin-1', errors='backslashreplace')
b'\\U0001f600'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants