Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add a comment explaining that nginx OCSP stapling is broken without configuring the async resolver #283

Closed
thestinger opened this issue May 3, 2022 · 16 comments

Comments

@thestinger
Copy link

thestinger commented May 3, 2022

nginx caches the IP of the OCSP stapling server forever after loading the configuration if it's using the default synchronous DNS resolver provided by libc. The synchronous DNS resolver is only ever used at configuration load time. People need to be setting resolver to the DNS resolver IP address, such as resolver [::1] for localhost or it ends up breaking if they migrate their IP address. This is included in the generated configuration but has no comment explaining it, so people may remove it to use the default and not realize it's broken.

If you configure the async resolver, it respects TTL instead of caching it forever to avoid blocking the event loop on DNS. The configuration generator needs a comment telling people that this is required, instead of it just being there with no explanation of why it's required. Many people are going to think that removing it and using the default DNS resolution is fine, since it appears to work.

See https://trac.nginx.org/nginx/ticket/1305 or one of the other issues there with an explanation from the developers.

Related:

It would also make a lot of sense to add a comment explaining that people should not use Must-Staple unless they use an approach like https://github.com/tomwassenberg/certbot-ocsp-fetcher because nginx doesn't persistently cache the OCSP response and is also willing to replace a valid response with an invalid one resulting in it no longer serving one. Must-Staple is a great way to do a denial of service on yourself unless you use certbot-ocsp-fetcher. At the moment, nothing discourages people from trying to use Must-Staple with that configuration since it appears to support OCSP stapling but the built-in nginx implementation is ONLY intended as a performance optimization that's treated as optional / non-critical so it doesn't use it at start-up until it gets it in the background, has no persistent cache and doesn't try to avoid losing the valid response it already has.

@thestinger thestinger changed the title enabling nginx OCSP stapling without configuring the async resolver is quite broken due to only resolving the IP at configuration load time add a comment explaining that nginx OCSP stapling without configuring the async resolver May 3, 2022
@thestinger thestinger reopened this May 3, 2022
@thestinger thestinger changed the title add a comment explaining that nginx OCSP stapling without configuring the async resolver add a comment explaining that nginx OCSP stapling is broken without configuring the async resolver May 3, 2022
@thestinger
Copy link
Author

I reworded this a fair bit to clarify that I think there should be a clear explanation that resolver should be considered mandatory for using the built-in OCSP stapling. If people use an external implementation, they don't need resolver configured unless they need that for some other reason like using dynamic proxy_pass with a variable (or the resolve feature for upstream blocks that's not in open source nginx) which is far more obvious and isn't just silently broken like this.

@HLFH
Copy link

HLFH commented Dec 6, 2022

And I also recommend https://github.com/tomwassenberg/certbot-ocsp-fetcher which makes OCSP Must-Staple work with nginx.

@gstrauss
Copy link
Collaborator

The current ssl-config-generator outputs

    # replace with the IP address of your resolver
    resolver 127.0.0.1;

Please suggest a 1-4 line replacement comment and I will update it.

Regarding OCSP Must Staple, I think we might add a comment suggesting reading about OCSP Must Staple and a link to https://github.com/tomwassenberg/certbot-ocsp-fetcher for details.

@gstrauss
Copy link
Collaborator

FYI: Let's Encrypt has announced that it will shut down its OCSP responders on August 6, 2025
https://letsencrypt.org/2024/12/05/ending-ocsp/

gstrauss added a commit to gstrauss/ssl-config-generator that referenced this issue Dec 12, 2024
x-ref:
  "add a comment explaining that nginx OCSP stapling is broken without
    configuring the async resolver"
  mozilla/server-side-tls#283
@gstrauss
Copy link
Collaborator

These are the comments I added in mozilla/ssl-config-generator@61f66b7

-      '    # replace with the IP address of your resolver\n'+
-      '    resolver 127.0.0.1;\n';
+      '    # replace with the IP address of your resolver;\n'+
+      '    # async \'resolver\' is important for proper operation of OCSP stapling\n'+
+      '    resolver 127.0.0.1;\n'+
+      '\n'+
+      '    # If certificates are marked OCSP Must-Staple, consider managing the\n'+
+      '    # OCSP stapling cache with an external script, e.g. certbot-ocsp-fetcher\n';

Please let me know if you have any suggestions or corrections, or if you think this issue can be closed. Thanks!

@ghen2
Copy link

ghen2 commented Dec 12, 2024

FYI: Let's Encrypt has announced that it will shut down its OCSP responders on August 6, 2025 https://letsencrypt.org/2024/12/05/ending-ocsp/

Let's Encrypt will be first, but quite surely other CA's will follow as well. I'd hesitate to encourage OCSP Must-Staple at this point, as it will make the transition away from OCSP more difficult.

@thestinger
Copy link
Author

See https://community.letsencrypt.org/t/what-will-happen-to-must-staple/222397/33.

mcpherrinm
Let's Encrypt staff
Jul 26

Wouldn't one of the solutions here be not responding to OCSP requests for certificates that are not requiring Must-Staple

This is the “continue to run OCSP for certificates which have must-staple” plan, an option we are considering.

@ghen2
Copy link

ghen2 commented Dec 12, 2024

See https://letsencrypt.org/2024/12/05/ending-ocsp/ for the actual deployment plan.
Certificate requests with OCSP Must-Staple will be rejected as of January 30, except for accounts that already used it before.
For those, it will be rejected as of May 7, when LE will stop including OCSP URL's at all.

@gstrauss
Copy link
Collaborator

gstrauss commented Dec 12, 2024

When Let's Encrypt no longer includes OCSP extension in certificates, e.g. OCSP - URI:http://r10.o.lencr.org, then configuring OCSP in servers will ignore those certificates, as is currently done, for example, for self-signed certificates.

I'd hesitate to encourage OCSP Must-Staple at this point, as it will make the transition away from OCSP more difficult.

The comment is "If certificates are marked OCSP Must-Staple, consider managing ...", not a recommendation or instructions how to add the extension to certificates.

When Let's Encrypt shuts down its OCSP responders, we might consider changing the default in ssl-config-generator to uncheck the OCSP Stapling box in the form so that OCSP Stapling configuration is not shown by default, but is still available if the box is checked. @ghen2 do you think we should any sooner stop showing OCSP Stapling configuration by default in ssl-config-generator?

(Yes, Let's Encrypt certs with OCSP Must-Staple will break without the Let's Encrypt OCSP responders, but Let's Encrypt certificates requesting OCSP Must-Fail will fail to renew May 7 2025, before the Let's Encrypt OCSP responders are shut down more than 90 days later.)

@thestinger
Copy link
Author

If they added short-lived certificates before Must-Staple is blocked, people using Must-Staple could migrate to those while keeping the same security benefits since the OCSP responses last for at least a few days anyway. Without short-lived certificates, Must-Staple is currently the only way to get private, secure and efficient revocation support not depending on the clients shipping all of the CRLs for your CA which is not happening in practice. Browsers ship a subset of CRLs, not all of them, and the main implementation used in most browsers doesn't have most Let's Encrypt revocations.

@tomato42
Copy link
Member

yes, I also think Let's Encrypt action is a step backwards

OCSP Must Staple and stapling in general doesn't need five 9's of availability of the OCSP responder

and I'm not convinced that other CAs will follow suit, there are industry requirements for revocation checking, and CRLs simply don't scale

@thestinger
Copy link
Author

CRLs can be sharded as Let's Encrypt is doing so providing them scales. Consuming them is what doesn't scale well because gathering all the data from each CA or even finding all of the CRL URLs is a hard problem. Shipping them to users with each application is entirely impractical. CRLite is a good approach but doesn't solve the whole problem. Firefox has partial integration of CRLite, but it's not used on all platforms (missing on Android and of course iOS) and it doesn't have everything included. Must-Staple is very easy for clients to implement if they already know how to check stapled OCSP which is supported by the common TLS libraries. It's just a flag in the certificate where they should enforce using stapling.

@gstrauss
Copy link
Collaborator

@thestinger what are your thoughts about OCSP stapling and the (long-overdue) TLSv1.3 Encrypted Client Hello (ECH) ? Do you know of any discussions anywhere about that and what Let's Encrypt is saying about OCSP stapling and TLS ECH? I'll open a new issue in this repo and/or somewhere with Let's Encrypt if I can get some pointers.

https://wiki.mozilla.org/Security/Encrypted_Client_Hello#Interaction_with_Revocation_Checking specifically recommends using OCSP with TLS ECH.

On the server-side, BoringSSL supports TLS ECH, and both cURL and lighttpd can use BoringSSL to support TLS ECH. On the client-side, Firefox, Chrome, Edge, and more support TLS ECH.

@thestinger
Copy link
Author

OCSP with Must-Staple is the only working revocation mechanism. We used it for most of our services but unfortunately will have to stop. OCSP response lifetimes tend to be fairly long i.e. days before they expire, so short-lived certificates can have a short enough lifetime to resemble the security provided by OCSP stapling with Must-Staple. Short-lived certificates work around clients needing to support OCSP stapling and Must-Staple so that's a major advantage, but at the cost of massively increasing the certificates needing to be logged if it was very widely adopted. It's hard to see how that's easier to handle as infrastructure than OCSP.

https://wiki.mozilla.org/Security/Encrypted_Client_Hello#Interaction_with_Revocation_Checking specifically recommends using OCSP with TLS ECH.

It's recommending that anyone deploying ECH also uses OCSP stapling to avoid the privacy issues of non-stapled OCSP. Not having OCSP at all accomplishes the same thing. The way to follow the advice is to continue using OCSP stapling until Let's Encrypt stops including OCSP metadata in the certificates. For us, we're probably to keep using it until the very last day they're going to allow it with Must-Staple certificates... and hopefully they've launched short-lived certificates by then for us to replace it. If they launch them prior to that date, we'll switch to those and won't have OCSP anymore. It's a good enough replacement for how OCSP was used in practice... but they haven't deployed it and have no timeline. We're not optimistic it will be soon.

On the server-side, BoringSSL supports TLS ECH, and both cURL and lighttpd can use BoringSSL to support TLS ECH. On the client-side, Firefox, Chrome, Edge, and more support TLS ECH.

Using ECH only actually helps if the IP is used to host multiple things with a relevant difference between them. It doesn't really apply to us since our IPs are only for GrapheneOS services. It doesn't provide anything valuable to know a connection went to apps.grapheneos.org rather than releases.grapheneos.org hosted on the same servers or samsung.psds.grapheneos.org rather than time.grapheneos.org. A lot this could be figured out based on the request / response sizes and timing anyway...

@gstrauss
Copy link
Collaborator

@thestinger Thank you for the detailed response. Like you, I am also intimately aware of these limitations. (I helped Stephen Farrell https://defo.ie/ with the TLS ECH support in lighttpd)

so short-lived certificates can have a short enough lifetime to resemble the security provided by OCSP stapling with Must-Staple.

Yes, but as you noted, short-lived certs < 1 week (matching Let's Encrypt current OCSP stapling lifetimes) are not currently available from Let's Encrypt.

Using ECH only actually helps if the IP is used to host multiple things with a relevant difference between them.

Yes, to help preserve privacy of the SNI, the size of the anonymity set matters, as does the number of disparate hosts being served through the same IP. TLS ECH protects everything in the TLS ClientHello, though SNI is one of the more important pieces of information desirable to protect.

I think I was trying to ask if you thought that TLS ECH might be a decent argument to ask Let's Encrypt to continue supporting OCSP at least for certificates marked OCSP Must Staple.

If the OCSP responders only provided stapling responses to registered Let's Encrypt accounts for the domains associated with those accounts and only for OCSP Must Staple, then the site serving the stapled responses would remain in control of serving stapled responses, rather than sites which do not use OCSP Must Staple and which depend on the public OCSP responders to field billions of OCSP requests.

@gene1wood
Copy link
Collaborator

Closing issue at request of @gstrauss.
Please comment if you think issue should be reopened.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants