-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MathML support in the HTML Sanitizer API #227
Comments
I see the current sanitization algorithms have a configurable on/off toggle for
Similarly to |
Right, I believe we should have both 1) a strict default subset implemented natively in browsers and 2) a way to relax it for web developers using the API. I haven't read the spec for a while, but that's how I had understood the situation for HTML. It would be good if MathML folks could spend some time to ensure this is the case for MathML too. |
I think that MathML in general is safe. Any element except maybe those who contain script oriented ones should be in the accept-list. |
As requested in the Math WG meeting on March 28, here is a link to the MathML-related CVEs on record, currently 8: NIST national vulnerabilities database, keyword MathML I remembered noticing back in the day that cases such as CVE-2021-38193 and CVE-2020-26870 appeared to be examples where switching between parsing contexts hosted exploits. This was the reason I flagged
|
Hello @fred-wang and all, we discussed the subject in the MathML-core meeting yesterday and I think that the following seems to have met everyone's agreement: We converged on the fact that the skeptiscism about the security of
Other than that we see sanitziation needs to wipe-out:
We have also considered it important that this issue carries a few examples of potentials that the sanitizer's inclusion of the MathML elements may bring. Finally, we have highlighted the potentials of TrustedTypes as an application that may be relevant for the sanitizers. But so far, I see this as a potential only. I would suggest that we request that the MathWG or Math CG be "called back" when TrustedTypes may intersect the sanitizer APIs beyond its current scope (which I understand to be a baseline converter to transform web-content in something that can be exchanged in a way considered safer further than the browser's current page). Do you agree with the approach proposed in the numbers 1 to 4. Then I suggest we go to the sanitizer API issues and make that suggestion as a safe list. thanks in advance. Paul |
Hi, Sorry for the late reply, I overlooked this was directed to me. In general I don't have strong opinion on this, the sanitizer API is implemented in a relatively part of browser code that is relatively independent from MathML rendering. It should be fine to go ahead and talk to the people working on the sanitizer API spec, finding a consensus there. I didn't check what was the latest status regarding non-HTML namespace. Probably the main thing to pay attention is that MathML Core is targetted for browsers while MathML Full is used in other applications. So we would need to decide whether we only accept MathML Core markup or allow MathML Full markup (with maybe more sanitization for security/privacy sensitive markup that will need to be figured out). 1-4 seems to be about things that are not in MathML Core. Note that Firefox's sanitization currently accepts content markup but at the cost of adding many atomic strings for each content MathML tag: https://bugzilla.mozilla.org/show_bug.cgi?id=1787594#c8 Regarding security/safety in browsers, the one I'm aware of are described in https://w3c.github.io/mathml-core/#security-considerations and https://w3c.github.io/mathml-core/#privacy-considerations ; in particular href is the one that can cause problems (unfortunately the discussions regarding its inclusion in MathML Core is on hold). Note also the case of maction statusline (whose support was removed from browsers). |
Your comment made me wonder why/how the elements annotation, annotation-xml and their container semantics made it into MathML Core. If they are to be useful, their contents should be able to survive sanitization, at least in some cases. If not, maybe they are better thought of as MathML Full elements? As a cross-spec thought: SVG has a construct similar to Content MathML is indeed the classic use of |
See https://wicg.github.io/sanitizer-api/
Some work has been done to hande mathml/svg namespaces but the spec should likely specify a default safelist, see WICG/sanitizer-api#103 (comment) (IIRC, the API allows web dev to accept more element/attributes that are not in the safelist, though)
So this issue is about discussing what we want to suggest as a default safelist for MathML.
In another issue, I had commented to try and follow MathML Core as much as possible as that's what browsers are expected to implement: WICG/sanitizer-api#167 (comment)
Some more comments:
Firefox has some safe list already but I guess it is not very strict, for example it still allows XLink href or content mathml markup. The bug is https://bugzilla.mozilla.org/show_bug.cgi?id=1787594
For Chromium, I don't remember without checking more. But probably it does not include more than what is in MathML Core, since we never implemented more.
I'm not sure if the sanitzer api is actually being implemented in webkit.
The text was updated successfully, but these errors were encountered: