3 failing test examples that previously worked with AntiSamy 1.7.3. #388

kwwall · 2023-11-11T00:09:13Z

Run 'mvn test' and observe the 3 failing tests. The question is not how to fix this, but is there an explanation of why this is the case and can you perhaps conjecture as to what sort of tainted input would lead to this.

See GitHub issue #389 for details regarding expectations.

…rent for some tainted input strings than for previous version of AntiSamy 1.7.3.

davewichers · 2024-01-15T16:04:30Z

@spassarop - Are we going to do anything about this? Or did you already, like add notes to the README or something?

spassarop · 2024-01-15T20:30:29Z

I did: #389 (comment)

The parser keeps changing and we are dependent on their updates :(

rbri · 2024-01-16T18:50:42Z

@kwwall had a look at your test cases and made another fix for neko. The tag name parsing is now much closer to the spec/real browsers and your test cases are now test cases for neko also.

see HtmlUnit/htmlunit-neko@55053e4

additionally a new neko 3.11.0-SNAPSHOT is available if you like to test.

davewichers · 2024-01-17T15:19:34Z

@spassarop - I tested the new 3.11.0-SNAPSHOT and AntiSamy fails 3 tests like so:

[ERROR] Errors:
[ERROR] AntiSamyTest.issue37:1002 » Scan org.w3c.dom.DOMException: INVALID_CHARACTER_ERR: An invalid or illegal XML character is specified.
[ERROR] AntiSamyTest.scriptAttacks:154 » Scan org.w3c.dom.DOMException: INVALID_CHARACTER_ERR: An invalid or illegal XML character is specified.
[ERROR] AntiSamyTest.testSmuggledTagsInStyleContentCase1:2526 » Scan org.w3c.dom.DOMException: INVALID_CHARACTER_ERR: An invalid or illegal XML character is specified.
[ERROR] Tests run: 108, Failures: 0, Errors: 3, Skipped: 0

Can you determine if our test cases needed to be changed to pass somehow? Or is a fix needed in the updates to NekoHtml?

@rbri - Maybe you want to look into this too.

rbri · 2024-01-18T17:48:28Z

ok, can reproduce this but i have to think about.....

spassarop · 2024-01-20T20:03:13Z

It looks like from our side it's ok. I mean, our test outputs never got that kind of exception before. Anyway, Ronald seems to have it at least identified. El jue, 18 ene 2024 a las 14:48, RBRi ***@***.***>) escribió:

…

ok, can reproduce this but i have to think about..... — Reply to this email directly, view it on GitHub <#388 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AHL3BMKVTFTEHW2NZ4NXGN3YPFN7PAVCNFSM6AAAAAA7G3ARW6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOJYHE2DKMBWGE> . You are receiving this because you were mentioned.Message ID: ***@***.***>

rbri · 2024-01-21T12:30:56Z

SNAPSHOT is updated again

Current status

have added unit tests for Dom/Fragment/Sax parser - so far we had no unit test for the pasers; hope this will make side effects of changes more visible
major change - the scanner now supports many more chars as valid for tag names - this is in sync with the spec and the current browsers
- this change was also the root of the exception in the dom parser - this is fixed (and unit tested)
some more fixes for bugs found on the way (https://github.com/HtmlUnit/htmlunit/blob/master/src/changes/changes.xml)

rbri · 2024-01-21T12:39:15Z

@kwwall - your tests are still failing but i used your test case to improve the tag name scanning/parsing
It will be great if you can have a look at this - i have no real idea if the cleanup done on top of the dom tree are now correct.

testAntiSamyRegressionCDATAWithJavascriptURL

<style/>b<![cdata[</style><a href=javascript:alert(1)>test

Now you got

b&lt;![cdata[test

the parsed dom tree is now like in real browsers

I guess this is related to the major improvements done for the last version to handle script/comment/cdata sections in sync with the browsers/spec.

testOnfocusAfterStyleClosing

<select<style/>k<input<</>input/onfocus=alert(1)>

Now you got

kinput/onfocus=alert(1)&gt;

the parsed dom tree is now like in real browsers

The changed output here is a result of the changed tag name parsing - like in real browsers 'select<style' and 'input<<' are now parsed as valid tag names.

testScriptTagAfterStyleClosing

<select<style/>W<xmp<script>alert(1)</script>

Now you got

Walert(1)

the parsed dom tree is now like in real browsers

The changed output here is a result of the changed tag name parsing - like in real browsers 'select<style' and 'xmp<script' are now parsed as valid tag names.

rbri · 2024-01-21T12:54:07Z

The parser keeps changing and we are dependent on their updates :(

At least from my site this is also the plan for the future - there is still a lot to do to be more and more in sync with the browser parsers.

kwwall · 2024-01-21T18:20:55Z

@rbri - I looked at this and I'm fine with it as it is. I've already noted in our ESAPI release notes that sanitized results may be different than they previously work. As long as it sanitizes safely, it's all good and it does that. I probably will rewrite these 3 JUnit tests to just confirm that the dangerous portions have been properly cleansed. I explained my motivation for noting noting in issue #389 in #389 (comment).

davewichers · 2024-06-18T14:04:23Z

@kwwall @spassarop - what is the status of this PR? Close it, work still to be done, ???

kwwall · 2024-06-18T14:22:57Z

@davewichers - Let me retest with the latest ESAPI release and get back to you. But I think it had been fixed and just didn't get closed. Will let you know by noon tomorrow.

kwwall · 2024-06-19T16:29:12Z

@davewichers and @spassarop - Okay, I looked at this, and from ESAPI's perspective, we just adjusted our JUnit tests in ESAPI's at:

https://github.com/ESAPI/esapi-java-legacy/blob/develop/src/test/java/org/owasp/esapi/reference/validation/HTMLValidationRuleCleanTest.java

to agree with whatever 1.7.5 was producing. So in that specific sense, I think you can close this PR with a suitable comment. (Although, I guess I thought you may want to incorporate these test cases into your own AntiSamy JUnit regression tests. Maybe @spassarop has already done that; I haven't bothered to check.)

However, I also think this points to a larger problem and that is how do we test sanitized output? The way that I generally try to do so is that I run the tainted output through AntiSamy's DOM parser and look at the output. Something like

   System.out.println( (new AntiSamy).scan(tainted, esapi_policy, AntiSamy.DOM).getCleanHTML() );

and look at the output and then use it to write a JUnit test. When possible, I try to write the JUnit test to see if the tainted payload is still present, but one can't always do that (e.g., looking for the absence of onfocus=alert(1) in the one example would be the obvious thing to do, but that string was never removed since we've added that specific test), so in those cases, I just eyeball the cleansed output (and maybe pop it into a browser to test it as well), and if it's safe, I just use the entire cleansed string as the expected value to compare it against. (That is in fact how we discovered those 3 tests that changed that we're discussing here.)

I really don't know what to do for such cases though. And I expect that this affects a lot more than just ESAPI and its Validator.getValidSafeHTML() methods. Certainly (well, hopefully) there are others who have similar, but different regression tests. You really need that if you've done any customization at all of the AntiSamy policy file, which I expect to be somewhat common.

But given that the the cleansed AntiSamy output for any specific input is not constant (and I'm not debating that there are sensible reasons for that), it seems that this problem will be popping up now and again. In fact, over the history of AntiSamy, it's probably popped up a lot more than we've observed because even between AntiSamy and ESAPI teams, we likely are only testing some of the obvious edge cases.

I'm not sure what to do about this larger problem though. This GitHub PR and the associated issue doesn't really do that larger problem justice. I guess you can add a blurb to your README that this just should be expected now and again on an ongoing basis, and ask people report specific edge cases that they encounter when it changes between AntiSamy releases so you can note it in your release notes, but more importantly, use it to improve your regression test cases. But beyond that, I have no suggestions. Maybe you want to consider that (noting it in the README) before closing this GitHub PR. I don't know. It's your call. I only reported this because I figured that you'd want to know, but maybe you don't care and only want things reported when they possibly could result in a vulnerability,

davewichers · 2024-06-26T14:06:23Z

@spassarop - do you want to propose an update to the README and then we can merge that and close this issue?

kwwall added 2 commits November 10, 2023 18:59

ESAPI 2.5.2.0 AntiSamy policy file.

b833a27

Test that illustrates sanitized output in AntiSamy 1.7.4 is now diffe…

f967903

…rent for some tainted input strings than for previous version of AntiSamy 1.7.3.

kwwall mentioned this pull request Nov 11, 2023

Sanitized output for same tainted input differs from AntiSamy 1.7.3 to 1.7.4 #389

Closed

mpreziuso mentioned this pull request Feb 28, 2024

fix: upgrade Antisamy to 1.7.5 to resolve CVE-2024-23635 ESAPI/esapi-java-legacy#833

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

3 failing test examples that previously worked with AntiSamy 1.7.3. #388

3 failing test examples that previously worked with AntiSamy 1.7.3. #388

kwwall commented Nov 11, 2023 •

edited

Loading

davewichers commented Jan 15, 2024

spassarop commented Jan 15, 2024

rbri commented Jan 16, 2024

davewichers commented Jan 17, 2024

rbri commented Jan 18, 2024

spassarop commented Jan 20, 2024 via email

rbri commented Jan 21, 2024

rbri commented Jan 21, 2024 •

edited

Loading

rbri commented Jan 21, 2024

kwwall commented Jan 21, 2024

davewichers commented Jun 18, 2024

kwwall commented Jun 18, 2024

kwwall commented Jun 19, 2024

davewichers commented Jun 26, 2024

3 failing test examples that previously worked with AntiSamy 1.7.3. #388

Are you sure you want to change the base?

3 failing test examples that previously worked with AntiSamy 1.7.3. #388

Conversation

kwwall commented Nov 11, 2023 • edited Loading

davewichers commented Jan 15, 2024

spassarop commented Jan 15, 2024

rbri commented Jan 16, 2024

davewichers commented Jan 17, 2024

rbri commented Jan 18, 2024

spassarop commented Jan 20, 2024 via email

rbri commented Jan 21, 2024

rbri commented Jan 21, 2024 • edited Loading

rbri commented Jan 21, 2024

kwwall commented Jan 21, 2024

davewichers commented Jun 18, 2024

kwwall commented Jun 18, 2024

kwwall commented Jun 19, 2024

davewichers commented Jun 26, 2024

kwwall commented Nov 11, 2023 •

edited

Loading

rbri commented Jan 21, 2024 •

edited

Loading