ICU-23251 Fix static analyzer bugs#3758
ICU-23251 Fix static analyzer bugs#3758poliudian-iv wants to merge 1 commit intounicode-org:mainfrom
Conversation
883a651 to
7955d20
Compare
|
Hooray! The files in the branch are the same across the force-push. 😃 ~ Your Friendly Jira-GitHub PR Checker Bot |
|
|
||
| // Set postContext to some of msg starting at index. | ||
| length=msg.length()-index; | ||
| length=msg.length()>index ? msg.length()-index : 0; |
There was a problem hiding this comment.
Is this something that could reasonably happen? If not, I suggest instead adding the length condition to the first if-statement in this function and just do nothing at all if index is out of range. You might even add an U_ASSERT() statement to assert that this function never gets called with an invalid index value.
icu4c/source/i18n/rbt_pars.cpp
Outdated
| int32_t pos, | ||
| UErrorCode& status) | ||
| { | ||
| if (pos < 0) { |
There was a problem hiding this comment.
Same here, this is not something that should ever happen, assert (to get a nice human readable error message in debug builds if some bug causes it to happen even if it shouldn't) and then return immediately without doing anything.
| const UnicodeSet &s = RegexStaticSets::gStaticSets->fPropSets[opValue]; | ||
| if (s.contains(c)) { | ||
| success = !success; | ||
| if (c >= 0) { |
There was a problem hiding this comment.
This doesn't look right, UChar32 should be an unsigned data type, it shouldn't be possible for this value to be negative.
There was a problem hiding this comment.
UChar32 is int32_t
And U16_NEXT can return U_SENTINEL == -1
There was a problem hiding this comment.
Oh, I see, I mixed it up with UChar, sorry about that. But then there's a clearly defined sentinel value that you can test for here, U_SENTINEL, which will tell any future reader that this is a value that this variable really can be assigned. (And in case you have reason to fear that any other negative value could ever be assigned here, cover that case with a U_ASSERT.)
There was a problem hiding this comment.
Testing for negative / non-negative is much spiffier, and used throughout the code base.
U_SENTINEL is negative for a reason -- for easy testing via if (c < 0).
| const UnicodeSet &s = RegexStaticSets::gStaticSets->fPropSets[opValue]; | ||
| if (s.contains(c)) { | ||
| success = !success; | ||
| if (c >= 0) { |
There was a problem hiding this comment.
Oh, I see, I mixed it up with UChar, sorry about that. But then there's a clearly defined sentinel value that you can test for here, U_SENTINEL, which will tell any future reader that this is a value that this variable really can be assigned. (And in case you have reason to fear that any other negative value could ever be assigned here, cover that case with a U_ASSERT.)
7955d20 to
e1a8cc6
Compare
|
Notice: the branch changed across the force-push!
~ Your Friendly Jira-GitHub PR Checker Bot |
icu4c/source/common/ubidiln.cpp
Outdated
| /* search for the run limits and initialize visualLimit values with the run lengths */ | ||
| i=0; | ||
| do { | ||
| while(i<limit) { |
There was a problem hiding this comment.
line 578 says that limit>0. let's assert that if necessary, rather than change the logic.
after 26 years with this code as is, i doubt there is a real problem to be fixed.
There was a problem hiding this comment.
Ok, if you wanna, i can add checking limit to start of function for checking argument pBiDi->trailingWSStart
icu4c/source/i18n/rematch.cpp
Outdated
| const UnicodeSet &s = RegexStaticSets::gStaticSets->fPropSets[opValue]; | ||
| if (s.contains(c)) { | ||
| success = !success; | ||
| if (c != U_SENTINEL) { |
There was a problem hiding this comment.
| if (c != U_SENTINEL) { | |
| if (c >= 0) { |
etc.
Actually, I am surprised that @aheninger used the validating macro but didn't check for well-formed input.
Also, is it ok to do nothing when the input is ill-formed? We shouldn't change this code without understanding what it does.
FYI: If it was ok to use a simple fallback, then we could use U16_NEXT_OR_FFFD().
e1a8cc6 to
2d6871d
Compare
|
Notice: the branch changed across the force-push!
~ Your Friendly Jira-GitHub PR Checker Bot |
|
@roubert, @markusicu |
Bugs identified by the static analyzers when analyzing the chromium source code (part 3)
icu4c/source/i18n/rbt_pars.cpp - "Out of bound access to memory preceding the field 'preContext'"
icu4c/source/common/cstring.cpp - "Out of bound access to memory" if n == INT_MAX
icu4c/source/i18n/rematch.cpp - "Out of bound access to memory preceding the field 'd'" at Regex8BitSet::contains() when c == -1;
icu4c/source/i18n/number_longnames.cpp - "Out of bound access to memory" at getMixedUnitModifier
icu4c/source/common/messagepattern.cpp - "Out of bound access to memory preceding the field 'postContext'" when index < msg.length()
icu4c/source/common/ubidiln.cpp - "Out of bound access to memory" when limit==0
icu4c/source/i18n/simpletz.cpp - "The left operand of '<' is a garbage value"
TODO: Fill out the checklist below.
Checklist