-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: Add null terminators to character arrays to include final characters #202
Fix: Add null terminators to character arrays to include final characters #202
Conversation
@ashvardanian On my local computer (WSL Ubuntu 22.04, x86_64), I only encounter test issues with C++20. It seems unrelated to my changes.
|
@alexbarev, is there a different way to solve this by constraining the type-conversion rules? Extending every |
I agree it looks unappealing. Apparently, the issue is that I see nothing that can be done with the current constructor from c array that is being called now, because strings literals are also cast to a C-style string where a null terminator StringZilla/include/stringzilla/stringzilla.hpp Lines 283 to 300 in 152ed04
|
Hi there, Without modifying basic_charset or appending '\0' to each utility array of ASCII characters, I propose the following two solutions: Option 1: Converter FunctionIntroduce a helper function to handle correct convertation from array to template <std::size_t count_characters>
char_set char_set_converter(char const (&chars)[count_characters]) {
return char_set(chars).add(chars[count_characters - 1]);
} This function must be applied in every inline char_set digits_set() {
static char_set charset = char_set_converter(digits());
return charset;
} There was no static before, but was there any specific reason not to include it? Option 2: Wrapper ClassWith this approach, I aim to provide a struct that enhances safety against incorrect usage. However, it introduces significantly more code and stores a reference as a class data member. I'm unsure if this is entirely appropriate here — it's just an idea. template <std::size_t count_characters>
class carray_wrapper {
using carray = char[count_characters];
carray const & all;
char_set const charset;
public:
carray_wrapper (char const (&chars)[count_characters]) noexcept
: all(chars), charset{char_set{all}.add(all[count_characters - 1])} {}
operator char_set() const noexcept {
return charset;
}
carray const & operator ()() const noexcept {
return all;
}
};
namespace detail {
char const digits[10] = {'0', '1', '2', '3', '4', '5', '6', '7', '8', '9'};
//.....
}
carray_wrapper<sizeof(detail::digits)> digits{detail::digits};
inline char_set digits_set() {
// User-defined conversion ensures `char_set` includes the ad-hoc last character:
static char_set charset {digits};
return charset;
} This preserves the previous behavior of returning raw arrays for string operations through the |
245c85a
to
86f53d9
Compare
This PR fixes the issue where the last character of certain character arrays was not included in
basic_charset
. By explicitly adding a'\0'
terminator to these arrays (e.g.,ascii_letters()
), all intended characters are now properly included.In addition, new tests have been added for the ASCII utility functions provided by
sz::string
andsz::string_view
. Previously, these tests would fail due to the missing final character. With this fix, they pass successfully, confirming that the issue is resolved.#200