This is an implementation of the UTF-8
encoding standard for C++
, the implementation is based on the std::basic_string<char>
class, this gives us the advantage of std::basic_string<char>
class functions, as well as compatibility with some functions from the standard library.
C++ AStyle
code style was used for this project.
C++ 20
CMake 3.14
- Linux (verified on clang)
- Windows (verified on gcc && clang)
- Support for creating an object from:
std::string, std::string_view, const char*
- Comparing an
utf8::ustring
withutf8::ustring
,const char*
,std::string
,std::string_view
- Copy/Move (<- assignment also) constructor implemented
- O(1) Random Access
- Simple replacement of any character, e.g. a smaller character
(1 byte)
with a larger character(2-3-4 bytes)
. - Write/read to a file
String-Version | Test-Name | iteration count | time (for all iterations) |
---|---|---|---|
std::string | replace_char | 500'000 | ~0.57s |
utf8::ustring | replace_char | 500'000 | ~0.60s |
Yes and no, in the case where we replace a character with another with the same size
, we have O(1)
, in the other case we have O(~N)
, where N is the length of the string.
But in any case we have O(1) Random Access
.
std::string str{ "hello world!" };
utf8::ustring ustr{ str };
ustr.replace_char("п", 0); // "пello world!"
ustr == "hi !"; // false
ustr == str; // false
- Docs.