Skip to content
This repository has been archived by the owner on Apr 9, 2020. It is now read-only.

Strings not null-terminated after decode #32

Open
phpHavok opened this issue Jun 16, 2015 · 2 comments
Open

Strings not null-terminated after decode #32

phpHavok opened this issue Jun 16, 2015 · 2 comments
Labels

Comments

@phpHavok
Copy link

Hello, everyone. This may be intended behavior (although I can't see why), but it appears that if you stick a string inside a map and encode it, that string will not be null terminated after decoding. Here is an example of what I mean:

cn_cbor_mapput_string(inner_map, "username", cbor_username, &ctx, NULL);
cn_cbor_mapput_string(inner_map, "message", cbor_message, &ctx, NULL);

Assume cbor_username and cbor_message are of type const char *. Also assume inner_map is a map. I then stick inner_map inside another map and encode it. On decoding, I recover the strings.

const cn_cbor * cbor_username = cn_cbor_mapget_string(inner_map, "username");
const cn_cbor * cbor_message = cn_cbor_mapget_string(inner_map, "message");

The field cbor_username->v.str is not null-terminated. You can still recover the length of the string through cbor_username->length. This is also true of cbor_message.

Any thoughts? Is this a known issue? Is this intended?

@cabo
Copy link
Owner

cabo commented Jun 17, 2015

That is indeed the intended behavior.

cn-cbor tries to minimize resource usage, and one of its little tricks is that it only ever allocates memory for the cn_cbor structures themselves, not for the data inside. The cn_cbor structure for a string points back into the CBOR data for the actual string data (being able to do this was one of the design goals of CBOR). Null-terminating that would require writing to that CBOR data -- that is not in the contract for the decoder, which receives a const unsigned char* buf.
If you don't need the CBOR data unmutated and know you have at least one more byte of unused space in that buffer, you can simply set x->v.str[x->length] = 0 (the syntax of CBOR guarantees you won't overwrite another string).

(Note that zero bytes are valid UTF-8 data, so null-terminating strings is only a valid strategy if you otherwise know that won't be the case for your data. 0xFF would be a much better string terminator for UTF-8, but of course does not help if the point is interfacing with libraries that expect C strings.)

[Leaving this open because it is a documentation issue.]

@cabo cabo added the docfix label Jun 17, 2015
@phpHavok
Copy link
Author

Ah, that makes sense. Thank you for the quick reply and explanation.

It shouldn't be a problem for me to just copy the string or otherwise manipulate it. It is just good to know that it is intended behavior and I'm not just doing something wrong.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants