Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug-fix: snprintf prints NULL in place of the last character #10419

Merged
merged 2 commits into from
Dec 11, 2024

Conversation

kallewoof
Copy link
Contributor

@kallewoof kallewoof commented Nov 20, 2024

We need to give snprintf enough space to print the last character and the null character, thus we allocate one extra byte and then ignore it when converting to std::string.

char buf[5];
snprintf(buf, 5, "hello");
printf("%s", buf); // -> hell\0

Because of this, when copying the C string into the std::string, we get a \u0000 at the end of it, which can cause issues.

We need to give snprintf enough space to print the last character and the null character, thus we allocate one extra byte and then ignore it when converting to std::string.
@ngxson
Copy link
Collaborator

ngxson commented Nov 20, 2024

@slaren The return value of snprintf excludes the terminating null byte. Should we add one byte before returning from llama_model_meta_val_str, or it's up to the user?

In anyway, I think we should leave a comment in llama.h to clarify this behavior.

@kallewoof
Copy link
Contributor Author

kallewoof commented Nov 20, 2024

Edited: never mind, this is C code..

Added a comment. I think a cleaner approach might be to have this allocate and return a char* which is then strlen()'d instead. Chat templates are not enormous enough that the extra length check will pose a noticeable impact.

@kallewoof
Copy link
Contributor Author

kallewoof commented Nov 20, 2024

I pushed an alternative PR in #10424 which simplifies the interface but requires a free() call.
Rewritten to not require free() as #10430.

@slaren
Copy link
Member

slaren commented Nov 20, 2024

Should we add one byte before returning from llama_model_meta_val_str, or it's up to the user?

It's up to the user. They do not need to use NUL-terminated strings.

@kallewoof
Copy link
Contributor Author

kallewoof commented Nov 20, 2024

Unless there are architecture subtleties that I'm not aware of, I think #10430 is a cleaner solution, but keeping both up until a judgement is made.

@ngxson
Copy link
Collaborator

ngxson commented Nov 20, 2024

Should we also apply this patch to every places where llama_chat_apply_template is called? (We don't have many of those)

@kallewoof
Copy link
Contributor Author

Should we also apply this patch to every places where llama_chat_apply_template is called? (We don't have many of those)

I looked and it doesn't seem like it is affected since the allocated buffer is basically guaranteed to be bigger than the chat template (including NULL term).

@ngxson
Copy link
Collaborator

ngxson commented Nov 21, 2024

Sorry I mean changing all other places using llama_model_meta_val_str (there is one inside llama_chat_apply_template IIRC)

@kallewoof
Copy link
Contributor Author

kallewoof commented Nov 22, 2024

Sorry I mean changing all other places using llama_model_meta_val_str (there is one inside llama_chat_apply_template IIRC)

Right -- the only other place is in server.cpp specifically in

https://github.com/ggerganov/llama.cpp/blob/a5e47592b6171ae21f3eaa1aba6fb2b707875063/examples/server/server.cpp#L663-L674

which also preallocates a big enough buffer, like llama_chat_apply_template.

@kallewoof
Copy link
Contributor Author

kallewoof commented Dec 11, 2024

@slaren Is there something I can do to accelerate this very tiny fix? Being merged, that is.

@slaren
Copy link
Member

slaren commented Dec 11, 2024

Sorry, forgot to merge this.

@slaren slaren merged commit 484d2f3 into ggml-org:master Dec 11, 2024
54 checks passed
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Dec 20, 2024
…g#10419)

* bug-fix: snprintf prints NULL in place of the last character

We need to give snprintf enough space to print the last character and the null character, thus we allocate one extra byte and then ignore it when converting to std::string.

* add comment about extra null-term byte requirement
tinglou pushed a commit to tinglou/llama.cpp that referenced this pull request Feb 13, 2025
…g#10419)

* bug-fix: snprintf prints NULL in place of the last character

We need to give snprintf enough space to print the last character and the null character, thus we allocate one extra byte and then ignore it when converting to std::string.

* add comment about extra null-term byte requirement
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants