From 841de8d86085502051718b5e6e101e33f5b9b62d Mon Sep 17 00:00:00 2001 From: Thomas Robinson Date: Fri, 23 Feb 2024 13:21:56 +0000 Subject: [PATCH 1/2] Update codellama.md Alter text to reflect release of Code Llama 70B variants --- codellama.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/codellama.md b/codellama.md index b049315f80..b6b75ae577 100644 --- a/codellama.md +++ b/codellama.md @@ -77,7 +77,7 @@ You can easily try the Code Llama Model (13 billion parameters!) in **[this Spa Under the hood, this playground uses Hugging Face's [Text Generation Inference](https://github.com/huggingface/text-generation-inference), the same technology that powers [HuggingChat](https://huggingface.co/chat/), and we'll share more in the following sections. -If you want to try out the bigger instruct-tuned 34B model, it is now available on **HuggingChat**! You can try it out here: [hf.co/chat](https://hf.co/chat). Make sure to specify the Code Llama model. You can also check [this chat-based demo](https://huggingface.co/spaces/codellama/codellama-13b-chat) and duplicate it for your use – it's self-contained, so you can examine the source code and adapt it as you wish! +If you want to try out the bigger instruct-tuned 34B or 70B models, they are now available on **HuggingChat**! You can try it out here: [hf.co/chat](https://hf.co/chat). Make sure to specify the Code Llama model. You can also check [this chat-based demo](https://huggingface.co/spaces/codellama/codellama-13b-chat) and duplicate it for your use – it's self-contained, so you can examine the source code and adapt it as you wish! ### Transformers @@ -156,7 +156,7 @@ Code Llama is specialized in code understanding, but it's a language model in it This is a specialized task particular to code models. The model is trained to generate the code (including comments) that best matches an existing prefix and suffix. This is the strategy typically used by code assistants: they are asked to fill the current cursor position, considering the contents that appear before and after it. -This task is available in the **base** and **instruction** variants of the 7B and 13B models. It is _not_ available for any of the 34B models or the Python versions. +This task is available in the **base** and **instruction** variants of the 7B and 13B models. It is _not_ available for any of the models or the Python versions. To use this feature successfully, you need to pay close attention to the format used to train the model for this task, as it uses special separators to identify the different parts of the prompt. Fortunately, transformers' `CodeLlamaTokenizer` makes this very easy, as demonstrated below: From aeb38cf11f6bdb3a24cc836dfde3118ed5a49007 Mon Sep 17 00:00:00 2001 From: Thomas Robinson Date: Fri, 23 Feb 2024 13:28:08 +0000 Subject: [PATCH 2/2] Update model variants --- codellama.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/codellama.md b/codellama.md index b6b75ae577..d5a63f837f 100644 --- a/codellama.md +++ b/codellama.md @@ -156,7 +156,7 @@ Code Llama is specialized in code understanding, but it's a language model in it This is a specialized task particular to code models. The model is trained to generate the code (including comments) that best matches an existing prefix and suffix. This is the strategy typically used by code assistants: they are asked to fill the current cursor position, considering the contents that appear before and after it. -This task is available in the **base** and **instruction** variants of the 7B and 13B models. It is _not_ available for any of the models or the Python versions. +This task is available in the **base** and **instruction** variants of the 7B and 13B models. It is _not_ available for any of the 34B or 70B models or the Python versions. To use this feature successfully, you need to pay close attention to the format used to train the model for this task, as it uses special separators to identify the different parts of the prompt. Fortunately, transformers' `CodeLlamaTokenizer` makes this very easy, as demonstrated below: