This repository provides a collection of Generative AI engineering notebooks that demonstrate how to use Amazon SageMaker JumpStart SDK to customize Large Language Models (LLMs). The notebooks show using the Falcon model variants how to apply basic levels of inference customization such as: decoding strategies, prompting techniques, and Retrieval-Augmented Generation. The notebooks are designed to be easy to deploy and follow, making them a good resource for learning about LLM inference customization.
The following Amazon SageMaker Studio notebooks are available in this repository:
LLM-Custom-Decoding-Falcon40B-G5.ipynb
demonstrates how to generate text using different decoding strategies with Amazon SageMaker JumpStart SDK and Falcon-40B-Instruct model.LLM-Custom-Prompting-Falcon40B.ipynb
demonstrates how to generate text using prompting engineering techniques with Amazon SageMaker JumpStart SDK and Falcon-40B model.LLM-Custom-RAG-Kendra-Falcon40B.ipynb
demonstrates how to use SageMaker and boto3 SDKs to generate text using the Retrieval-Augmented Generation (RAG) pattern. The notebook implements semantic search using Amazon Kendra enterprise search service. The language model used for text generation is Falcon-40B-Instruct.
To open a Jupyter Notebook using Amazon SageMaker, consider the two steps below:
- Create or Open an Amazon SageMaker Studio Notebook.
- Clone this Git Repository in Amazon SageMaker Studio.
See CONTRIBUTING for more information.
This library is licensed under the MIT-0 License. See the LICENSE file.