Skip to content

Commit

Permalink
Added examples tab with write up
Browse files Browse the repository at this point in the history
  • Loading branch information
milo157 committed Nov 9, 2023
1 parent 0ea3b48 commit 57a2260
Show file tree
Hide file tree
Showing 12 changed files with 893 additions and 186 deletions.
2 changes: 1 addition & 1 deletion cerebrium/deployments/ci-cd.mdx
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
---
title: "CI/CD Pipelines"
description: "Integrate Cerebrium into your CI/CD workflow for automated deployments"
---
---
148 changes: 0 additions & 148 deletions cerebrium/examples/stable_diffusion.mdx

This file was deleted.

62 changes: 37 additions & 25 deletions cerebrium/examples/langchain.mdx → examples/langchain.mdx
Original file line number Diff line number Diff line change
@@ -1,32 +1,25 @@
---
title: "Langchain Q&A YouTube"
title: "Langchain Q&A on a YouTube Video"
description: "To deploy a Q&A application around a YouTube video"
---

In this tutorial, we will recreate a question-answering bot that can answer questions based on a YouTube video. We recreated the application built [here](https://colab.research.google.com/drive/1sKSTjt9cPstl_WMZ86JsgEqFG-aSAwkn?usp=sharing) by @m_morzywolek.

To see the final implementation, you can view it [here](https://github.com/CerebriumAI/examples/tree/master/3-langchain)
To see the final implementation, you can view it [here](https://github.com/CerebriumAI/examples/tree/master/8-lanchain-QA)

## Create Cerebrium Account

Before building, you need to set up a Cerebrium account. This is as simple as
starting a new Project in Cerebrium and copying the API key. This will be used
to authenticate all calls for this project.

### Create a project

1. Go to [dashboard.cerebrium.ai](https://dashboard.cerebrium.ai)
2. Sign up or Login
3. Navigate to the API Keys page
4. You will need your private API key for deployments. Click the copy button to copy it to your clipboard

![API Key](/images/cortex/api_keys_private_key.png)

## Basic Setup

It is important to think of the way you develop models using Cerebrium should be identical to developing on a virtual machine or Google Colab - so converting this should be very easy!
Please make sure you have the Cerebrium package installed and have logged in. If not, please take a look at our docs [here](https://docs.cerebrium.ai/cerebrium/getting-started/installation)

First we create our project:
```
cerebrium init langchain-QA
```

Let us create our **_requirements.txt_** file and add the following packages:
We need certain Python packages in order to implement this project. Lets add those to our **_requirements.txt_** file:

```
pytube # For audio downloading
Expand All @@ -39,15 +32,15 @@ sentence_transformers
cerebrium
```

To use Whisper we also have to install ffmpeg and a few other packages as a Linux package and therefore have to create another file to define it, **pkglist.txt** - this is to install all Linux-based packages.
To use Whisper we also have to install ffmpeg and a few other packages as a Linux package and therefore have to define these in **pkglist.txt** - this is to install all Linux-based packages.

```
ffmpeg
libopenblas-base
libomp-dev
```

To start, we need to create a **main.py** file which will contain our main Python code. This is a relatively simple implementation, so we can do everything in 1 file. We would like a user to send in a link to a YouTube video with a question and return to them the answer as well as the time segment of where we got that response.
Our **main.py** file will contain our main Python code. This is a relatively simple implementation, so we can do everything in 1 file. We would like a user to send in a link to a YouTube video with a question and return to them the answer as well as the time segment of where we got that response.
So let us define our request object.

```python
Expand All @@ -58,7 +51,7 @@ class Item(BaseModel):
question: str
```

Above, we use Pydantic as our data validation library, and BaseModel is where Cerebrium keeps some default parameters like "webhook_url" that you can use for long-running tasks but do not worry about that functionality for this tutorial. Due to the way that we have defined the Base Model, "url" and "question" are required parameters and so if they are not present in the request, the user will automatically receive an error message.
Above, we use Pydantic as our data validation library. Due to the way that we have defined the Base Model, "url" and "question" are required parameters and so if they are not present in the request, the user will automatically receive an error message.

## Convert Video to text

Expand All @@ -67,6 +60,7 @@ Below, we will use the Whisper model from OpenAI to convert the video audio to t
```python
import pytube
from datetime import datetime
import whisper

model = whisper.load_model("small")

Expand Down Expand Up @@ -107,7 +101,7 @@ def predict(item, run_id, logger):

### Langchain Implementation

Below, we will implement [Langchain](https://python.langchain.com/en/latest/index.html) to use a vectorstore, where we will store all our video segments above, with an LLM, Flan-T5, hosted on Cerebrium in order to generate answers.
Below, we will implement [Langchain](https://python.langchain.com/en/latest/index.html) to use a vectorstore, where we will store all our video segments above, with an LLM, locally hosted on Cerebrium, in order to generate answers.

```python
from langchain.embeddings.openai import OpenAIEmbeddings
Expand All @@ -120,7 +114,7 @@ import openai
import faiss

sentenceTransformer = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")
os.environ["CEREBRIUMAI_API_KEY"] = "private-XXXXXXXXXXXX"
os.environ["CEREBRIUMAI_API_KEY"] = <JWT_TOKEN>

def create_embeddings(texts, start_times):
text_splitter = CharacterTextSplitter(chunk_size=1500, separator="\n")
Expand Down Expand Up @@ -152,17 +146,35 @@ Above, we chunk our text segments and store them in a FAISS vector store. To cre

We then integrate Langchain with a Cerebrium deployed endpoint to answer questions. Lastly, we return the results

To deploy the model to an AMPERE_A5000, we use the following command:
## Deploy

Your config.yaml file is where you can set your compute/environment. Please make sure that the hardware you specify is a AMPERE_A5000 and that you have enough memory (RAM) on your instance to run the models. You config.yaml file should look like:

```
%YAML 1.2
---
hardware: AMPERE_A5000
memory: 14
cpu: 2
min_replicas: 0
log_level: INFO
include: '[./*, main.py, requirements.txt, pkglist.txt, conda_pkglist.txt]'
exclude: '[./.*, ./__*]'
cooldown: 60
disable_animation: false
```

To deploy the model use the following command:

```bash
cerebrium deploy langchain --hardware AMPERE\_A5000 --api-key private-XXXXXXXXXXXXX
cerebrium deploy langchain-QA
```

Once deployed, we can make the following request:

```curl
curl --location --request POST 'https://run.cerebrium.ai/v1/dev-p-xxxxxx/langchain/predict' \
--header 'Authorization: public-XXXXXXXXXXXX' \
curl --location --request POST 'https://run.cerebrium.ai/v3/p-xxxxxx/langchain/predict' \
--header 'Authorization: <JWT-TOKEN>' \
--header 'Content-Type: application/json' \
--data-raw '{
"url": "https://www.youtube.com/watch?v=UF8uR6Z6KLc&ab_channel=Stanford",
Expand Down
Loading

0 comments on commit 57a2260

Please sign in to comment.