A Github bot that automatically responds to issues using real-time data from your library's documentation, source code, & past issues.
View the demo over Pydantic's documentation: #5
Original Project
|
@fleet_ai
You'll need several things before we get set up.
-
A Pinecone account. Make one here.
-
An OpenAI account. Make one here.
-
Embeddings. We already have support for embeddings covering the top 1221 libraries.
You can check if your library is supported here. If it's supported, you'll be able to get this set up pretty quickly. If it's not, you'll have to embed your documentation yourself. Luckly, we have a full guide for you to follow.
We will be using ngrok to set up your FastAPI endpoints, but feel free to use any alternatives. If you want to deploy this and put it in production, we recommend a more production-ready solution like AWS EC2 or similar.
First, clone this repository and install all requirements:
git clone https://github.com/fleet-ai/code-pilot.git
cd code-pilot
pip install -r requirements.txt
python main.py
Then, in a separate terminal window, follow the ngrok quickstart guide and run:
ngrok http 8000
This will set up web forwarding — any requests that go through ngrok will forward to your localhost server.
First, you'll need to create a new Github bot. Click on your profile on the top right corner of Github, then click Settings
-> Developer Settings
(should be the bottom option on the left sidebar). You should be greeted with this screen:
Fill out the form with the right details, or leave it as the default value. Insert your ngrok URL you got from the previous section into the Webhook URL
:
Under "Permissions", make sure you allow "Read and write" for issues. This is the only one you'll need.
You'll also want to subscribe to the Issues
and Issue comment
events so that your API is properly notified.
Once you fill these out, click Create!
Once you're done creating your bot, you should get a notification prompting you to create a Github private key. Go ahead and click that, then click "Generate Private Key". It should download a .pem file for you automatically.
Drag and drop your .pem file into the root directory of the cloned code-pilot
repository. We will be JWT and this .pem file to get a Github access token.
Create a new file in the root directory called .env
and add the line:
OPENAI_API_KEY=<your openai api key>
Now, you need to create a new Pinecone index. Follow Pinecone's instructions on how to do that. The pod that you use doesn't matter too much and depends on if you want to optimize for storage or performance. We recommend dotproduct as your metric, as we will be implementing a hybrid retrievals system.
Go to constants.py
and update the following constants:
INDEX_NAME
: the name of your index.INDEX_ENVIRONMENT
: the environment for your index (ie "us-east-1-aws")NAMESPACE
: the namespace you will be using within the index. Feel free to keep it blank.
In your .env file, add the line:
PINECONE_API_KEY=<your pinecone api key>
Go to constants.py
and update the following constants:
APP_ID
: your Github app's ID. You can find it under the "General" tab within your app settingsBOT_NAME
: whatever you want to name your botPRIVATE_KEY_PATH
: the name of the .pem file you just added to your root directory
PATH_TO_SRC_CODE
: the path to the root directory of the source code you want scraped. Must start withsrc_code/
as that's where source code will be cloned into. If you don't change this, it will by default scrape your entire repository (which you may not always want).
Using Fleet's context
module, we've written a script to automatically download and upsert your library's embeddings to Pinecone. Simply run:
make docs library_name=<your library name>
Check your Pinecone index to make sure everything was properly upserted.
You can view all supported libraries and their associated library names here. If your library is not supported by Fleet Context out of the box, you can embed your documentation yourself using our guide and continue through this tutorial once you've completed that.
Regardless of whether or not your library is supported by Fleet Context out of the box, you'll be able to embed your source code so that your bot can reference it. Simply run:
make code url=<git clone link, i.e. https://github.com/pydantic/pydantic.git>
The script will clone the repository, scrape/chunk/embed the source code, then upsert it into your Pinecone index.
Install the app to your repository. You should see that it starts an asynchronous job to embed all past issues. Wait for this to finish. Tada, you have your past issues embedded!
Real-time issues embeddings are automatically supported. Every time a new issue is created, it automatically chunks and embeds it so that the bot always has real-time information about your library.
Open an issue and ask a question. It should give you a response with the right context.
Congratulations, you've set up your own issues responder bot!