Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/pull precheck #11

Closed
Show file tree
Hide file tree
Changes from 19 commits
Commits
Show all changes
61 commits
Select commit Hold shift + click to select a range
61799d0
chore: use default wrangler dev port, init handler
Keyrxng Oct 23, 2024
86e7852
chore: remove .d.ts from manually written type file
Keyrxng Oct 23, 2024
9001723
chore: add new supported events, type handler
Keyrxng Oct 23, 2024
53d9a96
chore: gitignore temp payloads
Keyrxng Oct 23, 2024
af762e6
Merge branch 'development' into feat/pull-precheck
Keyrxng Oct 23, 2024
65a3a36
chore: submitCodeReview handler
Keyrxng Oct 24, 2024
f02dafa
chore: gql for task relations
Keyrxng Oct 24, 2024
f81eca0
chore: create sys msg fn, use array joins, update type name
Keyrxng Oct 24, 2024
89cbd3f
chore: improve sys msg readability, move to own file
Keyrxng Oct 24, 2024
eae71c5
chore: default sys msg and type import
Keyrxng Oct 24, 2024
b28670b
chore: llm-query-output handler
Keyrxng Oct 24, 2024
2d26534
chore: formatting
Keyrxng Oct 24, 2024
2c0325c
feat: basis for pull precheck
Keyrxng Oct 24, 2024
e6586a4
feat: dynamic ground truths
Keyrxng Oct 24, 2024
42934ff
chore: get issue no from payload util
Keyrxng Oct 24, 2024
63ab3e4
chore: precheck handler complete
Keyrxng Oct 24, 2024
638acbf
chore: gql updates, format, target: ESNEXT for regex groups
Keyrxng Oct 24, 2024
00534fa
chore: eslint, remove console.log, type fix
Keyrxng Oct 24, 2024
18f467f
chore: fix tests, type context for comment.created fns
Keyrxng Oct 24, 2024
e24add8
chore: pass single object param
Keyrxng Oct 24, 2024
788403d
chore: cleanup pull-precheck handler
Keyrxng Oct 24, 2024
48956a8
chore: correct Logs
Keyrxng Oct 24, 2024
50eae39
chore: move helper
Keyrxng Oct 24, 2024
b5b418d
chore: owner - repo - issueNo url util
Keyrxng Oct 24, 2024
8922866
chore: one review per day check
Keyrxng Oct 24, 2024
7501862
chore: convert to draft
Keyrxng Oct 24, 2024
b1d5d70
chore: context fallback for missing task spec
Keyrxng Oct 24, 2024
bbefbad
chore: get task spec
Keyrxng Oct 24, 2024
baa8ade
chore: has collaborator converted
Keyrxng Oct 24, 2024
d5359e8
chore: move hardcoded MAX_TOKENS into constants.ts
Keyrxng Oct 24, 2024
258c077
chore: tool handling
Keyrxng Oct 24, 2024
0f8ef70
chore: relocate helper fns
Keyrxng Oct 24, 2024
82cbb1e
chore: use my original agent logic - untested
Keyrxng Oct 24, 2024
e0f2d51
chore: adapters/openai/types and llm tools
Keyrxng Oct 24, 2024
5852dc4
chore: convertPrToDraft refactored for llm tooling
Keyrxng Oct 24, 2024
35e0429
chore: allow null body, format
Keyrxng Oct 31, 2024
228f486
chore: remove github-diff-tool
Keyrxng Oct 31, 2024
841b9a5
chore: return body hash matching, simplify diff fetch
Keyrxng Oct 31, 2024
dafdd09
chore: update logs to capture full final ctx
Keyrxng Oct 31, 2024
98c6839
chore: list normal comment on PR, remove readme from comments
Keyrxng Oct 31, 2024
d727f79
chore: readme block section, fetch only for current issue repo
Keyrxng Oct 31, 2024
0a63c34
chore: askGpt > askLlm
Keyrxng Oct 31, 2024
e2f6fad
chore: return empty array not a throw
Keyrxng Oct 31, 2024
9784ec5
chore: format, ctx window formatting fixes
Keyrxng Oct 31, 2024
e20a2c5
chore: fix tests
Keyrxng Oct 31, 2024
f1e5ba7
chore: sysMsg formatting fix
Keyrxng Oct 31, 2024
e817454
chore: hardcode model token limits
Keyrxng Oct 31, 2024
26f1709
chore: hardcode no language/deps responses
Keyrxng Oct 31, 2024
c5d4beb
chore: readability
Keyrxng Oct 31, 2024
d4b5a90
chore: ignore pr template html hashMatch
Keyrxng Oct 31, 2024
34b7ab8
chore: token handling
Keyrxng Oct 31, 2024
b877526
chore: diff fetch err handling
Keyrxng Oct 31, 2024
97868df
chore: pr parsing
Keyrxng Oct 31, 2024
398e993
chore: type, tests, cspell
Keyrxng Oct 31, 2024
0324dc9
chore: remove jsdoc comments, add helpful comments
Keyrxng Oct 31, 2024
f4332f9
chore: remove unused gql fetch
Keyrxng Oct 31, 2024
a6ffb03
chore: remove exclusion by file ext
Keyrxng Oct 31, 2024
a5221c7
chore: type null, filter null
Keyrxng Oct 31, 2024
2718477
chore: remove unused type
Keyrxng Nov 1, 2024
bf67520
Merge branch 'fix/missing-context' into feat/pull-precheck
Keyrxng Nov 3, 2024
9104945
chore: push old and merge missing ctx fixes
Keyrxng Nov 3, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .cspell.json
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,8 @@
"mixtral",
"nemo",
"Reranking",
"mistralai"
"mistralai",
"Precheck"
],
"dictionaries": ["typescript", "node", "software-terms"],
"import": ["@cspell/dict-typescript/cspell-ext.json", "@cspell/dict-node/cspell-ext.json", "@cspell/dict-software-terms"],
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -16,3 +16,4 @@ cypress/screenshots
script.ts
.wrangler
test-dashboard.md
payloads.json
2 changes: 1 addition & 1 deletion manifest.json
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
{
"name": "command-ask",
"description": "A highly context aware organization integrated chatbot",
"ubiquity:listeners": ["issue_comment.created"]
"ubiquity:listeners": ["issue_comment.created", "pull_request.opened", "pull_request.ready_for_review"]
}
1 change: 1 addition & 0 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@
],
"dependencies": {
"@mswjs/data": "^0.16.2",
"@octokit/graphql-schema": "^15.25.0",
"@octokit/rest": "20.1.1",
"@octokit/webhooks": "13.2.7",
"@sinclair/typebox": "0.32.33",
Expand Down
37 changes: 23 additions & 14 deletions src/adapters/openai/helpers/completions.ts
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ import { Context } from "../../../types";
import { SuperOpenAi } from "./openai";
const MAX_TOKENS = 7000;

export interface CompletionsType {
export interface ResponseFromLlm {
answer: string;
tokenUsage: {
input: number;
Expand All @@ -20,14 +20,34 @@ export class Completions extends SuperOpenAi {
this.context = context;
}

private _createSystemMessage(systemMessage: string, additionalContext: string[], localContext: string[], groundTruths: string[], botName: string) {
// safer to use array join than string concatenation
const parts = [
"You Must obey the following ground truths: [",
groundTruths.join(":"),
"]\n",
systemMessage,
"Your name is : ",
botName,
"\n",
"Primary Context: ",
additionalContext.join("\n"),
"\nLocal Context: ",
localContext.join("\n"),
];

return parts.join("\n");
}

async createCompletion(
systemMessage: string,
prompt: string,
model: string = "o1-mini",
additionalContext: string[],
localContext: string[],
groundTruths: string[],
botName: string
): Promise<CompletionsType> {
): Promise<ResponseFromLlm> {
const res: OpenAI.Chat.Completions.ChatCompletion = await this.client.chat.completions.create({
model: model,
messages: [
Expand All @@ -36,18 +56,7 @@ export class Completions extends SuperOpenAi {
content: [
{
type: "text",
text:
"You Must obey the following ground truths: [" +
groundTruths.join(":") +
"]\n" +
"You are tasked with assisting as a GitHub bot by generating responses based on provided chat history and similar responses, focusing on using available knowledge within the provided corpus, which may contain code, documentation, or incomplete information. Your role is to interpret and use this knowledge effectively to answer user questions.\n\n# Steps\n\n1. **Understand Context**: Review the chat history and any similar provided responses to understand the context.\n2. **Extract Relevant Information**: Identify key pieces of information, even if they are incomplete, from the available corpus.\n3. **Apply Knowledge**: Use the extracted information and relevant documentation to construct an informed response.\n4. **Draft Response**: Compile the gathered insights into a coherent and concise response, ensuring it's clear and directly addresses the user's query.\n5. **Review and Refine**: Check for accuracy and completeness, filling any gaps with logical assumptions where necessary.\n\n# Output Format\n\n- Concise and coherent responses in paragraphs that directly address the user's question.\n- Incorporate inline code snippets or references from the documentation if relevant.\n\n# Examples\n\n**Example 1**\n\n*Input:*\n- Chat History: \"What was the original reason for moving the LP tokens?\"\n- Corpus Excerpts: \"It isn't clear to me if we redid the staking yet and if we should migrate. If so, perhaps we should make a new issue instead. We should investigate whether the missing LP tokens issue from the MasterChefV2.1 contract is critical to the decision of migrating or not.\"\n\n*Output:*\n\"It was due to missing LP tokens issue from the MasterChefV2.1 Contract.\n\n# Notes\n\n- Ensure the response is crafted from the corpus provided, without introducing information outside of what's available or relevant to the query.\n- Consider edge cases where the corpus might lack explicit answers, and justify responses with logical reasoning based on the existing information." +
"Your name is : " +
botName +
"\n" +
"Primary Context: " +
additionalContext.join("\n") +
"\nLocal Context: " +
localContext.join("\n"),
text: this._createSystemMessage(systemMessage, additionalContext, localContext, groundTruths, botName),
},
],
},
Expand Down
32 changes: 32 additions & 0 deletions src/adapters/openai/helpers/prompts.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
export const DEFAULT_SYSTEM_MESSAGE = `You are tasked with assisting as a GitHub bot by generating responses based on provided chat history and similar responses, focusing on using available knowledge within the provided corpus, which may contain code, documentation, or incomplete information. Your role is to interpret and use this knowledge effectively to answer user questions.

# Steps

1. **Understand Context**: Review the chat history and any similar provided responses to understand the context.
2. **Extract Relevant Information**: Identify key pieces of information, even if they are incomplete, from the available corpus.
3. **Apply Knowledge**: Use the extracted information and relevant documentation to construct an informed response.
4. **Draft Response**: Compile the gathered insights into a coherent and concise response, ensuring it's clear and directly addresses the user's query.
5. **Review and Refine**: Check for accuracy and completeness, filling any gaps with logical assumptions where necessary.

# Output Format

- Concise and coherent responses in paragraphs that directly address the user's question.
- Incorporate inline code snippets or references from the documentation if relevant.

# Examples

**Example 1**

*Input:*
- Chat History: "What was the original reason for moving the LP tokens?"
- Corpus Excerpts: "It isn't clear to me if we redid the staking yet and if we should migrate. If so, perhaps we should make a new issue instead. We should investigate whether the missing LP tokens issue from the MasterChefV2.1 contract is critical to the decision of migrating or not."

*Output:*
"It was due to missing LP tokens issue from the MasterChefV2.1 Contract.

# Notes

- Ensure the response is crafted from the corpus provided, without introducing information outside of what's available or relevant to the query.
- Consider edge cases where the corpus might lack explicit answers, and justify responses with logical reasoning based on the existing information.`;

export const PULL_PRECHECK_SYSTEM_MESSAGE = `Perform code review using the diff and spec.`;
4 changes: 3 additions & 1 deletion src/handlers/add-comment.ts
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
import { getIssueNumberFromPayload } from "../helpers/get-issue-no-from-payload";
import { Context } from "../types/context";

/**
Expand All @@ -7,7 +8,8 @@ import { Context } from "../types/context";
*/
export async function addCommentToIssue(context: Context, message: string) {
const { payload } = context;
const issueNumber = payload.issue.number;
const issueNumber = getIssueNumberFromPayload(payload);

try {
await context.octokit.issues.createComment({
owner: payload.repository.owner.login,
Expand Down
9 changes: 6 additions & 3 deletions src/handlers/ask-llm.ts
Original file line number Diff line number Diff line change
@@ -1,10 +1,11 @@
import { Context } from "../types";
import { CompletionsType } from "../adapters/openai/helpers/completions";
import { ResponseFromLlm } from "../adapters/openai/helpers/completions";
import { CommentSimilaritySearchResult } from "../adapters/supabase/helpers/comment";
import { IssueSimilaritySearchResult } from "../adapters/supabase/helpers/issues";
import { recursivelyFetchLinkedIssues } from "../helpers/issue-fetching";
import { formatChatHistory } from "../helpers/format-chat-history";
import { optimizeContext } from "../helpers/issue";
import { DEFAULT_SYSTEM_MESSAGE } from "../adapters/openai/helpers/prompts";

/**
* Asks a question to GPT and returns the response
Expand All @@ -13,14 +14,15 @@ import { optimizeContext } from "../helpers/issue";
* @returns The response from GPT
* @throws If no question is provided
*/
export async function askQuestion(context: Context, question: string) {
export async function askQuestion(context: Context<"issue_comment.created">, question: string) {
if (!question) {
throw context.logger.error("No question provided");
}
const { specAndBodies, streamlinedComments } = await recursivelyFetchLinkedIssues({
context,
owner: context.payload.repository.owner.login,
repo: context.payload.repository.name,
issueNum: context.payload.issue.number,
});
const formattedChat = await formatChatHistory(context, streamlinedComments, specAndBodies);
context.logger.info(`${formattedChat.join("")}`);
Expand All @@ -34,7 +36,7 @@ export async function askQuestion(context: Context, question: string) {
* @param formattedChat - The formatted chat history to provide context to GPT
* @returns completions - The completions generated by GPT
**/
export async function askGpt(context: Context, question: string, formattedChat: string[]): Promise<CompletionsType> {
export async function askGpt(context: Context, question: string, formattedChat: string[]): Promise<ResponseFromLlm> {
const {
env: { UBIQUITY_OS_APP_NAME },
config: { model, similarityThreshold },
Expand Down Expand Up @@ -63,6 +65,7 @@ export async function askGpt(context: Context, question: string, formattedChat:
similarText = similarText.filter((text) => text !== "");
const rerankedText = similarText.length > 0 ? await context.adapters.voyage.reranker.reRankResults(similarText, question) : [];
return context.adapters.openai.completions.createCompletion(
DEFAULT_SYSTEM_MESSAGE,
question,
model,
rerankedText,
Expand Down
21 changes: 3 additions & 18 deletions src/handlers/comment-created-callback.ts
Original file line number Diff line number Diff line change
@@ -1,8 +1,7 @@
import { Context, SupportedEvents } from "../types";
import { addCommentToIssue } from "./add-comment";
import { askQuestion } from "./ask-llm";
import { CallbackResult } from "../types/proxy";
import { bubbleUpErrorComment } from "../helpers/errors";
import { askQuestion } from "./ask-llm";
import { handleLlmQueryOutput } from "./llm-query-output";

export async function issueCommentCreatedCallback(
context: Context<"issue_comment.created", SupportedEvents["issue_comment.created"]>
Expand All @@ -23,19 +22,5 @@ export async function issueCommentCreatedCallback(
return { status: 204, reason: logger.info("Comment is empty. Skipping.").logMessage.raw };
}
logger.info(`Asking question: ${question}`);

try {
const response = await askQuestion(context, question);
const { answer, tokenUsage } = response;
if (!answer) {
throw logger.error(`No answer from OpenAI`);
}
logger.info(`Answer: ${answer}`, { tokenUsage });
const tokens = `\n\n<!--\n${JSON.stringify(tokenUsage, null, 2)}\n--!>`;
const commentToPost = answer + tokens;
await addCommentToIssue(context, commentToPost);
return { status: 200, reason: logger.info("Comment posted successfully").logMessage.raw };
} catch (error) {
throw await bubbleUpErrorComment(context, error, false);
}
return await handleLlmQueryOutput(context, await askQuestion(context, question));
}
86 changes: 86 additions & 0 deletions src/handlers/find-ground-truths.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
import OpenAI from "openai";
import { Context } from "../types";
import { logger } from "../helpers/errors";

const FIND_GROUND_TRUTHS_SYSTEM_MESSAGE = `Using the input provided, your goal is to produce an array of strings that represent "Ground Truths."
These ground truths are high-level abstractions that encapsulate the key aspects of the task.
They serve to guide and inform our code review model's interpretation of the task by providing clear, concise, and explicit insights.

Each ground truth should:
- Be succinct and easy to understand.
- Directly pertain to the task at hand.
- Focus on essential requirements, behaviors, or assumptions involved in the task.

Example:
Task: Implement a function that adds two numbers.
Ground Truths:
- The function should accept two numerical inputs.
- The function should return the sum of the two inputs.
- Inputs must be validated to ensure they are numbers.

Based on the given task, generate similar ground truths adhering to a maximum of 10.

Return a JSON parsable array of strings representing the ground truths, without comment or directive.`;

function validateGroundTruths(truthsString: string): string[] {
let truths;
try {
truths = JSON.parse(truthsString);
} catch (err) {
throw logger.error("Failed to parse ground truths");
}
if (!Array.isArray(truths)) {
throw logger.error("Ground truths must be an array");
}

if (truths.length > 10) {
throw logger.error("Ground truths must not exceed 10");
}

truths.forEach((truth: string) => {
if (typeof truth !== "string") {
throw logger.error("Each ground truth must be a string");
}
});

return truths;
}

export async function findGroundTruths(context: Context, groundTruthSource: string) {
const {
env: { OPENAI_API_KEY },
config: { openAiBaseUrl, model },
} = context;

const openAi = new OpenAI({
apiKey: OPENAI_API_KEY,
...(openAiBaseUrl && { baseURL: openAiBaseUrl }),
});

const res = await openAi.chat.completions.create({
messages: [
{
role: "system",
content: FIND_GROUND_TRUTHS_SYSTEM_MESSAGE,
},
{
role: "user",
content: groundTruthSource,
},
],
/**
* I've used the config model here but in my opinion,
* we should optimize this for a quicker response which
* means no advanced reasoning models. rfc
Keyrxng marked this conversation as resolved.
Show resolved Hide resolved
*/
model: model,
});

const output = res.choices[0].message.content;

if (!output) {
throw logger.error("Failed to produce a ground truths response");
}

return validateGroundTruths(output);
}
22 changes: 22 additions & 0 deletions src/handlers/llm-query-output.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
import { ResponseFromLlm } from "../adapters/openai/helpers/completions";
import { bubbleUpErrorComment } from "../helpers/errors";
import { Context } from "../types";
import { CallbackResult } from "../types/proxy";
import { addCommentToIssue } from "./add-comment";

export async function handleLlmQueryOutput(context: Context, llmResponse: ResponseFromLlm): Promise<CallbackResult> {
const { logger } = context;
try {
const { answer, tokenUsage } = llmResponse;
if (!answer) {
throw logger.error(`No answer from OpenAI`);
}
logger.info(`Answer: ${answer}`, { tokenUsage });
const tokens = `\n\n<!--\n${JSON.stringify(tokenUsage, null, 2)}\n--!>`;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have some means to add structured metadata via a method from the SDK

const commentToPost = answer + tokens;
await addCommentToIssue(context, commentToPost);
return { status: 200, reason: logger.info("Comment posted successfully").logMessage.raw };
} catch (error) {
throw await bubbleUpErrorComment(context, error, false);
}
}
Loading
Loading