Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion apps/dbagent/src/evals/chat/tool-choice.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -153,7 +153,7 @@ describe.concurrent('tool_choice', () => {
{
id: 'explain_query',
prompt: 'Explain SELECT * FROM dogs',
expectedToolCalls: ['unsafeExplainQuery'],
expectedToolCalls: ['safeExplainQuery'],
allowOtherTools: false
},
{
Expand Down
2 changes: 1 addition & 1 deletion apps/dbagent/src/lib/ai/prompts.ts
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ If the user asks for something that is not related to PostgreSQL or database adm
export const chatSystemPrompt = `
Provide clear, concise, and accurate responses to questions.
Use the provided tools to get context from the PostgreSQL database to answer questions.
When asked why a query is slow, call the unsafeExplainQuery tool and also take into account the table sizes.
When asked why a query is slow, call the safeExplainQuery tool and also take into account the table sizes.
During the initial assessment use the getTablesInfo, getPerfromanceAndVacuumSettings, getConnectionsStats, and getPostgresExtensions, and others if you want.
When asked to run a playbook, use the getPlaybook tool to get the playbook contents. Then use the contents of the playbook
as an action plan. Execute the plan step by step.
Expand Down
9 changes: 5 additions & 4 deletions apps/dbagent/src/lib/ai/tools/db.ts
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ export class DBSQLTools implements ToolsetGroup {
return tool({
description: `Get a list of slow queries formatted as a JSON array. Contains how many times the query was called,
the max execution time in seconds, the mean execution time in seconds, the total execution time
(all calls together) in seconds, and the query itself.`,
(all calls together) in seconds, the query itself, and the queryid for use with safeExplainQuery.`,
parameters: z.object({}),
execute: async () => {
try {
Expand Down Expand Up @@ -95,10 +95,11 @@ If you know the schema, pass it in as well.`,
const pool = this.#pool;
return tool({
description: `Safely run EXPLAIN on a query by fetching it from pg_stat_statements using queryId.
This prevents SQL injection by not accepting raw SQL queries. Returns the explain plan as received from PostgreSQL.`,
This prevents SQL injection by not accepting raw SQL queries. Returns the explain plan as received from PostgreSQL.
Use the queryid field from the getSlowQueries tool output as the queryId parameter.`,
parameters: z.object({
schema: z.string().optional(),
queryId: z.string().describe('The query ID from pg_stat_statements')
schema: z.string(),
queryId: z.string().describe('The query ID from pg_stat_statements (use the queryid field from getSlowQueries)')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The schema parameter is defined as required in the Zod schema but has a default value in the destructuring, creating inconsistent behavior.

View Details
📝 Patch Details
diff --git a/apps/dbagent/src/lib/ai/tools/db.ts b/apps/dbagent/src/lib/ai/tools/db.ts
index 35fd3e9..9e155c4 100644
--- a/apps/dbagent/src/lib/ai/tools/db.ts
+++ b/apps/dbagent/src/lib/ai/tools/db.ts
@@ -72,7 +72,7 @@ The query needs to be complete, it cannot contain $1, $2, etc. If you need to, r
 It's very important that $1, $2, etc. are not passed to this tool. Use the tool describeTable to get the types of the columns.
 If you know the schema, pass it in as well.`,
       parameters: z.object({
-        schema: z.string(),
+        schema: z.string().optional(),
         query: z.string()
       }),
       execute: async ({ schema = 'public', query }) => {
@@ -98,7 +98,7 @@ If you know the schema, pass it in as well.`,
 This prevents SQL injection by not accepting raw SQL queries. Returns the explain plan as received from PostgreSQL.
 Use the queryid field from the getSlowQueries tool output as the queryId parameter.`,
       parameters: z.object({
-        schema: z.string(),
+        schema: z.string().optional(),
         queryId: z.string().describe('The query ID from pg_stat_statements (use the queryid field from getSlowQueries)')
       }),
       execute: async ({ schema = 'public', queryId }) => {
@@ -116,7 +116,7 @@ Use the queryid field from the getSlowQueries tool output as the queryId paramet
     return tool({
       description: `Describe a table. If you know the schema, pass it as a parameter. If you don't, use public.`,
       parameters: z.object({
-        schema: z.string(),
+        schema: z.string().optional(),
         table: z.string()
       }),
       execute: async ({ schema = 'public', table }) => {

Analysis

Zod schema validation failure when required parameters have default values

What fails: safeExplainQuery(), unsafeExplainQuery(), and describeTable() tools define schema parameter as required (z.string()) but use default values in destructuring ({ schema = 'public', ... }), causing Zod validation to fail when AI models don't provide the parameter

How to reproduce:

// Test with the original schema definition
const schema = z.object({ schema: z.string(), queryId: z.string() });
schema.parse({ queryId: '123' }); // Fails with "Required" error

Result: Zod validation throws error: [{"code": "invalid_type", "expected": "string", "received": "undefined", "path": ["schema"], "message": "Required"}]

Expected: Should accept missing schema parameter and use default value 'public' as intended by the destructuring pattern

Fix: Changed schema: z.string() to schema: z.string().optional() in all three affected tool parameter schemas to match the intended behavior with default values

}),
execute: async ({ schema = 'public', queryId }) => {
try {
Expand Down
4 changes: 3 additions & 1 deletion apps/dbagent/src/lib/targetdb/db.ts
Original file line number Diff line number Diff line change
Expand Up @@ -361,6 +361,7 @@ interface SlowQuery {
mean_exec_secs: number;
total_exec_secs: number;
query: string;
queryid: string;
}

export async function getSlowQueries(client: ClientBase, thresholdMs: number): Promise<SlowQuery[]> {
Expand All @@ -370,7 +371,8 @@ export async function getSlowQueries(client: ClientBase, thresholdMs: number): P
round(max_exec_time/1000) max_exec_secs,
round(mean_exec_time/1000) mean_exec_secs,
round(total_exec_time/1000) total_exec_secs,
query
query,
queryid::text as queryid
FROM pg_stat_statements
WHERE max_exec_time > $1
ORDER BY total_exec_time DESC
Expand Down
7 changes: 2 additions & 5 deletions apps/dbagent/src/lib/targetdb/safe-explain.ts
Original file line number Diff line number Diff line change
Expand Up @@ -19,18 +19,15 @@ export async function safeExplainQuery(client: ClientBase, schema: string, query
return 'The query is not a single safe statement. Only SELECT, INSERT, UPDATE, DELETE, and WITH statements are allowed.';
}

if (query.includes('$1') || query.includes('$2') || query.includes('$3') || query.includes('$4')) {
// TODO: we could use `GENERIC_PLAN` to still get the plan in this case.
return 'The query seems to contain placeholders ($1, $2, etc). Replace them with actual values and try again.';
}
const hasPlaceholders = /\$\d+/.test(query);

let toReturn = '';
try {
await client.query('BEGIN');
await client.query("SET LOCAL statement_timeout = '2000ms'");
await client.query("SET LOCAL lock_timeout = '200ms'");
await client.query(`SET search_path TO ${schema}`);
const explainQuery = `EXPLAIN ${query}`;
const explainQuery = hasPlaceholders ? `EXPLAIN (GENERIC_PLAN true) ${query}` : `EXPLAIN ${query}`;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

console.log(schema);
console.log(explainQuery);
const result = await client.query(explainQuery);
Expand Down
2 changes: 1 addition & 1 deletion apps/dbagent/src/lib/tools/playbooks.ts
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ Use the tool findTableSchema to find the schema of the table involved in the slo
Use the tool describeTable to describe the table you found.

Step 4:
Use the tool unsafeExplainQuery to explain the slow queries. Make sure to pass the schema you found to the tool.
Use the tool safeExplainQuery to explain the slow queries. Make sure to pass the schema you found to the tool.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The playbook instructions tell users to "replace the query parameters ($1, $2, etc) with actual values" when using safeExplainQuery, but this tool fetches queries by queryId from pg_stat_statements, so users cannot modify the query.

View Details
📝 Patch Details
diff --git a/apps/dbagent/src/lib/tools/playbooks.ts b/apps/dbagent/src/lib/tools/playbooks.ts
index cf4fc47..57b6818 100644
--- a/apps/dbagent/src/lib/tools/playbooks.ts
+++ b/apps/dbagent/src/lib/tools/playbooks.ts
@@ -24,9 +24,10 @@ Use the tool findTableSchema to find the schema of the table involved in the slo
 Use the tool describeTable to describe the table you found.
 
 Step 4:
-Use the tool safeExplainQuery to explain the slow queries. Make sure to pass the schema you found to the tool. 
-Also, it's very important to replace the query parameters ($1, $2, etc) with the actual values. Generate your own values, but
-  take into account the data types of the columns.
+Use the tool safeExplainQuery to explain the slow queries. Make sure to pass the schema you found to the tool
+and the queryid from the getSlowQueries output. Note that safeExplainQuery automatically fetches the query
+from pg_stat_statements using the queryid, so you cannot modify the query text. If the query contains
+placeholders ($1, $2, etc), you may need to use unsafeExplainQuery instead with manually replaced values.
 
 Step 5:
 If the previous step indicates that an index is missing, tell the user the exact DDL to create the index.

Analysis

Incorrect safeExplainQuery instructions in slow queries playbook

What fails: SLOW_QUERIES_PLAYBOOK Step 4 instructs users to "replace the query parameters ($1, $2, etc) with actual values" when using safeExplainQuery, but this tool fetches queries by queryId from pg_stat_statements automatically, making parameter replacement impossible.

How to reproduce:

  1. Call getSlowQueries() to get queries with placeholders like "$1, $2"
  2. Follow playbook Step 4 instructions to use safeExplainQuery with the queryId
  3. safeExplainQuery fetches the query from pg_stat_statements WHERE queryid = $1 and detects placeholders
  4. Function returns error: "The query seems to contain placeholders ($1, $2, etc). Replace them with actual values and try again."

Result: Users receive contradictory error message telling them to replace parameters they cannot modify, since safeExplainQuery automatically retrieves the query text from the database.

Expected: Instructions should clarify that safeExplainQuery uses queryId to automatically fetch queries from pg_stat_statements, and suggest using unsafeExplainQuery when parameter replacement is needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is correct, we should remove that instruction to the model.

Also, it's very important to replace the query parameters ($1, $2, etc) with the actual values. Generate your own values, but
take into account the data types of the columns.

Expand Down
Loading