eval-data

vipul-maheshwari

and

Vipul Maheshwari

fix(evals) : removing the dependence from the autoevals (#402 )

Apr 25, 2025

523f166 · Apr 25, 2025

History

Name	Name	Last commit message	Last commit date
parent directory ..
README.md	README.md	fix(evals) : removing the dependence from the autoevals (#402 )	Apr 25, 2025
test-queries.json	test-queries.json	fix(evals) : removing the dependence from the autoevals (#402 )	Apr 25, 2025

README.md

Test Queries Documentation

Overview

This directory contains test queries used for evaluating the AI system's performance. The evaluation framework uses these queries to measure factuality and accuracy of the system's responses.

File Format

Test queries are stored in JSON files, where each query is represented as an object with two required fields:

input: The query text that will be fed to the AI system
expected: The expected response that will be used to evaluate the system's answers

Example format:

[
  {
    "input": "what is my email",
    "expected": "user@gmail.com"
  }
]

Guidelines for Creating Effective Test Queries

Clarity & Specificity: Make input queries clear and specific - ambiguous queries are hard to evaluate
Factual Correctness: The expected answer should be factually correct and concise
Diversity: Include a diverse range of query types (factual, temporal, personal, etc.)
Edge Cases: Consider adding edge cases to thoroughly test the system
Personal Data: For queries about personal data, ensure the expected answer matches the test user account
Objectivity: Avoid queries that have subjective or multiple correct answers
Complexity Range: Include both simple and complex queries to test different capabilities

Example Query Types

Factual: {"input": "what is my email", "expected": "user@example.com"}
Temporal: {"input": "when was my last meeting", "expected": "Yesterday at 3pm with Marketing team"}
Personal: {"input": "what's my job title", "expected": "Senior Developer"}
Data Search: {"input": "find emails about project alpha", "expected": "Found 3 emails from last week about project alpha"}

Usage

These test queries are automatically used by the evaluation system to measure performance. To add new queries, simply add new objects to the array while following the guidelines above.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Files

eval-data

eval-data

README.md

Test Queries Documentation

Overview

File Format

Guidelines for Creating Effective Test Queries

Example Query Types

Usage

Files

eval-data

Directory actions

More options

Directory actions

More options

Latest commit

History

eval-data

Folders and files

parent directory

README.md

Test Queries Documentation

Overview

File Format

Guidelines for Creating Effective Test Queries

Example Query Types

Usage