Skip to content

Create Benchmarks Directory Structure #193

@gltanaka

Description

@gltanaka

Problem

Benchmark experiments (like Jihye's 135k lines) need a home for collaboration within the main repo.

From Dec 13 Benchmarking Meeting:

"We probably want to... put it in PDD... we probably want that in the main repo."

Proposed Structure

pdd/
└── benchmarks/
    ├── humaneval/           # Jihye's benchmark work
    ├── auto-regen/          # PDD self-regeneration benchmark
    ├── single-key-matrix/   # Simple quality bar tests
    ├── experiments/
    │   └── [contributor-name]/  # Individual experiments
    ├── data/
    │   └── benchmark_runs/      # Historical results (Git LFS)
    └── README.md

Requirements

  • Git LFS enabled for large data files in benchmarks/data/
  • Clear README explaining how to run benchmarks
  • Separate from core pdd/ source code

Benefits

  1. Centralized location for benchmark experiments
  2. Historical data preserved for time-series analysis
  3. Contributors can share work in their own subdirectories
  4. Enables CI/CD integration for quality bar testing

Related

  • Metrics for regeneration #152 (Metrics for regeneration) - auto-regen benchmark would live here
  • The quality bar test matrix issue - single-key-matrix would live here

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions