-
Notifications
You must be signed in to change notification settings - Fork 12
feat: Add parallel execution support for benchmark phases #14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Implements configurable parallelism for benchmark phases to improve performance and throughput when running evaluations.
## Changes
### Core Features
- Add ParallelExecutor utility for concurrent phase execution with graceful stop support
- Implement per-phase parallelism configuration via CLI flags
- Add atomic checkpoint saves with temp file + rename pattern to prevent corruption
- Add flush() method to ensure all checkpoint writes complete before run completion
### Phase Updates
- Refactor answer, evaluate, indexing, ingest, and search phases to use ParallelExecutor
- Replace sequential loops with parallel execution using configurable concurrency
- Maintain per-question error handling and progress tracking
### CLI Enhancements
- Add --parallelism flag for default concurrency across all phases
- Add phase-specific flags: --parallelism-{ingest,indexing,search,answer,evaluate}
- Parallelism settings persisted in checkpoint and respected on resume
### Type System
- Add ParallelismConfig type for phase-specific concurrency settings
- Add resolveParallelism() helper to determine effective parallelism with fallbacks
- Extend RunCheckpoint to store parallelism configuration
### Improvements
- Thread-safe checkpoint saving prevents race conditions during parallel writes
- Graceful shutdown support with shouldStop() checks in parallel execution
- Progress logging maintains visibility during concurrent operations
## Testing
- Tested with various parallelism configurations
- Verified checkpoint integrity under concurrent writes
- Confirmed graceful stop functionality works with parallel execution
| export interface Provider { | ||
| name: string | ||
| prompts?: ProviderPrompts | ||
| defaultParallelism?: ParallelismConfig |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Dhravya code looks great!
"parallelism" feels vague, also technically imprecise
we are doing concurrency - single-threaded with async op
popular libs also use concurrency - fastq, p-limit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ingesting and Indexing concurrent rate could be merged because if a provider can handle x bandwidth then it can handle x indexing req [this will minimise user friction]
✅search should be separate as provider's can usually handle higher search bandwidth
✅Answer and evaluate should be separate [as we could customise judge and answering model]
|
|
||
| <div className="mt-6 pt-4 border-t border-[#333333]"> | ||
| <button | ||
| type="button" | ||
| onClick={() => setShowPerformanceSettings(!showPerformanceSettings)} | ||
| className="flex items-center gap-2 text-sm font-medium text-text-primary mb-3 hover:text-accent transition-colors" | ||
| > | ||
| <svg className={`w-4 h-4 transition-transform ${showPerformanceSettings ? "rotate-90" : ""}`} fill="none" viewBox="0 0 24 24" stroke="currentColor"> | ||
| <path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M9 5l7 7-7 7" /> | ||
| </svg> | ||
| Performance Settings | ||
| </button> | ||
|
|
||
| {showPerformanceSettings && ( | ||
| <div className="ml-6 space-y-3 p-4 bg-[#1a1a1a] border border-[#333333] rounded"> | ||
| <p className="text-xs text-text-muted"> | ||
| Configure parallelism for this run. Leave empty to use source run settings or provider defaults. | ||
| </p> | ||
|
|
||
| <div className="grid grid-cols-2 gap-4"> | ||
| <div> | ||
| <label className="block text-sm font-medium text-text-primary mb-2"> | ||
| Default Parallelism | ||
| </label> | ||
| <input | ||
| type="number" | ||
| className="w-full px-3 py-2 text-sm bg-[#222222] border border-[#444444] rounded text-text-primary focus:outline-none focus:border-accent" | ||
| value={form.parallelism.default ?? ""} | ||
| onChange={(e) => setForm({ | ||
| ...form, | ||
| parallelism: { ...form.parallelism, default: e.target.value ? parseInt(e.target.value) : undefined } | ||
| })} | ||
| placeholder="1 (sequential)" | ||
| min="1" | ||
| /> | ||
| <p className="text-xs text-text-muted mt-1">Applies to all phases unless overridden</p> | ||
| </div> | ||
|
|
||
| <div className="flex items-end"> | ||
| <button | ||
| type="button" | ||
| onClick={() => setShowPerPhaseSettings(!showPerPhaseSettings)} | ||
| className="text-sm text-accent hover:text-accent/80 transition-colors mb-2" | ||
| > | ||
| {showPerPhaseSettings ? "Hide" : "Show"} per-phase settings | ||
| </button> | ||
| </div> | ||
| </div> | ||
|
|
||
| {showPerPhaseSettings && ( | ||
| <div className="grid grid-cols-3 gap-3 pt-2 border-t border-[#333333]"> | ||
| {(["ingest", "indexing", "search", "answer", "evaluate"] as const).map(phase => ( | ||
| <div key={phase}> | ||
| <label className="block text-xs font-medium text-text-secondary mb-1 capitalize"> | ||
| {phase} | ||
| </label> | ||
| <input | ||
| type="number" | ||
| className="w-full px-2 py-1.5 text-sm bg-[#222222] border border-[#444444] rounded text-text-primary focus:outline-none focus:border-accent" | ||
| value={form.parallelism[phase] ?? ""} | ||
| onChange={(e) => setForm({ | ||
| ...form, | ||
| parallelism: { ...form.parallelism, [phase]: e.target.value ? parseInt(e.target.value) : undefined } | ||
| })} | ||
| placeholder="—" | ||
| min="1" | ||
| /> | ||
| </div> | ||
| ))} | ||
| </div> | ||
| )} | ||
|
|
||
| <div className="flex items-start gap-2 p-3 bg-blue-500/5 border border-blue-500/20 rounded"> | ||
| <svg className="w-4 h-4 text-blue-400 mt-0.5 flex-shrink-0" fill="none" viewBox="0 0 24 24" stroke="currentColor"> | ||
| <path strokeLinecap="round" strokeLinejoin="round" strokeWidth={2} d="M13 16h-1v-4h-1m1-4h.01M21 12a9 9 0 11-18 0 9 9 0 0118 0z" /> | ||
| </svg> | ||
| <div className="text-xs text-blue-200"> | ||
| <strong>Recommendations:</strong> Ingest/Indexing: 50-200, Search: 20-50, Answer/Evaluate: 10-20 | ||
| </div> | ||
| </div> | ||
| </div> | ||
| )} | ||
| </div> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if this is useful for the ui, once the provider sets it in the code for the first time and they're most unlikely to change it again ever
if so
the UI can be simpler, toggle button to enable parallelism, icon to change inline, advanced settings -> onclick dropdown (hides default number and icon)
[toggle button] Concurrent requests: 0 <pencil icon> [TAB space] [TAB space] [Advanced settings]
[small description text]
on dropdown
Ingest: 0 <pencil icon>
Search: 0 <pencil icon>
Answer: 0 <pencil icon>
Evaluate: 0 <pencil icon>
which follows two click rule for adding a feature
|
|
||
| {run.parallelism && (run.parallelism.default !== undefined || | ||
| run.parallelism.ingest !== undefined || | ||
| run.parallelism.indexing !== undefined || | ||
| run.parallelism.search !== undefined || | ||
| run.parallelism.answer !== undefined || | ||
| run.parallelism.evaluate !== undefined) && ( | ||
| <div className="p-4 bg-[#1a1a1a] border border-[#333333] rounded"> | ||
| <h3 className="text-sm font-semibold text-text-primary mb-3">Performance Configuration</h3> | ||
| <div className="grid grid-cols-6 gap-3 text-xs"> | ||
| {run.parallelism.default !== undefined && ( | ||
| <div> | ||
| <span className="text-text-muted">Default:</span> | ||
| <span className="ml-2 text-text-primary font-medium">{run.parallelism.default}</span> | ||
| </div> | ||
| )} | ||
| {(["ingest", "indexing", "search", "answer", "evaluate"] as const).map(phase => ( | ||
| run.parallelism?.[phase] !== undefined && ( | ||
| <div key={phase}> | ||
| <span className="text-text-muted capitalize">{phase}:</span> | ||
| <span className="ml-2 text-text-primary font-medium">{run.parallelism[phase]}</span> | ||
| </div> | ||
| ) | ||
| ))} | ||
| </div> | ||
| </div> | ||
| )} | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we probably dont need to show it in the runId page
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I also think that the UI itself doesnt look good xD

Implements configurable parallelism for benchmark phases to improve performance and throughput when running evaluations.
Changes
Core Features
Phase Updates
CLI Enhancements
Type System
Improvements
Testing