← Back to index
v0.2.0 — Add Leaderboard
[0.2.0] add leaderboard
Added
- Live benchmarking: Direct API calls for model benchmarking (removed in later version)
- Leaderboard system (
leaderboard.py): Track and compare model scores across benchmarks
- JSON-based persistent storage (
leaderboard.json)
- Markdown table generation (
LEADERBOARD.md)
- Batch benchmarking:
--run-all flag to test all models from models.txt
- Parallel execution by default for faster testing
--sequential flag for one-at-a-time testing
- Timing metrics: Track and display elapsed time (seconds) for each model response
- New CLI options:
--model, --leaderboard, --add-to-leaderboard, --run-all, --sequential
.gitignore for excluding generated files
models.txt for configuring which models to benchmark
Changed
run_benchmark.py now supports file-based benchmarking with enhanced parsing
- Token usage tracking (prompt, completion, total) in leaderboard results
- Updated
requirements.txt with requests dependency