← Back to index
v0.9.0 — Add Model Skipping
[0.9.0] add model skipping
Added
- Model Skip System: New
models_skip.txt file for tracking failed or problematic models
- Automatic movement of failed models from
models.txt to models_skip.txt
- Prevents re-testing of consistently failing models
- Improved batch execution efficiency by avoiding known problematic models
- Automatic Model Management: Enhanced benchmark runner with smart model handling
move_model_to_skip() function for automated model organization
- Both parallel and sequential execution modes now automatically skip failed models
- Clear feedback when models are moved to skip list during execution
Changed
- Model List Optimization: Streamlined
models.txt from 29 to 10 active models
- Removed 19 underperforming or consistently failing models
- Focused on higher-quality models for more reliable benchmarking
- Moved removed models to new
models_skip.txt for reference
- Leaderboard Expansion: Added 7 new model results to leaderboard
- New entries include google/gemini-2.0-flash-exp, nvidia/nemotron-nano-9b-v2, and others
- Updated rankings reflect latest benchmark performance
- String Formatting: Fixed escape character issues throughout
run_benchmark.py
- Corrected
\\n to \n for proper newline handling
- Improved regex patterns for better model detection and time extraction
- Enhanced string formatting consistency across the codebase
Improved
- Execution Efficiency: Faster batch processing by avoiding known failing models
- Error Handling: Better feedback when models are automatically moved to skip list
- Code Quality: Cleaner string handling and more robust regex patterns
- Leaderboard Accuracy: More reliable rankings with focus on consistently performing models