← Back to index
v0.5.2 — Increase Maze Size
[0.5.2] increase maze size
Added
- Enhanced Maze Size Support: Increased maximum maze size from 32×32 to 64×64
- Updated validation constants in strategic_evaluator.py
- Modified prompt.md to reflect new size constraints
- Updated README.md documentation
- Multiple Run Support: Leaderboard now supports multiple runs per model
- Migrated from single-result dict format to list-based storage
- Enhanced get_rankings() to include all runs for each model
- Added migration logic for backward compatibility
- Updated ingest_manual_output to generate timestamped output files
- Improved Rescoring: Enhanced --rescore functionality
- Searches for multiple benchmark output files per model
- Preserves existing metadata (timing, token usage) during rescore
- Better error handling and file discovery
Changed
- Leaderboard Storage: Migrated from single-result to multiple-run format
- Supports historical comparison and progress tracking
- Maintains backward compatibility with existing data
- Input Processing: Enhanced pattern matching and file handling
- Updated regex patterns for better model detection
- Added timestamp support for multiple manual runs
Fixed
- Regex Pattern Issues: Improved model header detection in manual output ingestion
- File Overwriting: Prevented multiple manual runs from overwriting previous results
- Metadata Preservation: Fixed token usage and timing data loss during rescoring
Improved
- Scalability: Support for larger, more complex mazes (up to 64×64)
- Performance: Better file handling for multiple benchmark outputs
- User Experience: Enhanced error messages and file management