← Back to index
v0.6.0 — Fix Prompt
[0.6.0] fix prompt
Added
- Strategic Maze Evaluator v2.0: Complete rewrite of the maze evaluator system
- Advanced stateful pathfinding algorithm with multi-dimensional state tracking
- Strategic element integration: teleporters (O/Q), switches (s), movable blocks (B)
- Conditional door logic: X (2+ keys), Y (switch), Z (switch + key)
- Enhanced innovation scoring for creative strategic element usage
- Timeout protection for complex maze solving (5-second limit)
Changed
- Scoring System Overhaul: Replaced gradient scoring with strategic innovation approach
- Strategic Innovation: +15 teleporter, +20 switch, +25 conditional door points
- Route Complexity: Key collection chains and switch activation bonuses
- Bonus Objectives: +75 points per optional exit (F/G/H)
- Strategic Danger: Capped at 30 points for quality over quantity
- Maze Size Support: Increased maximum maze dimensions from 32×32 to 64×64
- Updated validation constants in strategic_evaluator.py
- Enhanced scalability for larger, more complex strategic mazes
- Prompt Refinement: Streamlined and clarified maze generation instructions
- Simplified strategic elements description and logic
- Enhanced guidance for conditional door requirements
- Clearer constraint specifications for maze solvability
Fixed
- Pathfinding State Management: Resolved teleporter and switch state tracking issues
- Maze Solvability: Enhanced validation to ensure all strategic elements work correctly
- Character Validation: Improved handling of strategic maze characters and edge cases
- Performance Issues: Optimized complex maze solving with proper timeout handling
Improved
- AI Model Challenge: Higher intelligence ceiling with strategic puzzle complexity
- Benchmark Quality: True strategic reasoning assessment vs basic maze completion
- Evaluation Accuracy: More precise scoring of creative strategic element usage
- System Reliability: Robust handling of complex strategic element interactions
Updated
- Leaderboard Results: Refreshed with new evaluator scoring
- gemini 3 pro preview: 1446.46 points (51.4s)
- deepseek v3.2: 1338.83 points (152.0s)
- Strategic Elements: Enhanced teleporter, switch, and conditional door mechanics