RL-based TSP with Hardness-Adaptive Curriculum
Overview
Undergraduate thesis project exploring a Reinforcement Learning approach to the Travelling Salesman Problem with a hardness-adaptive curriculum to stabilize training.
Approach
- Designed a curriculum that adjusts graph hardness based on agent performance.
- Implemented RL baselines and evaluation pipeline to compare sample efficiency and solution quality.
- Integrated 1.5-approximation heuristics into the reward to guide exploration.
Results
- Improved training stability over greedy roll-out baselines.
- Demonstrated higher-quality tours on medium-sized TSP instances.
Poster & Certificate
- Poster — View poster
- Poster presentation certificate — View certificate