Claude Mythos METR Evaluation: Autonomous Task Time Doubles Past 16 Hours, The Watershed Moment from Assistant to Independent Worker
METR evaluation shows Claude Mythos Preview exceeds 16 hours of autonomous task time, reaching the current benchmark ceiling. The leap from AI assistant to autonomous worker is happening.