ACC: Compiling Agent Trajectories into Long-Context QA for Direct Reasoning
ACC compiles multi-turn agent tool-calling trajectories into long-context QA pairs, teaching models to integrate scattered evidence. Qwen3-30B-A3B gains +18.1 on MRCR after ACC training, approaching the 235B version.