C
ChaoBro

SkillsVote: Adding a "Voting System" to AI Agent Skills for Self-Evolution Without Model Updates

SkillsVote: Adding a "Voting System" to AI Agent Skills for Self-Evolution Without Model Updates

After Claude Code's skills directory gained popularity, various skill management solutions emerged in the community. However, a fundamental issue is rarely discussed: How should an Agent's skill library be governed?

As skills accumulate, they become redundant, vary in quality, and introduce complex environment dependencies. Indiscriminately updating the skill library can actually "pollute" the context for subsequent executions. The SkillsVote paper published today by IAAR-Shanghai and Memtensor Research Group tackles exactly this problem.

What SkillsVote Does

At its core, SkillsVote converts an Agent's execution trajectories into reusable skills (Agent Skills) and manages this transformation through a "vote-attribute-admit" mechanism.

Pre-Execution: Structured Skill Library Search

Before executing a task, SkillsVote performs an intelligent search across a structured skill library to surface relevant skill instructions to the Agent. This isn't simple keyword matching; it's a comprehensive retrieval based on environmental requirements, quality scores, and verifiability.

Post-Execution: Trajectory Decomposition & Attribution

After task completion, SkillsVote breaks down the Agent's complete trajectory into skill-associated subtasks, then conducts attribution analysis on the results:

  • How much credit goes to the skills used?
  • How much stems from the Agent's own exploration?
  • How much is due to environmental factors?
  • How much comes from execution result signals?

Admission: Evidence-Gated Updates

Only successful, reusable discoveries can pass through the "evidence gate" to enter the skill library. This prevents low-quality or accidentally successful skills from being included.

Experimental Results

Scenario Baseline Improvement
Offline Evolution GPT-5.2 + Terminal-Bench 2.0 +7.9 pp
Online Evolution Frozen Model + SWE-Bench Pro +2.6 pp

A key takeaway: Model weights do not need to be updated. Through a well-governed external skill library, even a frozen Agent can achieve capability improvements.

Million-Scale Skill Corpus

An implicit highlight of the paper is the team's systematic analysis of a million-scale open-source skill corpus, profiling it across three dimensions: environmental requirements, quality, and verifiability. This dataset itself is a significant asset for the Agent research field.

One-Sentence Summary

SkillsVote essentially answers: How should Agent skills be "cultivated"? It's not about the more the better, nor the faster the updates the better. Instead, it requires a governance system that is selective, attribution-driven, and has clear admission thresholds. This approach is highly valuable for anyone building an Agent platform.

Main Sources:

  • arXiv:2605.18401 - SkillsVote Paper
  • IAAR-Shanghai / Memtensor Research Group