Evaluating Kimi 2.5 vs Kimi 2.6: What happens to agent skills when the model gets smarter?
<p>When a stronger model ships, there are two questions every skill author should want answered, and evals are the only honest way to answer either:</p> <ol> <li> <strong>Which skills just got absorbed?</strong> A model that now knows how to do X natively does not need a skill telling it to do X. Fewer skills to maintain, leaner context, lower cost.</li> <li> <strong>Which skills still matter?</strong> Behaviour-level guidance (conventions, preferences, project-specific workflows) is not somethi
Story Overview
When a stronger model ships, there are two questions every skill author should want answered, and evals are the only honest way to answer either:
- Which skills just got absorbed? A model that now knows how to do X natively does not need a skill telling it to do X. Fewer skills to maintain, leaner context, lower cost.
- Which skills still matter? Behaviour-level guidance (conventions, preferences, project-specific workflows) is not somethi