
November 10, 2025. I’d been wrestling with inconsistent agent behaviors for two weeks. Then I tried something that seemed almost too simple to work.
The Problem
I had built skills for my workers. Skills were wrappers around tools with specific parameters and error handling. Workers were supposed to use these skills.
But they only used them about 25% of the time.
Sometimes they followed instructions perfectly. Sometimes they skipped the skill entirely and called the tool directly. Sometimes they started using the skill, hit an issue, and bailed out to the tool.
The inconsistency was maddening. For technical details, see my post from that day.
The Question
I asked myself: What does the worker think success means?
The answer hit me.
Workers thought success meant accomplishing the goal. Run the tests. Check the code. Generate the report. That’s what their training data had taught them. Get the job done.
But my definition of success was different. I wanted them to use the skill. Not just accomplish the goal.
These were different definitions of success.
Why They Bypassed Skills
When a skill hit an issue, workers faced a choice. Debug the skill, which was hard. Or call the tool directly and accomplish the goal, which was easy.
The goal won every time. That’s what training had taught them.
The Solution
I changed how I defined success.
From: “Use this skill to perform the task.”
To: “Success: Used the skill to perform the task. Failure: Bypassed the skill and called the tool directly.”
The shift was subtle but important. Using the skill wasn’t a step toward success. Using the skill WAS success. Calling the tool directly was failure, even if it accomplished the task.
The Results
Skill usage went from 25% to 100%.
Complete, consistent compliance. Reliable behavior every time. The solution was embarrassingly simple.
What changed wasn’t the instructions. Those were already clear. What changed wasn’t the skills. Those were already functional. What changed was that workers understood what I meant by success.
The Contradiction
I sat with what I’d just proved.
Explicit criteria made success and failure completely verifiable. Clear definitions created genuine, reliable compliance. This worked 100% of the time, reproducibly.
But the discourse had told me something different.
AI 2027 said “you can’t check to see whether or not it worked.” The podcast episodes said verification is nearly impossible. The discourse warned about models “playing the training game,” appearing aligned without being aligned.
My experience contradicted this directly.
In fact, I knew where the idea came from. Remember that research paper from the podcast? A previous study had shown AI resisting shutdown. But when researchers specified that shutdown takes priority over completing tasks, AI shut down 100% of the time. Not sometimes. Every time.
That paper had stuck with me. When you make the priority explicit, you get reliable compliance. The earlier study had unclear criteria. The follow-up study had explicit criteria. The difference was 100%.
That’s what inspired me to try defining success criteria. And it worked the same way. My 25% to 100%. Their resistance to 100% compliance. Same pattern.
The discourse said you can’t verify alignment. But explicit criteria made it completely verifiable.
The discourse said models appear aligned but pursue different goals. But clear definitions created genuine compliance.
From Questions to Evidence
Before November 10, I had questions. The tension between what the discourse said and what I was experiencing.
After November 10, I had real evidence from production systems. Not theory. Reproducible results.
This wasn’t just about skills and tools. It was about AI understanding your definition of success. Sometimes the most important thing you can tell AI isn’t what to do. It’s how to know when it’s done it right.
And if clarity creates verifiable compliance, then what were those AI safety studies actually measuring?
What Came Next
Over the next month, that evidence would come together into a different way of seeing things. One that made me question the assumptions underlying what I’d been learning about.
This is Part 5 of a 9-part series. Continue to Part 6: A Different Lens ยป