A Philosophy of Skill Design

130 skills, and the one feedback loop I never built.

May 31, 2026

I went to move my skills to Codex last week, and they disappeared.

Not deleted. Just invisible. Claude Code reads my skills the way I expect, and Codex didn’t register them the same way. So I sat there about to wire a few back up by hand, and had Codex crawl the system first to show me what I’d actually be porting.

Over 130 skills.

I scrolled the list. Three of them wrap Cloudflare. One does WAF and page rules, one pulls metrics, one is scoped to my personal domains because the auth is different from work. I have two separate ways to ask for a PR review. I have a MongoDB skill, a Prisma-schema skill, and a data-architect skill that all reach for the same corner of my brain and half the same work.

That is the maintenance backlog, hiding in plain sight. Not bugs. Not broken commands. Just choices I pushed onto future-me because making a new skill was cheaper than deciding where the old one belonged. Every duplicate is a little tax: which review skill do I want, which Cloudflare context am I in, which data-layer instinct am I actually trying to invoke?

I built every one of these on purpose. Each solved a real, repeated workflow the day I made it. No single one ever felt like too much, which is exactly how you end up with 130. The collection turned into something I can’t hold in my head. I’m the index now, and the index is out of date.

The loop with no brake

Every other part of my harness has a forcing function. Code gets reviewed. Tests fail in CI. Infrastructure breaks in production and pages me at a bad hour. Even a sloppy prompt punishes you in the same breath, because the output comes back wrong and you feel it.

A skill I never open again does none of that. It costs nothing I can see. It sits in the folder, quietly, one more thing I have to know exists.

Donella Meadows had a name for this: reinforcing loops amplify whatever the system is already doing, and balancing loops pull back toward a limit. Healthy systems run both. A reinforcing loop with no balancing loop is just runaway growth waiting for a wall.

My skill creation is a strong reinforcing loop. I run /reflect obsessively. Any time I catch myself steering the model, correcting the same thing twice, nudging it back on track, I capture that steer. Most of the time /reflect deepens a skill I already have. Sometimes it spins up a new one. Either way the library gets richer and the count goes up. (The Simplest Feedback Loop was about exactly this engine.)

What I never built is the balancing loop. Nothing in my setup steps back across the whole library and says you have three Cloudflare skills now, make them one, or this skill hasn’t fired in four months, retire it. /reflect works one skill at a time. It makes each skill better. It has no opinion about how many skills there should be.

Narrow, not shallow

It would be easy to call these skills shallow, and it would be wrong. Most of them aren’t thin. They do real work.

John Ousterhout, in A Philosophy of Software Design, splits modules into deep and shallow. His mantra is that “the best modules are deep,” by which he means a lot of functionality sitting behind a simple interface. A shallow module is the opposite, a complicated interface wrapped around not much capability. Depth is how much you get for how little you have to know.

My skills aren’t shallow. They’re scoped to a tool instead of a category. A MongoDB skill, a Prisma skill, a data-architect skill, when what I want is one skill for the data layer that knows which tool to reach for. A WAF skill and a metrics skill, when I want one Cloudflare skill that takes the auth and the capability as parameters. Each one is fine alone. Together they push the choosing onto me. Which of the three do I invoke? That question is interface complexity, and I pay it every single time.

So the fix for 130 tool-scoped skills is not 130 better-named tool-scoped skills. Renaming, tagging, a tidy index, a find-skills lookup, those make the sprawl searchable. They don’t make it smaller, and searchable sprawl is still sprawl. The fix is fewer, deeper skills. Category over tool. A dozen deep doors instead of a hundred narrow ones.

What collapsing looks like

When I do sit down and consolidate, the work is straightforward. I pull project-level skills up to a global home so the same capability isn’t redefined in four repos. I put the collection in git so there’s one source of truth instead of a folder that drifts on every machine. I delete the lazy ones. I generalize the survivors so they take auth or environment as a parameter instead of spawning a new skill every time the context shifts.

But that’s me, by hand, on the rare afternoon I notice. It is not a loop. It does not run on its own.

Sharing is the brake

The surprise: the balancing loop I won’t run for myself, sharing runs for me.

My friend Scott publishes his skills openly. Part of that is building in public, which he does on principle. But part of it, he’ll tell you, is that he loses track of where things live on his own machines. The public repo is external memory. It’s the index he can’t hold in his head, sitting somewhere he can actually go and look.

Pushing a skill where another person might read it changes what you’re willing to ship. The private dialect, the hardcoded path, the name that only makes sense to you on a Tuesday, all of it surfaces the moment a second reader is implied. You generalize because you have to. You delete the lazy ones because you’d be embarrassed to publish them.

It changes how I consume skills too. A skill is a frozen opinion about how to do something. Import a lazy one and you’ve adopted someone else’s problem as your own. The bar has to be visible reasoning, some sign the author hit a wall and adjusted, not just a confident description wrapped around a thin prompt. refactor-first-principles is one I built from Dan Shipper’s thinking, and I kept it because the reasoning was visible, not because the name sounded good.

Writing for one other person is the balancing loop, smuggled in through the side door of not wanting to look sloppy.

The brake is a separate system

I still haven’t finished moving my skills to Codex. But the failed port did one useful thing. It forced me to count, and counting is the first balancing move there is.

The creation engine is the easy half, and it’s the half everyone builds first. /reflect, skillify, the make-a-skill habit, all reinforcing, all pulling the same direction. None of them have a brake. The brake is a separate system you have to build on purpose, and if you don’t, the library grows until something external builds it for you. A new runtime that won’t read your folder. A teammate who has to use them. A morning you go looking for the skill you know you wrote and build it a second time because finding it was harder than remaking it.

Codex not reading the folder was annoying. It was also the first honest review the folder had received.

Engineered Intelligence

Discussion about this post

Ready for more?