Skills in CC have been a bit frustrating for me. They don't trigger reliably and the emphasis on "it's just markdown" makes it harder to have them reliably call certain tools with the correct arguments.
The idea that agent harnesses should primarily have their functionality dictated by plaintext commands feels like a copout around programming in some actually useful, semi-opinionated functionality (not to mention that it makes capability-discoverability basically impossible). For example, Claude Code has three modes: plan, ask about edits, and auto-accept edits. I always start with a plan and then I end up with multiple tasks. I'd like to auto-accept edits for a step at a time and the only way to do that reliably is to ask CC to do that, but it's not reliable—sometimes it just continues to go into the next step. If this were programmed explicitly into CC rather than relying on agent obedience, we could ditch the nondeterminism and just have a hook on task completion that toggles auto-complete back to "off."
giancarlostoro 9 minutes ago [-]
Are you using either CLAUDE.md or .claude/INSTRUCTIONS.md to direct Claude about the different agents?
Also, be aware that when you add new instructions if you don't tell claude to reread these files, it will NOT have it in its context window until you tell it to read them OR you make a new CC session. This was a bit frustrating for me because it was not immediately obvious.
chickensong 12 minutes ago [-]
> sometimes it just continues to go into the next step
Use a structured workflow that loops on every task and includes a pause for user confirmation at the end. Enforce it with a hook. I'm not sure if you can toggle auto-accept this way, but I think the end result is what you're asking for.
I use this with great success, sometimes toggling auto-accept on when confidence is high that Claude can complete a step without guidance, and toggling off when confidence is low and you want to slow down and steer, with Claude stopping between the steps. Now that prompt suggestions are a thing, you can just hit enter to continue on the suggested prompt to continue.
btbuildem 12 minutes ago [-]
> idea that agent harnesses should primarily have their functionality dictated by plaintext commands feels like a copout
I think it's more along the lines of acknowledging the fast-paced changes in the field, and refusing to cast into code something that's likely to rapidly evolve in the near future.
Once things settle down into tested practices, we'll see more "permanent" instrumentation arise.
daturkel 9 minutes ago [-]
Surely this logic doesn't apply if we're to believe that "code is cheap" now :p
siquick 10 minutes ago [-]
> Skills in CC have been a bit frustrating for me. They don't trigger reliably
Referencing them in AGENTS/CLAUDE.md has increased their usage for me.
Frannky 35 minutes ago [-]
I think unless you're doing simple tasks, skills are unreliable. For better reliability, I have the agent trigger APIs that handles the complex logic (and its own LLM calls) internally. Has anyone found a solid strategy for making complex 'skills' more dependable?
chickensong 5 minutes ago [-]
Is it that the skills aren't being triggered reliably, or that they get triggered but the skill itself is complex and doesn't work as expected?
plufz 18 minutes ago [-]
My only strategy is what used to be called slash-commands but are also skills now, I.e I call them explicitly. I think that actually works quite well and you can allow specific tools and tell it to use specific hooks for security of validation in the frontmatter properties.
PantaloonFlames 54 minutes ago [-]
You can publish scripts with skills you author, right? With carefully constructed markdown that should allow the agent to call tools the right way.
DarmokJalad1701 28 minutes ago [-]
You can write skills that have an associated js/python/whatever script.
RyanShook 47 minutes ago [-]
So far my experience with skills is that they slow down or confuse agents unless you as the user understand what the skill actually contains and how it works. In general I would rather install a CLI tool and explain to the agent how I want it used vs. trying to get the agent to use a folder of instructions that I don't really understand what's inside.
giancarlostoro 6 minutes ago [-]
> So far my experience with skills is that they slow down or confuse agents unless you as the user understand what the skill actually contains and how it works. In general I would rather install a CLI tool and explain to the agent how I want it used vs. trying to get the agent to use a folder of instructions that I don't really understand what's inside.
For Claude Code I add the tooling into either CLAUDE.md or .claude/INSTRUCTIONS.md which Claude reads when you start a new instance. If you update it, you MUST ask Claude to reread the file so it knows the full instructions.
airstrike 41 minutes ago [-]
Most LLM "harnessing" seems very lazy and bolted on. You can build much more robustly by leveraging a more complex application layer where you can manage state, but I guess people struggle building that
mccoyb 29 minutes ago [-]
Skills feel analogous to behavioral programs. If you give an agent access to a programmable substrate (e.g. bash + CLI tools), you write these Markdown programs which are triggered and read when the agent thinks certain behaviors will be beneficial.
It's a great idea: really neat take on programmability, and can be reloaded while the agent is running without tweaking the harness, etc -- lots of benefits.
`pi` has a great skills implementation too.
I think skills might really shine if you take a minimal approach to the system prompt (like `pi`) -- a lot of the times, if I want to orchestrate the agent in some complex behavior, I want to start fresh, and having it walk through a bunch of skills ... possibly the smaller the system prompt, the more likely the agent is to follow the skills without issue.
evalstate 7 minutes ago [-]
Yes -- skills live in a special gap between "should have been a deterministic program" and "model already had the ability to figure this out". My personal experience leaves me in agreement that minimal system prompts are definitely the way to go.
umairnadeem123 33 minutes ago [-]
the standardization angle is interesting but the real value is in skills that bundle executable scripts alongside the markdown. pure instruction-based skills are fragile because the agent has to interpret intent each time. when the skill includes a concrete shell script or python tool the agent just needs to know when to call it and what args to pass - way more deterministic.
we settled on a pattern where the SKILL.md mostly just describes the interface to a script in the same folder. the agent rarely drifts when it has a concrete tool to invoke vs trying to follow multi-step prose instructions.
ms170888 2 hours ago [-]
[flagged]
naillang 2 hours ago [-]
[flagged]
chasd00 43 minutes ago [-]
> is there a mechanism to pin a version, or is it always HEAD? Skills that evolve can silently break downstream workflows.
don't forget these skills are just text that goes into the llm for it to read, interpret, and then produce text that then gets executed in bash. The more intricate and specific the skill definition the more likely the model is to miss something or not follow it exactly.
I actually think SKILLS.md is such a janky way of doing this sort of thing, let alone the fact that's reliant on the oh-so-brittle Python ecosystem. Also way too much context/tokens being eaten up by something that could be piece-wise programmatically injected in the token stream.
Imo a bad idea, but alas.
esafak 34 minutes ago [-]
It is not dependent on any language.
PantaloonFlames 53 minutes ago [-]
Wait - how are skills dependent on python?
Isn’t python just an option ?
dvt 44 minutes ago [-]
It is, but to do the "useful" stuff, it's more or less mandatory. Either Python or bash scripts (which are equally as janky tbh).
armcat 2 hours ago [-]
I think you are spot on there, and I am not sure such things exist (yet), but I may be wrong. Some random thoughts:
1. Using the skills frontmatter to implement a more complex YAML structure, so e.g.
Except they're still not accepting any feedback around AGENTS.md as a standard. You need to explicitly symlink CLAUDE.md to AGENTS.md in a workspace in order to Claude to work like every other agent when it comes to loading context.
Rendered at 21:05:16 GMT+0000 (Coordinated Universal Time) with Vercel.
The idea that agent harnesses should primarily have their functionality dictated by plaintext commands feels like a copout around programming in some actually useful, semi-opinionated functionality (not to mention that it makes capability-discoverability basically impossible). For example, Claude Code has three modes: plan, ask about edits, and auto-accept edits. I always start with a plan and then I end up with multiple tasks. I'd like to auto-accept edits for a step at a time and the only way to do that reliably is to ask CC to do that, but it's not reliable—sometimes it just continues to go into the next step. If this were programmed explicitly into CC rather than relying on agent obedience, we could ditch the nondeterminism and just have a hook on task completion that toggles auto-complete back to "off."
Also, be aware that when you add new instructions if you don't tell claude to reread these files, it will NOT have it in its context window until you tell it to read them OR you make a new CC session. This was a bit frustrating for me because it was not immediately obvious.
Use a structured workflow that loops on every task and includes a pause for user confirmation at the end. Enforce it with a hook. I'm not sure if you can toggle auto-accept this way, but I think the end result is what you're asking for.
I use this with great success, sometimes toggling auto-accept on when confidence is high that Claude can complete a step without guidance, and toggling off when confidence is low and you want to slow down and steer, with Claude stopping between the steps. Now that prompt suggestions are a thing, you can just hit enter to continue on the suggested prompt to continue.
I think it's more along the lines of acknowledging the fast-paced changes in the field, and refusing to cast into code something that's likely to rapidly evolve in the near future.
Once things settle down into tested practices, we'll see more "permanent" instrumentation arise.
Referencing them in AGENTS/CLAUDE.md has increased their usage for me.
For Claude Code I add the tooling into either CLAUDE.md or .claude/INSTRUCTIONS.md which Claude reads when you start a new instance. If you update it, you MUST ask Claude to reread the file so it knows the full instructions.
It's a great idea: really neat take on programmability, and can be reloaded while the agent is running without tweaking the harness, etc -- lots of benefits.
`pi` has a great skills implementation too.
I think skills might really shine if you take a minimal approach to the system prompt (like `pi`) -- a lot of the times, if I want to orchestrate the agent in some complex behavior, I want to start fresh, and having it walk through a bunch of skills ... possibly the smaller the system prompt, the more likely the agent is to follow the skills without issue.
we settled on a pattern where the SKILL.md mostly just describes the interface to a script in the same folder. the agent rarely drifts when it has a concrete tool to invoke vs trying to follow multi-step prose instructions.
don't forget these skills are just text that goes into the llm for it to read, interpret, and then produce text that then gets executed in bash. The more intricate and specific the skill definition the more likely the model is to miss something or not follow it exactly.
For example, on Proposal: AgentFile — Declarative Agent Composition from Skills + Filesystem-Native Skill Delivery
https://github.com/agentskills/agentskills/discussions/179
Imo a bad idea, but alas.
Isn’t python just an option ?
1. Using the skills frontmatter to implement a more complex YAML structure, so e.g.
2. Using a skills lock file ;-)