Skills

The Agent Framework introduces skills as on-demand instructions the agent loads when needed. The design question is: what makes a good skill, and how do you decide what belongs in a skill versus in the system prompt?

Why skills exist

Skills are the primary mechanism for keeping the system prompt lean. Anything that is a step-by-step procedure, a playbook, or reference material that is only needed sometimes — that is a skill, not an instruction.

Without skills, every procedure the agent might ever need would have to live in the system prompt. The agent would carry instructions for every workflow, every system, every edge case — all the time, whether it needs them or not. The system prompt would grow until the important things get lost in the noise.

Skills solve this. Each skill captures the instructions for one specific aspect of the agent’s work. The agent loads the right skill at the right time, and the system prompt stays focused on what matters in every interaction.

The boundary with instructions

The test is simple:

“Does the agent need this for every interaction?”

Yes → It belongs in instructions.

No, only for specific tasks → It is a skill.

A blog writing procedure? Skill. A refund process? Skill. How to deploy to production? Skill.

Standing duties like “keep responses concise” or “always confirm before deleting shared documents”? Those apply every time — they are instructions.

When in doubt, start with the content in instructions and extract it into a skill when the system prompt starts feeling long. That is perfectly natural — the extraction heuristic is how good agents get designed.

Three design surfaces

A skill has three parts, and each one is a design surface — a place where your choices affect how well the skill works.

1. Title and description — the spine label

The title and description are what the agent sees when deciding whether to load a skill. Think of it like the label on the spine of a book. The agent scans its available skills, reads the labels, and picks the one that fits the task.

A vague label means the skill never gets used. A sharp one means it fires at exactly the right moment.

Bad:

Marketing stuff

The agent has no idea when to reach for this.

Good:

Blog post workflow — Step-by-step process for writing a blog post following our content guidelines and brand voice.

The agent knows exactly when this skill is relevant and what it will get from loading it.

2. Argument hint — the intake form

When the agent loads a skill, it can pass context along — like filling out a cover sheet before handing off a task. The argument hint tells the agent what information to include.

Before you start, you will need: customer name, order ID, and reason for refund.

This is optional — not every skill needs it. But for skills that operate on specific inputs, a clear argument hint makes the agent more reliable. Without it, the agent might load the skill and then realize it is missing key information.

3. Content — the procedure

The actual instructions. This is what the agent reads after loading the skill. It should be clear, structured, and actionable — everything the agent needs to execute the task well.

Good skill content reads like a well-written handoff document. Any competent agent should be able to pick it up and follow it without needing additional context from the system prompt.

Design principles

Self-contained. A skill should work on its own. It should not assume context that is not there. If a skill depends on information from the system prompt to make sense, that dependency should be explicit — or the information should be included in the skill itself.

One domain per skill. Do not create grab-bag skills that cover multiple unrelated topics. Each skill covers one procedure or one knowledge area. If you find a skill growing to cover “everything about marketing,” split it into specific skills: one for blog posts, one for social media campaigns, one for email sequences.

Written for the agent, not for humans. Skills are read by the agent, not by your team. Write them in the way the agent will best understand and follow — clear steps, explicit decision points, concrete examples of expected output. Skip the organizational context a human reader would want.

When skills get loaded

Skills load through the same mechanism as any tool — the agent decides to call it based on the current task and the skill’s title and description. Once loaded, the skill’s content becomes part of the conversation context for the rest of that session.

This means the agent can load multiple skills in a single conversation if the task requires it. A user might ask the agent to “write a blog post about our new feature and then schedule it for publication.” The agent could load a blog writing skill and a content scheduling skill, using both in sequence.

It also means skills add to the context length as they are loaded. This is another reason to keep skills focused — a skill that includes pages of rarely needed reference material wastes context on information the agent may not use.

Next: Schedules →