create-skill
Creates new agent skills following the Agent Skills specification. Investigates the repo for conventions, designs the skill around progressive disclosure, writes SKILL.md with effective trigger descriptions, and validates with representative prompts. Use when the user wants to create a skill, build a SKILL.md, turn a workflow into a reusable skill, teach the agent a new task, scaffold a new agent capability, has a repeated workflow they want to codify, is frustrated by inconsistent agent behavior, or wants to package expertise for a team.
Create agent skills that are portable, easy to trigger, and cheap to load. A skill is a folder containing a SKILL.md file with YAML frontmatter and markdown instructions.
Workflow
-
Intake (mandatory gate) — Understand what the skill should do. Ask at least 3 targeted questions before drafting anything. Collect:
- A short name (lowercase, hyphenated)
- What the skill enables the agent to do
- When it should activate (trigger conditions)
- What success looks like
Summarize your understanding and get explicit confirmation before proceeding. Do not write SKILL.md until the user confirms.
-
Investigate the repo — Before asking questions, search the repo for:
- Existing skills, conventions, and workflow docs
- Scripts, templates, schemas relevant to the target workflow
- Tool or dependency requirements
- Whether the conversation already contains a workflow to capture
-
Clarify — Ask only questions that materially affect the skill. Push until these are clear:
- Required workflow steps and their order
- Required inputs and expected outputs
- Dependencies on tools, scripts, or services
- Whether the skill needs
references/,scripts/, orassets/
-
Design the package — Structure:
skill-name/ ├── SKILL.md # Metadata + core workflow ├── references/ # Detailed docs, loaded on demand ├── scripts/ # Executable code └── assets/ # Templates, resources- Keep SKILL.md under 500 lines
- Move bulky detail into
references/ - Put deterministic execution in
scripts/ - Don't duplicate guidance across files
-
Write SKILL.md — Structure:
--- name: skill-name description: > What the skill does and produces. Use when the user wants to <scenario>, mentions <keyword>, or asks about <topic>. ---Frontmatter rules:
- Required:
nameanddescriptionin frontmatter. Other fields depend on your project's conventions.
Writing rules:
-
Description optimizes activation, not teaching. State the job and when to use it in words a user would actually say. Include both actions and situations. Keep workflow details out of the description.
Bad:
Follows a 7-step process to generate SKILL.md files with YAML frontmatter.Good:
Creates agent skills. Use when the user wants to build a SKILL.md, turn a workflow into a reusable skill, or is frustrated by inconsistent agent behavior. -
Body is procedural and imperative. Tell the agent exactly how to proceed. Don't restate trigger criteria from the description — a "When to use" section in the body duplicates the description.
-
Use imperative form. "Do not", "Use", "Run" — not "prefer" or "consider".
-
Be concise. Terse reminders, not tutorials.
-
Include a complete example. One full, copy-paste-ready artifact beats scattered snippets.
-
Include a Boundaries section (mandatory). List what the skill DOES and Does NOT do.
-
Include a Common Failures section. List 2–3 domain-specific mistakes an agent would make without this guidance.
See example-skill.md for a complete finished skill demonstrating these principles.
- Required:
-
Validate — Test the skill with representative prompts:
- 2–3 realistic positive prompts (things users would say)
- At least 1 negative prompt (adjacent but shouldn't trigger)
Write a brief validation report noting:
- Which prompts triggered correctly
- Which failed and why (trigger wording, workflow ambiguity, or missing resources)
- What was fixed based on the failures
Skip validation only for trivial skills where the trigger surface is obvious.
Portability check — For distributable skills, verify:
- No hardcoded project-specific paths (use discovery)
- No project-specific terminology (internal jargon)
- No references to specific rules/tools only in your repo
- Instructions work in any repo with any directory layout
-
Acknowledge sources — If the skill draws on external practices, create
references/ACKNOWLEDGMENTS.mdlisting each source with a link, license, what was adapted, and the version it was adopted in. -
Confirm — Show the user the created skill and ask if adjustments are needed.
Boundaries
- DOES create skill directories, SKILL.md, references/, scripts/
- DOES validate with representative prompts
- Does NOT modify existing skills
- Does NOT create rules or profiles (separate workflows)
Example Scenario
User: "Turn my database migration steps into a skill."
→ Investigate repo (Flyway config) → ask about rollback scope
→ create migrate-database/SKILL.md → validate with prompts.
Common Failures
- Description leaks workflow — the agent reads the summary and skips the body, following a shortcut instead of the full procedure.
- Body too abstract to act on — "investigate the problem"
isn't actionable. "Run
git log --oneline -20to check recent patterns" is. - Weak enforcement in instructions — If evals show the agent ignoring a step, add it to a Common Failures section with NEVER/MUST language. Explicit failure modes with strong directives are more effective than polite workflow steps.
Quality Checklist
Before finalizing, use the skill design checklist, skill validation, and token optimization.
SKILL.md | | Raw
Create Skill
Create agent skills that are portable, easy to trigger, and cheap to load. A skill is a folder containing a SKILL.md file with YAML frontmatter and markdown instructions.
Workflow
-
Intake (mandatory gate) — Understand what the skill should do. Ask at least 3 targeted questions before drafting anything. Collect:
- A short name (lowercase, hyphenated)
- What the skill enables the agent to do
- When it should activate (trigger conditions)
- What success looks like
Summarize your understanding and get explicit confirmation before proceeding. Do not write SKILL.md until the user confirms.
-
Investigate the repo — Before asking questions, search the repo for:
- Existing skills, conventions, and workflow docs
- Scripts, templates, schemas relevant to the target workflow
- Tool or dependency requirements
- Whether the conversation already contains a workflow to capture
-
Clarify — Ask only questions that materially affect the skill. Push until these are clear:
- Required workflow steps and their order
- Required inputs and expected outputs
- Dependencies on tools, scripts, or services
- Whether the skill needs
references/,scripts/, orassets/
-
Design the package — Structure:
skill-name/ ├── SKILL.md # Metadata + core workflow ├── references/ # Detailed docs, loaded on demand ├── scripts/ # Executable code └── assets/ # Templates, resources- Keep SKILL.md under 500 lines
- Move bulky detail into
references/ - Put deterministic execution in
scripts/ - Don't duplicate guidance across files
-
Write SKILL.md — Structure:
--- name: skill-name description: > What the skill does and produces. Use when the user wants to <scenario>, mentions <keyword>, or asks about <topic>. ---Frontmatter rules:
- Required:
nameanddescriptionin frontmatter. Other fields depend on your project's conventions.
Writing rules:
-
Description optimizes activation, not teaching. State the job and when to use it in words a user would actually say. Include both actions and situations. Keep workflow details out of the description.
Bad:
Follows a 7-step process to generate SKILL.md files with YAML frontmatter.Good:
Creates agent skills. Use when the user wants to build a SKILL.md, turn a workflow into a reusable skill, or is frustrated by inconsistent agent behavior. -
Body is procedural and imperative. Tell the agent exactly how to proceed. Don't restate trigger criteria from the description — a "When to use" section in the body duplicates the description.
-
Use imperative form. "Do not", "Use", "Run" — not "prefer" or "consider".
-
Be concise. Terse reminders, not tutorials.
-
Include a complete example. One full, copy-paste-ready artifact beats scattered snippets.
-
Include a Boundaries section (mandatory). List what the skill DOES and Does NOT do.
-
Include a Common Failures section. List 2–3 domain-specific mistakes an agent would make without this guidance.
See example-skill.md for a complete finished skill demonstrating these principles.
- Required:
-
Validate — Test the skill with representative prompts:
- 2–3 realistic positive prompts (things users would say)
- At least 1 negative prompt (adjacent but shouldn't trigger)
Write a brief validation report noting:
- Which prompts triggered correctly
- Which failed and why (trigger wording, workflow ambiguity, or missing resources)
- What was fixed based on the failures
Skip validation only for trivial skills where the trigger surface is obvious.
Portability check — For distributable skills, verify:
- No hardcoded project-specific paths (use discovery)
- No project-specific terminology (internal jargon)
- No references to specific rules/tools only in your repo
- Instructions work in any repo with any directory layout
-
Acknowledge sources — If the skill draws on external practices, create
references/ACKNOWLEDGMENTS.mdlisting each source with a link, license, what was adapted, and the version it was adopted in. -
Confirm — Show the user the created skill and ask if adjustments are needed.
Boundaries
- DOES create skill directories, SKILL.md, references/, scripts/
- DOES validate with representative prompts
- Does NOT modify existing skills
- Does NOT create rules or profiles (separate workflows)
Example Scenario
User: "Turn my database migration steps into a skill."
→ Investigate repo (Flyway config) → ask about rollback scope
→ create migrate-database/SKILL.md → validate with prompts.
Common Failures
- Description leaks workflow — the agent reads the summary and skips the body, following a shortcut instead of the full procedure.
- Body too abstract to act on — "investigate the problem"
isn't actionable. "Run
git log --oneline -20to check recent patterns" is. - Weak enforcement in instructions — If evals show the agent ignoring a step, add it to a Common Failures section with NEVER/MUST language. Explicit failure modes with strong directives are more effective than polite workflow steps.
Quality Checklist
Before finalizing, use the skill design checklist, skill validation, and token optimization.
references/example-skill.md | | Raw
Example Skill
A finished skill for running the test suite:
---
name: run-tests
description: >
Runs the project's test suite and reports results. Use when
the user wants to run tests, check if tests pass, verify
changes don't break anything, or asks about test failures.
---
# Run Tests
Run the full test suite, surface failures clearly, and suggest
fixes when the cause is obvious.
## Workflow
1. **Detect the test runner** — Check package.json scripts,
Makefile targets, or pyproject.toml for the test command.
Prefer `npm test`, `make test`, or `pytest` in that order.
2. **Run the suite** — Execute the detected command. Capture
stdout and stderr.
3. **Report results** — If all tests pass, confirm with a
one-line summary. If tests fail, list each failing test
with its error message and the file:line reference.
4. **Suggest fixes** — For failures with an obvious cause
(import error, missing env var, typo), propose a concrete
fix. For ambiguous failures, ask the user before changing
anything.
## Conventions
- Never modify test files to make tests pass.
- Run the full suite unless the user explicitly scopes to a
subset.
This example shows:
- A description with natural trigger phrases and situations
- A tight four-step workflow
- Conventions that constrain behavior without rigid rules
references/skill-design-checklist.md | | Raw
Skill Design Checklist
Use this before finalizing a generated or revised skill.
Problem and Scope
- Is the target problem concrete and repeatable?
- Is the skill solving one coherent job rather than several?
- Is the audience clear?
- Are out-of-scope cases obvious from the wording?
Trigger Quality
- Does the description say what the skill does?
- Does it say when to use it?
- Does it include phrases a user would actually say?
- Does it avoid vague language like "helps with" or "handles"?
- If over-triggering is a risk, does the description narrow scope clearly?
- Does the body avoid restating trigger criteria already covered by the description?
- Does the description answer both "what" and "when" in a single read? (completeness)
- Could this skill accidentally trigger instead of another skill in the same workspace? Are the trigger terms unique to this skill's domain? (distinctiveness)
Workflow Quality
- Are the steps in the correct order?
- Does the skill investigate local context before asking avoidable questions?
- Are decision points and defaults explicit?
- Are required inputs and expected outputs stated?
- Are external dependencies named only when necessary?
Packaging
- Is SKILL.md sufficient on its own?
- If not, are extra details in
references/instead of bloating the main file? - Are
scripts/included only for deterministic or fragile tasks? - Are
assets/included only when they materially improve execution?
Validation
- Are there concrete examples or scenarios?
- Does the skill define what success looks like?
- Does it describe how to catch obvious failure modes?
- If assumptions were needed, are they stated explicitly?
Portability
- Is the wording generic unless the user explicitly asked for a repo-bound skill?
- Does the skill avoid environment assumptions it cannot justify?
- If the skill is repo-bound, does it say so plainly?
Consistency
- Does the name follow existing patterns (lowercase-hyphenated)?
- Does the description avoid leaking workflow steps that would let the agent skip the body?
- Is there a companion rule or skill that should be referenced?
- Does the CHANGELOG format match other artifacts?
Validation
- Have you tested with at least one near-miss negative prompt?
- Could someone verify the output is correct without re-reading the whole skill?
Token Budget
- Is the front-loaded cost justified for activation frequency?
- Could any section move to
references/without losing workflow clarity?
Boundaries
- Does the skill state what it DOES?
- Does it state what it Does NOT do?
- Are the boundaries specific enough to prevent scope creep?
- Would an agent know which files and actions are off-limits?
references/skill-validation.md | | Raw
Skill Validation
Use this when a skill needs prompt-level validation before shipping.
Goal
Prove that the skill:
- Triggers for obvious requests
- Triggers for paraphrased requests
- Does not trigger for nearby but unrelated work
- Gives another agent enough detail to act without guessing
Minimum Prompt Set
Write at least 3 prompts:
- positive-obvious — direct request using the most likely trigger words
- positive-paraphrased — same job, different wording
- negative-adjacent — close enough to confuse a weak description, but should not load the skill. The best negatives are near-misses: queries that share keywords or domain with the skill but need something different. "Write a fibonacci function" is too easy as a negative for a deploy skill — "set up a staging environment" is a real near-miss that tests discrimination.
Add more prompts only when the surface area is large or the user explicitly wants deeper validation.
Comparison Modes
Choose the lightest comparison that answers the risk:
- manual simulation — read the prompt against the skill and judge whether the trigger and workflow would work
- before vs after — compare the current skill against the revised skill
- trigger wording A vs B — use when the main risk is activation quality rather than body content
When Validation Can Be Skipped
Skip only when all of these are true:
- The edit does not change the trigger surface
- The edit does not change the workflow meaning
- The edit does not add or remove important resources
If any of those changed, run at least a lightweight prompt simulation.
Review Checklist
For each prompt, record:
- Should the skill trigger?
- Which words or phrases should cause activation?
- Which part of the body should guide the next step?
- Where could an agent take a shortcut or misread?
Common failures:
- Description too vague to trigger
- Description so broad that it over-triggers
- Description summarizes workflow, tempting the agent to skip the body
- Body assumes repo facts it never tells the reader to discover
- Examples are longer than the rules they clarify
references/token-optimization.md | | Raw
Token Optimization
Use this after the skill works. Optimize for lower context cost without reducing execution quality.
Keep in SKILL.md
- The trigger-bearing frontmatter
- The core workflow
- Critical decision rules
- Short examples that anchor the workflow
- Direct links to bundled references
Move Out of SKILL.md
- Long domain primers
- Exhaustive edge-case catalogs
- Variant-specific instructions
- Large examples
- Detailed command references
- Documentation discoverable from the repo at runtime
Compression Rules
- Delete repeated ideas before rewriting sentences
- Prefer short checklists over explanatory paragraphs
- Replace generic advice with workflow-specific rules
- Keep examples only if they teach something not already obvious from the instructions
- Avoid motivational or narrative text
- Prefer one sharp sentence over two soft ones
Smell Tests
The main file is probably too large if:
- Multiple sections repeat the same workflow in different words
- The body restates trigger criteria already in the description (e.g., a "When to use" section that duplicates the description)
- Examples are longer than the instructions they illustrate
- Reference material dominates the core procedure
- The skill explains common concepts instead of workflow-specific guidance
- Multiple sections serve the same purpose (e.g., a quality checklist and a common failures section that overlap)
Final Pass
Ask:
- What text can be deleted with no loss of behavior?
- What text belongs in
references/? - What assumptions should be stated once instead of repeated?
- Is the description still strong enough to trigger correctly after trimming?
create-skill is a meta-skill that creates other agent skills following the Agent Skills specification.
Why a mandatory intake gate before writing
Jumping to writing without understanding requirements produces generic skills that don't match the user's actual workflow. Three questions minimum forces the agent to understand before acting.
Why descriptions optimize for activation, not teaching
LLMs use the description field for routing — deciding which skill to activate. A description that teaches the workflow instead of describing triggers causes mis-activation. The body teaches; the description matches.
Why skills are validated with negative prompts
A skill that activates on everything is useless. Negative prompts ("this should NOT trigger the skill") test that the trigger surface has boundaries. Without them, over-eager activation degrades the full system.
[1.3.0] - 2026-06-11
Added
- Portability check in validation step for distributable skills
- Common Failures guidance: use NEVER/MUST language for weak enforcement
Changed
- Frontmatter rules now generic (project decides metadata structure)
- Acknowledgments step documents compact one-liner format
Removed
- Version-banned-from-frontmatter rationale (moved to project conventions)
[1.2.1] - 2026-05-26
Changed
- Streamlined SKILL.md for lower token cost and 95%+ review score
- Condensed writing rules, example scenario, and common failures sections
- Simplified investigation and design steps
[1.2.0] - 2026-05-24
Added
- Imperative form writing rule replacing "explain the why"
- Structured validation report format with trigger analysis
- Skill design checklist: consistency, validation, and token budget sections
- Metadata tags in frontmatter
Changed
- Boundaries section now mandatory (was recommended)
- Conciseness rule tightened to single-sentence form
[1.1.0] - 2026-05-05
Added
- Boundaries section as a recommended skill convention
- Boundaries checklist in skill design checklist
- Frontmatter rules: only name and description allowed
- Writing rule: complete examples over scattered snippets
- Writing rule: common failures must be non-obvious
Changed
- Conciseness rule now explicitly says not to explain concepts the model already knows
- Removed version field from frontmatter (ADR 0005)
[1.0.0] - 2026-04-28
Added
- Repo investigation step before asking questions
- Progressive disclosure design guidance (SKILL.md, references/, scripts/, assets/)
- Trigger description writing rules
- Validation with representative positive and negative prompts
- Quality checklist for finalizing skills
- n-n-code/n-n-code-skills (MIT) — skill design checklist, validation workflow, token optimization patterns, progressive disclosure guidance (adopted in v1.0.0)
- Anthropic/skills (MIT) — pushy description guidance, trigger quality principles (adopted in v1.0.0)
- agentskills.io — Agent Skills specification and directory structure (adopted in v1.0.0)
- antongulin/opencode-skill-creator (MIT) — mandatory intake gate pattern, near-miss negative prompt emphasis, staged workspace guidance (adopted in v1.1.0)