Hermes Agent Skills Advanced Guide 2026: SKILL.md, Bundles, GEPA Self-Evolution

00Why Hermes Skills Deserve a Dedicated Deep Dive

In early 2026, NousResearch launched Hermes Agent. Within two months it hit 160k GitHub stars — one of the fastest-growing open-source projects in the AI agent space. The headline is not a bigger model; it is a philosophy: "the agent that grows with you."

At the foundation of that growth is the Skills system. Unlike one-shot prompts, Hermes skills are standardized, evolvable, cross-session procedural memory. This post skips install basics. We go deep on Progressive Disclosure for token control, Conditional Activation for context-aware behavior, Skill Bundles for one-command workflows, DSPy + GEPA for auto-improving skills, and the best open-source skill repositories you can tap into today.

PainWhat Breaks When You Treat Skills Like Long Prompts

Token bleed from eager loading: Dumping full SOPs into every session burns context before the agent picks a task. Skills exist to load on demand; treating them like standing prompts defeats the design.
Vague descriptions, wrong activations: A description that says "helps with code" loads during unrelated tasks. Precision in the description field is your activation filter at Level 0.
No workflow packaging: Teams manually invoke five slash commands at session start — code review, TDD, PR workflow, debugging, deploy checks — instead of one Bundle.
Stale skills after edits: Changing SKILL.md mid-session does nothing until /reset or reinstall with --now, which invalidates prompt cache and spikes cost.
Skills and MCP confused: MCP gives tools; Skills teach how to use them. Without Skills, agents call database MCP tools with no migration playbook.

01Skills vs Memory vs Prompts: The Concept Map

Three layers of agent context look similar but behave differently. Use this matrix before writing your first SKILL.md.

Dimension	Prompt	Memory	Skill
Persistence	Current conversation	Cross-session, permanent	Cross-session, permanent
Load timing	Always in context	Injected each session	On demand (key difference)
Token cost	Every turn	Small, stable	Zero until activated
Content type	Any intent	User prefs / facts	Procedural steps (how to do X)
Maintained by	User manually	Agent auto	User + Agent
Shareable	Awkward	Private	Publishable as community Tap

Mnemonic: Prompt = sticky note (this session only). Memory = notebook (always nearby). Skill = SOP manual (open when needed).

02SKILL.md Format and Progressive Disclosure

All Hermes Skills follow the agentskills.io open standard — portable across Hermes, Claude Code, Cursor, and OpenCode. Validate with skills-ref validate ./my-skill before publishing.

SKILL.md frontmatter + body skeleton

---
name: my-skill
description: |
  Use when the user needs to [...].
  Handles [...] and [...].
version: 1.0.0
license: MIT
compatibility: Requires git, docker
allowed-tools: Bash(git:*) Read
metadata:
  hermes:
    tags: [devops, automation]
    category: software-development
    related_skills: [github-pr-workflow, test-driven-development]
    requires_toolsets: [terminal]
    fallback_for_toolsets: [web]
---

# My Skill Title

## Overview
What it does and why it exists.

## When to Use
- Use for: [specific scenarios]
- Don't use for: [excluded scenarios]

## Procedure
1. Step one (exact commands)
2. Step two
3. Step three

## Common Pitfalls
1. Failure mode + fix
2. Edge case handling

## Verification Checklist
- [ ] Checkpoint 1
- [ ] Checkpoint 2

Directory layout for modular skills:

~/.hermes/skills/my-category/my-skill/

├── SKILL.md              # Core steps (target ≤500 lines)
├── references/
│   ├── api-docs.md       # Loaded on demand
│   └── examples.md
├── templates/
│   └── config.yaml
└── scripts/
    └── setup.sh          # Agent-executable

Progressive Disclosure is the token-control core. Three load levels:

Level	Content	Trigger	Token cost
Level 0	`name` + `description`	Every session start (all skills)	~3K total across catalog
Level 1	Full `SKILL.md` body	`/skill-name` or LLM judgment	Depends on file length
Level 2	`references/`, `scripts/`	LLM decides during execution	Per file, on demand

The description field is all Level 0 information. Write when to use more clearly than what it is — the LLM uses it to decide whether to load the full skill.

03Skill Bundles: One Command for the Full Workflow

Skill Bundles are a 2026 Hermes feature that is still underused. A Bundle is a lightweight YAML file packing multiple skills into one slash command. Running /bundle-name loads every listed skill simultaneously.

File location: ~/.hermes/skill-bundles/<slug>.yaml

backend-dev.yaml

name: backend-dev
description: |
  Full backend feature workflow — code review, TDD, and PR management.
  Use at the start of any new backend feature session.
skills:
  - github-code-review
  - test-driven-development
  - github-pr-workflow
instruction: |
  Always write failing tests first before implementation.
  Open PRs with co-author tags for pair-programming sessions.
  Never push directly to main.

Research session bundle:

research-session.yaml

name: research-session
description: Load all research tools at once for deep-dive sessions.
skills:
  - arxiv
  - deep-research
  - plan
  - excalidraw
instruction: |
  Start every session by checking recent papers on the topic.
  Create an Excalidraw diagram for any architecture discussed.

Priority rules: Bundle beats a same-named Skill. Missing skills are skipped with a warning, not an error. Bundles do not modify the system prompt — prompt cache stays valid.

CLI quick create

hermes bundles create backend-dev \
  --skills github-code-review,test-driven-development,github-pr-workflow \
  --instruction "Always write failing tests first"

04Conditional Activation: Context-Aware Skills

Skills can auto-show or hide based on which tools are available in the current session. Configure under metadata.hermes:

Activation rules in SKILL.md

metadata:
  hermes:
    requires_toolsets: [web]
    requires_tools: [web_search]
    fallback_for_toolsets: [browser]
    fallback_for_tools: [browser_navigate]

Field	Behavior
`requires_toolsets`	Hide skill when listed toolsets are absent
`requires_tools`	Hide skill when listed tools are absent
`fallback_for_toolsets`	Hide skill when listed toolsets are present (fallback role)
`fallback_for_tools`	Hide skill when listed tools are present

Free vs paid search swap: DuckDuckGo skill sets fallback_for_tools: [web_search]. When FIRECRAWL_KEY or BRAVE_SEARCH_KEY activates paid web_search, the DuckDuckGo skill disappears from the prompt — saving tokens. When the API is unavailable, the fallback surfaces automatically.

Platform-aware skills: Use requires_toolsets: [messaging] with platforms: [telegram, discord]. The hermes skills TUI lets you toggle skills per platform (CLI, Telegram, Discord) independently.

05Skills Hub and Open-Source Community

Official install channels:

hermes skills install

hermes skills install official/research/arxiv
hermes skills install https://example.com/SKILL.md --name my-skill
hermes skills install github:openai/skills/k8s
hermes skills tap add github:my-org/my-skills

Repository	Focus	Stars	Highlight
awesome-hermes-skills	Production-grade curated set	67	Deep Research, MLOps, Apple integration; gh-copilot plugin
hermeshub	Community registry + security scan	166	API and marketplace; prompt-injection detection per skill
ai-agent-skills	191 skills, 28 categories	10	Cross-agent install for Hermes / Claude Code / Cursor
hermes-agent	Official source	—	Built-in skills and authoring spec

agentskills.io means skills are not locked to one host. Community assets travel with you.

06Publishing Your Own Skill Tap

A GitHub repo as Tap lets your whole team — or the community — subscribe to your skill set.

my-skills-tap/ layout

my-skills-tap/
├── skills.sh.json
├── mlops/
│   ├── vllm-deploy/SKILL.md
│   └── model-benchmark/SKILL.md
├── research/
│   ├── paper-summarizer/SKILL.md
│   └── citation-finder/SKILL.md
└── README.md

skills.sh.json (Hub groupings)

{
  "groupings": [
    {
      "title": "MLOps & Model Deployment",
      "skills": ["vllm-deploy", "model-benchmark"]
    },
    {
      "title": "AI Research Workflows",
      "skills": ["paper-summarizer", "citation-finder"]
    }
  ]
}

Team deployment

hermes skills tap add github:your-org/your-skills-tap
hermes skills tap add github:your-org/private-skills --token $GH_TOKEN
hermes skills tap update
hermes skills tap list

Version control tip: Track ~/.hermes/skills/ in Git for cross-device sync. After git pull, run hermes skills reset to rebuild built-ins.

07Self-Evolving Skills with GEPA + DSPy

GEPA (Genetic-Pareto Prompt Evolution) is ICLR 2026 Oral work integrated in hermes-agent-self-evolution. It improves skill text — not model weights — by analyzing execution traces, generating variants, and multi-objective Pareto optimization.

Cost: roughly $2–10 per optimization run (API calls only, no GPU).

Five-stage evolution pipeline:

Stage 1 — Trace collection: Read full reasoning traces from SQLite (tool calls, branches, errors).
Stage 2 — Reflective failure analysis: LLM produces actionable side information — not "it failed" but why.
Stage 3 — Targeted mutation: Generate 10–20 SKILL.md variants addressing root causes.
Stage 4 — Multi-objective Pareto evaluation: Optimize success rate × token efficiency × speed simultaneously.
Stage 5 — Human review PR: Best variant becomes a PR; ship only after human approval.

evolve_skill quick start

git clone https://github.com/NousResearch/hermes-agent-self-evolution
cd hermes-agent-self-evolution && pip install -r requirements.txt
export HERMES_AGENT_PATH=~/.hermes

python -m evolution.skills.evolve_skill \
    --skill github-code-review \
    --iterations 10 \
    --eval-source synthetic

python -m evolution.skills.evolve_skill \
    --skill github-code-review \
    --iterations 10 \
    --eval-source sessiondb

Four guardrails before any PR ships:

Full test suite: pytest tests/ -q must pass 100%
Size limit: Skills ≤ 15KB, tool descriptions ≤ 500 characters
Prompt cache compatibility: no mid-session edits that invalidate cache
Semantic preservation: evolved text must not drift from the skill's core purpose

Phase	Target	Engine	Status
Phase 1	SKILL.md files	DSPy + GEPA	Shipped
Phase 2	Tool descriptions	DSPy + GEPA	Planned
Phase 3	System prompt fragments	DSPy + GEPA	Planned
Phase 4	Tool implementation code	Darwinian Evolver	Planned
Phase 5	Continuous improvement loop	Automated pipeline	Planned

Cross-host traces: Because Skills follow agentskills.io, feed Claude Code or Gemini CLI traces into GEPA:

Mixed trace sources (experimental)

python -m evolution.skills.evolve_skill \
    --skill github-code-review \
    --iterations 10 \
    --eval-source mixed \
    --trace-dirs ~/.claude/traces,~/.hermes/sessions

08Plugin-Bundled Skills

Plugins namespace skills as plugin:skill:

Skills stay out of default skills_list (less system-prompt noise)
Activate only on explicit user call (opt-in)
Sibling skills within a plugin are cross-aware

Load plugin skill

skill_view("superpowers:writing-plans")

Agent response includes sibling skills: "This plugin also includes: superpowers:editing, superpowers:research".

plugin.yaml skill declaration

name: my-hermes-plugin
skills:
  - name: writing-plans
    path: skills/writing-plans/SKILL.md
  - name: editing
    path: skills/editing/SKILL.md

09Advanced Skill Authoring Tips

Description precision drives activation:

Bad vs good description

# Too vague — loads in wrong contexts
description: Helps with code.

# Precise — LLM matches intent correctly
description: |
  Use when reviewing a pull request, checking for code quality issues,
  security vulnerabilities, or style violations. Handles GitHub PR URLs
  and local git diff output. Do NOT use for writing new code.

Pitfalls section separates good skills from great ones: specific failure modes, root-cause analysis, actionable fixes — not generic warnings.

Scripts give skills real execution power. Reference scripts/extract_schema.py in Procedure; on failure, load references/manual-extract.md as fallback.

Skill size	Recommendation
< 500 lines	Keep everything in SKILL.md
500–1000 lines	Move reference material to `references/`
> 1000 lines	Split aggressively; consider two skills
> 15KB	Exceeds GEPA limit — must split

skill_manage lets the agent patch or create skills programmatically. Enable approval gate in config.yaml: skills.agent_writes_require_approval: true.

10Case Study: Tech Blog Workflow Skills

Goal: a complete blog-writing assistant skill stack for Hermes.

~/.hermes/skill-bundles/blog-workflow.yaml

name: blog-workflow
description: Full tech blog writing workflow.
skills:
  - seo-keyword-research
  - outline-generator
  - code-example-validator
  - bilingual-checker
  - publish-to-platform
instruction: |
  Always research SEO keywords before writing.
  Ensure all code examples are tested and runnable.
  Generate both Chinese and English title options.

seo-keyword-research/SKILL.md (excerpt)

---
name: seo-keyword-research
description: |
  Use when planning a technical blog post. Researches search volume,
  competition, and related queries for Chinese and English audiences.
metadata:
  hermes:
    requires_toolsets: [web]
    tags: [seo, blogging, content]
---

## Procedure
1. Identify primary topic from user or context
2. Chinese long-tail: "X 怎么用", "X 教程", "X 最佳实践"
3. English long-tail: "X tutorial", "how to X", "X vs Y"
4. Cross-reference platform trends (掘金, Dev.to, HN)
5. Output keyword matrix: 3-5 primary + 10-15 long-tail per language

Run /blog-workflow at session start. The agent researches keywords, drafts outlines, validates code samples, and prepares bilingual titles before you write a single paragraph.

11Six-Step Runbook: Skills Lab on Cloud Mac

01
Install Hermes and baseline skills: Follow our install walkthrough. Add official Taps and at least one community repo (hermes skills tap add github:ChuckSRQ/awesome-hermes-skills). Confirm Level 0 catalog stays under ~3K tokens with hermes skills list.
02
Provision a dedicated cloud Mac: Open the NUKCLOUD console and pick a 16 GB+ tier (32 GB if you run GEPA evolution alongside active sessions). Hourly billing on the pricing page suits skill authoring pilots.
03
Author and validate SKILL.md files: Create skills under ~/.hermes/skills/ per agentskills.io. Run skills-ref validate ./my-skill. Write precise description fields and split files over 500 lines into references/.
04
Package Bundles and Conditional Activation: Add YAML bundles for recurring workflows. Set fallback_for_tools for free/paid tool swaps. Smoke-test with /bundle-name and verify missing skills warn without crashing.
05
Publish team Tap and run GEPA evolution: Push skills to a GitHub Tap; teammates run hermes skills tap add. Clone hermes-agent-self-evolution, set HERMES_AGENT_PATH, evolve one skill with --eval-source sessiondb (budget $2–10 per run). Review PR diffs before merge.
06
Daemonize and lock capacity: Track ~/.hermes/skills/ in Git; use launchd for 24/7 Hermes gateway or Telegram bot. Lock your tier on the order page. Cross-read Cursor Skills patterns if your team splits IDE and terminal agents.

Running Hermes skills on a laptop hits lid-close sleep killing Telegram sessions, shared VPS bandwidth jitter breaking long agent loops, and prompt-cache invalidation every time you edit skills mid-flight. GEPA evolution and overnight agent runs need a machine that stays awake with stable network. For production skill labs and team Taps, NUKCLOUD multi-region bare-metal Mac / cloud Mac nodes give tenant isolation and spec elasticity — start hourly on the pricing page, then move to fixed monthly capacity when your skill catalog stabilizes.

12Frequently Asked Questions

How do Skills differ from MCP?

Skills are procedural knowledge documents that teach an agent how to do something. MCP is a tool interface that gives agents additional callable capabilities. They complement each other: an MCP server exposes database access; a Skill teaches the agent how to run a safe migration with that tool.

Why does my edited Skill still behave like the old version?

Skill edits do not apply mid-session. Run /reset to start a fresh session, or reinstall with --now to force refresh. The --now path invalidates prompt cache and costs more tokens — prefer /reset when possible.

Are GEPA-evolved skills safe to deploy?

Four guardrails apply: full pytest pass, 15KB size cap, prompt-cache compatibility, and semantic preservation checks. Every variant still ships as a PR requiring human review. Treat evolved diffs like any other code change.

Can I reuse Hermes Skills in Claude Code?

Copy SKILL.md files to ~/.claude/skills/, or use kevinnft/ai-agent-skills for one-install multi-agent setup. The agentskills.io format is intentionally portable.

Does Chinese skill content hurt token efficiency?

Chinese characters cost roughly 1–1.5 tokens per character in most tokenizers — similar to English. Keep description in English (or bilingual) for sharper LLM matching at Level 0; body content can be any language.

13Further Reading and Resources

Official documentation:

Hermes Agent docs — authoritative reference
Hermes Agent Chinese docs
Skills system reference
Creating Skills developer guide
agentskills.io open standard

Open-source repositories:

NousResearch/hermes-agent — official main repo
hermes-agent-self-evolution — GEPA tooling
awesome-hermes-skills — curated production skills
hermeshub — community registry
ai-agent-skills — 191 cross-platform skills
gepa-ai/gepa — GEPA algorithm (MIT)
stanfordnlp/dspy — DSPy framework

Community content: