Commands & Skills — Transformer Math

Module 49 · AI Engineering

📝 Commands & Skills

/compact is instant but 'compact this' takes 3 seconds — one never hits the API

Status:

Type /compactand it runs instantly — no API call. Type "please compact the conversation" and it goes to the LLM, costs tokens, takes seconds. The difference is the command system: a registry of slash commands and skills that decides what runs locally vs what goes to the model.

Three command types: local (instant function), local-jsx (React component), prompt (injected into conversation)
Skills are markdown files with YAML frontmatter — anyone can create them
! prefix = shell escape, runs command directly in terminal

🎮

Command Routing

What you are seeing

How user input is routed through the command system. Slash commands are intercepted before reaching the LLM, while plain text goes directly to the API.

What to try

Compare the three input types: /command (local), !shell (direct), and plain text (API). Notice which ones cost tokens and which are free.

// Input routing

/compact → local command → instant (0 tokens, 0ms)

/help → local command → instant (0 tokens, 0ms)

/review → prompt skill → inject markdown into conversation

!ls → shell escape → runs 'ls' directly in terminal

"help me" → API call → LLM processes (costs tokens, ~2s)

// Skill file anatomy

~/.claude/skills/my-skill/SKILL.md

├── YAML frontmatter (name, description, whenToUse)

├── allowed-tools: [Bash, Read, Write]

└── Markdown body → injected as user message

// Command sources (priority order)

1. Hardcoded (/help, /compact, /clear)

2. Plugin commands

3. Skill directory files

4. MCP server commands

5. Bundled commands

💡

The Intuition

What you’re seeing: the priority hierarchy for command resolution — hardcoded built-ins outrank plugins outrank user skills outrank bundled commands. What to try: trace what happens when a plugin tries to register /clear.

Before/After: Slash Command vs Natural Language

The difference between a slash command and typing the same request as plain text is not cosmetic — it's whether the request ever leaves your machine:

Input	Latency	How it works
`/compact`	0 ms	Local function call — never touches the API
"please compact"	~3 s	Goes to API, Claude reads the request, decides to compact

The slash command is ~1000x faster because it never leaves your machine.

Three Command Types

The key insight is that not everything needs the LLM. Commands split into three categories:

local — runs a JavaScript function directly. /compact triggers compaction logic, /help renders the help screen, /clear resets the conversation. Zero API calls, instant response.
local-jsx— renders a React component in the terminal UI. Used for rich interactive displays that don't need LLM reasoning.
prompt — injects markdown text as a system message into the conversation. The LLM then sees and responds to this injected context. This is how skills work.

💡 Tip · The / prefix is the user's explicit signal: "I want the built-in behavior." Without it, the same words go to the LLM for interpretation. This is why /compact is instant but "please compact" takes seconds and costs tokens.

Skills as Markdown

Skills are the extension mechanism. A skill is just a markdown file with YAML frontmatter:

name — the command name (becomes /name)
description — shown in /help and tab completion
whenToUse — lets the agent auto-trigger the skill based on conversation context
allowed-tools — restricts what tools the LLM can use while this skill is active (sandboxing)

The markdown body is the prompt — it gets injected as a user message (not system prompt), preserving the static system prompt cache across skill invocations. No code execution, just text. This makes creating skills trivially simple: write a .md file, drop it in ~/.claude/skills/, done.

Shell Escape

The ! prefix bypasses both the command system and the LLM entirely. !git status runs the command directly in the shell, same as typing it in a regular terminal. This is the escape hatch for when you want raw shell access without LLM interpretation.

✨ Insight · The command registry aggregates from 4 sources: hardcoded built-ins, skill directories, plugins, and bundled commands. It rebuilds on each invocation (not cached), so newly added skill files are picked up immediately without restarting the agent.

Auto-Trigger: How whenToUse Works

The whenToUse frontmatter field is not just documentation — it is injected into the system prompt as a hint to the LLM. At session start, all registered skills surface their name and whenToUsedescriptions in the dynamic section of the system prompt. When the user types plain text ("review this PR"), the LLM can match that intent against the whenToUse descriptions and respond by invoking the matching skill — even without the user typing /review. This is distinct from the command routing path: auto-trigger goes through the LLM, while explicit /nameinvocation bypasses it. The tradeoff: auto-trigger costs tokens and adds latency, but reduces the user's need to memorize command names.

End-to-End: A prompt Skill Invocation

What happens between typing /my-skill some args and the LLM responding:

Input router detects / prefix, looks up my-skill in the registry
Registry returns type: "prompt" with the markdown body as content
The markdown body is injected as a user message (not system prompt), with some args appended after the skill body
If allowedTools is set, the tool registry is filtered before the API call — the LLM only sees the declared tools
The API call is made with the filtered tools and the injected skill body; the LLM responds in the context of the skill's instructions
After the skill completes, allowedTools is restored to the full set for the next turn

The injection-as-user-message (not system prompt) is intentional: it preserves the static system prompt cache. Adding skill content to the system prompt would invalidate the cached prefix on every skill invocation, costing latency and money on every API call in the session.

Quick Check

What happens when you type /compact vs 'please compact'?

📐

Key Code Patterns

Input Routing (TypeScript pseudocode)

typescript

function processInput(text: string): Result {
  if (text.startsWith("/")) {
    const cmd = findCommand(text);
    if (cmd.type === "local") {
      return cmd.execute();           // instant, no API
    } else if (cmd.type === "local-jsx") {
      return cmd.render();            // React component
    } else if (cmd.type === "prompt") {
      injectAsUserMessage(cmd.content); // user message, not system prompt (preserves cache)
      return sendToApi();               // LLM sees injected text
    }
  } else if (text.startsWith("!")) {
    return runShell(text.slice(1));   // direct shell
  } else {
    return sendToApi(text);           // goes to LLM
  }
}

Skill File Structure

yaml

# Skills live in a named directory with a SKILL.md entry point.
# Example: ~/.claude/skills/my-skill/SKILL.md
---
name: my-skill
description: Does something useful
when_to_use: When user asks for X
allowed-tools: Bash,Read,Write
---
# Markdown body is injected as a user message (not system prompt).
# This preserves the static system prompt cache across invocations.

You are now in my-skill mode. Follow these rules:
1. Only read files in the ./src directory
2. Suggest changes but don't edit without confirmation
3. Format output as a markdown table

# Note: frontmatter keys are snake_case ("when_to_use", "allowed-tools").
# After parsing they are exposed as JS properties (e.g. whenToUse) — but
# the YAML keys themselves must be snake_case / kebab-case.

Command Registry

typescript

function getCommands(): Map<string, Command> {
  // Aggregate commands from 4 sources (priority order)
  const commands = new Map<string, Command>();

  // 1. Hardcoded built-ins (highest priority)
  for (const cmd of getBuiltinCommands()) {
    commands.set(cmd.name, cmd); // /help, /compact, /clear, /resume…
  }

  // 2. Plugin commands
  for (const plugin of getInstalledPlugins()) {
    for (const cmd of plugin.getCommands()) {
      if (!commands.has(cmd.name)) commands.set(cmd.name, cmd);
    }
  }

  // 3. Skill directory files (~/.claude/skills/, .claude/skills/)
  for (const skillDir of SKILL_DIRS) {
    for (const mdFile of glob(`${skillDir}/*.md`)) {
      const skill = parseSkill(mdFile);
      if (!commands.has(skill.name)) {
        commands.set(skill.name, {
          name: skill.name,
          type: "prompt",
          content: skill.body,
        });
      }
    }
  }

  // 4. Bundled commands (lowest priority)
  for (const cmd of getBundledCommands()) {
    if (!commands.has(cmd.name)) commands.set(cmd.name, cmd);
  }

  return commands;
}

🔧

Break It — See What Happens

Send everything to the API (no local commands)

No skill system (only hardcoded commands)

📊

Real-World Numbers

Metric	Value
Built-in commands	~10 (/help, /compact, /clear, /resume, etc.)
Command types	3 (local, local-jsx, prompt)
Skill directories	~/.claude/skills/, .claude/skills/, plugin dirs
Command sources	4 (hardcoded, plugins, skills, bundled)
/compact latency	~5ms (local) vs ~3s (if sent to API)
Skill format	Markdown + YAML frontmatter

✨ Insight · The skill system turns prompt engineering into a shareable artifact. Instead of copying and pasting prompts between conversations, you write a .md file once and invoke it with /name. Teams can distribute skills via plugins, creating organizational knowledge that persists across sessions and users.

🧠

Key Takeaways

What to remember for interviews

1Slash commands bypass the LLM entirely — /compact runs a local function in ~5ms with zero API cost, while typing 'please compact' takes ~3 seconds and burns tokens.
2Three command types exist: 'local' (instant function), 'local-jsx' (React component render), and 'prompt' (markdown injected as a user message for the LLM to act on).
3Skills are just markdown files with YAML frontmatter — drop one in ~/.claude/skills/ and it immediately becomes a /name command without restarting the agent.
4The command registry rebuilds on every invocation from 4 sources (hardcoded → plugins → skills → bundled), so new skill files are picked up instantly.
5Skill content is injected as a user message (not system prompt) so it never invalidates the cached system prompt prefix on each invocation.

📚

Interview Questions

Difficulty:

Company:

Showing 4 of 4

Design a plugin system where users can extend an AI agent with markdown files.

★★★

Anthropic

How do you decide what runs locally vs what goes to the LLM?

★☆☆

GoogleAnthropic

How would you implement a command registry that aggregates commands from 4 different sources?

★★☆

AnthropicGoogle

Design a skill that chains multiple other skills — what are the error handling challenges when a mid-chain skill fails?

★★★