Structured Skill Definitions#

When an agent has access to dozens of tools, it needs more than names and descriptions to use them well. It needs to know what inputs each tool expects, what outputs it produces, what other tools or infrastructure must be present, and how expensive or risky a call is. A structured skill definition captures all of this in a machine-readable format.

Why Not Just Use Function Signatures?#

Function signatures tell you the types of parameters. They do not tell you that a skill requires kubectl to be installed, takes 10-30 seconds to run, needs cluster-admin permissions, and might delete resources if called with the wrong flags. Agents making autonomous decisions need this information up front, not buried in documentation they may not read.

Core Schema#

A skill definition has six sections: identity, inputs, outputs, dependencies, metadata, and examples.

skill:
  name: validate-helm-chart
  version: "1.2.0"
  description: >
    Validate a Helm chart by running helm lint, template rendering,
    and optional dry-run against a live cluster. Returns structured
    validation results with error locations.

  inputs:
    chart_path:
      type: string
      required: true
      description: "Path to the Helm chart directory"
      validation:
        pattern: "^[a-zA-Z0-9_./-]+$"
    values_file:
      type: string
      required: false
      description: "Path to a values.yaml override file"
    dry_run:
      type: boolean
      default: false
      description: "If true, runs helm install --dry-run against the cluster"
    target_namespace:
      type: string
      default: "default"
      description: "Kubernetes namespace for dry-run validation"

  outputs:
    valid:
      type: boolean
      description: "Whether the chart passed all validation checks"
    errors:
      type: array
      items:
        type: object
        properties:
          file: { type: string }
          line: { type: number }
          message: { type: string }
          severity: { type: string, enum: [error, warning, info] }
      description: "List of validation issues found"
    rendered_templates:
      type: array
      items: { type: string }
      description: "Names of templates that rendered successfully"

  dependencies:
    tools:
      - name: helm
        version: ">=3.12.0"
        required: true
      - name: kubectl
        version: ">=1.28.0"
        required: false
        reason: "Only needed when dry_run is true"
    skills:
      - name: cluster-auth
        reason: "Provides kubeconfig for dry-run validation"
        required: false

  metadata:
    estimated_duration: "5-30s"
    cost: none
    risk_level: read-only
    permissions:
      - "filesystem:read"
      - "kubernetes:get (when dry_run=true)"
    idempotent: true
    cacheable: true
    cache_ttl: "5m"

  examples:
    - description: "Basic lint check"
      inputs: { chart_path: "./charts/my-app" }
      expected_output: { valid: true, errors: [] }
    - description: "Lint with values override"
      inputs: { chart_path: "./charts/my-app", values_file: "./values-prod.yaml" }

Input Types and Validation#

Go beyond basic types. Add constraints that prevent the agent from calling the skill with bad data:

  • string: Add pattern (regex), min_length, max_length, enum (allowed values)
  • number: Add minimum, maximum, multiple_of
  • array: Add min_items, max_items, items (element schema)
  • object: Add properties with nested schemas, required fields

The more specific your input schema, the fewer invalid calls the agent will make.

Output Structure#

Return structured data, not prose. An agent that receives "Chart is valid with 2 warnings" has to parse natural language. An agent that receives { "valid": true, "errors": [{ "severity": "warning", ... }] } can act programmatically.

Define your output schema as precisely as your input schema. If downstream skills consume this output, they need to know the exact shape.

Dependency Declaration#

Dependencies answer the question: “Can this skill run right now?” There are three categories:

  • Tool dependencies: Binaries or CLIs that must be installed (helm, kubectl, docker). Include minimum version requirements.
  • Skill dependencies: Other skills that must be available. A deploy-to-cluster skill might depend on validate-helm-chart and cluster-auth.
  • Infrastructure dependencies: Services that must be running – a database, an API endpoint, a Kubernetes cluster.

Mark each dependency as required or optional, and explain why it is needed. This lets the agent decide whether to proceed, find alternatives, or report what is missing.

Composability: Skills Calling Skills#

Complex operations are built from simple skills. A deploy-application skill might chain: validate-helm-chart then cluster-auth then helm-install then verify-deployment.

Define this composition explicitly:

skill:
  name: deploy-application
  composed_of:
    - skill: validate-helm-chart
      inputs_from: { chart_path: "$inputs.chart_path" }
      on_failure: abort
    - skill: cluster-auth
      inputs_from: { cluster: "$inputs.target_cluster" }
    - skill: helm-install
      inputs_from:
        chart_path: "$inputs.chart_path"
        namespace: "$inputs.namespace"
        kubeconfig: "$steps.cluster-auth.outputs.kubeconfig"
    - skill: verify-deployment
      inputs_from:
        namespace: "$inputs.namespace"
        expected_pods: "$steps.helm-install.outputs.created_resources"

Each step can reference inputs from the top-level skill or outputs from previous steps using $steps.<name>.outputs.<field>.

Versioning#

Skill schemas change. Use semantic versioning:

  • Patch (1.2.0 to 1.2.1): Bug fixes, no schema changes.
  • Minor (1.2.0 to 1.3.0): New optional inputs or outputs. Existing callers are unaffected.
  • Major (1.2.0 to 2.0.0): Breaking changes – removed inputs, changed output structure, new required inputs.

When an agent discovers a skill, it should check the version against its expectations. If a skill catalog returns validate-helm-chart@2.0.0 but the agent was built against 1.x, it should either adapt or report incompatibility.

Comparison with OpenAPI and Function Calling#

Skill definitions overlap with OpenAPI specs and LLM function-calling schemas, but fill a different niche:

Aspect OpenAPI Function Calling Skill Definition
Primary consumer HTTP clients LLM inference Autonomous agents
Dependencies Not modeled Not modeled Explicit
Risk/cost metadata Not standard Not standard Built-in
Composability Not modeled Not modeled First-class
Discovery API docs, Swagger Model context Skill catalogs

If you already have an OpenAPI spec, you can generate a skill definition from it and add the missing metadata. They are complementary, not competing.