Skills

Codify Institutional Knowledge. Execute It Deterministically.

Skills are reusable operational workflows that encode your organisation's expertise into executable units. Novel problems use reasoning. Solved problems become deterministic skills, executed consistently, at any hour, without escalation.

What is a Skill?

A skill is a defined operational workflow that Hyground executes on your behalf. It knows which data sources to query, what evidence to collect, what actions to take, and how to present findings.

Every team member executes the same workflow with the same rigour, whether it is 3pm or 3am.

01

Manual Trigger

Run any skill on demand from the Hyground interface. Ask an ad-hoc question and get a structured investigation with evidence-backed findings.

02

Scheduled Trigger

Set skills to run on a cron schedule: every morning at 7:55am, every Monday, before every deploy. Findings and executed actions land in your inbox or Slack.

03

Event Trigger

Skills fire automatically when an alert fires or a deployment is queued. Investigation and initial remediation begin before a human is paged.

Agents autonomously load skills on demand

Every skill's description sits in the agent's system prompt, but the full body is fetched only when the agent decides to use it. The agent calls `skill(name, reason)`. The description tells it what the skill does and when it applies. The actual procedure is pulled in just-in-time. Context stays lean even as the skill library grows into the hundreds, and each load is recorded with the agent's stated reason.

Example Skills

These could be skills that you could execute immediately against your existing data sources and connected systems at any time.

Incident Investigation

Full structured RCA workflow, trigger detection, evidence collection, multi-system investigation, diagnosis, and documented findings.

Morning Health Check

Overnight anomaly sweep across services, metrics, and alerts. Collects evidence, flags issues, and delivers a structured report before standup.

Pre-Release Validation

Checks cluster health, config drift, dependency status, and resource headroom before a deployment is approved. Blocks or reports based on your thresholds.

Security Check

Scans for exposed secrets, misconfigured RBAC, outdated images, and unusual access patterns. Produces an evidence-backed findings report.

Capacity Assessment

Reviews resource utilisation trends, identifies services approaching saturation, and produces growth projections with supporting data.

Post-Incident Report

Generates a structured post-mortem from the investigation record, timeline, root cause, evidence chain, and remediation steps taken.

On-Call Handover Brief

Compiles system state, open alerts, recent changes, and active investigations for incoming on-call engineers.

FMEA Analysis

Failure mode and effect analysis across your service graph, identifies single points of failure and produces prioritised risk findings.

Weekly Ops Summary

A weekly digest of incidents handled, actions executed, trends, and capacity signals for engineering leadership.

Custom skills, your operational specifics

Built-in skills cover the common cases. Custom skills encode your organisation's specifics, your architecture, your runbooks, your definitions of healthy, your remediation steps.

Custom skills can be as simple as a plain text checklist or as detailed as a multi-step investigation procedure. If your senior engineer has a mental checklist they run during incidents, that becomes a deterministic, repeatable skill.

  • Plain text or structured format, whichever works for your team
  • Automatically shared across your team when saved
  • Bundle a Python or Bash script alongside skill.md; the agent runs it in a sandboxed workspace with a 30-second timeout
  • Skills can reference any API or MCP server you have configured

Example: Pre-deploy payment service checklist

  1. Check payment-service pod health in prod
  2. Verify Stripe webhook endpoint is reachable
  3. Confirm database connection pool utilisation < 70%
  4. Check for any open P1 incidents on dependent services
  5. Confirm last deployment succeeded without rollback

Operational Knowledge That Survives Attrition

Every new skill added to Hyground is executable by every engineer on your team. Over time, the platform accumulates the operational expertise of your entire organisation, independent of individual headcount.

When a senior engineer leaves, their operational workflows stay. When a new engineer joins, they execute at the same level from day one. Skills are the institutional execution capability that HR cannot retain.

Start with Built-ins, Build Your Own

See how skills execute against your infrastructure. We will show you the built-in skills running live and help you build your first custom skill.

See Hyground in action

Check out our sandbox or schedule a demo with our team and experience sovereign AI for DevOps firsthand.