Architecture

How Hyground turns a signal into action

A signal arrives, agents investigate in a sandboxed workspace, and the result lands in the tools your team already runs. Every step happens inside your own infrastructure, on the model you choose.

The whole picture

Three stages, one loop

Inbound adapters carry alerts, chats, and messages into Hyground. Agents reason over them, act in a sandboxed shell, and draw on your runbooks and past investigations. What comes back is a filed ticket, written documentation, or an answered thread. Nothing leaves your cluster.

Bring your own model

Agents reach their model through LiteLLM, so the choice stays yours: Anthropic, OpenAI, Google, or a model you host yourself. Swap providers without touching the rest of the system. No vendor lock-in, no data sent to a model you didn't pick.

Inside Hyground

Four parts do the work: agents that reason, adapters that read your systems, a sandbox where they act, and a memory of how you operate.

Agents

A multi-agent system that plans, investigates, and decides. Specialised agents work logs, metrics, and knowledge in parallel, then aggregate what they find into one answer.

Adapters

Hardened, read-only wrappers around the tools you already trust: Kubernetes, cloud, observability, databases. They refuse to start if the principal can write.

Workspace

A sandboxed shell where the agents do the actual work. Each task runs in an isolated container with a fixed set of pre-authenticated CLIs: kubectl, logcli, promtool, psql. Nothing else.

Knowledgebase

Your runbooks, documentation, and past investigations, embedded for retrieval. Agents recall how your systems work and how you fixed things last time.

A real run

Same incident. Different experience.

What it takes to get from an alert to a structured root cause, with and without Hyground in the loop.

Without Hyground

00:00
Alert fires. The on-call engineer is paged and starts switching between dashboards.
+5m
Logs and metrics opened in separate tabs. Manual filtering for the affected service starts.
+15m
Cross-team Slack thread spun up to find someone who knows the service and the recent changes.
+45m
Timestamps correlated by hand across logs, traces and deployment history.
+2h
A likely root cause emerges after trial and error and a hunt through old incident notes.
+3h
Findings written up manually. The Jira ticket is filed and the post-mortem is scheduled.

Total time: around 3 hours

Heavily dependent on who is on-call and what they remember from the last similar incident.

With Hyground

00:00
An alert arrives at the ingest endpoint. The payload passes an injection and jailbreak check before any work starts.
+30s
The agent plans the investigation and executes it.
+2m
Logs, metrics, traces and deployment history are queried in parallel from a read-only execution environment.
+4m
Relevant runbooks and past investigations are pulled from the knowledge base, so the reasoning is grounded in how your systems actually work.
+6m
A structured root cause is delivered: affected services, supporting evidence, recommended next actions, every step recorded.
+1h
A post-mortem draft is written automatically from the investigation: timeline, root cause, contributing factors, action items.

Time to root cause: around 6 minutes

Every step runs inside your cluster. Every query, every piece of evidence, every reasoning step is recorded and replayable.

Connected to everything you run

The same loop plugs into the tools your team already lives in, inbound and outbound.

Signals in

Alerts, chats, and AI tools reach Hyground through inbound adapters.

Alerts: Prometheus, PagerDuty, Grafana via Herald
Chat: Slack, Teams, Email
AI tools: Claude Code, Cursor, any MCP client

See all integraions

Results out

Findings come back as work, not notifications, in the tools your team already uses.

Tickets: Jira, ServiceNow, GitHub, GitLab
Conversations: Slack, Teams, Email
Code & docs: commits and PRs via GitHub, GitLab

Self-hosted & sovereign

Security is enforced at the architecture level

Hyground ships as a Kubernetes Helm chart and runs entirely inside your perimeter. There is no Hyground SaaS in the path: no control plane, no phone-home, no operator access into your cluster.

See the full security model

Runs in your cluster

A self-hosted Helm chart with no vendor access path. No SaaS control plane, no central key, no shared infrastructure. The vendor has no operational route in.

We never see your data

Zero data egress, zero telemetry, no phone-home. The only outbound traffic is the prompt to the LLM you chose, and secrets are stripped before the model sees them. Host that model yourself and nothing leaves at all.

Hardened and read-only

Chainguard distroless images, non-root, read-only root filesystem. Adapters are read-only by default and refuse to start if the principal can write.

See Hyground in action

Check out our sandbox or schedule a demo with our team and experience sovereign AI firsthand.

Try our sandbox Book a demo