Back to blog

Building stamusctl · Part 1

Two commands to a working NDR stack

A Go CLI with a self-describing template system: OCI-distributed, composable config fragments that the tool discovers at runtime. One question, dozens of configs, zero hardcoded product knowledge.

Clear NDR Community Edition is an open-source network detection and response platform built on Suricata. It’s the successor to SELKS, which Stamus Networks maintained for years. Under the hood it’s a lot of software: Suricata for IDS/IPS, Fluentd for log processing, OpenSearch for indexing, Scirius for rule management and threat hunting, Arkime for PCAP analysis, Evebox for alert management, PostgreSQL, RabbitMQ, Celery workers, NGINX. That’s 10+ containers that all need to talk to each other, with configs that depend on your hardware and network.

I wrote stamusctl to make deploying all of that feel like nothing:

stamusctl compose init
stamusctl compose up

That’s it. First command asks you one question: which network interface to monitor. It pulls templates, generates everything. Second command brings it all up. A few minutes later you have Suricata watching your network, events flowing into OpenSearch, and a web UI where you can manage rules, hunt threats, and investigate alerts.

But the part that really sells it is PCAP replay:

stamusctl compose readpcap capture.pcap

You hand it a packet capture file and it replays it through Suricata. The events get indexed, and suddenly your Scout hunting interface is full of data. DNS queries, HTTP transactions, TLS handshakes, flow records, alerts. You can investigate a network incident from a PCAP without setting up any infrastructure manually. For demos, training, or forensics, this is the fastest path from “I have a PCAP” to “I can see what happened.”

The tool is open-source (GPL-3.0), written in Go, and the code is at github.com/StamusNetworks/stamusctl. The rest of this series is about how it works under the hood.

The architecture: templates all the way down

The key insight behind stamusctl is that the CLI doesn’t know anything about the product it deploys. It doesn’t know what Suricata is. It doesn’t know how many containers Clear NDR needs. It doesn’t know which config files to generate. All of that lives in templates. The CLI is a generic engine that pulls templates, discovers their parameters, prompts the user, renders the output, and wraps Docker Compose. The same binary could deploy a completely different product if you pointed it at different templates.

The pipeline looks like this:

  1. Pull a template bundle from an OCI registry (or use an embedded fallback)
  2. Walk the template tree, extract every parameter and its metadata
  3. Prompt the user only for parameters that have no default
  4. Merge parameter values from multiple sources (defaults → config file → env vars → flags)
  5. Render every template file through Go’s text/template + Sprig
  6. Write the output as a Docker Compose config
  7. Wrap docker compose to manage the resulting stack

Steps 2 and 3 are what make it interesting. The CLI is self-configuring: when a new template version adds a parameter, the CLI discovers it automatically. No code changes, no release, no coordination.

OCI distribution

Templates are distributed as OCI artifacts. Same wire protocol as Docker images, pushed and pulled with the same registries. This was a deliberate choice over alternatives like Git submodules or a custom HTTP API.

OCI gives us several things for free:

  • Versioning via tags. stamusctl compose init pulls :latest by default, but you can pin stamusctl compose init --template-version 2.3.1. Tags are immutable once pushed (digests enforce this).
  • Any registry works. Docker Hub, Harbor, GitHub Container Registry, AWS ECR. Whatever your organization already runs. No new infrastructure to host templates.
  • Content-addressable layers. If a template update only changes the Suricata config, only that layer is pulled. Large bundles don’t mean large downloads on every update.
  • Air-gapped support. Pull the artifact once, push it to an internal registry, and stamusctl works behind the firewall.

The templates live in a separate repo (stamusctl-templates) with their own CI and versioning. The split is deliberate: the tool logic and the product configuration have different release cycles. A new Suricata version might need template changes but zero CLI changes. A new CLI feature doesn’t touch any templates. In practice, templates ship 3-4x more often than the CLI.

For environments with no registry access at all, templates are also compiled into the binary via go:embed:

//go:embed clearndr/*
var AllConf embed.FS

The embedded copy acts as a fallback. stamusctl compose init tries the registry first, then falls back to the embedded bundle. This means the tool works on a freshly imaged machine with no network.

Self-describing templates

This is the part I’m most proud of. A template looks like this:

services:
  suricata:
    image: {{ .SuricataImage }}
    network_mode: host
    cap_add:
      - net_admin
      - sys_nice
    volumes:
      - {{ .ConfigPath }}/suricata/etc:/etc/suricata
{{ if .EnablePCAP }}
      - {{ .PCAPPath }}:/data/pcap
{{ end }}

But the parameters aren’t hardcoded in the CLI. They’re declared in a metadata section within the template bundle itself:

type Parameter struct {
    Name         string
    Type         string  // "string", "bool", "int"
    Default      Variable
    Choices      []Variable
    ValidateFunc func(Variable) bool
}

When compose init runs, it walks every template file, follows include directives, and builds a complete parameter set. The walking handles cycles: an include graph is tracked with a visited map and capped at depth 20, so templates can compose each other without infinite loops.

The Variable type uses pointers to distinguish “not set” from “set to zero value.” A nil string pointer means “use the default.” An empty string pointer means “the user explicitly set this to empty.” This distinction matters when merging values from multiple sources, because you need to know whether the user made an active choice.

Template composition

A Clear NDR deployment isn’t one big template; it’s a tree. The root template includes per-service fragments:

compose.yaml.tmpl
├── services/suricata.yaml.tmpl
├── services/opensearch.yaml.tmpl
├── services/fluentd.yaml.tmpl
├── services/scirius.yaml.tmpl
├── services/arkime.yaml.tmpl
├── services/nginx.yaml.tmpl
└── configs/
    ├── suricata/suricata.yaml.tmpl
    ├── fluentd/fluent.conf.tmpl
    └── nginx/nginx.conf.tmpl

Each fragment declares its own parameters. The Suricata template needs the network interface and capture method. The OpenSearch template needs heap size and cluster name. NGINX needs the TLS certificate paths. The CLI merges them all into a single parameter set, deduplicates, and presents one unified prompt.

This composability is what lets the template repo evolve without the CLI knowing. When we added Arkime for PCAP analysis, it was a new service fragment and two new parameters. Zero CLI changes. When we added optional CrowdSec integration, it was a conditional include guarded by a boolean parameter. Again, zero CLI changes.

Sprig functions handle the logic within templates: string manipulation, math for computing heap sizes from available memory, list operations for building volume mount arrays. The templates are more capable than they look. There’s real logic in them, but it’s all declarative and contained within the template bundle.

Parameter cascading

Parameter values resolve through four layers, last wins:

  1. Template defaults: baked into the parameter declaration
  2. Config file: ~/.config/stamusctl/config.yaml or --config flag
  3. Environment variables: STAMUS_ prefix, dots become underscores (STAMUS_SURICATA_INTERFACE=eth0)
  4. CLI flags: --set suricata.interface=eth0

Viper handles the merging. The config list command shows the fully resolved parameter set with the source of each value, which is essential for debugging “why did it use that image tag” questions.

For the interactive compose init flow, the CLI only prompts for parameters that don’t have a value from any source and don’t have a default. In practice, this means one question: which network interface. Everything else has sensible defaults. But an operator in a CI pipeline can set everything via environment variables and get a fully non-interactive init.

The end result: one question for the user, dozens of configs generated correctly. Suricata gets the right interface, Fluentd gets the right output targets, OpenSearch gets reasonable heap sizes based on available memory, NGINX gets TLS configured. All from one init.