Helm Quickstart

This guide walks an operator from a fresh Kubernetes cluster to a working openZro deployment with the dashboard, management API, signal, relay (with TLS), embedded Dex IdP backed by external PostgreSQL.

The chart reflects production-tested defaults from the openZro production deploy: consolidated Ingress for HTTP routes, cert-manager for all certificates (including the relay LoadBalancer cert), and external PostgreSQL via Cloud SQL or any managed/self-hosted Postgres. SQLite still works for labs but is not the default production target.

Prerequisites

  • Kubernetes 1.27+ with kubectl access
  • Helm 3.12+
  • ingress-nginx installed (IngressClass named nginx)
  • cert-manager installed with a working ClusterIssuer (Let's Encrypt or zero-ssl-prod work equally well)
  • A DNS record pointing at your Ingress controller's external IP — this guide uses openzro.example.com throughout
  • PostgreSQL reachable from the cluster — Cloud SQL, AWS RDS, or a self-hosted instance. The user openzro needs CREATEDB privilege if you want the chart's idempotent provisioning Job to create the four databases for you (otherwise pre-create them with the runtime user as OWNER and disable the Job)

1. Add the Helm repo

helm repo add openzro https://openzro.github.io/helms
helm repo update

OCI also works:

helm install --version 2.1.0-alpha.27 \
  openzro oci://ghcr.io/openzro/charts/openzro

2. Author your values override

Save this as my-openzro.yaml. The minimum viable production configuration — adjust the placeholders before applying.

global:
  namespace: openzro

# Consolidated Ingress for HTTP routes — dashboard, management /api,
# Dex /dex all on the same hostname/cert via path matching. Relay
# does NOT use Ingress (TCP/WebSocket persistent — see step below).
ingress:
  enabled: true
  className: nginx
  host: openzro.example.com
  annotations:
    cert-manager.io/cluster-issuer: letsencrypt-prod
    nginx.ingress.kubernetes.io/proxy-body-size: 200m
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
  tls:
    - secretName: openzro-tls
      hosts:
        - openzro.example.com
  dashboard:
    path: /
    pathType: Prefix
  dex:
    path: /dex
    pathType: Prefix
  management:
    enabled: true
    path: /api
    pathType: Prefix

postgres:
  enabled: true
  host: "10.x.x.x"            # private IP of your Postgres / Cloud SQL
  port: 5432
  sslMode: require            # ENCRYPTED_ONLY in Cloud SQL terms
  username: openzro
  password: "REPLACE_WITH_PG_PASSWORD"
  databases:
    management: openzro
    flow: openzro_flow
    activity: openzro_activity
    dex: dex
  provisioning:
    enabled: true             # set false if DBA pre-creates the 4 DBs
    username: openzro
    password: "REPLACE_WITH_PG_PASSWORD"

# HA broker — embedded NATS+JetStream cluster across StatefulSet pods.
cluster:
  mode: embedded
  embedded:
    clientPort: 4222
    clusterPort: 6222
    jetstream:
      storage: memory         # locks bucket only; PVCs not required

management:
  replicaCount: 3
  dnsDomain: mesh.example.com
  config:
    dataStoreEncryptionKey: "REPLACE_WITH_BASE64_32_BYTES"   # openssl rand -base64 32
    relay:
      addresses: []           # auto-derived from relay.publicHostname + tls
      credentialsTTL: "24h"
      secret: "REPLACE_WITH_RELAY_HMAC"                      # openssl rand -hex 32
    signal:
      proto: https
      uri: "signal.example.com:443"
    httpConfig:
      address: "0.0.0.0:33071"
      authIssuer: "https://openzro.example.com/dex"
      authAudience: "openzro-dashboard"
      authUserIDClaim: "sub"
      oidcConfigEndpoint: "https://openzro.example.com/dex/.well-known/openid-configuration"
      idpSignKeyRefreshEnabled: true
    pkceAuthorizationFlow:
      providerConfig:
        audience: "openzro-dashboard"
        clientId: "openzro-dashboard"
        clientSecret: ""        # public client + PKCE; no secret
        domain: "openzro.example.com"
        authorizationEndpoint: "https://openzro.example.com/dex/auth"
        tokenEndpoint: "https://openzro.example.com/dex/token"
        # `groups` here AND in dashboard.env.AUTH_SUPPORTED_SCOPES are
        # what gates JWT Group Sync (Settings → Groups). Cheap to leave
        # on even when no upstream connector emits groups — Dex just
        # omits the claim. See the "JWT Group Sync" section below.
        scope: "openid profile email offline_access groups"
        redirectURLs:
          - "http://localhost:53000/"
          - "http://localhost:54000/"
          - "http://localhost:55000/"
        # Dex puts groups in id_token only (access_token is minimal by
        # design). The `openzro` CLI must use id_token as Bearer or the
        # management never sees the claim. Same reason as the dashboard.
        useIDToken: true
    deviceAuthorizationFlow:
      provider: hosted
      providerConfig:
        audience: "openzro-dashboard"
        clientId: "openzro-dashboard"
        domain: "openzro.example.com"
        tokenEndpoint: "https://openzro.example.com/dex/token"
        deviceAuthEndpoint: "https://openzro.example.com/dex/device/code"
        scope: "openid profile email offline_access groups"
        useIDToken: true
  ingress:
    enabled: false            # consolidated Ingress carries /api
  ingressGrpc:
    enabled: true             # gRPC needs its own host/cert
    className: nginx
    annotations:
      cert-manager.io/cluster-issuer: letsencrypt-prod
      nginx.ingress.kubernetes.io/backend-protocol: GRPC
    hosts:
      - host: grpc.openzro.example.com
        paths:
          - path: /management.ManagementService
            pathType: ImplementationSpecific
    tls:
      - secretName: grpc-openzro-tls
        hosts:
          - grpc.openzro.example.com

signal:
  replicaCount: 3
  # Same pattern as relay: LoadBalancer + TLS on the signal binary.
  # Bypasses ingress-nginx (community), which has a 2020-vintage bug
  # that breaks gRPC server-streaming initial metadata — peers hang
  # on the registration header. See HA docs for details.
  publicHostname: signal.example.com
  containerPort: 443
  service:
    type: LoadBalancer
    port: 443
  tls:
    enabled: true
    certManager:
      enabled: true
      issuerRef:
        kind: ClusterIssuer
        name: letsencrypt-prod
      duration: 2160h
      renewBefore: 360h
  ingress:
    enabled: false   # bypass community ingress-nginx

relay:
  enabled: true
  replicaCount: 2             # multi-pod fabric auto-on (ADR-0014)
  publicHostname: relay.example.com
  service:
    type: LoadBalancer        # Ingress is NOT supported — TCP/WS persistent
    port: 33080
  tls:
    enabled: true
    certManager:
      enabled: true
      issuerRef:
        kind: ClusterIssuer
        name: letsencrypt-prod
      duration: 2160h
      renewBefore: 360h

dex:
  enabled: true
  replicaCount: 2
  config:
    issuer: https://openzro.example.com/dex
    storage:
      type: postgres
      config:
        host: 10.x.x.x        # same DB server, dex DB
        port: 5432
        database: dex
        user: openzro
        password: "REPLACE_WITH_PG_PASSWORD"
        ssl:
          mode: require
    web:
      http: 0.0.0.0:5556
      allowedOrigins:
        - https://openzro.example.com
    grpc:
      addr: 0.0.0.0:5557
      tlsCert: ""             # plaintext gRPC in-cluster (default)
      tlsKey: ""              # operator opt-in mTLS later via cert-manager
      tlsClientCA: ""
    oauth2:
      responseTypes: [code]
      skipApprovalScreen: true
    enablePasswordDB: true
    staticPasswords:
      # Generate the bcrypt with:
      # htpasswd -bnBC 10 "" yourpassword | tr -d ':\n' | sed 's/$2y/$2a/'
      - email: admin@example.com
        hash: "REPLACE_WITH_BCRYPT_HASH"
        username: admin
        userID: openzro-admin
    staticClients:
      - id: openzro-dashboard
        name: openZro
        public: true          # PKCE; no client secret
        redirectURIs:
          - https://openzro.example.com/auth
          - https://openzro.example.com/silent-auth
          - https://openzro.example.com/
          - http://localhost:53000/
          - http://localhost:54000/
          - http://localhost:55000/
          - /device/callback
    connectors: []            # add Google/Microsoft/etc at runtime via dashboard
    logger:
      level: info
      format: text

dashboard:
  replicaCount: 2
  env:
    USE_AUTH0: "false"
    AUTH_AUTHORITY: "https://openzro.example.com/dex"
    AUTH_CLIENT_ID: "openzro-dashboard"
    AUTH_CLIENT_SECRET: ""
    AUTH_SUPPORTED_SCOPES: "openid profile email offline_access groups"
    AUTH_AUDIENCE: "openzro-dashboard"
    AUTH_REDIRECT_URI: "/auth"
    AUTH_SILENT_REDIRECT_URI: "/silent-auth"
    OPENZRO_MGMT_API_ENDPOINT: "https://openzro.example.com"
    OPENZRO_MGMT_GRPC_API_ENDPOINT: "https://grpc.openzro.example.com"
    OPENZRO_TOKEN_SOURCE: "idToken"

3. Install the chart

helm install openzro openzro/openzro \
  --create-namespace \
  --namespace openzro \
  -f my-openzro.yaml

Pre-install Job creates the 4 PostgreSQL databases if they don't exist (postgres, openzro_flow, openzro_activity, dex). After that, management and Dex run their own schema migrations on boot.

4. DNS for the LoadBalancer Services

Both relay and signal come up as LoadBalancer. Get their external IPs and update DNS:

kubectl get svc -n openzro openzro-relay openzro-signal
# NAME             TYPE           EXTERNAL-IP    PORT(S)
# openzro-relay    LoadBalancer   34.x.x.x       33080:30869/TCP
# openzro-signal   LoadBalancer   34.y.y.y       443:31000/TCP

Point in your DNS:

HostnameValue
relay.example.comA<openzro-relay EXTERNAL-IP>
signal.example.comA<openzro-signal EXTERNAL-IP>

cert-manager handles certificate issuance via DNS-01 (or whatever challenge your ClusterIssuer uses). With DNS-01 there's no chicken- and-egg with DNS — the cert is provisioned through the DNS provider API, not by reaching the host.

5. Verify

kubectl -n openzro get pods
# Expect (replicaCount=3 management/signal, 2 relay/dashboard/dex):
#   openzro-management-0      1/1   Running
#   openzro-management-1      1/1   Running
#   openzro-management-2      1/1   Running
#   openzro-signal-0          1/1   Running
#   ...
#   openzro-relay-...         1/1   Running   (x2)
#   openzro-dashboard-...     1/1   Running   (x2)
#   openzro-dex-...           1/1   Running   (x2)

curl https://openzro.example.com/dex/.well-known/openid-configuration | jq .issuer
# "https://openzro.example.com/dex"

curl -sI https://openzro.example.com/api/users
# HTTP/2 401   ← expected (no auth header), management is reachable

Open https://openzro.example.com/, sign in with the bootstrap admin email + the password you put behind the bcrypt hash. From Settings → Identity Providers wire your corporate IdP (Microsoft Entra, Google Workspace, Okta, Keycloak, generic OIDC) at runtime.

JWT Group Sync (when using an upstream IdP)

When you wire a corporate IdP (Keycloak, Microsoft Entra, Google Workspace, Okta, generic OIDC) under Settings → Identity Providers, groups from the upstream can flow into openZro Groups and be used in Policies — but only if four moving parts line up:

  1. Upstream IdP emits groups in the OIDC token. For Keycloak this is the Group Membership mapper on the groups client scope, with Add to ID token: On and Token Claim Name: groups. See the Keycloak guide.

  2. Dex requests groups from upstream and re-emits them. When you add the IdP via the dashboard, openZro creates a Dex OIDC connector with insecureEnableGroups: true, scopes including groups, and getUserInfo: false. The getUserInfo: false matters: with true, Dex calls the upstream /userinfo endpoint after token exchange and overwrites the claims — and Keycloak's userinfo doesn't carry groups by default, so the claim disappears.

  3. The dashboard requests groups and uses the id_token. Both are in the example values above:

    dashboard:
      env:
        AUTH_SUPPORTED_SCOPES: "openid profile email offline_access groups"
        OPENZRO_TOKEN_SOURCE: "idToken"
    

    Dex puts groups in the id_token only — the access_token is minimal by design, so OPENZRO_TOKEN_SOURCE: idToken is required for the management to ever see the claim.

  4. JWT Group Sync is enabled at the account level. In the dashboard go to Settings → Groups and turn on Enable JWT group sync, with JWT claim set to groups. Without this the management ignores the claim even if it's present in the JWT.

After all four are aligned, log out and log back in. The groups appear in Team → Groups auto-created, and the user gets them as auto_groups. Empty groups list after a fresh login is a strong signal that step 2 or 3 is misaligned — inspect the id_token in the browser DevTools (Application → Storage → look for oidc.user:...) to confirm whether the claim made it into the token.

Optional: GitOps with argocd-vault-plugin

For ArgoCD-driven deploys, all the secrets (postgres password, data store encryption key, relay HMAC, bcrypt admin hash) live in a Vault and are referenced via <path:vault/path#field> markers in the values. The chart was hardened in 2.1.0-alpha.20+ to use mustToRawJson so the markers survive intact through management.json rendering — without that fix the markers got HTML-escaped to < and the plugin couldn't resolve them.

postgres:
  password: "<path:secret/data/openzro#dbPassword>"

management:
  config:
    dataStoreEncryptionKey: "<path:secret/data/openzro#dataStoreEncryptionKey>"
    relay:
      secret: "<path:secret/data/openzro#relaySecret>"

relay:
  cluster:
    authSecret:
      value: "<path:secret/data/openzro#relayAuthSecret>"

dex:
  config:
    staticPasswords:
      - email: admin@example.com
        hash: "<path:secret/data/openzro#adminPasswordHash>"
        username: admin
        userID: openzro-admin

argocd-vault-plugin substitutes the markers when ArgoCD renders the chart — values committed to git stay sanitized.

What the chart auto-derives

To minimize duplication in the values:

Field on the wireAuto-derived from
OZ_LISTEN_ADDRESS (relay)relay.containerPort (default :33080)
OZ_EXPOSED_ADDRESS (relay)relay.publicHostname + relay.service.port
OZ_AUTH_SECRET (relay)management.config.relay.secret (same HMAC both ends)
rels:// scheme in management.config.relay.addressesrelay.tls.enabled (rel:// when TLS off)
Cert SANsrelay.publicHostname
Postgres DSNpostgres.host + postgres.port + postgres.databases.* + postgres.username + postgres.password
NATS routes (cluster.mode=embedded)StatefulSet pod names + headless Service

You don't need to repeat these in relay.env / relay.envRaw or management.envRaw. If you do override them, your override wins.

Where to file issues