CMP Operator Checklist (Dev → Prod)
Last updated: 2026-01-09
This is a pragmatic checklist to deploy and harden the Registry, Portal, and Jobs across environments.
0) Prereqs
- IDP — configure issuer and audience for JWTs, allow the Portal origin via CORS, and ensure admin role/scope
cmp.adminexists. Optionally provide a shared secret for HS256 tokens. - Database — PostgreSQL URL for the Registry.
- Gateway & Routing — Cilium Gateway API installed with HTTPRoute support; external-gateway configured for HTTPS. See
docs/cmp/cilium-gateway-setup.mdfor routing details. - DNS & TLS — public hostnames for the Registry and Portal; TLS certificates managed by cert-manager or external DNS provider.
1) Secrets
Create a secret cmp-registry-secret (per environment) with at least:
CMP_REGISTRY_DATABASE_URLOIDC_ISSUER,OIDC_AUDIENCE, (OIDC_JWKS_URIoptional), (OIDC_HS_SECREToptional)- Accepted JWT algorithms: RS256/RS384/RS512, PS256/PS384/PS512, ES256/ES384/ES512
- HS256 rotation plan: update IDP + registry together, deploy new
OIDC_HS_SECRET, keep the previous secret in the IDP until all producers rotate, then remove the old secret from both sides APPEND_AUTH_REQUIRED=1in prod to enforce Bearer auth forPOST /consent/v1/append- Dataset URLs (WTM, AdGuard, IAB GVL) if you use the provided jobs
- (Optional)
METRICS_PASSwhen exposing/metricswith Basic Auth - (Optional)
CONSENT_IP_SALTif enabling hashed IP derivation at the edge
2) Helm values (Registry)
Set sane defaults for resources, autoscaling, CORS, rate limits, and NetworkPolicy. Example:
image:
repository: registry.digiwedge.com/digiwedge/cmp-registry
tag: <pinned>
resources:
requests: { cpu: 100m, memory: 256Mi }
limits: { cpu: 1000m, memory: 1Gi }
autoscaling:
enabled: true
minReplicas: 2
maxReplicas: 6
targetCPUUtilizationPercentage: 70
rateLimit:
enabled: true
windowMs: 60000
max: 300
skipSuccessful: false
standardHeaders: true
legacyHeaders: false
trustProxy: true
cors:
adminAllowedOrigins:
- https://cmp-portal.example.com
strictConfigOrigin: true
metrics:
enabled: true
path: /metrics
basicAuth: { enabled: false }
networkPolicy:
enabled: true
ingress:
allowFromNamespaces: [kube-system, monitoring, cmp]
egress:
restricted: true
# DNS via kube-system (CoreDNS) + DB via CIDR or namespace
allowNamespaces: [kube-system]
allowCIDRs: [10.42.0.0/16] # example DB subnet
3) Helm values (Portal)
image:
repository: registry.digiwedge.com/digiwedge/cmp-portal
tag: <pinned>
resources:
requests: { cpu: 50m, memory: 128Mi }
limits: { cpu: 500m, memory: 512Mi }
autoscaling:
enabled: true
minReplicas: 2
maxReplicas: 5
targetCPUUtilizationPercentage: 60
networkPolicy:
enabled: true
ingress:
allowFromNamespaces: [kube-system, cmp]
egress:
restricted: false # typically open for frontends; tighten if needed
4) Deploy with Argo CD
- Point Applications to the Helm chart paths:
charts/cmp-registry,charts/cmp-portal,charts/cmp-jobs. - Provide the environment specific values and secrets.
- Enable image updater annotations if you use argocd-image-updater.
- Apply HTTPRoute and ReferenceGrant manifests:
kubectl apply -f kubernetes/cmp/portal/httproute.yamlandkubernetes/idp/cmp-to-idp-referencegrant.yaml. Seedocs/cmp/cilium-gateway-setup.mdfor routing architecture.
5) Observability
- Apply ServiceMonitor for the Registry scraping
/metrics. - Apply PrometheusRule alerts (dataset freshness, 429 spikes, CORS denials, CronJobs).
- Import Grafana dashboard
grafana/dashboards/cmp-overview.json.
6) CI gates
- Configure repo secrets for scanner CI:
CMP_REGISTRY_URL,CMP_SITE_KEY,CMP_CANARY_URL
- Run manual CMP scanner checks (baseline + GPC) and review the diff report.
- Run the CMP React SDK a11y harness locally and review any findings.
7) Post‑deploy checks
GET /v1/config?site_key=...&v=livereturns 200 with ETag and Cache‑Control./v1/consentaccepts consent and records events; withSec-GPC: 1, events havegpc=true./metricsexposes counters (429s, config/consent decisions, dataset freshness, GPC).- Portal login works; Sites page shows analytics; export consents (CSV/JSON) downloads correctly.
- Multi-domain CORS: Sites page "Allowed origins (CORS)" shows and updates host list; adding a second origin allows
/v1/consentcalls from that origin. - Scanner CI gate is green for canary (baseline + GPC).
- HTTPRoute validation: Portal routes work correctly:
curl https://cmp-portal.uat.digiwedge.com/api/auth/csrf→ 204 (IDP backend)curl https://cmp-portal.uat.digiwedge.com/api/health→ 200 (Registry backend)curl https://cmp-portal.uat.digiwedge.com/→ 200 (Portal frontend)
8) Hardening tips
- Pin images by digest and tag.
- Tighten CSP for the Registry UI (if ever presented) and Portal domains.
- Consider Basic Auth on
/metrics(setMETRICS_BASIC_AUTH=trueandMETRICS_PASS). - Enable PDBs and Pod anti‑affinity (already in the charts) to improve availability during maintenance.
9) Privacy & retention runbook
-
IP minimization: Registry does not persist client IPs in
ConsentEventby design. If a downstream team ever requires coarse network correlation, use a one‑way hash with salt at the edge and only propagateipHash:ipHash = sha256(ip + CONSENT_IP_SALT). Do not store raw IPs. KeepCONSENT_IP_SALTin thecmp-registry-secret. -
Retention: Default consent retention is 13 months (configurable via
CONSENT_RETENTION_DAYS). The CronJobcmp-consent-retentiondeletes rows by timestamptsolder than the cutoff. To dry‑run locally:- Exec into a throwaway job pod and inspect counts grouped by month.
- Set
CONSENT_RETENTION_DAYS=395and run the script once to purge >13‑month data.
The job uses the correct timestamp field
ts; no PII columns are touched.
10) SLO dashboards
-
Latency SLOs: The Registry exports
cmp_http_request_duration_ms{route,method,status}. Use Grafana panels based onhistogram_quantile(0.95, sum by (le) (rate(cmp_http_request_duration_ms_bucket{route="config"}[5m])))and 0.99 for p99. Target p95 for/v1/configunder steady state. -
429 budget: Use the included panel or derive as
sum(rate(cmp_http_rate_limited_total[5m])) / clamp_min(sum(rate(cmp_config_requests_total[5m]) + rate(cmp_consent_requests_total[5m])), 1e-6)and keep below your agreed budget. Alerts are labeledteam=cmpfor routing.
11) NetworkPolicy verification
-
Ingress: With defaults, only
kube-system,monitoring, andcmpnamespaces can reach the Registry Service.kubectl run -n default test --rm -it --image=curlimages/curl -- curl -sS http://cmp-registry.cmp.svc/→ DENYkubectl run -n cmp test --rm -it --image=curlimages/curl -- curl -sS http://cmp-registry.cmp.svc/→ OK
-
Egress (Registry): Set
networkPolicy.egress.restricted=trueand populateallowNamespaces: [kube-system](DNS) andallowCIDRs: [<DB CIDRs>]. Verify DNS resolves and DB connectivity succeeds, while public internet is blocked from Registry pods. -
CronJobs: Dataset and retention jobs run in the
cmp-jobschart and are not subject to the Registry egress policy. They retain open egress to fetch datasets. If you apply a NetPol for jobs, ensure the dataset endpoints remain allowed.