Production Checklist¶

Use this checklist to verify your MPL proxy deployment is production-ready. Each category covers a critical dimension of operational excellence. Complete all items before routing production traffic through the proxy.

How to Use This Checklist

Work through each section sequentially. Items marked as critical are hard requirements. Items marked as recommended are best practices that significantly reduce operational risk.

Security¶

Ensure the proxy and its environment are hardened against unauthorized access and data exposure.

Security Context Example

securityContext:
  capabilities:
    drop:
      - ALL
  readOnlyRootFilesystem: true
  runAsNonRoot: true
  runAsUser: 1000
  allowPrivilegeEscalation: false
  seccompProfile:
    type: RuntimeDefault

Reliability¶

Ensure the proxy can withstand failures, traffic spikes, and infrastructure changes.

Reliability Configuration

# HPA
autoscaling:
  enabled: true
  minReplicas: 2
  maxReplicas: 10
  targetCPUUtilizationPercentage: 70

# PDB
apiVersion: policy/v1
kind: PodDisruptionBudget
spec:
  minAvailable: 1

# Graceful shutdown
spec:
  terminationGracePeriodSeconds: 30
  containers:
    - lifecycle:
        preStop:
          exec:
            command: ["/bin/sh", "-c", "sleep 5"]

Observability¶

Ensure you can monitor, debug, and alert on proxy behavior in production.

Alerting Rules

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: mpl-proxy-alerts
spec:
  groups:
    - name: mpl-proxy
      rules:
        - alert: MPLHighErrorRate
          expr: |
            rate(mpl_requests_total{status="error"}[5m])
            / rate(mpl_requests_total[5m]) > 0.05
          for: 5m
          labels:
            severity: critical
          annotations:
            summary: "MPL proxy error rate above 5%"

        - alert: MPLHighLatency
          expr: |
            histogram_quantile(0.99, rate(mpl_request_duration_seconds_bucket[5m])) > 1.0
          for: 5m
          labels:
            severity: warning
          annotations:
            summary: "MPL proxy p99 latency above 1s"

        - alert: MPLQoMDegradation
          expr: |
            avg(mpl_qom_score) < 0.7
          for: 10m
          labels:
            severity: warning
          annotations:
            summary: "Average QoM score below threshold"

Configuration¶

Ensure the MPL proxy is configured for production-grade semantic governance.

Production MPL Configuration

mpl:
  mode: strict
  registry: "https://github.com/Skelf-Research/mpl/raw/v0.1.0/registry"
  required_profile: qom-strict-argcheck
  enforce_schema: true
  enforce_assertions: true
  policy_engine: true

Operations¶

Ensure your team is prepared to operate, maintain, and recover the deployment.

Rollback Procedure

# View release history
helm history mpl-proxy

# Rollback to previous revision
helm rollback mpl-proxy 1

# Verify rollback
kubectl get pods -l app.kubernetes.io/name=mpl-proxy
kubectl logs -l app.kubernetes.io/name=mpl-proxy --tail=50

Compliance¶

Ensure the deployment meets audit, regulatory, and governance requirements.

Audit Log Configuration

observability:
  log_format: json
  log_level: info
  # Structured fields for audit trail
  # Each request logs: timestamp, stype, qom_score,
  # semantic_hash, source, decision (allow/deny)

Summary¶

Category	Items	Priority
Security	12	Critical -- must complete before production traffic
Reliability	13	Critical -- ensures availability SLA
Observability	9	High -- required for operational awareness
Configuration	10	High -- ensures semantic governance is active
Operations	11	Medium -- reduces MTTR and operational risk
Compliance	10	Varies -- depends on regulatory environment

Quick Validation Commands¶

Run these commands to verify key aspects of your deployment:

# Verify pods are running and ready
kubectl get pods -l app.kubernetes.io/name=mpl-proxy -o wide

# Check resource limits are set
kubectl describe pod -l app.kubernetes.io/name=mpl-proxy | grep -A 4 "Limits:"

# Verify health endpoints
kubectl exec deploy/mpl-proxy -- curl -s http://localhost:9443/health

# Check metrics are being scraped
kubectl exec deploy/mpl-proxy -- curl -s http://localhost:9100/metrics | head -20

# Verify network policy exists
kubectl get networkpolicy -l app.kubernetes.io/name=mpl-proxy

# Check HPA status
kubectl get hpa mpl-proxy

# Verify PDB exists
kubectl get pdb mpl-proxy-pdb

# Check ServiceMonitor
kubectl get servicemonitor mpl-proxy -n monitoring

Next Steps¶

Kubernetes & Helm -- Detailed Helm chart configuration
Monitoring & Metrics -- Set up dashboards and alerts
Troubleshooting -- Debug production issues
Security: Threat Model -- Understand the threat landscape