Operator Guide¶
This guide is for operators deploying and managing ZViz in production environments.
Overview¶
Operating ZViz involves:
- Installation & Configuration — Setting up ZViz on nodes
- Integration — Connecting with containerd/Kubernetes
- Monitoring — Observing performance and security events
- Maintenance — Upgrades, debugging, and troubleshooting
Deployment Architecture¶
┌─────────────────────────────────────────────────────────────┐
│ Kubernetes Cluster │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ Control Plane │ │
│ │ ┌─────────────────────────────────────────────┐ │ │
│ │ │ RuntimeClass: zviz │ │ │
│ │ │ handler: zviz │ │ │
│ │ └─────────────────────────────────────────────┘ │ │
│ └─────────────────────────────────────────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ Worker Node │ │
│ │ ┌───────────────────────────────────────────────┐ │ │
│ │ │ containerd │ │ │
│ │ │ ┌─────────────────────────────────────────┐ │ │ │
│ │ │ │ ZViz Runtime │ │ │ │
│ │ │ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │ │ │
│ │ │ │ │Container│ │Container│ │Container│ │ │ │ │
│ │ │ │ │ Pod │ │ Pod │ │ Pod │ │ │ │ │
│ │ │ │ └─────────┘ └─────────┘ └─────────┘ │ │ │ │
│ │ │ └─────────────────────────────────────────┘ │ │ │
│ │ └───────────────────────────────────────────────┘ │ │
│ └─────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
Operator Guide Contents¶
-
Kubernetes Integration
Deploy ZViz as a Kubernetes RuntimeClass
-
containerd Setup
Configure containerd to use ZViz
-
Monitoring
Prometheus metrics and alerting
-
Performance Tuning
Optimize for your workload
-
Debugging
Troubleshoot production issues
-
Upgrades
Safely upgrade ZViz
Quick Deployment¶
1. Install ZViz on All Nodes¶
2. Configure containerd¶
# Add to /etc/containerd/config.toml
cat >> /etc/containerd/config.toml << 'EOF'
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.zviz]
runtime_type = "io.containerd.runc.v2"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.zviz.options]
BinaryName = "/usr/local/bin/zviz"
EOF
systemctl restart containerd
3. Create RuntimeClass¶
# zviz-runtimeclass.yaml
apiVersion: node.k8s.io/v1
kind: RuntimeClass
metadata:
name: zviz
handler: zviz
scheduling:
nodeSelector:
zviz.io/enabled: "true"
4. Deploy Pods with ZViz¶
apiVersion: v1
kind: Pod
metadata:
name: secure-workload
annotations:
zviz.io/profile: "ci-runner"
spec:
runtimeClassName: zviz
containers:
- name: app
image: my-app:latest
Host Requirements¶
Kernel Configuration¶
Verify required kernel options:
Required:
CONFIG_SECCOMP_FILTER=yCONFIG_SECCOMP_USER_NOTIFICATION=yCONFIG_USER_NS=yCONFIG_CGROUPS=yCONFIG_CGROUP_BPF=y
Recommended:
CONFIG_SECURITY_APPARMOR=yorCONFIG_SECURITY_SELINUX=yCONFIG_SECURITY_LANDLOCK=y
System Limits¶
# /etc/sysctl.conf
kernel.unprivileged_userns_clone = 1
kernel.pid_max = 65536
fs.file-max = 1048576
cgroups v2¶
Ensure cgroups v2 is enabled:
Security Considerations¶
Principle of Least Privilege¶
- Run ZViz with minimum required capabilities
- Use read-only root filesystems
- Apply network policies
Audit Logging¶
Enable audit logging for security events:
# /etc/zviz/config.yaml
logging:
level: info
format: json
audit:
enabled: true
path: /var/log/zviz/audit.json
Security Updates¶
Subscribe to security advisories:
Capacity Planning¶
Memory¶
| Component | Memory |
|---|---|
| ZViz broker | ~5MB per container |
| Base overhead | ~2MB |
| Profile cache | ~1MB |
CPU¶
- Broker adds ~5% overhead for syscall-heavy workloads
- Network-heavy workloads see <2% overhead
Storage¶
| Path | Purpose | Recommended Size |
|---|---|---|
/var/lib/zviz |
State directory | 1GB |
/var/log/zviz |
Logs | 10GB |
Troubleshooting¶
Container Won't Start¶
Permission Denied¶
# Check capabilities
zviz validate
# Check AppArmor/SELinux
aa-status
setenforce 0 # Temporarily disable SELinux
Performance Issues¶
See Debugging Guide for detailed troubleshooting.
Support Matrix¶
| Kubernetes Version | Status |
|---|---|
| 1.30+ | Supported |
| 1.28-1.29 | Supported |
| 1.26-1.27 | Best effort |
| < 1.26 | Not supported |
| Linux Distribution | Status |
|---|---|
| Ubuntu 22.04+ | Supported |
| Debian 12+ | Supported |
| RHEL 9+ | Supported |
| Fedora 38+ | Supported |
| Amazon Linux 2023 | Supported |