Get Started
Security

Pod Security

Security contexts, seccomp profiles, and container hardening

Every pod managed by Lucity runs with a hardened security context. No root processes, no extra Linux capabilities, no privilege escalation. These aren't optional flags you can forget to set. The platform enforces them in the pod specs it generates.

Seccomp Profiles

Seccomp (Secure Computing Mode) is a Linux kernel feature that restricts which system calls a process can make. Lucity uses the RuntimeDefault seccomp profile on all workload pods. It's built into containerd and blocks approximately 44 dangerous syscalls:

  • mount, umount (filesystem manipulation)
  • ptrace (process debugging/injection)
  • reboot, init_module (system-level operations)
  • keyctl (kernel keyring access)
  • And dozens more that containers should never need
securityContext:
  seccompProfile:
    type: RuntimeDefault

This is compatible with virtually all workloads. Unlike gVisor (which intercepts all syscalls and breaks some applications), RuntimeDefault only blocks syscalls that no legitimate container workload should use.

Security Context

User workload pods

Workload pods (Deployments and CronJobs created by the lucity-app chart) run with:

# Pod-level
securityContext:
  runAsNonRoot: true
  runAsUser: 1000
  runAsGroup: 1000
  fsGroup: 1000
  seccompProfile:
    type: RuntimeDefault

# Container-level
securityContext:
  allowPrivilegeEscalation: false
  capabilities:
    drop: ["ALL"]

runAsUser: 1000: Forces the container process to run as UID 1000. The build pipeline appends post-build steps to create a non-root user and set USER 1000:1000 in the image config, but runAsUser in the pod spec acts as a second layer: even if the image metadata were missing or overridden, the container still runs non-root.

runAsGroup: 1000 / fsGroup: 1000: Sets the primary group and filesystem group. Any volumes mounted into the pod are owned by GID 1000, ensuring the non-root process can write to them.

runAsNonRoot: true: A safety net. Even if runAsUser were removed, this would prevent the container from running as root.

allowPrivilegeEscalation: false: Blocks setuid/setgid binaries and the PR_SET_NO_NEW_PRIVS flag. Even if a binary inside the container has the setuid bit, it can't escalate.

capabilities: drop: ["ALL"]: Linux capabilities are fine-grained root privileges (e.g., NET_RAW for raw sockets, SYS_ADMIN for mount). Dropping all of them means the container process has exactly zero special kernel privileges.

Non-root image compatibility

Railpack doesn't create a non-root user in built images. To prevent runtime write failures (e.g., Next.js writing to .next/cache, Nuxt writing to .output/), the build pipeline appends post-build steps that:

  1. Create a lucity user with UID 1000 in the image
  2. chown the image's WORKDIR to UID 1000
  3. Set USER 1000:1000 in the image config

The WORKDIR is read from railpack's image config (not hardcoded), so this works for any framework or language.

Build Job pods

Build Job pods run trusted platform code (clone, railpack detection, BuildKit client). They currently inherit the default security context from the Kubernetes namespace but are not explicitly hardened with runAsNonRoot because they need to execute in the build runner environment. The primary isolation for builds comes from namespace separation and network policies.

BuildKit Exception

BuildKit requires seccompProfile: Unconfined to function. The OCI worker needs syscalls like unshare and mount to set up build environments, even with --oci-worker-no-process-sandbox. This is a BuildKit requirement, not a Lucity design choice.

BuildKit itself runs as UID 1000 (non-root), but with Unconfined seccomp. Each RUN step gets its own mount namespace (the filesystem is the image layers, not buildkitd's), but shares the PID and network namespace with buildkitd. RUN steps inherit the Unconfined seccomp profile, meaning they can make syscalls that RuntimeDefault would block.

This is why the other isolation layers are critical:

  • Network policies restrict what buildkitd (and therefore RUN steps) can reach over the network
  • Namespace isolation keeps builds in lucity-builds, away from platform services
  • Non-root execution (UID 1000) limits what damage a compromised build can do at the OS level

See Build Isolation for the full model.

What About gVisor?

gVisor provides stronger isolation by intercepting all syscalls through a user-space kernel. It's a meaningful security upgrade for hostile multi-tenant environments where tenants actively attempt container escapes.

The trade-offs:

Seccomp RuntimeDefaultgVisor
Compatibility~99% of workloads~85% of workloads
PerformanceNegligible overhead10-30% I/O overhead
Node setupNone (built into containerd)Binary install + containerd config per node
MaintenanceZeroMonthly binary updates, node automation

For most Lucity deployments (teams running their own workloads), seccomp RuntimeDefault with dropped capabilities provides strong isolation without the compatibility and operational costs of gVisor. If your threat model includes actively malicious tenants, consider adding gVisor on dedicated node pools for user workloads.