Scaling
Lucity supports two scaling modes: fixed replica counts and automatic horizontal pod autoscaling (HPA). Scaling is configured per service per environment, so you can run a single replica in development and autoscale in production.
Manual scaling
Set a fixed number of replicas for a service:
mutation {
setServiceScaling(input: {
projectId: "myapp"
environment: "production"
service: "api"
replicas: 3
}) {
replicas
}
}
The replica count must be between 1 and 20. Lucity updates the Helm values in the GitOps repo and ArgoCD applies the change to Kubernetes.
Autoscaling
Enable horizontal pod autoscaling to let Kubernetes adjust replicas based on CPU utilization:
mutation {
setServiceScaling(input: {
projectId: "myapp"
environment: "production"
service: "api"
replicas: 2
autoscaling: {
enabled: true
minReplicas: 2
maxReplicas: 10
targetCPU: 70
}
}) {
replicas
autoscaling {
enabled
minReplicas
maxReplicas
targetCPU
}
}
}
When autoscaling is enabled, Kubernetes monitors CPU usage across your pods. If average CPU exceeds the target percentage, new replicas are added (up to maxReplicas). When load drops, replicas scale back down (to minReplicas).
Constraints
| Parameter | Min | Max |
|---|---|---|
replicas | 1 | 20 |
minReplicas | 1 | 20 |
maxReplicas | 1 | 20 |
targetCPU | 10% | 95% |
Per-environment scaling
Scaling is configured independently per environment. A typical setup:
- Development: 1 replica, no autoscaling. Keep it simple.
- Staging: 1-2 replicas. Enough to test, not enough to burn budget.
- Production: autoscaling with 2-10 replicas at 70% target CPU. Let Kubernetes handle the load.
How it maps to Kubernetes
Manual scaling sets the replicas field on the Deployment. Autoscaling creates a HorizontalPodAutoscaler resource targeting the Deployment.
Both configurations live in the GitOps repo as Helm values. After ejection, scaling continues working as standard Kubernetes Deployment replicas and HPA resources.