Microservice Deployment Settings
Most of our microservices are Spring Boot Java applications with Actuator monitoring enabled.
This guide describes recommended CPU and memory settings, probe configuration, monitoring practices with Grafana, and troubleshooting steps with kubectl.
CPU Settings
Principles
-
Limit up to 4 CPUs Helps startup time (faster JIT warm-up and class loading). ⚠️ If all services start in parallel this may throttle the node – but in production we always use rolling updates, so this is not a problem.
-
Constrained CPU (1 vCPU) Works fine, but you must extend the
startupProbetimeout/threshold, because startup will be slower. -
Normal operation Microservices typically consume very little CPU most of the time. → Set
requests.cpu: 100m.
Kubernetes Example
resources:
requests:
cpu: "100m" # low steady consumption
limits:
cpu: "4" # allow faster startup
Memory Settings
What Java Memory Consists Of
- Heap (
-Xmx) – application objects. - Metaspace – typically ~300 MB.
- Thread stacks – ~1 MB per thread (~100 MB for ~100 threads).
- JVM internals & native buffers – ~200 MB (GC structures, NIO, Netty, glibc arenas).
👉 Rule of thumb: Pod memory limit = Xmx + ~600 MB.
Guidelines
- Set both
-Xmsand-Xmxto the same value to avoid dynamic heap resizing. - at least 200 MB for small microservices, 300 MB for microservices with process engine
- at most 1-5 GBs for large workloads. For larger workloads scale horizontally (add more replicas).
- Always leave headroom above
-Xmxfor Metaspace, threads, and native overhead. - You should never see
OOMKilled. If it happens → either increase memory limit or even decrease-Xmx!. - Use Grafana monitoring to calibrate real usage.
Kubernetes Example
resources:
requests:
memory: "900Mi" # Example: Xmx(300Mi) + ~600Mi overhead
limits:
memory: "1200Mi" # Add 300Mi headroom for spikes. For safety, avoid OOMKilled.
JVM Options Example
env:
- name: JAVA_TOOL_OPTIONS
value: -Xms300m -Xmx300m
Probes (Startup, Liveness, Readiness)
Spring Boot with Actuator provides endpoints for Kubernetes probes.
Recommendations
- Startup Probe:
/actuator/healthAllows slow startup without failing the pod. IncreasefailureThresholdif CPU is constrained. - Liveness Probe:
/actuator/health/livenessDetects JVM deadlocks or crash states. - Readiness Probe:
/actuator/health/readinessEnsures the pod only receives traffic when ready.
Typical Settings
- Startup Probe:
initialDelaySeconds: 20,periodSeconds: 10,failureThreshold: 30 - Liveness/Readiness Probe:
periodSeconds: 5,timeoutSeconds: 5,failureThreshold: 5
Example Deployment Descriptor
A typical Spring Boot microservice (tsm-catalog) Deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: tsm-catalog
name: tsm-catalog
namespace: tsm-datalite
spec:
replicas: 1
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
selector:
matchLabels:
app: tsm-catalog
template:
metadata:
annotations:
linkerd.io/inject: disabled
labels:
app: tsm-catalog
spec:
containers:
- name: tsm-catalog
image: registry.datalite.cz/tsm/tsm-catalog:2.3
imagePullPolicy: Always
env:
- name: JAVA_TOOL_OPTIONS
value: >
-Xms300m -Xmx300m
- name: spring.config.import
value: configserver:http://tsm-config-server
- name: MALLOC_ARENA_MAX
value: "2"
ports:
- containerPort: 8099
protocol: TCP
resources:
requests:
cpu: 100m
memory: 800Mi
limits:
cpu: 3000m
memory: 1100Mi
livenessProbe:
httpGet:
path: /actuator/health/liveness
port: 8099
initialDelaySeconds: 0
periodSeconds: 5
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 5
readinessProbe:
httpGet:
path: /actuator/health/readiness
port: 8099
initialDelaySeconds: 0
periodSeconds: 5
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 5
startupProbe:
httpGet:
path: /actuator/health
port: 8099
initialDelaySeconds: 40
periodSeconds: 10
timeoutSeconds: 1
failureThreshold: 30
imagePullSecrets:
- name: regcred
securityContext:
runAsUser: 1001
runAsGroup: 1001
Monitoring with Grafana
-
JVM metrics (via Actuator/Micrometer):
jvm.memory.used{area=heap}vs.jvm.memory.max{area=heap}jvm.memory.used{area=nonheap}jvm.threads.live
-
Pod metrics (via cAdvisor / kubelet):
- Pod memory usage (RSS / working set)
- Pod CPU usage and throttling
👉 Always compare JVM vs Pod metrics: if Pod RSS ≫ JVM heap+nonheap → overhead from threads, native memory, or allocator arenas.
Troubleshooting with kubectl
Check resource usage
kubectl top pod <pod> -n <namespace>
Inspect pod state and events
kubectl describe pod <pod> -n <namespace>
Access logs
kubectl logs <pod> -n <namespace>
Exec into pod
kubectl exec -it <pod> -n <namespace> -- sh
Check memory details
cat /proc/1/smaps_rollup
In case you need detailed native memory tracking (NMT) for troubleshooting, add:
-XX:+UnlockDiagnosticVMOptions -XX:NativeMemoryTracking=summary
and after restart add jcmd 1 VM.native_memory detail.
Or with JDK tools:
jcmd 1 VM.native_memory summary
Checklist
- CPU requests = 100m, limits = 4 (or 1 if constrained + probe adjusted).
- Memory limit = Xmx + ~600 MB overhead.
- JVM options include
-Xms -Xmxand optional NMT. - Optionally
MALLOC_ARENA_MAX=2set to reduce native memory overhead. - Probes configured (startup, readiness, liveness).
- Monitor JVM and Pod metrics in Grafana.
- Never accept OOMKilled events.