Scaling the MCP Gateway¶
This guide covers scaling the MCP Gateway horizontally by running multiple replicas with shared session state.
Overview¶
By default, the MCP Gateway runs as a single replica with session mappings stored in memory. To handle increased traffic or improve availability, you can scale the gateway to multiple replicas. However, because the gateway router maintains stateful session mappings between clients and backend MCP servers, scaling requires an external session store so that any replica can serve any client request.
Key concepts:
- Session Mapping: Each gateway session ID maps to one or more backend MCP server session IDs
- Lazy Initialization: Backend sessions are created on first
tools/call, not at connection time - Shared State: An external store (Redis) makes session mappings accessible to all gateway replicas
Prerequisites¶
- MCP Gateway installed and configured
- A Redis instance accessible from the gateway (Redis 7+ recommended)
Step 1: Deploy Redis¶
If you don't already have a Redis instance available, deploy one in your cluster. Any standard Redis deployment will work. For example:
kubectl apply -n your-namespace -f - <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
name: redis
labels:
app: redis
spec:
replicas: 1
selector:
matchLabels:
app: redis
template:
metadata:
labels:
app: redis
spec:
containers:
- name: redis
image: redis:7-alpine
ports:
- containerPort: 6379
readinessProbe:
exec:
command: ["redis-cli", "ping"]
initialDelaySeconds: 5
periodSeconds: 10
---
apiVersion: v1
kind: Service
metadata:
name: redis
labels:
app: redis
spec:
type: ClusterIP
ports:
- port: 6379
targetPort: 6379
selector:
app: redis
EOF
Wait for Redis to be ready:
Step 2: Configure the Gateway Connection¶
Configure the MCP Gateway to use Redis by adding the --cache-connection-string flag to the gateway deployment's command:
kubectl patch deployment mcp-gateway -n mcp-system --type=json \
-p '[{"op":"add","path":"/spec/template/spec/containers/0/command/-","value":"--cache-connection-string=redis://redis.your-namespace.svc.cluster.local:6379"}]'
Wait for the rollout to complete:
Connection String Format:
For a Redis instance without authentication in the same cluster, the host is typically redis.<namespace>.svc.cluster.local.
Note: The
--cache-connection-stringflag is in the controller's ignored flags list, so the MCPGatewayExtension controller will not revert this change during reconciliation. Do not usekubectl set envas the controller will revert environment variable changes.
Step 3: Scale the Gateway¶
With Redis configured, scale the gateway to multiple replicas:
Verify all replicas are ready:
Step 4: Verify Session Sharing¶
Confirm that Redis is active by checking the gateway logs. You should see session cache using external store on startup:
Test that sessions are shared across replicas by making multiple tool calls from the same client. The backend session ID should remain consistent regardless of which replica handles the request.
Reverting to a Single Replica¶
To revert to in-memory session caching:
-
Scale down to a single replica:
-
Remove the cache connection flag (replace
INDEXwith the position of the flag in the command array):# Find the index of the --cache-connection-string flag kubectl get deployment mcp-gateway -n mcp-system \ -o jsonpath='{.spec.template.spec.containers[0].command}' # Remove it by index (e.g., if it's the last element at index 7) kubectl patch deployment mcp-gateway -n mcp-system --type=json \ -p '[{"op":"remove","path":"/spec/template/spec/containers/0/command/INDEX"}]' -
Wait for the rollout to complete:
Next Steps¶
With horizontal scaling configured, you can:
- Observability - Monitor gateway performance across replicas
- Troubleshooting - Debug session and routing issues