prod deployment
This commit is contained in:
456
doc/CLUSTER_SETUP.md
Normal file
456
doc/CLUSTER_SETUP.md
Normal file
@@ -0,0 +1,456 @@
|
||||
# Production Cluster Setup Guide
|
||||
|
||||
This guide covers setting up the Dexorder AI platform from scratch on a fresh Kubernetes cluster.
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
The platform runs across two namespaces:
|
||||
|
||||
| Namespace | Contents |
|
||||
|-----------|----------|
|
||||
| `ai` | Gateway, web UI, all infrastructure services (postgres, minio, kafka, flink, relay, ingestor, qdrant, dragonfly, iceberg-catalog) |
|
||||
| `sandbox` | Per-user sandbox containers (created dynamically by the gateway) |
|
||||
|
||||
Secrets are managed via 1Password CLI (`op inject`). All `.tpl.yaml` files in `deploy/k8s/prod/secrets/` contain `op://` references and are safe to commit; actual values are never stored in git.
|
||||
|
||||
---
|
||||
|
||||
## Prerequisites
|
||||
|
||||
### Tooling
|
||||
|
||||
| Tool | Purpose | Min Version |
|
||||
|------|---------|-------------|
|
||||
| `kubectl` | Cluster management | 1.30+ |
|
||||
| `kustomize` | Manifest rendering | 5.x |
|
||||
| `op` | 1Password CLI | 2.x |
|
||||
| `docker` | Image builds | - |
|
||||
|
||||
### Cluster Requirements
|
||||
|
||||
- **Kubernetes**: 1.30+ (required for `ValidatingAdmissionPolicy` GA)
|
||||
- **nginx-ingress-controller**: For ingress routing and WebSocket support
|
||||
- **cert-manager**: For TLS certificate provisioning (with `letsencrypt-prod` ClusterIssuer)
|
||||
- **Persistent volume provisioner**: StorageClass `standard` must exist and be functional
|
||||
- **DNS**: `dexorder.ai` resolves to the cluster's ingress IP/load balancer
|
||||
|
||||
### Container Registry Access
|
||||
|
||||
Images are hosted at `git.dxod.org/dexorder/dexorder/`. The cluster must be able to pull from this registry. If the registry requires authentication, create an image pull secret before deploying.
|
||||
|
||||
---
|
||||
|
||||
## Step 1 — Configure kubectl Context
|
||||
|
||||
Create a dedicated context named `prod` that defaults to the `ai` namespace:
|
||||
|
||||
```bash
|
||||
# Add cluster credentials (replace with your actual kubeconfig details)
|
||||
kubectl config set-cluster prod-cluster \
|
||||
--server=https://<your-cluster-api-endpoint> \
|
||||
--certificate-authority=/path/to/ca.crt
|
||||
|
||||
kubectl config set-credentials prod-user \
|
||||
--client-certificate=/path/to/client.crt \
|
||||
--client-key=/path/to/client.key
|
||||
|
||||
kubectl config set-context prod \
|
||||
--cluster=prod-cluster \
|
||||
--user=prod-user \
|
||||
--namespace=ai
|
||||
|
||||
# Verify
|
||||
kubectl --context=prod cluster-info
|
||||
```
|
||||
|
||||
All `bin/` scripts use `kubectl --context=prod` for production operations.
|
||||
|
||||
---
|
||||
|
||||
## Step 2 — Install Cluster Prerequisites
|
||||
|
||||
### nginx-ingress-controller
|
||||
|
||||
```bash
|
||||
kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.10.0/deploy/static/provider/cloud/deploy.yaml
|
||||
kubectl -n ingress-nginx wait --for=condition=ready pod -l app.kubernetes.io/component=controller --timeout=120s
|
||||
```
|
||||
|
||||
### cert-manager
|
||||
|
||||
```bash
|
||||
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.14.0/cert-manager.yaml
|
||||
kubectl -n cert-manager wait --for=condition=ready pod -l app=cert-manager --timeout=120s
|
||||
```
|
||||
|
||||
Then create the `letsencrypt-prod` ClusterIssuer. Edit the email address:
|
||||
|
||||
```yaml
|
||||
# Save as /tmp/clusterissuer.yaml and apply
|
||||
apiVersion: cert-manager.io/v1
|
||||
kind: ClusterIssuer
|
||||
metadata:
|
||||
name: letsencrypt-prod
|
||||
spec:
|
||||
acme:
|
||||
server: https://acme-v02.api.letsencrypt.org/directory
|
||||
email: your-email@dexorder.ai
|
||||
privateKeySecretRef:
|
||||
name: letsencrypt-prod-key
|
||||
solvers:
|
||||
- http01:
|
||||
ingress:
|
||||
class: nginx
|
||||
```
|
||||
|
||||
```bash
|
||||
kubectl apply -f /tmp/clusterissuer.yaml
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 3 — Set Up 1Password Vault
|
||||
|
||||
All production secrets are stored under the **AI Prod** vault in 1Password. The `bin/op-setup` script creates the vault and all required items with placeholder values so you can fill them in before deploying.
|
||||
|
||||
```bash
|
||||
# Sign in to 1Password
|
||||
op signin
|
||||
|
||||
# Preview what will be created (no changes)
|
||||
bin/op-setup --dry-run
|
||||
|
||||
# Create the vault and all items
|
||||
bin/op-setup
|
||||
```
|
||||
|
||||
After running the script, open 1Password and update each item in the **AI Prod** vault with real values:
|
||||
|
||||
| Item | Fields | Where to get the value |
|
||||
|------|--------|------------------------|
|
||||
| `PostgreSQL` | `password` | Generate: `openssl rand -base64 32` |
|
||||
| `MinIO` | `access_key`, `secret_key` | `access_key` can stay `minio-admin`; generate a strong `secret_key` |
|
||||
| `Gateway` | `anthropic_api_key` | [Anthropic Console](https://console.anthropic.com) → API Keys |
|
||||
| `Gateway` | `jwt_secret` | Generate: `openssl rand -base64 48` |
|
||||
| `Gateway` | `openai_api_key` | [OpenAI Platform](https://platform.openai.com) → API Keys (optional) |
|
||||
| `Gateway` | `google_api_key` | Google AI Studio (optional) |
|
||||
| `Gateway` | `openrouter_api_key` | [OpenRouter](https://openrouter.ai) (optional) |
|
||||
| `Telegram` | `bot_token` | BotFather → `/newbot` (optional) |
|
||||
| `Ingestor` | `binance_api_key/secret` | Binance API Management (optional) |
|
||||
| `Ingestor` | `coinbase_api_key/secret` | Coinbase CDP Portal (optional) |
|
||||
| `Ingestor` | `kraken_api_key/secret` | Kraken API Settings (optional) |
|
||||
|
||||
Verify the references resolve before continuing:
|
||||
|
||||
```bash
|
||||
op inject -i deploy/k8s/prod/secrets/gateway-secrets.tpl.yaml | head -20
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 4 — Apply Base Manifests
|
||||
|
||||
This creates namespaces, RBAC, network policies, admission policies, and resource quotas.
|
||||
|
||||
```bash
|
||||
kubectl --context=prod apply -k deploy/k8s/prod/
|
||||
```
|
||||
|
||||
Verify the namespaces and key resources are created:
|
||||
|
||||
```bash
|
||||
kubectl --context=prod get namespaces ai sandbox
|
||||
kubectl --context=prod -n ai get serviceaccount gateway
|
||||
kubectl --context=prod -n sandbox get serviceaccount sandbox-lifecycle
|
||||
kubectl --context=prod get validatingadmissionpolicy dexorder-sandbox-image-policy
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 5 — Apply Secrets
|
||||
|
||||
```bash
|
||||
# Apply all secrets (uses op inject to resolve op:// references)
|
||||
bin/secret-update prod
|
||||
```
|
||||
|
||||
This will prompt for confirmation, then apply all 7 secrets:
|
||||
- `ai-secrets` (Anthropic API key)
|
||||
- `postgres-secret` (PostgreSQL password)
|
||||
- `minio-secret` (MinIO credentials)
|
||||
- `ingestor-secrets` (exchange API keys)
|
||||
- `flink-secrets` (MinIO credentials for Flink)
|
||||
- `gateway-secrets` (gateway application secrets)
|
||||
- `sandbox-secrets` (secrets mounted in sandbox pods)
|
||||
|
||||
Verify:
|
||||
|
||||
```bash
|
||||
kubectl --context=prod -n ai get secrets
|
||||
kubectl --context=prod -n sandbox get secret sandbox-secrets
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 6 — Apply Configs
|
||||
|
||||
```bash
|
||||
# Apply all configs (gateway-config uses op inject; others are plain YAML)
|
||||
bin/config-update prod
|
||||
```
|
||||
|
||||
This applies:
|
||||
- `relay-config` — ZMQ relay configuration
|
||||
- `ingestor-config` — CCXT ingestor configuration
|
||||
- `flink-config` — Flink job configuration
|
||||
- `gateway-config` — Gateway config (DB credentials resolved via op inject)
|
||||
|
||||
Verify:
|
||||
|
||||
```bash
|
||||
kubectl --context=prod -n ai get configmaps
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 7 — Deploy Infrastructure
|
||||
|
||||
Infrastructure services (postgres, minio, kafka, iceberg-catalog, dragonfly, qdrant, relay, ingestor, flink) are defined in `deploy/k8s/prod/infrastructure.yaml` and were applied in Step 4.
|
||||
|
||||
Wait for the StatefulSets and Deployments to become ready:
|
||||
|
||||
```bash
|
||||
kubectl --context=prod -n ai rollout status statefulset/postgres
|
||||
kubectl --context=prod -n ai rollout status statefulset/minio
|
||||
kubectl --context=prod -n ai rollout status statefulset/kafka
|
||||
kubectl --context=prod -n ai rollout status statefulset/qdrant
|
||||
kubectl --context=prod -n ai rollout status deployment/dragonfly
|
||||
kubectl --context=prod -n ai rollout status deployment/iceberg-catalog
|
||||
kubectl --context=prod -n ai rollout status deployment/relay
|
||||
kubectl --context=prod -n ai rollout status deployment/ingestor
|
||||
kubectl --context=prod -n ai rollout status deployment/flink-jobmanager
|
||||
kubectl --context=prod -n ai rollout status deployment/flink-taskmanager
|
||||
```
|
||||
|
||||
MinIO will automatically run a Job to create the `warehouse` bucket on first start. Confirm it completes:
|
||||
|
||||
```bash
|
||||
kubectl --context=prod -n ai get jobs
|
||||
kubectl --context=prod -n ai wait --for=condition=complete job/minio-init --timeout=120s
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 8 — Deploy Application Images
|
||||
|
||||
Build and push the application images:
|
||||
|
||||
```bash
|
||||
# Build and push all services
|
||||
bin/deploy gateway prod
|
||||
bin/deploy web prod
|
||||
bin/deploy sandbox prod
|
||||
bin/deploy lifecycle-sidecar prod
|
||||
bin/deploy flink prod
|
||||
bin/deploy relay prod
|
||||
bin/deploy ingestor prod
|
||||
```
|
||||
|
||||
Each `bin/deploy` command builds the Docker image, tags it with the current git SHA, pushes to `git.dxod.org/dexorder/dexorder/`, and updates the live deployment via `kubectl set image`.
|
||||
|
||||
Wait for the gateway and web to be ready:
|
||||
|
||||
```bash
|
||||
kubectl --context=prod -n ai rollout status deployment/gateway
|
||||
kubectl --context=prod -n ai rollout status deployment/ai-web
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Step 9 — Initialize Schema and Admin User
|
||||
|
||||
```bash
|
||||
bin/init prod
|
||||
```
|
||||
|
||||
This will:
|
||||
1. Wait for postgres to be ready
|
||||
2. Check if the schema exists; apply `gateway/schema.sql` if not
|
||||
3. Prompt for admin user credentials (email, password, display name, license tier)
|
||||
4. Register the user via the API
|
||||
5. Insert the license record into the database
|
||||
|
||||
---
|
||||
|
||||
## Step 10 — Verify TLS and Ingress
|
||||
|
||||
cert-manager should automatically provision TLS certificates via Let's Encrypt once the ingress resources are applied and DNS is resolving correctly.
|
||||
|
||||
```bash
|
||||
# Check certificate status
|
||||
kubectl --context=prod -n ai get certificates
|
||||
kubectl --context=prod -n ai describe certificate dexorder-ai-tls
|
||||
|
||||
# Certificates are ready when READY=True
|
||||
# This can take 1-2 minutes for HTTP-01 challenge completion
|
||||
```
|
||||
|
||||
Once ready, verify the application is accessible:
|
||||
|
||||
```bash
|
||||
curl -I https://dexorder.ai/api/health
|
||||
# Expected: HTTP/2 200
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Day-2 Operations
|
||||
|
||||
### Update a Service After Code Changes
|
||||
|
||||
```bash
|
||||
# Rebuild and redeploy a single service
|
||||
bin/deploy gateway prod
|
||||
bin/deploy web prod
|
||||
```
|
||||
|
||||
### Update Secrets
|
||||
|
||||
```bash
|
||||
# Update all secrets
|
||||
bin/secret-update prod
|
||||
|
||||
# Update a specific secret
|
||||
bin/secret-update prod ai-secrets
|
||||
```
|
||||
|
||||
### Update Config
|
||||
|
||||
```bash
|
||||
# Update all configs (triggers pod restarts)
|
||||
bin/config-update prod
|
||||
|
||||
# Update a specific config
|
||||
bin/config-update prod gateway-config
|
||||
```
|
||||
|
||||
### Add a New User
|
||||
|
||||
```bash
|
||||
# Re-run init to add another user
|
||||
bin/init prod
|
||||
```
|
||||
|
||||
Or insert directly via psql:
|
||||
|
||||
```bash
|
||||
PG_POD=$(kubectl --context=prod -n ai get pods -l app=postgres -o jsonpath='{.items[0].metadata.name}')
|
||||
kubectl --context=prod -n ai exec -it "$PG_POD" -- psql -U postgres -d iceberg
|
||||
```
|
||||
|
||||
### View Logs
|
||||
|
||||
```bash
|
||||
kubectl --context=prod -n ai logs -f deployment/gateway
|
||||
kubectl --context=prod -n ai logs -f deployment/ingestor
|
||||
kubectl --context=prod -n ai logs -f deployment/flink-jobmanager
|
||||
kubectl --context=prod -n sandbox logs -l dexorder.io/component=sandbox
|
||||
```
|
||||
|
||||
### Check Sandbox Status
|
||||
|
||||
```bash
|
||||
# List all running sandboxes
|
||||
kubectl --context=prod -n sandbox get deployments
|
||||
kubectl --context=prod -n sandbox get pods
|
||||
|
||||
# Check resource usage in sandbox namespace
|
||||
kubectl --context=prod -n sandbox top pods
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Namespace & Security Architecture
|
||||
|
||||
```
|
||||
Internet
|
||||
│
|
||||
▼
|
||||
nginx-ingress (dexorder.ai)
|
||||
│
|
||||
├──/──────────────────► ai-web:5173 (Vue.js UI)
|
||||
│
|
||||
└──/api/───────────────► gateway:3000 (Node.js API)
|
||||
│
|
||||
│ Creates/manages via k8s API
|
||||
▼
|
||||
sandbox namespace
|
||||
┌──────────────────────┐
|
||||
│ sandbox-<userId> │
|
||||
│ ├── sandbox │
|
||||
│ │ (MCP server) │
|
||||
│ └── lifecycle-sidecar│
|
||||
└──────────────────────┘
|
||||
│
|
||||
│ Egress: only ai namespace
|
||||
│ services + external HTTPS
|
||||
▼
|
||||
ai namespace services:
|
||||
gateway:5571 (ZMQ events)
|
||||
iceberg-catalog:8181
|
||||
minio:9000
|
||||
relay:5559
|
||||
```
|
||||
|
||||
### Network Isolation
|
||||
|
||||
- Sandbox pods have default-deny network policy
|
||||
- Sandboxes can reach: gateway (ZMQ + callbacks), iceberg-catalog, minio, relay, external HTTPS (port 443)
|
||||
- Sandboxes cannot reach: other sandbox pods, the Kubernetes API, private IP ranges
|
||||
- The admission policy (`dexorder-sandbox-image-policy`) prevents non-approved images from running in the sandbox namespace
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Pods stuck in `Pending`
|
||||
|
||||
```bash
|
||||
kubectl --context=prod -n ai describe pod <pod-name>
|
||||
# Look for: resource quota exceeded, PVC not bound, image pull errors
|
||||
```
|
||||
|
||||
### Certificate not issuing
|
||||
|
||||
```bash
|
||||
kubectl --context=prod -n ai describe certificaterequest
|
||||
kubectl --context=prod -n cert-manager logs -l app=cert-manager
|
||||
# Common cause: DNS not pointing to cluster ingress IP yet
|
||||
```
|
||||
|
||||
### Gateway can't create sandboxes
|
||||
|
||||
```bash
|
||||
# Verify RBAC is correct
|
||||
kubectl --context=prod auth can-i create deployments \
|
||||
--as=system:serviceaccount:ai:gateway -n sandbox
|
||||
|
||||
# Should return: yes
|
||||
```
|
||||
|
||||
### Sandbox pod fails to start with "configmap not found"
|
||||
|
||||
This would indicate a leftover reference to `sandbox-config` (removed from the template). Check the sandbox deployment spec:
|
||||
|
||||
```bash
|
||||
kubectl --context=prod -n sandbox describe deployment sandbox-<userId>
|
||||
```
|
||||
|
||||
### 1Password auth expired
|
||||
|
||||
```bash
|
||||
op signin
|
||||
bin/secret-update prod
|
||||
```
|
||||
@@ -1,8 +1,8 @@
|
||||
# DexOrder AI Platform Architecture
|
||||
# Dexorder AI Platform Architecture
|
||||
|
||||
## Overview
|
||||
|
||||
DexOrder is an AI-powered trading platform that combines real-time market data processing, user-specific AI agents, and a flexible data pipeline. The system is designed for scalability, isolation, and extensibility.
|
||||
Dexorder is an AI-powered trading platform that combines real-time market data processing, user-specific AI agents, and a flexible data pipeline. The system is designed for scalability, isolation, and extensibility.
|
||||
|
||||
## High-Level Architecture
|
||||
|
||||
@@ -415,12 +415,12 @@ User authenticates → Gateway checks if deployment exists
|
||||
### RBAC
|
||||
|
||||
**Gateway ServiceAccount:**
|
||||
- Create deployments/services/PVCs in `dexorder-sandboxes` namespace
|
||||
- Create deployments/services/PVCs in `sandbox` namespace
|
||||
- Read pod status and logs
|
||||
- Cannot delete, exec, or access secrets
|
||||
|
||||
**Lifecycle Sidecar ServiceAccount:**
|
||||
- Delete deployments in `dexorder-sandboxes` namespace
|
||||
- Delete deployments in `sandbox` namespace
|
||||
- Delete PVCs (conditional on user type)
|
||||
- Cannot access other resources
|
||||
|
||||
@@ -428,7 +428,7 @@ User authenticates → Gateway checks if deployment exists
|
||||
|
||||
### Admission Control
|
||||
|
||||
All pods in `dexorder-sandboxes` namespace must:
|
||||
All pods in `sandbox` namespace must:
|
||||
- Use approved images only (allowlist)
|
||||
- Run as non-root
|
||||
- Drop all capabilities
|
||||
@@ -550,7 +550,7 @@ docker push ghcr.io/dexorder/lifecycle-sidecar:latest
|
||||
|
||||
**Namespaces:**
|
||||
- `dexorder-system` - Platform services (gateway, infrastructure)
|
||||
- `dexorder-sandboxes` - User containers (isolated)
|
||||
- `sandbox` - User containers (isolated)
|
||||
|
||||
---
|
||||
|
||||
|
||||
@@ -85,7 +85,7 @@ Runs alongside the agent container with shared PID namespace. Monitors the main
|
||||
- `USER_TYPE`: License tier (`anonymous`, `free`, `paid`, `enterprise`)
|
||||
- `MAIN_CONTAINER_PID`: PID of main container (default: 1)
|
||||
|
||||
**RBAC**: Has permission to delete deployments and PVCs **only in dexorder-sandboxes namespace**. Cannot delete other deployments due to:
|
||||
**RBAC**: Has permission to delete deployments and PVCs **only in sandbox namespace**. Cannot delete other deployments due to:
|
||||
1. Only knows its own deployment name (from env)
|
||||
2. RBAC scoped to namespace
|
||||
3. No cross-pod communication
|
||||
@@ -164,12 +164,12 @@ Configured via `USER_TYPE` env var in deployment.
|
||||
**Lifecycle Sidecar**:
|
||||
- Can delete its own deployment only
|
||||
- Cannot delete other deployments
|
||||
- Scoped to dexorder-sandboxes namespace
|
||||
- Scoped to sandbox namespace
|
||||
- No exec, no secrets access
|
||||
|
||||
### Admission Control
|
||||
|
||||
All deployments in `dexorder-sandboxes` namespace are subject to:
|
||||
All deployments in `sandbox` namespace are subject to:
|
||||
- Image allowlist (only approved images)
|
||||
- Security context enforcement (non-root, drop caps, read-only rootfs)
|
||||
- Resource limits required
|
||||
@@ -198,7 +198,7 @@ kubectl apply -k deploy/k8s/dev # or prod
|
||||
```
|
||||
|
||||
This creates:
|
||||
- Namespaces (`dexorder-system`, `dexorder-sandboxes`)
|
||||
- Namespaces (`dexorder-system`, `sandbox`)
|
||||
- RBAC (gateway, lifecycle sidecar)
|
||||
- Admission policies
|
||||
- Network policies
|
||||
@@ -257,7 +257,7 @@ cd lifecycle-sidecar
|
||||
go build -o lifecycle-sidecar main.go
|
||||
|
||||
# Run (requires k8s config)
|
||||
export NAMESPACE=dexorder-sandboxes
|
||||
export NAMESPACE=sandbox
|
||||
export DEPLOYMENT_NAME=agent-test
|
||||
export USER_TYPE=free
|
||||
./lifecycle-sidecar
|
||||
@@ -277,7 +277,7 @@ export USER_TYPE=free
|
||||
|
||||
Check logs:
|
||||
```bash
|
||||
kubectl logs -n dexorder-sandboxes sandbox-user-abc123 -c agent
|
||||
kubectl logs -n sandbox sandbox-user-abc123 -c agent
|
||||
```
|
||||
|
||||
Verify:
|
||||
@@ -289,19 +289,19 @@ Verify:
|
||||
|
||||
Check sidecar logs:
|
||||
```bash
|
||||
kubectl logs -n dexorder-sandboxes sandbox-user-abc123 -c lifecycle-sidecar
|
||||
kubectl logs -n sandbox sandbox-user-abc123 -c lifecycle-sidecar
|
||||
```
|
||||
|
||||
Verify:
|
||||
- Exit code file exists: `/var/run/agent/exit_code` contains `42`
|
||||
- RBAC permissions: `kubectl auth can-i delete deployments --as=system:serviceaccount:dexorder-sandboxes:sandbox-lifecycle -n dexorder-sandboxes`
|
||||
- RBAC permissions: `kubectl auth can-i delete deployments --as=system:serviceaccount:sandbox:sandbox-lifecycle -n sandbox`
|
||||
- Deployment name matches: Check `DEPLOYMENT_NAME` env var
|
||||
|
||||
### Gateway can't create deployments
|
||||
|
||||
Check gateway logs and verify:
|
||||
- ServiceAccount exists: `kubectl get sa gateway -n dexorder-system`
|
||||
- RoleBinding exists: `kubectl get rolebinding gateway-sandbox-creator -n dexorder-sandboxes`
|
||||
- RoleBinding exists: `kubectl get rolebinding gateway-sandbox-creator -n sandbox`
|
||||
- Admission policy allows image: Check image name matches allowlist in `admission-policy.yaml`
|
||||
|
||||
## Future Enhancements
|
||||
|
||||
@@ -63,7 +63,7 @@ userId: "user-abc123"
|
||||
deploymentName: "sandbox-user-abc123"
|
||||
serviceName: "sandbox-user-abc123"
|
||||
pvcName: "sandbox-user-abc123-data"
|
||||
mcpEndpoint: "http://sandbox-user-abc123.dexorder-sandboxes.svc.cluster.local:3000"
|
||||
mcpEndpoint: "http://sandbox-user-abc123.sandbox.svc.cluster.local:3000"
|
||||
```
|
||||
|
||||
User IDs are sanitized to be Kubernetes-compliant (lowercase alphanumeric + hyphens).
|
||||
@@ -145,7 +145,7 @@ Environment variables:
|
||||
|
||||
```bash
|
||||
# Kubernetes
|
||||
KUBERNETES_NAMESPACE=dexorder-sandboxes
|
||||
KUBERNETES_NAMESPACE=sandbox
|
||||
KUBERNETES_IN_CLUSTER=true # false for local dev
|
||||
KUBERNETES_CONTEXT=minikube # for local dev only
|
||||
|
||||
@@ -162,9 +162,9 @@ SANDBOX_STORAGE_CLASS=standard
|
||||
The gateway uses a restricted ServiceAccount with RBAC:
|
||||
|
||||
**Can do:**
|
||||
- ✅ Create deployments in `dexorder-sandboxes` namespace
|
||||
- ✅ Create services in `dexorder-sandboxes` namespace
|
||||
- ✅ Create PVCs in `dexorder-sandboxes` namespace
|
||||
- ✅ Create deployments in `sandbox` namespace
|
||||
- ✅ Create services in `sandbox` namespace
|
||||
- ✅ Create PVCs in `sandbox` namespace
|
||||
- ✅ Read pod status and logs (debugging)
|
||||
- ✅ Update deployments (future: resource scaling)
|
||||
|
||||
@@ -226,7 +226,7 @@ kubectl apply -k deploy/k8s/dev
|
||||
# .env
|
||||
KUBERNETES_IN_CLUSTER=false
|
||||
KUBERNETES_CONTEXT=minikube
|
||||
KUBERNETES_NAMESPACE=dexorder-sandboxes
|
||||
KUBERNETES_NAMESPACE=sandbox
|
||||
```
|
||||
|
||||
4. Run gateway:
|
||||
@@ -242,9 +242,9 @@ wscat -c "ws://localhost:3000/ws/chat" -H "Authorization: Bearer your-jwt"
|
||||
|
||||
The gateway will create deployments in minikube. View with:
|
||||
```bash
|
||||
kubectl get deployments -n dexorder-sandboxes
|
||||
kubectl get pods -n dexorder-sandboxes
|
||||
kubectl logs -n dexorder-sandboxes sandbox-user-abc123 -c agent
|
||||
kubectl get deployments -n sandbox
|
||||
kubectl get pods -n sandbox
|
||||
kubectl logs -n sandbox sandbox-user-abc123 -c agent
|
||||
```
|
||||
|
||||
## Production Deployment
|
||||
@@ -262,7 +262,7 @@ kubectl apply -k deploy/k8s/prod
|
||||
```
|
||||
|
||||
3. Gateway runs in `dexorder-system` namespace
|
||||
4. Creates agent containers in `dexorder-sandboxes` namespace
|
||||
4. Creates agent containers in `sandbox` namespace
|
||||
5. Admission policies enforce image allowlist and security constraints
|
||||
|
||||
## Monitoring
|
||||
|
||||
@@ -1169,7 +1169,7 @@ apiVersion: networking.k8s.io/v1
|
||||
kind: NetworkPolicy
|
||||
metadata:
|
||||
name: agent-to-gateway-events
|
||||
namespace: dexorder-sandboxes
|
||||
namespace: sandbox
|
||||
spec:
|
||||
podSelector:
|
||||
matchLabels:
|
||||
|
||||
Reference in New Issue
Block a user