container lifecycle management

2026-03-12 15:13:38 -04:00
parent e99ef5d2dd
commit b9cc397e05
61 changed files with 6880 additions and 31 deletions
--- a/doc/agent_harness_flow.md
+++ b/doc/agent_harness_flow.md
@@ -0,0 +1,21 @@
+┌─────────────────────────────────────────────────┐
+│              Agent Harness (your servers)         │
+│                                                   │
+│  on_message(user_id, message):                    │
+│    1. Look up user's MCP endpoint from Postgres   │
+│    2. mcp.call("get_context_summary")             │
+│    3. mcp.call("get_conversation_history", 20)    │
+│    4. Build prompt:                                │
+│         system = BASE_PROMPT                       │
+│                + context_summary                   │
+│                + user_agent_prompt (from MCP)      │
+│         messages = history + new message           │
+│    5. LLM call (your API key)                      │
+│    6. While LLM wants tool calls:                  │
+│         - Platform tools → handle locally          │
+│         - User tools → proxy to MCP                │
+│         - LLM call again with results              │
+│    7. mcp.call("save_message", ...)                │
+│    8. Return response to user                      │
+│                                                   │
+└─────────────────────────────────────────────────┘
--- a/doc/agent_redesign.md
+++ b/doc/agent_redesign.md
@@ -1,9 +1,11 @@
 Generally use skills instead of subagents, except for the analysis subagent.

-## User-specific files
+## User-specific files and tools
 * Indicators
 * Strategies
 * Watchlists
 * Preferences
  * Trading style
  * Charting / colors
+* Executors (really just sub-strategies)
+  * tactical-level order generators e.g. TWAP, iceberg, etc. 
--- a/doc/config.md
+++ b/doc/config.md
@@ -1,18 +0,0 @@
-This file describes all the configuration options used by all components. All configuration is divided into regular config and secrets, and k8s will mount either or both as a yaml file accessible to the process.
-
-# Configuration
-
-* `flink_hostname`
-* ... various zmq ports for flink ...
-* `iceberg_catalog_hostname`
-* `iceberg_catalog_port`
-* `iceberg_catalog_database`
-* etc
-
-
-# Secrets
-
-* `iceberg_catalog_username`
-* `iceberg_catalog_password`
-* etc.
-
--- a/doc/container_lifecycle_management.md
+++ b/doc/container_lifecycle_management.md
@@ -0,0 +1,313 @@
+# Container Lifecycle Management
+
+## Overview
+
+User agent containers self-manage their lifecycle to optimize resource usage. Containers automatically shut down when idle (no triggers + no recent activity) and clean themselves up using a lifecycle sidecar.
+
+## Architecture
+
+```
+┌──────────────────────────────────────────────────────────┐
+│                   Agent Pod                              │
+│  ┌───────────────────┐       ┌──────────────────────┐   │
+│  │  Agent Container  │       │ Lifecycle Sidecar    │   │
+│  │  ───────────────  │       │ ──────────────────   │   │
+│  │                   │       │                      │   │
+│  │ Lifecycle Manager │       │ Watches exit code    │   │
+│  │ - Track activity  │       │ - Detects exit 42    │   │
+│  │ - Track triggers  │       │ - Calls k8s API      │   │
+│  │ - Exit 42 if idle │       │ - Deletes deployment │   │
+│  └───────────────────┘       └──────────────────────┘   │
+│           │                           │                  │
+│           │ writes exit_code          │                  │
+│           └────►/var/run/agent/exit_code                │
+│                                       │                  │
+└───────────────────────────────────────┼──────────────────┘
+                                        │
+                                        ▼ k8s API (RBAC)
+                              ┌─────────────────────┐
+                              │ Delete Deployment   │
+                              │ Delete PVC (if anon)│
+                              └─────────────────────┘
+```
+
+## Components
+
+### 1. Lifecycle Manager (Python)
+
+**Location**: `client-py/dexorder/lifecycle_manager.py`
+
+Runs inside the agent container and tracks:
+- **Activity**: MCP tool/resource/prompt calls reset the idle timer
+- **Triggers**: Data subscriptions, CEP patterns, etc.
+- **Idle state**: No triggers + idle timeout exceeded
+
+**Configuration** (via environment variables):
+- `IDLE_TIMEOUT_MINUTES`: Minutes before shutdown (default: 15)
+- `IDLE_CHECK_INTERVAL_SECONDS`: Check frequency (default: 60)
+- `ENABLE_IDLE_SHUTDOWN`: Enable/disable shutdown (default: true)
+
+**Usage in agent code**:
+```python
+from dexorder.lifecycle_manager import get_lifecycle_manager
+
+# On startup
+manager = get_lifecycle_manager()
+await manager.start()
+
+# On MCP calls (tool/resource/prompt)
+manager.record_activity()
+
+# When triggers change
+manager.add_trigger("data_sub_BTC_USDT")
+manager.remove_trigger("data_sub_BTC_USDT")
+
+# Or batch update
+manager.update_triggers({"trigger_1", "trigger_2"})
+```
+
+**Exit behavior**:
+- Idle shutdown: Exit with code `42`
+- Signal (SIGTERM/SIGINT): Exit with code `0` (allows restart)
+- Errors/crashes: Exit with error code (allows restart)
+
+### 2. Lifecycle Sidecar (Go)
+
+**Location**: `lifecycle-sidecar/`
+
+Runs alongside the agent container with shared PID namespace. Monitors the main container process and:
+- On exit code `42`: Deletes deployment (and PVC if anonymous user)
+- On any other exit code: Exits with same code (k8s restarts pod)
+
+**Configuration** (via environment, injected by downward API):
+- `NAMESPACE`: Pod's namespace
+- `DEPLOYMENT_NAME`: Deployment name (from pod label)
+- `USER_TYPE`: License tier (`anonymous`, `free`, `paid`, `enterprise`)
+- `MAIN_CONTAINER_PID`: PID of main container (default: 1)
+
+**RBAC**: Has permission to delete deployments and PVCs **only in dexorder-agents namespace**. Cannot delete other deployments due to:
+1. Only knows its own deployment name (from env)
+2. RBAC scoped to namespace
+3. No cross-pod communication
+
+### 3. Gateway (TypeScript)
+
+**Location**: `gateway/src/harness/agent-harness.ts`
+
+Creates agent deployments when users connect. Has permissions to:
+- ✅ Create deployments, services, PVCs
+- ✅ Read pod status and logs
+- ✅ Update deployments (e.g., resource limits)
+- ❌ Delete deployments (handled by sidecar)
+- ❌ Exec into pods
+- ❌ Access secrets
+
+## Lifecycle States
+
+```
+┌─────────────┐
+│   CREATED   │ ← Gateway creates deployment
+└──────┬──────┘
+       │
+       ▼
+┌─────────────┐
+│   RUNNING   │ ← User interacts, has triggers
+└──────┬──────┘
+       │
+       ▼
+┌─────────────┐
+│    IDLE     │ ← No triggers + timeout exceeded
+└──────┬──────┘
+       │
+       ▼
+┌─────────────┐
+│  SHUTDOWN   │ ← Exit code 42
+└──────┬──────┘
+       │
+       ▼
+┌─────────────┐
+│   DELETED   │ ← Sidecar deletes deployment
+└─────────────┘
+```
+
+## Idle Detection Logic
+
+Container is **IDLE** when:
+1. `active_triggers.isEmpty()` AND
+2. `(now - last_activity) > idle_timeout`
+
+Container is **ACTIVE** when:
+1. Has any active triggers (data subscriptions, CEP patterns, etc.) OR
+2. Recent user activity (MCP calls within timeout)
+
+## Cleanup Policies by License Tier
+
+| User Type    | Idle Timeout | PVC Policy | Notes |
+|--------------|--------------|------------|-------|
+| Anonymous    | 15 minutes   | Delete     | Ephemeral, no data retention |
+| Free         | 15 minutes   | Retain     | Can resume session |
+| Paid         | 60 minutes   | Retain     | Longer grace period |
+| Enterprise   | No shutdown  | Retain     | Always-on containers |
+
+Configured via `USER_TYPE` env var in deployment.
+
+## Security
+
+### Principle of Least Privilege
+
+**Gateway**:
+- Can create agent resources
+- Cannot delete agent resources
+- Cannot access other namespaces
+- Cannot exec into pods
+
+**Lifecycle Sidecar**:
+- Can delete its own deployment only
+- Cannot delete other deployments
+- Scoped to dexorder-agents namespace
+- No exec, no secrets access
+
+### Admission Control
+
+All deployments in `dexorder-agents` namespace are subject to:
+- Image allowlist (only approved images)
+- Security context enforcement (non-root, drop caps, read-only rootfs)
+- Resource limits required
+- PodSecurity standards (restricted profile)
+
+See `deploy/k8s/base/admission-policy.yaml`
+
+### Network Isolation
+
+Agents are network-isolated via NetworkPolicy:
+- Can connect to gateway (MCP)
+- Can connect to Redpanda (data streams)
+- Can make outbound HTTPS (exchanges, LLM APIs)
+- Cannot access k8s API
+- Cannot access system namespace
+- Cannot access other agent pods
+
+See `deploy/k8s/base/network-policies.yaml`
+
+## Deployment
+
+### 1. Apply Security Policies
+
+```bash
+kubectl apply -k deploy/k8s/dev  # or prod
+```
+
+This creates:
+- Namespaces (`dexorder-system`, `dexorder-agents`)
+- RBAC (gateway, lifecycle sidecar)
+- Admission policies
+- Network policies
+- Resource quotas
+
+### 2. Build and Push Lifecycle Sidecar
+
+```bash
+cd lifecycle-sidecar
+docker build -t ghcr.io/dexorder/lifecycle-sidecar:latest .
+docker push ghcr.io/dexorder/lifecycle-sidecar:latest
+```
+
+### 3. Gateway Creates Agent Deployments
+
+When a user connects, the gateway creates:
+- Deployment with agent + sidecar
+- PVC for persistent data
+- Service for MCP endpoint
+
+See `deploy/k8s/base/agent-deployment-example.yaml` for template.
+
+## Testing
+
+### Test Lifecycle Manager Locally
+
+```python
+from dexorder.lifecycle_manager import LifecycleManager
+
+# Disable actual shutdown for testing
+manager = LifecycleManager(
+    idle_timeout_minutes=1,
+    check_interval_seconds=10,
+    enable_shutdown=False  # Only log, don't exit
+)
+
+await manager.start()
+
+# Simulate activity
+manager.record_activity()
+
+# Simulate triggers
+manager.add_trigger("test_trigger")
+await asyncio.sleep(70)  # Wait past timeout
+manager.remove_trigger("test_trigger")
+await asyncio.sleep(70)  # Should detect idle
+
+await manager.stop()
+```
+
+### Test Sidecar Locally
+
+```bash
+# Build
+cd lifecycle-sidecar
+go build -o lifecycle-sidecar main.go
+
+# Run (requires k8s config)
+export NAMESPACE=dexorder-agents
+export DEPLOYMENT_NAME=agent-test
+export USER_TYPE=free
+./lifecycle-sidecar
+```
+
+### Integration Test
+
+1. Deploy test agent with sidecar
+2. Verify agent starts and is healthy
+3. Stop sending MCP calls and remove all triggers
+4. Wait for idle timeout + check interval
+5. Verify deployment is deleted
+
+## Troubleshooting
+
+### Container not shutting down when idle
+
+Check logs:
+```bash
+kubectl logs -n dexorder-agents agent-user-abc123 -c agent
+```
+
+Verify:
+- `ENABLE_IDLE_SHUTDOWN=true`
+- No active triggers: `manager.active_triggers` should be empty
+- Idle timeout exceeded
+
+### Sidecar not deleting deployment
+
+Check sidecar logs:
+```bash
+kubectl logs -n dexorder-agents agent-user-abc123 -c lifecycle-sidecar
+```
+
+Verify:
+- Exit code file exists: `/var/run/agent/exit_code` contains `42`
+- RBAC permissions: `kubectl auth can-i delete deployments --as=system:serviceaccount:dexorder-agents:agent-lifecycle -n dexorder-agents`
+- Deployment name matches: Check `DEPLOYMENT_NAME` env var
+
+### Gateway can't create deployments
+
+Check gateway logs and verify:
+- ServiceAccount exists: `kubectl get sa gateway -n dexorder-system`
+- RoleBinding exists: `kubectl get rolebinding gateway-agent-creator -n dexorder-agents`
+- Admission policy allows image: Check image name matches allowlist in `admission-policy.yaml`
+
+## Future Enhancements
+
+1. **Graceful shutdown notifications**: Warn users before shutdown via websocket
+2. **Predictive scaling**: Keep frequently-used containers warm
+3. **Tiered storage**: Move old PVCs to cheaper storage class
+4. **Metrics**: Expose lifecycle metrics (idle rate, shutdown count, etc.)
+5. **Cost allocation**: Track resource usage per user/license tier
--- a/doc/gateway_container_creation.md
+++ b/doc/gateway_container_creation.md
@@ -0,0 +1,286 @@
+# Gateway Container Creation
+
+## Overview
+
+The gateway automatically provisions user agent containers when users authenticate. This ensures each user has their own isolated environment running their MCP server with persistent storage.
+
+## Authentication Flow with Container Creation
+
+```
+User connects (WebSocket/Telegram)
+         ↓
+   Send "Authenticating..." status
+         ↓
+   Verify token/channel link
+         ↓
+   Lookup user license from DB
+         ↓
+   Send "Starting workspace..." status
+         ↓
+┌────────────────────────────────────┐
+│  ContainerManager.ensureRunning() │
+│  ┌──────────────────────────────┐ │
+│  │ Check if deployment exists   │ │
+│  └──────────────────────────────┘ │
+│           ↓                        │
+│     Does it exist?                 │
+│     ↙         ↘                    │
+│   Yes          No                  │
+│    │            │                  │
+│    │      ┌──────────────────┐    │
+│    │      │ Create deployment│    │
+│    │      │ Create PVC       │    │
+│    │      │ Create service   │    │
+│    │      └──────────────────┘    │
+│    │            │                  │
+│    └────────────┘                  │
+│         ↓                          │
+│  Wait for deployment ready         │
+│  (polls every 2s, timeout 2min)    │
+│         ↓                          │
+│  Compute MCP endpoint URL          │
+│  (internal k8s service DNS)        │
+└────────────────────────────────────┘
+         ↓
+   Update license.mcpServerUrl
+         ↓
+   Send "Connected" status
+         ↓
+   Initialize AgentHarness
+         ↓
+   Connect to user's MCP server
+         ↓
+   Ready for messages
+```
+
+## Container Naming Convention
+
+All resources follow a consistent naming pattern based on `userId`:
+
+```typescript
+userId: "user-abc123"
+  ↓
+deploymentName: "agent-user-abc123"
+serviceName: "agent-user-abc123"
+pvcName: "agent-user-abc123-data"
+mcpEndpoint: "http://agent-user-abc123.dexorder-agents.svc.cluster.local:3000"
+```
+
+User IDs are sanitized to be Kubernetes-compliant (lowercase alphanumeric + hyphens).
+
+## Templates by License Tier
+
+Templates are located in `gateway/src/k8s/templates/`:
+- `free-tier.yaml`
+- `pro-tier.yaml`
+- `enterprise-tier.yaml`
+
+### Variable Substitution
+
+Templates use simple string replacement:
+- `{{userId}}` - User ID
+- `{{deploymentName}}` - Computed deployment name
+- `{{serviceName}}` - Computed service name
+- `{{pvcName}}` - Computed PVC name
+- `{{agentImage}}` - Agent container image (from env)
+- `{{sidecarImage}}` - Lifecycle sidecar image (from env)
+- `{{storageClass}}` - Kubernetes storage class (from env)
+
+### Resource Limits
+
+| Tier | Memory Request | Memory Limit | CPU Request | CPU Limit | Storage | Idle Timeout |
+|------|----------------|--------------|-------------|-----------|---------|--------------|
+| **Free** | 256Mi | 512Mi | 100m | 500m | 1Gi | 15min |
+| **Pro** | 512Mi | 2Gi | 250m | 2000m | 10Gi | 60min |
+| **Enterprise** | 1Gi | 4Gi | 500m | 4000m | 50Gi | Never (shutdown disabled) |
+
+## Components
+
+### KubernetesClient (`gateway/src/k8s/client.ts`)
+
+Low-level k8s API wrapper:
+- `deploymentExists(name)` - Check if deployment exists
+- `createAgentDeployment(spec)` - Create deployment/service/PVC from template
+- `waitForDeploymentReady(name, timeout)` - Poll until ready
+- `getServiceEndpoint(name)` - Get service URL
+- `deleteAgentDeployment(userId)` - Cleanup (for testing)
+
+Static helpers:
+- `getDeploymentName(userId)` - Generate deployment name
+- `getServiceName(userId)` - Generate service name
+- `getPvcName(userId)` - Generate PVC name
+- `getMcpEndpoint(userId, namespace)` - Compute internal service URL
+
+### ContainerManager (`gateway/src/k8s/container-manager.ts`)
+
+High-level orchestration:
+- `ensureContainerRunning(userId, license)` - Main entry point
+  - Returns: `{ mcpEndpoint, wasCreated }`
+  - Creates deployment if missing
+  - Waits for ready state
+  - Returns endpoint URL
+- `getContainerStatus(userId)` - Check status without creating
+- `deleteContainer(userId)` - Manual cleanup
+
+### Authenticator (`gateway/src/auth/authenticator.ts`)
+
+Updated to call container manager:
+- `authenticateWebSocket()` - Calls `ensureContainerRunning()` before returning `AuthContext`
+- `authenticateTelegram()` - Same for Telegram webhooks
+
+### WebSocketHandler (`gateway/src/channels/websocket-handler.ts`)
+
+Multi-phase connection protocol:
+1. Send `{type: 'status', status: 'authenticating'}`
+2. Authenticate (may take 30-120s if creating container)
+3. Send `{type: 'status', status: 'initializing'}`
+4. Initialize agent harness
+5. Send `{type: 'connected', ...}`
+
+This gives the client visibility into the startup process.
+
+## Configuration
+
+Environment variables:
+
+```bash
+# Kubernetes
+KUBERNETES_NAMESPACE=dexorder-agents
+KUBERNETES_IN_CLUSTER=true         # false for local dev
+KUBERNETES_CONTEXT=minikube        # for local dev only
+
+# Container images
+AGENT_IMAGE=ghcr.io/dexorder/agent:latest
+SIDECAR_IMAGE=ghcr.io/dexorder/lifecycle-sidecar:latest
+
+# Storage
+AGENT_STORAGE_CLASS=standard
+```
+
+## Security
+
+The gateway uses a restricted ServiceAccount with RBAC:
+
+**Can do:**
+- ✅ Create deployments in `dexorder-agents` namespace
+- ✅ Create services in `dexorder-agents` namespace
+- ✅ Create PVCs in `dexorder-agents` namespace
+- ✅ Read pod status and logs (debugging)
+- ✅ Update deployments (future: resource scaling)
+
+**Cannot do:**
+- ❌ Delete deployments (handled by lifecycle sidecar)
+- ❌ Delete PVCs (handled by lifecycle sidecar)
+- ❌ Exec into pods
+- ❌ Access secrets or configmaps
+- ❌ Create resources in other namespaces
+- ❌ Access Kubernetes API from agent containers (blocked by NetworkPolicy)
+
+See `deploy/k8s/base/gateway-rbac.yaml` for full configuration.
+
+## Lifecycle
+
+### Container Creation (Gateway)
+- User authenticates
+- Gateway checks if deployment exists
+- If missing, creates from template
+- Waits for ready (2min timeout)
+- Returns MCP endpoint
+
+### Container Deletion (Lifecycle Sidecar)
+- Container tracks activity and triggers
+- When idle (no triggers + timeout), exits with code 42
+- Sidecar detects exit code 42
+- Sidecar deletes deployment + optional PVC via k8s API
+- Gateway creates fresh container on next authentication
+
+See `doc/container_lifecycle_management.md` for full lifecycle details.
+
+## Error Handling
+
+| Error | Gateway Action | User Experience |
+|-------|----------------|-----------------|
+| Deployment creation fails | Log error, return auth failure | "Authentication failed" |
+| Wait timeout (image pull, etc.) | Log warning, return 503 | "Service unavailable, retry" |
+| Service not found | Retry with backoff | Transparent retry |
+| MCP connection fails | Return error | "Failed to connect to workspace" |
+| Existing deployment not ready | Wait 30s, continue if still not ready | May connect to partially-ready container |
+
+## Local Development
+
+For local development (outside k8s):
+
+1. Start minikube:
+```bash
+minikube start
+minikube addons enable storage-provisioner
+```
+
+2. Apply security policies:
+```bash
+kubectl apply -k deploy/k8s/dev
+```
+
+3. Configure gateway for local k8s:
+```bash
+# .env
+KUBERNETES_IN_CLUSTER=false
+KUBERNETES_CONTEXT=minikube
+KUBERNETES_NAMESPACE=dexorder-agents
+```
+
+4. Run gateway:
+```bash
+cd gateway
+npm run dev
+```
+
+5. Connect via WebSocket:
+```bash
+wscat -c "ws://localhost:3000/ws/chat" -H "Authorization: Bearer your-jwt"
+```
+
+The gateway will create deployments in minikube. View with:
+```bash
+kubectl get deployments -n dexorder-agents
+kubectl get pods -n dexorder-agents
+kubectl logs -n dexorder-agents agent-user-abc123 -c agent
+```
+
+## Production Deployment
+
+1. Build and push gateway image:
+```bash
+cd gateway
+docker build -t ghcr.io/dexorder/gateway:latest .
+docker push ghcr.io/dexorder/gateway:latest
+```
+
+2. Deploy to k8s:
+```bash
+kubectl apply -k deploy/k8s/prod
+```
+
+3. Gateway runs in `dexorder-system` namespace
+4. Creates agent containers in `dexorder-agents` namespace
+5. Admission policies enforce image allowlist and security constraints
+
+## Monitoring
+
+Useful metrics to track:
+- Container creation latency (time from auth to ready)
+- Container creation failure rate
+- Active containers by license tier
+- Resource usage per tier
+- Idle shutdown rate
+
+These can be exported via Prometheus or logged to monitoring service.
+
+## Future Enhancements
+
+1. **Pre-warming**: Create containers for active users before they connect
+2. **Image updates**: Handle agent image version migrations with user consent
+3. **Multi-region**: Geo-distributed container placement
+4. **Cost tracking**: Per-user resource usage and billing
+5. **Auto-scaling**: Scale down to 0 replicas instead of deletion (faster restart)
+6. **Container pools**: Shared warm containers for anonymous users
--- a/doc/m_c_p_client_authentication_modes.md
+++ b/doc/m_c_p_client_authentication_modes.md
@@ -0,0 +1,80 @@
+Mode A: Platform Harness → Hosted Container (internal)
+   Auth: mTLS + platform-signed user claim
+   Network: k8s internal, never hits the internet
+
+Mode B: Platform Harness → External User Container (remote)
+   Auth: OAuth2 token issued by your platform
+   Network: public internet, TLS required
+
+Mode C: Third-party MCP Client → External User Container (standalone)
+   Auth: User-managed API key or local-only (no network)
+   Network: localhost or user's own network
+
+┌──────────────────────────────────────────────────────────┐
+│                     Platform (Postgres)                    │
+│                                                            │
+│  users                                                     │
+│  ├── id, email, password_hash, plan_tier                   │
+│  │                                                         │
+│  containers                                                │
+│  ├── user_id                                               │
+│  ├── type: "hosted" | "external"                           │
+│  ├── mcp_endpoint: "internal-svc:3100" | "https://..."     │
+│  ├── auth_method: "mtls" | "platform_token" | "api_key"    │
+│  └── public_key_fingerprint (for pinning external certs)   │
+│                                                            │
+│  api_tokens                                                │
+│  ├── user_id                                               │
+│  ├── token_hash                                            │
+│  ├── scopes: ["mcp:tools", "mcp:resources", "data:read"]  │
+│  ├── expires_at                                            │
+│  └── issued_for: "platform_harness" | "user_direct"        │
+│                                                            │
+└──────────────────────────────────────────────────────────┘
+
+## Mode A
+
+Harness ──mTLS──▶ k8s Service ──▶ User Container MCP
+Validates: source is platform namespace
+Extracts: user_id from forwarded header
+
+## Mode B
+
+Registration flow (one-time):
+1. User provides their MCP endpoint URL in platform settings
+2. Platform generates a scoped token (JWT, short-lived, auto-refreshed)
+3. User configures their MCP server to accept tokens signed by your platform
+4. Platform stores the endpoint + auth method
+
+Runtime:
+┌──────────┐   HTTPS + Bearer token    ┌────────────────────┐
+│ Harness  │ ─────────────────────────▶ │ External MCP Server│
+│          │   Authorization:           │                    │
+│          │   Bearer <platform_jwt>    │ Validates:         │
+│          │                            │  - JWT signature   │
+│          │                            │    (your public    │
+│          │                            │     key, JWKS)     │
+│          │                            │  - user_id claim   │
+│          │                            │    matches self    │
+│          │                            │  - not expired     │
+└──────────┘                            └────────────────────┘
+
+## Mode C
+
+```yaml
+# openclaw/config.yaml
+auth:
+  # For local-only use (Claude Desktop, Cursor, etc via stdio)
+  mode: "local"          # no network auth needed
+
+  # OR for remote access
+  mode: "token"
+  tokens:
+    - name: "my-laptop"
+      hash: "sha256:..."  # generated by `openclaw token create`
+
+  # OR for platform integration
+  mode: "platform"
+  platform_jwks_url: "https://api.openclaw.io/.well-known/jwks.json"
+  expected_user_id: "user_abc123"
+```
--- a/doc/m_c_p_tools_architecture.md
+++ b/doc/m_c_p_tools_architecture.md
@@ -0,0 +1,29 @@
+MCP Tools (User Container)
+├── Memory
+│   ├── get_conversation_history(limit)
+│   ├── save_message(role, content)
+│   ├── search_memory(query)          ← semantic search over past conversations
+│   └── get_context_summary()         ← "who is this user, what do they care about"
+│
+├── Strategies & Indicators
+│   ├── list_strategies()
+│   ├── read_strategy(name)
+│   ├── write_strategy(name, code)
+│   ├── list_indicators()
+│   ├── read_indicator(name)
+│   ├── write_indicator(name, code)
+│   └── run_backtest(strategy, params)
+│
+├── Preferences
+│   ├── get_preferences()
+│   ├── set_preference(key, value)
+│   └── get_agent_prompt()            ← user's custom system prompt additions
+│
+├── Trading
+│   ├── get_watchlist()
+│   ├── execute_trade(params)
+│   ├── get_positions()
+│   └── get_trade_history()
+│
+└── Sandbox
+    └── run_python(code)              ← datascience toolset, matplotlib, etc.
--- a/doc/protocol.md
+++ b/doc/protocol.md
@@ -0,0 +1,168 @@
+# ZeroMQ Protocol Architecture
+
+Our data transfer protocol uses ZeroMQ with Protobufs. We send a small envelope with a protocol version byte as the first frame, then a type ID as the first byte of the second frame, followed by the protobuf payload also in the second frame.
+
+OHLC periods are represented as seconds.
+
+## Data Flow Overview
+
+**Relay as Gateway**: The Relay is a well-known bind point that all components connect to. It routes messages between clients, ingestors, and Flink.
+
+### Historical Data Query Flow (Async Event-Driven Architecture)
+* Client generates request_id and/or client_id (both are client-generated)
+* Client computes notification topic: `RESPONSE:{client_id}` or `HISTORY_READY:{request_id}`
+* **Client subscribes to notification topic BEFORE sending request (prevents race condition)**
+* Client sends SubmitHistoricalRequest to Relay (REQ/REP)
+* Relay returns immediate SubmitResponse with request_id and notification_topic (for confirmation)
+* Relay publishes DataRequest to ingestor work queue with exchange prefix (PUB/SUB)
+* Ingestor receives request, fetches data from exchange
+* Ingestor writes OHLC data to Kafka with __metadata in first record
+* Flink reads from Kafka, processes data, writes to Iceberg
+* Flink publishes HistoryReadyNotification to ZMQ PUB socket (port 5557) with deterministic topic
+* Relay proxies notification via XSUB → XPUB to clients
+* Client receives notification (already subscribed) and queries Iceberg for data
+
+**Key Architectural Change**: Relay is completely stateless. No request/response correlation needed. All notification routing is topic-based (e.g., "RESPONSE:{client_id}").
+
+**Race Condition Prevention**: Notification topics are deterministic based on client-generated values (request_id or client_id). Clients MUST subscribe to the notification topic BEFORE submitting the request to avoid missing notifications.
+
+**Two Notification Patterns**:
+1. **Per-client topic** (`RESPONSE:{client_id}`): Subscribe once during connection, reuse for all requests from this client. Recommended for most clients.
+2. **Per-request topic** (`HISTORY_READY:{request_id}`): Subscribe immediately before each request. Use when you need per-request isolation or don't have a persistent client_id.
+
+### Realtime Data Flow (Flink → Relay → Clients)
+* Ingestors write realtime ticks to Kafka
+* Flink reads from Kafka, processes OHLC aggregations, CEP triggers
+* Flink publishes market data via ZMQ PUB
+* Relay subscribes to Flink (XSUB) and fanouts to clients (XPUB)
+* Clients subscribe to specific tickers
+
+### Data Processing (Kafka → Flink → Iceberg)
+* All market data flows through Kafka (durable event log)
+* Flink processes streams for aggregations and CEP
+* Flink writes historical data to Apache Iceberg tables
+* Clients can query Iceberg for historical data (alternative to ingestor backfill)
+
+**Key Design Principles**:
+* Relay is the well-known bind point - all other components connect to it
+* Relay is completely stateless - no request tracking, only topic-based routing
+* Exchange prefix filtering allows ingestor specialization (e.g., only BINANCE ingestors)
+* Historical data flows through Kafka (durable processing) only - no direct response
+* Async event-driven notifications via pub/sub (Flink → Relay → Clients)
+* Protobufs over ZMQ for all inter-service communication
+* Kafka for durability and Flink stream processing
+* Iceberg for long-term historical storage and client queries
+
+## ZeroMQ Channels and Patterns
+
+All sockets bind on **Relay** (well-known endpoint). Components connect to relay.
+
+### 1. Client Request Channel (Clients → Relay)
+**Pattern**: ROUTER (Relay binds, Clients use REQ)
+- **Socket Type**: Relay uses ROUTER (bind), Clients use REQ (connect)
+- **Endpoint**: `tcp://*:5559` (Relay binds)
+- **Message Types**: `SubmitHistoricalRequest` → `SubmitResponse`
+- **Behavior**:
+  - Client generates request_id and/or client_id
+  - Client computes notification topic deterministically
+  - **Client subscribes to notification topic FIRST (prevents race)**
+  - Client sends REQ for historical OHLC data
+  - Relay validates request and returns immediate acknowledgment
+  - Response includes notification_topic for client confirmation
+  - Relay publishes DataRequest to ingestor work queue
+  - No request tracking - relay is stateless
+
+### 2. Ingestor Work Queue (Relay → Ingestors)
+**Pattern**: PUB/SUB with exchange prefix filtering
+- **Socket Type**: Relay uses PUB (bind), Ingestors use SUB (connect)
+- **Endpoint**: `tcp://*:5555` (Relay binds)
+- **Message Types**: `DataRequest` (historical or realtime)
+- **Topic Prefix**: Exchange name (e.g., `BINANCE:`, `COINBASE:`)
+- **Behavior**:
+  - Relay publishes work with exchange prefix from ticker
+  - Ingestors subscribe only to exchanges they support
+  - Multiple ingestors can compete for same exchange
+  - Ingestors write data to Kafka only (no direct response)
+  - Flink processes Kafka → Iceberg → notification
+
+### 3. Market Data Fanout (Relay ↔ Flink ↔ Clients)
+**Pattern**: XPUB/XSUB proxy
+- **Socket Type**:
+  - Relay XPUB (bind) ← Clients SUB (connect) - Port 5558
+  - Relay XSUB (connect) → Flink PUB (bind) - Port 5557
+- **Message Types**: `Tick`, `OHLC`, `HistoryReadyNotification`
+- **Topic Formats**:
+  - Market data: `{ticker}|{data_type}` (e.g., `BINANCE:BTC/USDT|tick`)
+  - Notifications: `RESPONSE:{client_id}` or `HISTORY_READY:{request_id}`
+- **Behavior**:
+  - Clients subscribe to ticker topics and notification topics via Relay XPUB
+  - Relay forwards subscriptions to Flink via XSUB
+  - Flink publishes processed market data and notifications
+  - Relay proxies data to subscribed clients (stateless forwarding)
+  - Dynamic subscription management (no pre-registration)
+
+### 4. Ingestor Control Channel (Optional - Future Use)
+**Pattern**: PUB/SUB (Broadcast control)
+- **Socket Type**: Relay uses PUB, Ingestors use SUB
+- **Endpoint**: `tcp://*:5557` (Relay binds)
+- **Message Types**: `IngestorControl` (cancel, config updates)
+- **Behavior**:
+  - Broadcast control messages to all ingestors
+  - Used for realtime subscription cancellation
+  - Configuration updates
+
+## Message Envelope Format
+
+The core protocol uses two ZeroMQ frames:
+```
+Frame 1: [1 byte: protocol version]
+Frame 2: [1 byte: message type ID][N bytes: protobuf message]
+```
+
+This two-frame approach allows receivers to check the protocol version before parsing the message type and protobuf payload.
+
+**Important**: Some ZeroMQ socket patterns (PUB/SUB, XPUB/XSUB) may prepend additional frames for routing purposes. For example:
+- **PUB/SUB with topic filtering**: SUB sockets receive `[topic frame][version frame][message frame]`
+- **ROUTER sockets**: Prepend identity frames before the message
+
+Components must handle these additional frames appropriately:
+- SUB sockets: Skip the first frame (topic), then parse the remaining frames as the standard 2-frame envelope
+- ROUTER sockets: Extract identity frames, then parse the standard 2-frame envelope
+
+The two-frame envelope is the **logical protocol format**, but physical transmission may include additional ZeroMQ transport frames.
+
+## Message Type IDs
+
+| Type ID | Message Type              | Description                                    |
+|---------|---------------------------|------------------------------------------------|
+| 0x01    | DataRequest               | Request for historical or realtime data        |
+| 0x02    | DataResponse (deprecated) | Historical data response (no longer used)      |
+| 0x03    | IngestorControl           | Control messages for ingestors                 |
+| 0x04    | Tick                      | Individual trade tick data                     |
+| 0x05    | OHLC                      | Single OHLC candle with volume                 |
+| 0x06    | Market                    | Market metadata                                |
+| 0x07    | OHLCRequest (deprecated)  | Client request (replaced by SubmitHistorical)  |
+| 0x08    | Response (deprecated)     | Generic response (replaced by SubmitResponse)  |
+| 0x09    | CEPTriggerRequest         | Register CEP trigger                           |
+| 0x0A    | CEPTriggerAck             | CEP trigger acknowledgment                     |
+| 0x0B    | CEPTriggerEvent           | CEP trigger fired callback                     |
+| 0x0C    | OHLCBatch                 | Batch of OHLC rows with metadata (Kafka)       |
+| 0x10    | SubmitHistoricalRequest   | Client request for historical data (async)     |
+| 0x11    | SubmitResponse            | Immediate ack with notification topic          |
+| 0x12    | HistoryReadyNotification  | Notification that data is ready in Iceberg     |
+
+## Error Handling
+
+**Async Architecture Error Handling**:
+- Failed historical requests: ingestor writes error marker to Kafka
+- Flink reads error marker and publishes HistoryReadyNotification with ERROR status
+- Client timeout: if no notification received within timeout, assume failure
+- Realtime requests cancelled via control channel if ingestor fails
+- REQ/REP timeouts: 30 seconds default for client request submission
+- PUB/SUB has no delivery guarantees (Kafka provides durability)
+- No response routing needed - all notifications via topic-based pub/sub
+
+**Durability**:
+- All data flows through Kafka for durability
+- Flink checkpointing ensures exactly-once processing
+- Client can retry request with new request_id if notification not received
--- a/doc/user_mcp_resources.md
+++ b/doc/user_mcp_resources.md
@@ -0,0 +1,472 @@
+# User MCP Server - Resource Architecture
+
+The user's MCP server container owns **all** conversation history, RAG, and contextual data. The platform gateway is a thin, stateless orchestrator that only holds the Anthropic API key.
+
+## Architecture Principle
+
+**User Container = Fat Context**
+- Conversation history (PostgreSQL/SQLite)
+- RAG system (embeddings, vector search)
+- User preferences and custom prompts
+- Trading context (positions, watchlists, alerts)
+- All user-specific data
+
+**Platform Gateway = Thin Orchestrator**
+- Anthropic API key (platform pays for LLM)
+- Session management (WebSocket/Telegram connections)
+- MCP client connection pooling
+- Tool routing (platform vs user tools)
+- **Zero conversation state stored**
+
+## MCP Resources for Context Injection
+
+Resources are **read-only** data sources that provide context to the LLM. They're fetched before each Claude API call and embedded in the conversation.
+
+### Standard Context Resources
+
+#### 1. `context://user-profile`
+**Purpose:** User's trading background and preferences
+
+**MIME Type:** `text/plain`
+
+**Example Content:**
+```
+User Profile:
+- Trading experience: Intermediate
+- Preferred timeframes: 1h, 4h, 1d
+- Risk tolerance: Medium
+- Focus: Swing trading with technical indicators
+- Favorite indicators: RSI, MACD, Bollinger Bands
+- Active pairs: BTC/USDT, ETH/USDT, SOL/USDT
+```
+
+**Implementation Notes:**
+- Stored in user's database `user_preferences` table
+- Updated via preference management tools
+- Includes inferred data from usage patterns
+
+---
+
+#### 2. `context://conversation-summary`
+**Purpose:** Semantic summary of recent conversation with RAG-enhanced context
+
+**MIME Type:** `text/plain`
+
+**Example Content:**
+```
+Recent Conversation Summary:
+
+Last 10 messages (summarized):
+- User asked about moving average crossover strategies
+- Discussed backtesting parameters for BTC/USDT
+- Reviewed risk management with 2% position sizing
+- Explored adding RSI filter to reduce false signals
+
+Relevant past discussions (RAG search):
+- 2 weeks ago: Similar strategy development on ETH/USDT
+- 1 month ago: User prefers simple strategies over complex ones
+- Past preference: Avoid strategies with >5 indicators
+
+Current focus: Optimizing MA crossover with momentum filter
+```
+
+**Implementation Notes:**
+- Last N messages stored in `conversation_history` table
+- RAG search against embeddings of past conversations
+- Semantic search using user's current message as query
+- ChromaDB/pgvector for embedding storage
+- Summary generated on-demand (can be cached for 1-5 minutes)
+
+**RAG Integration:**
+```python
+async def get_conversation_summary() -> str:
+    # Get recent messages
+    recent = await db.get_recent_messages(limit=50)
+
+    # Semantic search for relevant context
+    relevant = await rag.search_conversation_history(
+        query=recent[-1].content,  # Last user message
+        limit=5,
+        min_score=0.7
+    )
+
+    # Build summary
+    return build_summary(recent[-10:], relevant)
+```
+
+---
+
+#### 3. `context://workspace-state`
+**Purpose:** Current trading workspace (chart, positions, watchlist)
+
+**MIME Type:** `application/json`
+
+**Example Content:**
+```json
+{
+  "currentChart": {
+    "ticker": "BINANCE:BTC/USDT",
+    "timeframe": "1h",
+    "indicators": ["SMA(20)", "RSI(14)", "MACD(12,26,9)"]
+  },
+  "watchlist": ["BTC/USDT", "ETH/USDT", "SOL/USDT"],
+  "openPositions": [
+    {
+      "ticker": "BTC/USDT",
+      "side": "long",
+      "size": 0.1,
+      "entryPrice": 45000,
+      "currentPrice": 46500,
+      "unrealizedPnL": 150
+    }
+  ],
+  "recentAlerts": [
+    {
+      "type": "price_alert",
+      "message": "BTC/USDT crossed above $46,000",
+      "timestamp": "2025-01-15T10:30:00Z"
+    }
+  ]
+}
+```
+
+**Implementation Notes:**
+- Synced from web client chart state
+- Updated via WebSocket sync protocol
+- Includes active indicators on current chart
+- Position data from trading system
+
+---
+
+#### 4. `context://system-prompt`
+**Purpose:** User's custom instructions and preferences for AI behavior
+
+**MIME Type:** `text/plain`
+
+**Example Content:**
+```
+Custom Instructions:
+- Be concise and data-driven
+- Always show risk/reward ratios
+- Prefer simple strategies over complex ones
+- When suggesting trades, include stop-loss and take-profit levels
+- Explain your reasoning in trading decisions
+```
+
+**Implementation Notes:**
+- User-editable in preferences UI
+- Appended **last** to system prompt (highest priority)
+- Can override platform defaults
+- Stored in `user_preferences.custom_prompt` field
+
+---
+
+## MCP Tools for Actions
+
+Tools are for **actions** that have side effects. These are **not** used for context fetching.
+
+### Conversation Management
+- `save_message(role, content, timestamp)` - Save message to history
+- `search_conversation(query, limit)` - Explicit semantic search (for user queries like "what did we discuss about BTC?")
+
+### Strategy & Indicators
+- `list_strategies()` - List user's strategies
+- `read_strategy(name)` - Get strategy code
+- `write_strategy(name, code)` - Save strategy
+- `run_backtest(strategy, params)` - Execute backtest
+
+### Trading
+- `get_watchlist()` - Get watchlist (action that may trigger sync)
+- `execute_trade(params)` - Execute trade order
+- `get_positions()` - Fetch current positions from exchange
+
+### Sandbox
+- `run_python(code)` - Execute Python code with data science libraries
+
+---
+
+## Gateway Harness Flow
+
+```typescript
+// gateway/src/harness/agent-harness.ts
+
+async handleMessage(message: InboundMessage): Promise<OutboundMessage> {
+  // 1. Fetch context resources from user's MCP
+  const contextResources = await fetchContextResources([
+    'context://user-profile',
+    'context://conversation-summary',  // <-- RAG happens here
+    'context://workspace-state',
+    'context://system-prompt',
+  ]);
+
+  // 2. Build system prompt from resources
+  const systemPrompt = buildSystemPrompt(contextResources);
+
+  // 3. Build messages with embedded conversation context
+  const messages = buildMessages(message, contextResources);
+
+  // 4. Get tools from MCP
+  const tools = await mcpClient.listTools();
+
+  // 5. Call Claude with embedded context
+  const response = await anthropic.messages.create({
+    model: 'claude-3-5-sonnet-20241022',
+    system: systemPrompt,  // <-- User profile + workspace + custom prompt
+    messages,              // <-- Conversation summary from RAG
+    tools,
+  });
+
+  // 6. Save to user's MCP (tool call)
+  await mcpClient.callTool('save_message', { role: 'user', content: message.content });
+  await mcpClient.callTool('save_message', { role: 'assistant', content: response });
+
+  return response;
+}
+```
+
+---
+
+## User MCP Server Implementation (Python)
+
+### Resource Handler
+
+```python
+# user-mcp/src/resources.py
+
+from mcp.server import Server
+from mcp.types import Resource, ResourceTemplate
+import asyncpg
+
+server = Server("dexorder-user")
+
+@server.list_resources()
+async def list_resources() -> list[Resource]:
+    return [
+        Resource(
+            uri="context://user-profile",
+            name="User Profile",
+            description="Trading style, preferences, and background",
+            mimeType="text/plain",
+        ),
+        Resource(
+            uri="context://conversation-summary",
+            name="Conversation Summary",
+            description="Recent conversation with RAG-enhanced context",
+            mimeType="text/plain",
+        ),
+        Resource(
+            uri="context://workspace-state",
+            name="Workspace State",
+            description="Current chart, watchlist, positions",
+            mimeType="application/json",
+        ),
+        Resource(
+            uri="context://system-prompt",
+            name="Custom System Prompt",
+            description="User's custom AI instructions",
+            mimeType="text/plain",
+        ),
+    ]
+
+@server.read_resource()
+async def read_resource(uri: str) -> str:
+    if uri == "context://user-profile":
+        return await build_user_profile()
+    elif uri == "context://conversation-summary":
+        return await build_conversation_summary()
+    elif uri == "context://workspace-state":
+        return await build_workspace_state()
+    elif uri == "context://system-prompt":
+        return await get_custom_prompt()
+    else:
+        raise ValueError(f"Unknown resource: {uri}")
+```
+
+### RAG Integration
+
+```python
+# user-mcp/src/rag.py
+
+import chromadb
+from sentence_transformers import SentenceTransformer
+
+class ConversationRAG:
+    def __init__(self, db_path: str):
+        self.chroma = chromadb.PersistentClient(path=db_path)
+        self.collection = self.chroma.get_or_create_collection("conversations")
+        self.embedder = SentenceTransformer('all-MiniLM-L6-v2')
+
+    async def search_conversation_history(
+        self,
+        query: str,
+        limit: int = 5,
+        min_score: float = 0.7
+    ) -> list[dict]:
+        """Semantic search over conversation history"""
+        # Embed query
+        query_embedding = self.embedder.encode(query).tolist()
+
+        # Search
+        results = self.collection.query(
+            query_embeddings=[query_embedding],
+            n_results=limit,
+        )
+
+        # Filter by score and format
+        relevant = []
+        for i, score in enumerate(results['distances'][0]):
+            if score >= min_score:
+                relevant.append({
+                    'content': results['documents'][0][i],
+                    'metadata': results['metadatas'][0][i],
+                    'score': score,
+                })
+
+        return relevant
+
+    async def add_message(self, message_id: str, role: str, content: str, metadata: dict):
+        """Add message to RAG index"""
+        embedding = self.embedder.encode(content).tolist()
+
+        self.collection.add(
+            ids=[message_id],
+            embeddings=[embedding],
+            documents=[content],
+            metadatas=[{
+                'role': role,
+                'timestamp': metadata.get('timestamp'),
+                **metadata
+            }]
+        )
+```
+
+### Conversation Summary Builder
+
+```python
+# user-mcp/src/context.py
+
+async def build_conversation_summary(user_id: str) -> str:
+    """Build conversation summary with RAG"""
+    # 1. Get recent messages
+    recent_messages = await db.get_messages(
+        user_id=user_id,
+        limit=50,
+        order='desc'
+    )
+
+    # 2. Get current focus (last user message)
+    last_user_msg = next(
+        (m for m in recent_messages if m.role == 'user'),
+        None
+    )
+
+    if not last_user_msg:
+        return "No recent conversation history."
+
+    # 3. RAG search for relevant context
+    rag = ConversationRAG(f"/data/users/{user_id}/rag")
+    relevant_context = await rag.search_conversation_history(
+        query=last_user_msg.content,
+        limit=5,
+        min_score=0.7
+    )
+
+    # 4. Build summary
+    summary = f"Recent Conversation Summary:\n\n"
+
+    # Recent messages (last 10)
+    summary += "Last 10 messages:\n"
+    for msg in recent_messages[-10:]:
+        summary += f"- {msg.role}: {msg.content[:100]}...\n"
+
+    # Relevant past context
+    if relevant_context:
+        summary += "\nRelevant past discussions (RAG):\n"
+        for ctx in relevant_context:
+            timestamp = ctx['metadata'].get('timestamp', 'unknown')
+            summary += f"- [{timestamp}] {ctx['content'][:150]}...\n"
+
+    # Inferred focus
+    summary += f"\nCurrent focus: {infer_topic(last_user_msg.content)}\n"
+
+    return summary
+
+def infer_topic(message: str) -> str:
+    """Simple topic extraction"""
+    keywords = {
+        'strategy': ['strategy', 'backtest', 'trading system'],
+        'indicator': ['indicator', 'rsi', 'macd', 'moving average'],
+        'analysis': ['analyze', 'chart', 'price action'],
+        'risk': ['risk', 'position size', 'stop loss'],
+    }
+
+    message_lower = message.lower()
+    for topic, words in keywords.items():
+        if any(word in message_lower for word in words):
+            return topic
+
+    return 'general trading discussion'
+```
+
+---
+
+## Benefits of This Architecture
+
+1. **Privacy**: Conversation history never leaves user's container
+2. **Customization**: Each user controls their RAG, embeddings, prompt engineering
+3. **Scalability**: Platform harness is stateless - horizontally scalable
+4. **Cost Control**: Platform pays for Claude, users pay for their compute/storage
+5. **Portability**: Users can export/migrate their entire context
+6. **Development**: Users can test prompts/context locally without platform involvement
+
+---
+
+## Future Enhancements
+
+### Dynamic Resource URIs
+
+Support parameterized resources:
+```
+context://conversation/{session_id}
+context://strategy/{strategy_name}
+context://backtest/{backtest_id}/results
+```
+
+### Resource Templates
+
+MCP supports resource templates for dynamic discovery:
+```python
+@server.list_resource_templates()
+async def list_templates() -> list[ResourceTemplate]:
+    return [
+        ResourceTemplate(
+            uriTemplate="context://strategy/{name}",
+            name="Strategy Context",
+            description="Context for specific strategy",
+        )
+    ]
+```
+
+### Streaming Resources
+
+For large context (e.g., full backtest results), support streaming:
+```python
+@server.read_resource()
+async def read_resource(uri: str) -> AsyncIterator[str]:
+    if uri.startswith("context://backtest/"):
+        async for chunk in stream_backtest_results(uri):
+            yield chunk
+```
+
+---
+
+## Migration Path
+
+For users with existing conversation history in platform DB:
+
+1. **Export script**: Migrate platform history → user container DB
+2. **RAG indexing**: Embed all historical messages into ChromaDB
+3. **Preference migration**: Copy user preferences to container
+4. **Cutover**: Switch to resource-based context fetching
+
+Platform can keep read-only archive for compliance, but active context lives in user container.