container lifecycle management
This commit is contained in:
21
doc/agent_harness_flow.md
Normal file
21
doc/agent_harness_flow.md
Normal file
@@ -0,0 +1,21 @@
|
||||
┌─────────────────────────────────────────────────┐
|
||||
│ Agent Harness (your servers) │
|
||||
│ │
|
||||
│ on_message(user_id, message): │
|
||||
│ 1. Look up user's MCP endpoint from Postgres │
|
||||
│ 2. mcp.call("get_context_summary") │
|
||||
│ 3. mcp.call("get_conversation_history", 20) │
|
||||
│ 4. Build prompt: │
|
||||
│ system = BASE_PROMPT │
|
||||
│ + context_summary │
|
||||
│ + user_agent_prompt (from MCP) │
|
||||
│ messages = history + new message │
|
||||
│ 5. LLM call (your API key) │
|
||||
│ 6. While LLM wants tool calls: │
|
||||
│ - Platform tools → handle locally │
|
||||
│ - User tools → proxy to MCP │
|
||||
│ - LLM call again with results │
|
||||
│ 7. mcp.call("save_message", ...) │
|
||||
│ 8. Return response to user │
|
||||
│ │
|
||||
└─────────────────────────────────────────────────┘
|
||||
@@ -1,9 +1,11 @@
|
||||
Generally use skills instead of subagents, except for the analysis subagent.
|
||||
|
||||
## User-specific files
|
||||
## User-specific files and tools
|
||||
* Indicators
|
||||
* Strategies
|
||||
* Watchlists
|
||||
* Preferences
|
||||
* Trading style
|
||||
* Charting / colors
|
||||
* Executors (really just sub-strategies)
|
||||
* tactical-level order generators e.g. TWAP, iceberg, etc.
|
||||
|
||||
@@ -1,18 +0,0 @@
|
||||
This file describes all the configuration options used by all components. All configuration is divided into regular config and secrets, and k8s will mount either or both as a yaml file accessible to the process.
|
||||
|
||||
# Configuration
|
||||
|
||||
* `flink_hostname`
|
||||
* ... various zmq ports for flink ...
|
||||
* `iceberg_catalog_hostname`
|
||||
* `iceberg_catalog_port`
|
||||
* `iceberg_catalog_database`
|
||||
* etc
|
||||
|
||||
|
||||
# Secrets
|
||||
|
||||
* `iceberg_catalog_username`
|
||||
* `iceberg_catalog_password`
|
||||
* etc.
|
||||
|
||||
313
doc/container_lifecycle_management.md
Normal file
313
doc/container_lifecycle_management.md
Normal file
@@ -0,0 +1,313 @@
|
||||
# Container Lifecycle Management
|
||||
|
||||
## Overview
|
||||
|
||||
User agent containers self-manage their lifecycle to optimize resource usage. Containers automatically shut down when idle (no triggers + no recent activity) and clean themselves up using a lifecycle sidecar.
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
┌──────────────────────────────────────────────────────────┐
|
||||
│ Agent Pod │
|
||||
│ ┌───────────────────┐ ┌──────────────────────┐ │
|
||||
│ │ Agent Container │ │ Lifecycle Sidecar │ │
|
||||
│ │ ─────────────── │ │ ────────────────── │ │
|
||||
│ │ │ │ │ │
|
||||
│ │ Lifecycle Manager │ │ Watches exit code │ │
|
||||
│ │ - Track activity │ │ - Detects exit 42 │ │
|
||||
│ │ - Track triggers │ │ - Calls k8s API │ │
|
||||
│ │ - Exit 42 if idle │ │ - Deletes deployment │ │
|
||||
│ └───────────────────┘ └──────────────────────┘ │
|
||||
│ │ │ │
|
||||
│ │ writes exit_code │ │
|
||||
│ └────►/var/run/agent/exit_code │
|
||||
│ │ │
|
||||
└───────────────────────────────────────┼──────────────────┘
|
||||
│
|
||||
▼ k8s API (RBAC)
|
||||
┌─────────────────────┐
|
||||
│ Delete Deployment │
|
||||
│ Delete PVC (if anon)│
|
||||
└─────────────────────┘
|
||||
```
|
||||
|
||||
## Components
|
||||
|
||||
### 1. Lifecycle Manager (Python)
|
||||
|
||||
**Location**: `client-py/dexorder/lifecycle_manager.py`
|
||||
|
||||
Runs inside the agent container and tracks:
|
||||
- **Activity**: MCP tool/resource/prompt calls reset the idle timer
|
||||
- **Triggers**: Data subscriptions, CEP patterns, etc.
|
||||
- **Idle state**: No triggers + idle timeout exceeded
|
||||
|
||||
**Configuration** (via environment variables):
|
||||
- `IDLE_TIMEOUT_MINUTES`: Minutes before shutdown (default: 15)
|
||||
- `IDLE_CHECK_INTERVAL_SECONDS`: Check frequency (default: 60)
|
||||
- `ENABLE_IDLE_SHUTDOWN`: Enable/disable shutdown (default: true)
|
||||
|
||||
**Usage in agent code**:
|
||||
```python
|
||||
from dexorder.lifecycle_manager import get_lifecycle_manager
|
||||
|
||||
# On startup
|
||||
manager = get_lifecycle_manager()
|
||||
await manager.start()
|
||||
|
||||
# On MCP calls (tool/resource/prompt)
|
||||
manager.record_activity()
|
||||
|
||||
# When triggers change
|
||||
manager.add_trigger("data_sub_BTC_USDT")
|
||||
manager.remove_trigger("data_sub_BTC_USDT")
|
||||
|
||||
# Or batch update
|
||||
manager.update_triggers({"trigger_1", "trigger_2"})
|
||||
```
|
||||
|
||||
**Exit behavior**:
|
||||
- Idle shutdown: Exit with code `42`
|
||||
- Signal (SIGTERM/SIGINT): Exit with code `0` (allows restart)
|
||||
- Errors/crashes: Exit with error code (allows restart)
|
||||
|
||||
### 2. Lifecycle Sidecar (Go)
|
||||
|
||||
**Location**: `lifecycle-sidecar/`
|
||||
|
||||
Runs alongside the agent container with shared PID namespace. Monitors the main container process and:
|
||||
- On exit code `42`: Deletes deployment (and PVC if anonymous user)
|
||||
- On any other exit code: Exits with same code (k8s restarts pod)
|
||||
|
||||
**Configuration** (via environment, injected by downward API):
|
||||
- `NAMESPACE`: Pod's namespace
|
||||
- `DEPLOYMENT_NAME`: Deployment name (from pod label)
|
||||
- `USER_TYPE`: License tier (`anonymous`, `free`, `paid`, `enterprise`)
|
||||
- `MAIN_CONTAINER_PID`: PID of main container (default: 1)
|
||||
|
||||
**RBAC**: Has permission to delete deployments and PVCs **only in dexorder-agents namespace**. Cannot delete other deployments due to:
|
||||
1. Only knows its own deployment name (from env)
|
||||
2. RBAC scoped to namespace
|
||||
3. No cross-pod communication
|
||||
|
||||
### 3. Gateway (TypeScript)
|
||||
|
||||
**Location**: `gateway/src/harness/agent-harness.ts`
|
||||
|
||||
Creates agent deployments when users connect. Has permissions to:
|
||||
- ✅ Create deployments, services, PVCs
|
||||
- ✅ Read pod status and logs
|
||||
- ✅ Update deployments (e.g., resource limits)
|
||||
- ❌ Delete deployments (handled by sidecar)
|
||||
- ❌ Exec into pods
|
||||
- ❌ Access secrets
|
||||
|
||||
## Lifecycle States
|
||||
|
||||
```
|
||||
┌─────────────┐
|
||||
│ CREATED │ ← Gateway creates deployment
|
||||
└──────┬──────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────┐
|
||||
│ RUNNING │ ← User interacts, has triggers
|
||||
└──────┬──────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────┐
|
||||
│ IDLE │ ← No triggers + timeout exceeded
|
||||
└──────┬──────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────┐
|
||||
│ SHUTDOWN │ ← Exit code 42
|
||||
└──────┬──────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────┐
|
||||
│ DELETED │ ← Sidecar deletes deployment
|
||||
└─────────────┘
|
||||
```
|
||||
|
||||
## Idle Detection Logic
|
||||
|
||||
Container is **IDLE** when:
|
||||
1. `active_triggers.isEmpty()` AND
|
||||
2. `(now - last_activity) > idle_timeout`
|
||||
|
||||
Container is **ACTIVE** when:
|
||||
1. Has any active triggers (data subscriptions, CEP patterns, etc.) OR
|
||||
2. Recent user activity (MCP calls within timeout)
|
||||
|
||||
## Cleanup Policies by License Tier
|
||||
|
||||
| User Type | Idle Timeout | PVC Policy | Notes |
|
||||
|--------------|--------------|------------|-------|
|
||||
| Anonymous | 15 minutes | Delete | Ephemeral, no data retention |
|
||||
| Free | 15 minutes | Retain | Can resume session |
|
||||
| Paid | 60 minutes | Retain | Longer grace period |
|
||||
| Enterprise | No shutdown | Retain | Always-on containers |
|
||||
|
||||
Configured via `USER_TYPE` env var in deployment.
|
||||
|
||||
## Security
|
||||
|
||||
### Principle of Least Privilege
|
||||
|
||||
**Gateway**:
|
||||
- Can create agent resources
|
||||
- Cannot delete agent resources
|
||||
- Cannot access other namespaces
|
||||
- Cannot exec into pods
|
||||
|
||||
**Lifecycle Sidecar**:
|
||||
- Can delete its own deployment only
|
||||
- Cannot delete other deployments
|
||||
- Scoped to dexorder-agents namespace
|
||||
- No exec, no secrets access
|
||||
|
||||
### Admission Control
|
||||
|
||||
All deployments in `dexorder-agents` namespace are subject to:
|
||||
- Image allowlist (only approved images)
|
||||
- Security context enforcement (non-root, drop caps, read-only rootfs)
|
||||
- Resource limits required
|
||||
- PodSecurity standards (restricted profile)
|
||||
|
||||
See `deploy/k8s/base/admission-policy.yaml`
|
||||
|
||||
### Network Isolation
|
||||
|
||||
Agents are network-isolated via NetworkPolicy:
|
||||
- Can connect to gateway (MCP)
|
||||
- Can connect to Redpanda (data streams)
|
||||
- Can make outbound HTTPS (exchanges, LLM APIs)
|
||||
- Cannot access k8s API
|
||||
- Cannot access system namespace
|
||||
- Cannot access other agent pods
|
||||
|
||||
See `deploy/k8s/base/network-policies.yaml`
|
||||
|
||||
## Deployment
|
||||
|
||||
### 1. Apply Security Policies
|
||||
|
||||
```bash
|
||||
kubectl apply -k deploy/k8s/dev # or prod
|
||||
```
|
||||
|
||||
This creates:
|
||||
- Namespaces (`dexorder-system`, `dexorder-agents`)
|
||||
- RBAC (gateway, lifecycle sidecar)
|
||||
- Admission policies
|
||||
- Network policies
|
||||
- Resource quotas
|
||||
|
||||
### 2. Build and Push Lifecycle Sidecar
|
||||
|
||||
```bash
|
||||
cd lifecycle-sidecar
|
||||
docker build -t ghcr.io/dexorder/lifecycle-sidecar:latest .
|
||||
docker push ghcr.io/dexorder/lifecycle-sidecar:latest
|
||||
```
|
||||
|
||||
### 3. Gateway Creates Agent Deployments
|
||||
|
||||
When a user connects, the gateway creates:
|
||||
- Deployment with agent + sidecar
|
||||
- PVC for persistent data
|
||||
- Service for MCP endpoint
|
||||
|
||||
See `deploy/k8s/base/agent-deployment-example.yaml` for template.
|
||||
|
||||
## Testing
|
||||
|
||||
### Test Lifecycle Manager Locally
|
||||
|
||||
```python
|
||||
from dexorder.lifecycle_manager import LifecycleManager
|
||||
|
||||
# Disable actual shutdown for testing
|
||||
manager = LifecycleManager(
|
||||
idle_timeout_minutes=1,
|
||||
check_interval_seconds=10,
|
||||
enable_shutdown=False # Only log, don't exit
|
||||
)
|
||||
|
||||
await manager.start()
|
||||
|
||||
# Simulate activity
|
||||
manager.record_activity()
|
||||
|
||||
# Simulate triggers
|
||||
manager.add_trigger("test_trigger")
|
||||
await asyncio.sleep(70) # Wait past timeout
|
||||
manager.remove_trigger("test_trigger")
|
||||
await asyncio.sleep(70) # Should detect idle
|
||||
|
||||
await manager.stop()
|
||||
```
|
||||
|
||||
### Test Sidecar Locally
|
||||
|
||||
```bash
|
||||
# Build
|
||||
cd lifecycle-sidecar
|
||||
go build -o lifecycle-sidecar main.go
|
||||
|
||||
# Run (requires k8s config)
|
||||
export NAMESPACE=dexorder-agents
|
||||
export DEPLOYMENT_NAME=agent-test
|
||||
export USER_TYPE=free
|
||||
./lifecycle-sidecar
|
||||
```
|
||||
|
||||
### Integration Test
|
||||
|
||||
1. Deploy test agent with sidecar
|
||||
2. Verify agent starts and is healthy
|
||||
3. Stop sending MCP calls and remove all triggers
|
||||
4. Wait for idle timeout + check interval
|
||||
5. Verify deployment is deleted
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Container not shutting down when idle
|
||||
|
||||
Check logs:
|
||||
```bash
|
||||
kubectl logs -n dexorder-agents agent-user-abc123 -c agent
|
||||
```
|
||||
|
||||
Verify:
|
||||
- `ENABLE_IDLE_SHUTDOWN=true`
|
||||
- No active triggers: `manager.active_triggers` should be empty
|
||||
- Idle timeout exceeded
|
||||
|
||||
### Sidecar not deleting deployment
|
||||
|
||||
Check sidecar logs:
|
||||
```bash
|
||||
kubectl logs -n dexorder-agents agent-user-abc123 -c lifecycle-sidecar
|
||||
```
|
||||
|
||||
Verify:
|
||||
- Exit code file exists: `/var/run/agent/exit_code` contains `42`
|
||||
- RBAC permissions: `kubectl auth can-i delete deployments --as=system:serviceaccount:dexorder-agents:agent-lifecycle -n dexorder-agents`
|
||||
- Deployment name matches: Check `DEPLOYMENT_NAME` env var
|
||||
|
||||
### Gateway can't create deployments
|
||||
|
||||
Check gateway logs and verify:
|
||||
- ServiceAccount exists: `kubectl get sa gateway -n dexorder-system`
|
||||
- RoleBinding exists: `kubectl get rolebinding gateway-agent-creator -n dexorder-agents`
|
||||
- Admission policy allows image: Check image name matches allowlist in `admission-policy.yaml`
|
||||
|
||||
## Future Enhancements
|
||||
|
||||
1. **Graceful shutdown notifications**: Warn users before shutdown via websocket
|
||||
2. **Predictive scaling**: Keep frequently-used containers warm
|
||||
3. **Tiered storage**: Move old PVCs to cheaper storage class
|
||||
4. **Metrics**: Expose lifecycle metrics (idle rate, shutdown count, etc.)
|
||||
5. **Cost allocation**: Track resource usage per user/license tier
|
||||
286
doc/gateway_container_creation.md
Normal file
286
doc/gateway_container_creation.md
Normal file
@@ -0,0 +1,286 @@
|
||||
# Gateway Container Creation
|
||||
|
||||
## Overview
|
||||
|
||||
The gateway automatically provisions user agent containers when users authenticate. This ensures each user has their own isolated environment running their MCP server with persistent storage.
|
||||
|
||||
## Authentication Flow with Container Creation
|
||||
|
||||
```
|
||||
User connects (WebSocket/Telegram)
|
||||
↓
|
||||
Send "Authenticating..." status
|
||||
↓
|
||||
Verify token/channel link
|
||||
↓
|
||||
Lookup user license from DB
|
||||
↓
|
||||
Send "Starting workspace..." status
|
||||
↓
|
||||
┌────────────────────────────────────┐
|
||||
│ ContainerManager.ensureRunning() │
|
||||
│ ┌──────────────────────────────┐ │
|
||||
│ │ Check if deployment exists │ │
|
||||
│ └──────────────────────────────┘ │
|
||||
│ ↓ │
|
||||
│ Does it exist? │
|
||||
│ ↙ ↘ │
|
||||
│ Yes No │
|
||||
│ │ │ │
|
||||
│ │ ┌──────────────────┐ │
|
||||
│ │ │ Create deployment│ │
|
||||
│ │ │ Create PVC │ │
|
||||
│ │ │ Create service │ │
|
||||
│ │ └──────────────────┘ │
|
||||
│ │ │ │
|
||||
│ └────────────┘ │
|
||||
│ ↓ │
|
||||
│ Wait for deployment ready │
|
||||
│ (polls every 2s, timeout 2min) │
|
||||
│ ↓ │
|
||||
│ Compute MCP endpoint URL │
|
||||
│ (internal k8s service DNS) │
|
||||
└────────────────────────────────────┘
|
||||
↓
|
||||
Update license.mcpServerUrl
|
||||
↓
|
||||
Send "Connected" status
|
||||
↓
|
||||
Initialize AgentHarness
|
||||
↓
|
||||
Connect to user's MCP server
|
||||
↓
|
||||
Ready for messages
|
||||
```
|
||||
|
||||
## Container Naming Convention
|
||||
|
||||
All resources follow a consistent naming pattern based on `userId`:
|
||||
|
||||
```typescript
|
||||
userId: "user-abc123"
|
||||
↓
|
||||
deploymentName: "agent-user-abc123"
|
||||
serviceName: "agent-user-abc123"
|
||||
pvcName: "agent-user-abc123-data"
|
||||
mcpEndpoint: "http://agent-user-abc123.dexorder-agents.svc.cluster.local:3000"
|
||||
```
|
||||
|
||||
User IDs are sanitized to be Kubernetes-compliant (lowercase alphanumeric + hyphens).
|
||||
|
||||
## Templates by License Tier
|
||||
|
||||
Templates are located in `gateway/src/k8s/templates/`:
|
||||
- `free-tier.yaml`
|
||||
- `pro-tier.yaml`
|
||||
- `enterprise-tier.yaml`
|
||||
|
||||
### Variable Substitution
|
||||
|
||||
Templates use simple string replacement:
|
||||
- `{{userId}}` - User ID
|
||||
- `{{deploymentName}}` - Computed deployment name
|
||||
- `{{serviceName}}` - Computed service name
|
||||
- `{{pvcName}}` - Computed PVC name
|
||||
- `{{agentImage}}` - Agent container image (from env)
|
||||
- `{{sidecarImage}}` - Lifecycle sidecar image (from env)
|
||||
- `{{storageClass}}` - Kubernetes storage class (from env)
|
||||
|
||||
### Resource Limits
|
||||
|
||||
| Tier | Memory Request | Memory Limit | CPU Request | CPU Limit | Storage | Idle Timeout |
|
||||
|------|----------------|--------------|-------------|-----------|---------|--------------|
|
||||
| **Free** | 256Mi | 512Mi | 100m | 500m | 1Gi | 15min |
|
||||
| **Pro** | 512Mi | 2Gi | 250m | 2000m | 10Gi | 60min |
|
||||
| **Enterprise** | 1Gi | 4Gi | 500m | 4000m | 50Gi | Never (shutdown disabled) |
|
||||
|
||||
## Components
|
||||
|
||||
### KubernetesClient (`gateway/src/k8s/client.ts`)
|
||||
|
||||
Low-level k8s API wrapper:
|
||||
- `deploymentExists(name)` - Check if deployment exists
|
||||
- `createAgentDeployment(spec)` - Create deployment/service/PVC from template
|
||||
- `waitForDeploymentReady(name, timeout)` - Poll until ready
|
||||
- `getServiceEndpoint(name)` - Get service URL
|
||||
- `deleteAgentDeployment(userId)` - Cleanup (for testing)
|
||||
|
||||
Static helpers:
|
||||
- `getDeploymentName(userId)` - Generate deployment name
|
||||
- `getServiceName(userId)` - Generate service name
|
||||
- `getPvcName(userId)` - Generate PVC name
|
||||
- `getMcpEndpoint(userId, namespace)` - Compute internal service URL
|
||||
|
||||
### ContainerManager (`gateway/src/k8s/container-manager.ts`)
|
||||
|
||||
High-level orchestration:
|
||||
- `ensureContainerRunning(userId, license)` - Main entry point
|
||||
- Returns: `{ mcpEndpoint, wasCreated }`
|
||||
- Creates deployment if missing
|
||||
- Waits for ready state
|
||||
- Returns endpoint URL
|
||||
- `getContainerStatus(userId)` - Check status without creating
|
||||
- `deleteContainer(userId)` - Manual cleanup
|
||||
|
||||
### Authenticator (`gateway/src/auth/authenticator.ts`)
|
||||
|
||||
Updated to call container manager:
|
||||
- `authenticateWebSocket()` - Calls `ensureContainerRunning()` before returning `AuthContext`
|
||||
- `authenticateTelegram()` - Same for Telegram webhooks
|
||||
|
||||
### WebSocketHandler (`gateway/src/channels/websocket-handler.ts`)
|
||||
|
||||
Multi-phase connection protocol:
|
||||
1. Send `{type: 'status', status: 'authenticating'}`
|
||||
2. Authenticate (may take 30-120s if creating container)
|
||||
3. Send `{type: 'status', status: 'initializing'}`
|
||||
4. Initialize agent harness
|
||||
5. Send `{type: 'connected', ...}`
|
||||
|
||||
This gives the client visibility into the startup process.
|
||||
|
||||
## Configuration
|
||||
|
||||
Environment variables:
|
||||
|
||||
```bash
|
||||
# Kubernetes
|
||||
KUBERNETES_NAMESPACE=dexorder-agents
|
||||
KUBERNETES_IN_CLUSTER=true # false for local dev
|
||||
KUBERNETES_CONTEXT=minikube # for local dev only
|
||||
|
||||
# Container images
|
||||
AGENT_IMAGE=ghcr.io/dexorder/agent:latest
|
||||
SIDECAR_IMAGE=ghcr.io/dexorder/lifecycle-sidecar:latest
|
||||
|
||||
# Storage
|
||||
AGENT_STORAGE_CLASS=standard
|
||||
```
|
||||
|
||||
## Security
|
||||
|
||||
The gateway uses a restricted ServiceAccount with RBAC:
|
||||
|
||||
**Can do:**
|
||||
- ✅ Create deployments in `dexorder-agents` namespace
|
||||
- ✅ Create services in `dexorder-agents` namespace
|
||||
- ✅ Create PVCs in `dexorder-agents` namespace
|
||||
- ✅ Read pod status and logs (debugging)
|
||||
- ✅ Update deployments (future: resource scaling)
|
||||
|
||||
**Cannot do:**
|
||||
- ❌ Delete deployments (handled by lifecycle sidecar)
|
||||
- ❌ Delete PVCs (handled by lifecycle sidecar)
|
||||
- ❌ Exec into pods
|
||||
- ❌ Access secrets or configmaps
|
||||
- ❌ Create resources in other namespaces
|
||||
- ❌ Access Kubernetes API from agent containers (blocked by NetworkPolicy)
|
||||
|
||||
See `deploy/k8s/base/gateway-rbac.yaml` for full configuration.
|
||||
|
||||
## Lifecycle
|
||||
|
||||
### Container Creation (Gateway)
|
||||
- User authenticates
|
||||
- Gateway checks if deployment exists
|
||||
- If missing, creates from template
|
||||
- Waits for ready (2min timeout)
|
||||
- Returns MCP endpoint
|
||||
|
||||
### Container Deletion (Lifecycle Sidecar)
|
||||
- Container tracks activity and triggers
|
||||
- When idle (no triggers + timeout), exits with code 42
|
||||
- Sidecar detects exit code 42
|
||||
- Sidecar deletes deployment + optional PVC via k8s API
|
||||
- Gateway creates fresh container on next authentication
|
||||
|
||||
See `doc/container_lifecycle_management.md` for full lifecycle details.
|
||||
|
||||
## Error Handling
|
||||
|
||||
| Error | Gateway Action | User Experience |
|
||||
|-------|----------------|-----------------|
|
||||
| Deployment creation fails | Log error, return auth failure | "Authentication failed" |
|
||||
| Wait timeout (image pull, etc.) | Log warning, return 503 | "Service unavailable, retry" |
|
||||
| Service not found | Retry with backoff | Transparent retry |
|
||||
| MCP connection fails | Return error | "Failed to connect to workspace" |
|
||||
| Existing deployment not ready | Wait 30s, continue if still not ready | May connect to partially-ready container |
|
||||
|
||||
## Local Development
|
||||
|
||||
For local development (outside k8s):
|
||||
|
||||
1. Start minikube:
|
||||
```bash
|
||||
minikube start
|
||||
minikube addons enable storage-provisioner
|
||||
```
|
||||
|
||||
2. Apply security policies:
|
||||
```bash
|
||||
kubectl apply -k deploy/k8s/dev
|
||||
```
|
||||
|
||||
3. Configure gateway for local k8s:
|
||||
```bash
|
||||
# .env
|
||||
KUBERNETES_IN_CLUSTER=false
|
||||
KUBERNETES_CONTEXT=minikube
|
||||
KUBERNETES_NAMESPACE=dexorder-agents
|
||||
```
|
||||
|
||||
4. Run gateway:
|
||||
```bash
|
||||
cd gateway
|
||||
npm run dev
|
||||
```
|
||||
|
||||
5. Connect via WebSocket:
|
||||
```bash
|
||||
wscat -c "ws://localhost:3000/ws/chat" -H "Authorization: Bearer your-jwt"
|
||||
```
|
||||
|
||||
The gateway will create deployments in minikube. View with:
|
||||
```bash
|
||||
kubectl get deployments -n dexorder-agents
|
||||
kubectl get pods -n dexorder-agents
|
||||
kubectl logs -n dexorder-agents agent-user-abc123 -c agent
|
||||
```
|
||||
|
||||
## Production Deployment
|
||||
|
||||
1. Build and push gateway image:
|
||||
```bash
|
||||
cd gateway
|
||||
docker build -t ghcr.io/dexorder/gateway:latest .
|
||||
docker push ghcr.io/dexorder/gateway:latest
|
||||
```
|
||||
|
||||
2. Deploy to k8s:
|
||||
```bash
|
||||
kubectl apply -k deploy/k8s/prod
|
||||
```
|
||||
|
||||
3. Gateway runs in `dexorder-system` namespace
|
||||
4. Creates agent containers in `dexorder-agents` namespace
|
||||
5. Admission policies enforce image allowlist and security constraints
|
||||
|
||||
## Monitoring
|
||||
|
||||
Useful metrics to track:
|
||||
- Container creation latency (time from auth to ready)
|
||||
- Container creation failure rate
|
||||
- Active containers by license tier
|
||||
- Resource usage per tier
|
||||
- Idle shutdown rate
|
||||
|
||||
These can be exported via Prometheus or logged to monitoring service.
|
||||
|
||||
## Future Enhancements
|
||||
|
||||
1. **Pre-warming**: Create containers for active users before they connect
|
||||
2. **Image updates**: Handle agent image version migrations with user consent
|
||||
3. **Multi-region**: Geo-distributed container placement
|
||||
4. **Cost tracking**: Per-user resource usage and billing
|
||||
5. **Auto-scaling**: Scale down to 0 replicas instead of deletion (faster restart)
|
||||
6. **Container pools**: Shared warm containers for anonymous users
|
||||
80
doc/m_c_p_client_authentication_modes.md
Normal file
80
doc/m_c_p_client_authentication_modes.md
Normal file
@@ -0,0 +1,80 @@
|
||||
Mode A: Platform Harness → Hosted Container (internal)
|
||||
Auth: mTLS + platform-signed user claim
|
||||
Network: k8s internal, never hits the internet
|
||||
|
||||
Mode B: Platform Harness → External User Container (remote)
|
||||
Auth: OAuth2 token issued by your platform
|
||||
Network: public internet, TLS required
|
||||
|
||||
Mode C: Third-party MCP Client → External User Container (standalone)
|
||||
Auth: User-managed API key or local-only (no network)
|
||||
Network: localhost or user's own network
|
||||
|
||||
┌──────────────────────────────────────────────────────────┐
|
||||
│ Platform (Postgres) │
|
||||
│ │
|
||||
│ users │
|
||||
│ ├── id, email, password_hash, plan_tier │
|
||||
│ │ │
|
||||
│ containers │
|
||||
│ ├── user_id │
|
||||
│ ├── type: "hosted" | "external" │
|
||||
│ ├── mcp_endpoint: "internal-svc:3100" | "https://..." │
|
||||
│ ├── auth_method: "mtls" | "platform_token" | "api_key" │
|
||||
│ └── public_key_fingerprint (for pinning external certs) │
|
||||
│ │
|
||||
│ api_tokens │
|
||||
│ ├── user_id │
|
||||
│ ├── token_hash │
|
||||
│ ├── scopes: ["mcp:tools", "mcp:resources", "data:read"] │
|
||||
│ ├── expires_at │
|
||||
│ └── issued_for: "platform_harness" | "user_direct" │
|
||||
│ │
|
||||
└──────────────────────────────────────────────────────────┘
|
||||
|
||||
## Mode A
|
||||
|
||||
Harness ──mTLS──▶ k8s Service ──▶ User Container MCP
|
||||
Validates: source is platform namespace
|
||||
Extracts: user_id from forwarded header
|
||||
|
||||
## Mode B
|
||||
|
||||
Registration flow (one-time):
|
||||
1. User provides their MCP endpoint URL in platform settings
|
||||
2. Platform generates a scoped token (JWT, short-lived, auto-refreshed)
|
||||
3. User configures their MCP server to accept tokens signed by your platform
|
||||
4. Platform stores the endpoint + auth method
|
||||
|
||||
Runtime:
|
||||
┌──────────┐ HTTPS + Bearer token ┌────────────────────┐
|
||||
│ Harness │ ─────────────────────────▶ │ External MCP Server│
|
||||
│ │ Authorization: │ │
|
||||
│ │ Bearer <platform_jwt> │ Validates: │
|
||||
│ │ │ - JWT signature │
|
||||
│ │ │ (your public │
|
||||
│ │ │ key, JWKS) │
|
||||
│ │ │ - user_id claim │
|
||||
│ │ │ matches self │
|
||||
│ │ │ - not expired │
|
||||
└──────────┘ └────────────────────┘
|
||||
|
||||
## Mode C
|
||||
|
||||
```yaml
|
||||
# openclaw/config.yaml
|
||||
auth:
|
||||
# For local-only use (Claude Desktop, Cursor, etc via stdio)
|
||||
mode: "local" # no network auth needed
|
||||
|
||||
# OR for remote access
|
||||
mode: "token"
|
||||
tokens:
|
||||
- name: "my-laptop"
|
||||
hash: "sha256:..." # generated by `openclaw token create`
|
||||
|
||||
# OR for platform integration
|
||||
mode: "platform"
|
||||
platform_jwks_url: "https://api.openclaw.io/.well-known/jwks.json"
|
||||
expected_user_id: "user_abc123"
|
||||
```
|
||||
29
doc/m_c_p_tools_architecture.md
Normal file
29
doc/m_c_p_tools_architecture.md
Normal file
@@ -0,0 +1,29 @@
|
||||
MCP Tools (User Container)
|
||||
├── Memory
|
||||
│ ├── get_conversation_history(limit)
|
||||
│ ├── save_message(role, content)
|
||||
│ ├── search_memory(query) ← semantic search over past conversations
|
||||
│ └── get_context_summary() ← "who is this user, what do they care about"
|
||||
│
|
||||
├── Strategies & Indicators
|
||||
│ ├── list_strategies()
|
||||
│ ├── read_strategy(name)
|
||||
│ ├── write_strategy(name, code)
|
||||
│ ├── list_indicators()
|
||||
│ ├── read_indicator(name)
|
||||
│ ├── write_indicator(name, code)
|
||||
│ └── run_backtest(strategy, params)
|
||||
│
|
||||
├── Preferences
|
||||
│ ├── get_preferences()
|
||||
│ ├── set_preference(key, value)
|
||||
│ └── get_agent_prompt() ← user's custom system prompt additions
|
||||
│
|
||||
├── Trading
|
||||
│ ├── get_watchlist()
|
||||
│ ├── execute_trade(params)
|
||||
│ ├── get_positions()
|
||||
│ └── get_trade_history()
|
||||
│
|
||||
└── Sandbox
|
||||
└── run_python(code) ← datascience toolset, matplotlib, etc.
|
||||
168
doc/protocol.md
Normal file
168
doc/protocol.md
Normal file
@@ -0,0 +1,168 @@
|
||||
# ZeroMQ Protocol Architecture
|
||||
|
||||
Our data transfer protocol uses ZeroMQ with Protobufs. We send a small envelope with a protocol version byte as the first frame, then a type ID as the first byte of the second frame, followed by the protobuf payload also in the second frame.
|
||||
|
||||
OHLC periods are represented as seconds.
|
||||
|
||||
## Data Flow Overview
|
||||
|
||||
**Relay as Gateway**: The Relay is a well-known bind point that all components connect to. It routes messages between clients, ingestors, and Flink.
|
||||
|
||||
### Historical Data Query Flow (Async Event-Driven Architecture)
|
||||
* Client generates request_id and/or client_id (both are client-generated)
|
||||
* Client computes notification topic: `RESPONSE:{client_id}` or `HISTORY_READY:{request_id}`
|
||||
* **Client subscribes to notification topic BEFORE sending request (prevents race condition)**
|
||||
* Client sends SubmitHistoricalRequest to Relay (REQ/REP)
|
||||
* Relay returns immediate SubmitResponse with request_id and notification_topic (for confirmation)
|
||||
* Relay publishes DataRequest to ingestor work queue with exchange prefix (PUB/SUB)
|
||||
* Ingestor receives request, fetches data from exchange
|
||||
* Ingestor writes OHLC data to Kafka with __metadata in first record
|
||||
* Flink reads from Kafka, processes data, writes to Iceberg
|
||||
* Flink publishes HistoryReadyNotification to ZMQ PUB socket (port 5557) with deterministic topic
|
||||
* Relay proxies notification via XSUB → XPUB to clients
|
||||
* Client receives notification (already subscribed) and queries Iceberg for data
|
||||
|
||||
**Key Architectural Change**: Relay is completely stateless. No request/response correlation needed. All notification routing is topic-based (e.g., "RESPONSE:{client_id}").
|
||||
|
||||
**Race Condition Prevention**: Notification topics are deterministic based on client-generated values (request_id or client_id). Clients MUST subscribe to the notification topic BEFORE submitting the request to avoid missing notifications.
|
||||
|
||||
**Two Notification Patterns**:
|
||||
1. **Per-client topic** (`RESPONSE:{client_id}`): Subscribe once during connection, reuse for all requests from this client. Recommended for most clients.
|
||||
2. **Per-request topic** (`HISTORY_READY:{request_id}`): Subscribe immediately before each request. Use when you need per-request isolation or don't have a persistent client_id.
|
||||
|
||||
### Realtime Data Flow (Flink → Relay → Clients)
|
||||
* Ingestors write realtime ticks to Kafka
|
||||
* Flink reads from Kafka, processes OHLC aggregations, CEP triggers
|
||||
* Flink publishes market data via ZMQ PUB
|
||||
* Relay subscribes to Flink (XSUB) and fanouts to clients (XPUB)
|
||||
* Clients subscribe to specific tickers
|
||||
|
||||
### Data Processing (Kafka → Flink → Iceberg)
|
||||
* All market data flows through Kafka (durable event log)
|
||||
* Flink processes streams for aggregations and CEP
|
||||
* Flink writes historical data to Apache Iceberg tables
|
||||
* Clients can query Iceberg for historical data (alternative to ingestor backfill)
|
||||
|
||||
**Key Design Principles**:
|
||||
* Relay is the well-known bind point - all other components connect to it
|
||||
* Relay is completely stateless - no request tracking, only topic-based routing
|
||||
* Exchange prefix filtering allows ingestor specialization (e.g., only BINANCE ingestors)
|
||||
* Historical data flows through Kafka (durable processing) only - no direct response
|
||||
* Async event-driven notifications via pub/sub (Flink → Relay → Clients)
|
||||
* Protobufs over ZMQ for all inter-service communication
|
||||
* Kafka for durability and Flink stream processing
|
||||
* Iceberg for long-term historical storage and client queries
|
||||
|
||||
## ZeroMQ Channels and Patterns
|
||||
|
||||
All sockets bind on **Relay** (well-known endpoint). Components connect to relay.
|
||||
|
||||
### 1. Client Request Channel (Clients → Relay)
|
||||
**Pattern**: ROUTER (Relay binds, Clients use REQ)
|
||||
- **Socket Type**: Relay uses ROUTER (bind), Clients use REQ (connect)
|
||||
- **Endpoint**: `tcp://*:5559` (Relay binds)
|
||||
- **Message Types**: `SubmitHistoricalRequest` → `SubmitResponse`
|
||||
- **Behavior**:
|
||||
- Client generates request_id and/or client_id
|
||||
- Client computes notification topic deterministically
|
||||
- **Client subscribes to notification topic FIRST (prevents race)**
|
||||
- Client sends REQ for historical OHLC data
|
||||
- Relay validates request and returns immediate acknowledgment
|
||||
- Response includes notification_topic for client confirmation
|
||||
- Relay publishes DataRequest to ingestor work queue
|
||||
- No request tracking - relay is stateless
|
||||
|
||||
### 2. Ingestor Work Queue (Relay → Ingestors)
|
||||
**Pattern**: PUB/SUB with exchange prefix filtering
|
||||
- **Socket Type**: Relay uses PUB (bind), Ingestors use SUB (connect)
|
||||
- **Endpoint**: `tcp://*:5555` (Relay binds)
|
||||
- **Message Types**: `DataRequest` (historical or realtime)
|
||||
- **Topic Prefix**: Exchange name (e.g., `BINANCE:`, `COINBASE:`)
|
||||
- **Behavior**:
|
||||
- Relay publishes work with exchange prefix from ticker
|
||||
- Ingestors subscribe only to exchanges they support
|
||||
- Multiple ingestors can compete for same exchange
|
||||
- Ingestors write data to Kafka only (no direct response)
|
||||
- Flink processes Kafka → Iceberg → notification
|
||||
|
||||
### 3. Market Data Fanout (Relay ↔ Flink ↔ Clients)
|
||||
**Pattern**: XPUB/XSUB proxy
|
||||
- **Socket Type**:
|
||||
- Relay XPUB (bind) ← Clients SUB (connect) - Port 5558
|
||||
- Relay XSUB (connect) → Flink PUB (bind) - Port 5557
|
||||
- **Message Types**: `Tick`, `OHLC`, `HistoryReadyNotification`
|
||||
- **Topic Formats**:
|
||||
- Market data: `{ticker}|{data_type}` (e.g., `BINANCE:BTC/USDT|tick`)
|
||||
- Notifications: `RESPONSE:{client_id}` or `HISTORY_READY:{request_id}`
|
||||
- **Behavior**:
|
||||
- Clients subscribe to ticker topics and notification topics via Relay XPUB
|
||||
- Relay forwards subscriptions to Flink via XSUB
|
||||
- Flink publishes processed market data and notifications
|
||||
- Relay proxies data to subscribed clients (stateless forwarding)
|
||||
- Dynamic subscription management (no pre-registration)
|
||||
|
||||
### 4. Ingestor Control Channel (Optional - Future Use)
|
||||
**Pattern**: PUB/SUB (Broadcast control)
|
||||
- **Socket Type**: Relay uses PUB, Ingestors use SUB
|
||||
- **Endpoint**: `tcp://*:5557` (Relay binds)
|
||||
- **Message Types**: `IngestorControl` (cancel, config updates)
|
||||
- **Behavior**:
|
||||
- Broadcast control messages to all ingestors
|
||||
- Used for realtime subscription cancellation
|
||||
- Configuration updates
|
||||
|
||||
## Message Envelope Format
|
||||
|
||||
The core protocol uses two ZeroMQ frames:
|
||||
```
|
||||
Frame 1: [1 byte: protocol version]
|
||||
Frame 2: [1 byte: message type ID][N bytes: protobuf message]
|
||||
```
|
||||
|
||||
This two-frame approach allows receivers to check the protocol version before parsing the message type and protobuf payload.
|
||||
|
||||
**Important**: Some ZeroMQ socket patterns (PUB/SUB, XPUB/XSUB) may prepend additional frames for routing purposes. For example:
|
||||
- **PUB/SUB with topic filtering**: SUB sockets receive `[topic frame][version frame][message frame]`
|
||||
- **ROUTER sockets**: Prepend identity frames before the message
|
||||
|
||||
Components must handle these additional frames appropriately:
|
||||
- SUB sockets: Skip the first frame (topic), then parse the remaining frames as the standard 2-frame envelope
|
||||
- ROUTER sockets: Extract identity frames, then parse the standard 2-frame envelope
|
||||
|
||||
The two-frame envelope is the **logical protocol format**, but physical transmission may include additional ZeroMQ transport frames.
|
||||
|
||||
## Message Type IDs
|
||||
|
||||
| Type ID | Message Type | Description |
|
||||
|---------|---------------------------|------------------------------------------------|
|
||||
| 0x01 | DataRequest | Request for historical or realtime data |
|
||||
| 0x02 | DataResponse (deprecated) | Historical data response (no longer used) |
|
||||
| 0x03 | IngestorControl | Control messages for ingestors |
|
||||
| 0x04 | Tick | Individual trade tick data |
|
||||
| 0x05 | OHLC | Single OHLC candle with volume |
|
||||
| 0x06 | Market | Market metadata |
|
||||
| 0x07 | OHLCRequest (deprecated) | Client request (replaced by SubmitHistorical) |
|
||||
| 0x08 | Response (deprecated) | Generic response (replaced by SubmitResponse) |
|
||||
| 0x09 | CEPTriggerRequest | Register CEP trigger |
|
||||
| 0x0A | CEPTriggerAck | CEP trigger acknowledgment |
|
||||
| 0x0B | CEPTriggerEvent | CEP trigger fired callback |
|
||||
| 0x0C | OHLCBatch | Batch of OHLC rows with metadata (Kafka) |
|
||||
| 0x10 | SubmitHistoricalRequest | Client request for historical data (async) |
|
||||
| 0x11 | SubmitResponse | Immediate ack with notification topic |
|
||||
| 0x12 | HistoryReadyNotification | Notification that data is ready in Iceberg |
|
||||
|
||||
## Error Handling
|
||||
|
||||
**Async Architecture Error Handling**:
|
||||
- Failed historical requests: ingestor writes error marker to Kafka
|
||||
- Flink reads error marker and publishes HistoryReadyNotification with ERROR status
|
||||
- Client timeout: if no notification received within timeout, assume failure
|
||||
- Realtime requests cancelled via control channel if ingestor fails
|
||||
- REQ/REP timeouts: 30 seconds default for client request submission
|
||||
- PUB/SUB has no delivery guarantees (Kafka provides durability)
|
||||
- No response routing needed - all notifications via topic-based pub/sub
|
||||
|
||||
**Durability**:
|
||||
- All data flows through Kafka for durability
|
||||
- Flink checkpointing ensures exactly-once processing
|
||||
- Client can retry request with new request_id if notification not received
|
||||
472
doc/user_mcp_resources.md
Normal file
472
doc/user_mcp_resources.md
Normal file
@@ -0,0 +1,472 @@
|
||||
# User MCP Server - Resource Architecture
|
||||
|
||||
The user's MCP server container owns **all** conversation history, RAG, and contextual data. The platform gateway is a thin, stateless orchestrator that only holds the Anthropic API key.
|
||||
|
||||
## Architecture Principle
|
||||
|
||||
**User Container = Fat Context**
|
||||
- Conversation history (PostgreSQL/SQLite)
|
||||
- RAG system (embeddings, vector search)
|
||||
- User preferences and custom prompts
|
||||
- Trading context (positions, watchlists, alerts)
|
||||
- All user-specific data
|
||||
|
||||
**Platform Gateway = Thin Orchestrator**
|
||||
- Anthropic API key (platform pays for LLM)
|
||||
- Session management (WebSocket/Telegram connections)
|
||||
- MCP client connection pooling
|
||||
- Tool routing (platform vs user tools)
|
||||
- **Zero conversation state stored**
|
||||
|
||||
## MCP Resources for Context Injection
|
||||
|
||||
Resources are **read-only** data sources that provide context to the LLM. They're fetched before each Claude API call and embedded in the conversation.
|
||||
|
||||
### Standard Context Resources
|
||||
|
||||
#### 1. `context://user-profile`
|
||||
**Purpose:** User's trading background and preferences
|
||||
|
||||
**MIME Type:** `text/plain`
|
||||
|
||||
**Example Content:**
|
||||
```
|
||||
User Profile:
|
||||
- Trading experience: Intermediate
|
||||
- Preferred timeframes: 1h, 4h, 1d
|
||||
- Risk tolerance: Medium
|
||||
- Focus: Swing trading with technical indicators
|
||||
- Favorite indicators: RSI, MACD, Bollinger Bands
|
||||
- Active pairs: BTC/USDT, ETH/USDT, SOL/USDT
|
||||
```
|
||||
|
||||
**Implementation Notes:**
|
||||
- Stored in user's database `user_preferences` table
|
||||
- Updated via preference management tools
|
||||
- Includes inferred data from usage patterns
|
||||
|
||||
---
|
||||
|
||||
#### 2. `context://conversation-summary`
|
||||
**Purpose:** Semantic summary of recent conversation with RAG-enhanced context
|
||||
|
||||
**MIME Type:** `text/plain`
|
||||
|
||||
**Example Content:**
|
||||
```
|
||||
Recent Conversation Summary:
|
||||
|
||||
Last 10 messages (summarized):
|
||||
- User asked about moving average crossover strategies
|
||||
- Discussed backtesting parameters for BTC/USDT
|
||||
- Reviewed risk management with 2% position sizing
|
||||
- Explored adding RSI filter to reduce false signals
|
||||
|
||||
Relevant past discussions (RAG search):
|
||||
- 2 weeks ago: Similar strategy development on ETH/USDT
|
||||
- 1 month ago: User prefers simple strategies over complex ones
|
||||
- Past preference: Avoid strategies with >5 indicators
|
||||
|
||||
Current focus: Optimizing MA crossover with momentum filter
|
||||
```
|
||||
|
||||
**Implementation Notes:**
|
||||
- Last N messages stored in `conversation_history` table
|
||||
- RAG search against embeddings of past conversations
|
||||
- Semantic search using user's current message as query
|
||||
- ChromaDB/pgvector for embedding storage
|
||||
- Summary generated on-demand (can be cached for 1-5 minutes)
|
||||
|
||||
**RAG Integration:**
|
||||
```python
|
||||
async def get_conversation_summary() -> str:
|
||||
# Get recent messages
|
||||
recent = await db.get_recent_messages(limit=50)
|
||||
|
||||
# Semantic search for relevant context
|
||||
relevant = await rag.search_conversation_history(
|
||||
query=recent[-1].content, # Last user message
|
||||
limit=5,
|
||||
min_score=0.7
|
||||
)
|
||||
|
||||
# Build summary
|
||||
return build_summary(recent[-10:], relevant)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
#### 3. `context://workspace-state`
|
||||
**Purpose:** Current trading workspace (chart, positions, watchlist)
|
||||
|
||||
**MIME Type:** `application/json`
|
||||
|
||||
**Example Content:**
|
||||
```json
|
||||
{
|
||||
"currentChart": {
|
||||
"ticker": "BINANCE:BTC/USDT",
|
||||
"timeframe": "1h",
|
||||
"indicators": ["SMA(20)", "RSI(14)", "MACD(12,26,9)"]
|
||||
},
|
||||
"watchlist": ["BTC/USDT", "ETH/USDT", "SOL/USDT"],
|
||||
"openPositions": [
|
||||
{
|
||||
"ticker": "BTC/USDT",
|
||||
"side": "long",
|
||||
"size": 0.1,
|
||||
"entryPrice": 45000,
|
||||
"currentPrice": 46500,
|
||||
"unrealizedPnL": 150
|
||||
}
|
||||
],
|
||||
"recentAlerts": [
|
||||
{
|
||||
"type": "price_alert",
|
||||
"message": "BTC/USDT crossed above $46,000",
|
||||
"timestamp": "2025-01-15T10:30:00Z"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
**Implementation Notes:**
|
||||
- Synced from web client chart state
|
||||
- Updated via WebSocket sync protocol
|
||||
- Includes active indicators on current chart
|
||||
- Position data from trading system
|
||||
|
||||
---
|
||||
|
||||
#### 4. `context://system-prompt`
|
||||
**Purpose:** User's custom instructions and preferences for AI behavior
|
||||
|
||||
**MIME Type:** `text/plain`
|
||||
|
||||
**Example Content:**
|
||||
```
|
||||
Custom Instructions:
|
||||
- Be concise and data-driven
|
||||
- Always show risk/reward ratios
|
||||
- Prefer simple strategies over complex ones
|
||||
- When suggesting trades, include stop-loss and take-profit levels
|
||||
- Explain your reasoning in trading decisions
|
||||
```
|
||||
|
||||
**Implementation Notes:**
|
||||
- User-editable in preferences UI
|
||||
- Appended **last** to system prompt (highest priority)
|
||||
- Can override platform defaults
|
||||
- Stored in `user_preferences.custom_prompt` field
|
||||
|
||||
---
|
||||
|
||||
## MCP Tools for Actions
|
||||
|
||||
Tools are for **actions** that have side effects. These are **not** used for context fetching.
|
||||
|
||||
### Conversation Management
|
||||
- `save_message(role, content, timestamp)` - Save message to history
|
||||
- `search_conversation(query, limit)` - Explicit semantic search (for user queries like "what did we discuss about BTC?")
|
||||
|
||||
### Strategy & Indicators
|
||||
- `list_strategies()` - List user's strategies
|
||||
- `read_strategy(name)` - Get strategy code
|
||||
- `write_strategy(name, code)` - Save strategy
|
||||
- `run_backtest(strategy, params)` - Execute backtest
|
||||
|
||||
### Trading
|
||||
- `get_watchlist()` - Get watchlist (action that may trigger sync)
|
||||
- `execute_trade(params)` - Execute trade order
|
||||
- `get_positions()` - Fetch current positions from exchange
|
||||
|
||||
### Sandbox
|
||||
- `run_python(code)` - Execute Python code with data science libraries
|
||||
|
||||
---
|
||||
|
||||
## Gateway Harness Flow
|
||||
|
||||
```typescript
|
||||
// gateway/src/harness/agent-harness.ts
|
||||
|
||||
async handleMessage(message: InboundMessage): Promise<OutboundMessage> {
|
||||
// 1. Fetch context resources from user's MCP
|
||||
const contextResources = await fetchContextResources([
|
||||
'context://user-profile',
|
||||
'context://conversation-summary', // <-- RAG happens here
|
||||
'context://workspace-state',
|
||||
'context://system-prompt',
|
||||
]);
|
||||
|
||||
// 2. Build system prompt from resources
|
||||
const systemPrompt = buildSystemPrompt(contextResources);
|
||||
|
||||
// 3. Build messages with embedded conversation context
|
||||
const messages = buildMessages(message, contextResources);
|
||||
|
||||
// 4. Get tools from MCP
|
||||
const tools = await mcpClient.listTools();
|
||||
|
||||
// 5. Call Claude with embedded context
|
||||
const response = await anthropic.messages.create({
|
||||
model: 'claude-3-5-sonnet-20241022',
|
||||
system: systemPrompt, // <-- User profile + workspace + custom prompt
|
||||
messages, // <-- Conversation summary from RAG
|
||||
tools,
|
||||
});
|
||||
|
||||
// 6. Save to user's MCP (tool call)
|
||||
await mcpClient.callTool('save_message', { role: 'user', content: message.content });
|
||||
await mcpClient.callTool('save_message', { role: 'assistant', content: response });
|
||||
|
||||
return response;
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## User MCP Server Implementation (Python)
|
||||
|
||||
### Resource Handler
|
||||
|
||||
```python
|
||||
# user-mcp/src/resources.py
|
||||
|
||||
from mcp.server import Server
|
||||
from mcp.types import Resource, ResourceTemplate
|
||||
import asyncpg
|
||||
|
||||
server = Server("dexorder-user")
|
||||
|
||||
@server.list_resources()
|
||||
async def list_resources() -> list[Resource]:
|
||||
return [
|
||||
Resource(
|
||||
uri="context://user-profile",
|
||||
name="User Profile",
|
||||
description="Trading style, preferences, and background",
|
||||
mimeType="text/plain",
|
||||
),
|
||||
Resource(
|
||||
uri="context://conversation-summary",
|
||||
name="Conversation Summary",
|
||||
description="Recent conversation with RAG-enhanced context",
|
||||
mimeType="text/plain",
|
||||
),
|
||||
Resource(
|
||||
uri="context://workspace-state",
|
||||
name="Workspace State",
|
||||
description="Current chart, watchlist, positions",
|
||||
mimeType="application/json",
|
||||
),
|
||||
Resource(
|
||||
uri="context://system-prompt",
|
||||
name="Custom System Prompt",
|
||||
description="User's custom AI instructions",
|
||||
mimeType="text/plain",
|
||||
),
|
||||
]
|
||||
|
||||
@server.read_resource()
|
||||
async def read_resource(uri: str) -> str:
|
||||
if uri == "context://user-profile":
|
||||
return await build_user_profile()
|
||||
elif uri == "context://conversation-summary":
|
||||
return await build_conversation_summary()
|
||||
elif uri == "context://workspace-state":
|
||||
return await build_workspace_state()
|
||||
elif uri == "context://system-prompt":
|
||||
return await get_custom_prompt()
|
||||
else:
|
||||
raise ValueError(f"Unknown resource: {uri}")
|
||||
```
|
||||
|
||||
### RAG Integration
|
||||
|
||||
```python
|
||||
# user-mcp/src/rag.py
|
||||
|
||||
import chromadb
|
||||
from sentence_transformers import SentenceTransformer
|
||||
|
||||
class ConversationRAG:
|
||||
def __init__(self, db_path: str):
|
||||
self.chroma = chromadb.PersistentClient(path=db_path)
|
||||
self.collection = self.chroma.get_or_create_collection("conversations")
|
||||
self.embedder = SentenceTransformer('all-MiniLM-L6-v2')
|
||||
|
||||
async def search_conversation_history(
|
||||
self,
|
||||
query: str,
|
||||
limit: int = 5,
|
||||
min_score: float = 0.7
|
||||
) -> list[dict]:
|
||||
"""Semantic search over conversation history"""
|
||||
# Embed query
|
||||
query_embedding = self.embedder.encode(query).tolist()
|
||||
|
||||
# Search
|
||||
results = self.collection.query(
|
||||
query_embeddings=[query_embedding],
|
||||
n_results=limit,
|
||||
)
|
||||
|
||||
# Filter by score and format
|
||||
relevant = []
|
||||
for i, score in enumerate(results['distances'][0]):
|
||||
if score >= min_score:
|
||||
relevant.append({
|
||||
'content': results['documents'][0][i],
|
||||
'metadata': results['metadatas'][0][i],
|
||||
'score': score,
|
||||
})
|
||||
|
||||
return relevant
|
||||
|
||||
async def add_message(self, message_id: str, role: str, content: str, metadata: dict):
|
||||
"""Add message to RAG index"""
|
||||
embedding = self.embedder.encode(content).tolist()
|
||||
|
||||
self.collection.add(
|
||||
ids=[message_id],
|
||||
embeddings=[embedding],
|
||||
documents=[content],
|
||||
metadatas=[{
|
||||
'role': role,
|
||||
'timestamp': metadata.get('timestamp'),
|
||||
**metadata
|
||||
}]
|
||||
)
|
||||
```
|
||||
|
||||
### Conversation Summary Builder
|
||||
|
||||
```python
|
||||
# user-mcp/src/context.py
|
||||
|
||||
async def build_conversation_summary(user_id: str) -> str:
|
||||
"""Build conversation summary with RAG"""
|
||||
# 1. Get recent messages
|
||||
recent_messages = await db.get_messages(
|
||||
user_id=user_id,
|
||||
limit=50,
|
||||
order='desc'
|
||||
)
|
||||
|
||||
# 2. Get current focus (last user message)
|
||||
last_user_msg = next(
|
||||
(m for m in recent_messages if m.role == 'user'),
|
||||
None
|
||||
)
|
||||
|
||||
if not last_user_msg:
|
||||
return "No recent conversation history."
|
||||
|
||||
# 3. RAG search for relevant context
|
||||
rag = ConversationRAG(f"/data/users/{user_id}/rag")
|
||||
relevant_context = await rag.search_conversation_history(
|
||||
query=last_user_msg.content,
|
||||
limit=5,
|
||||
min_score=0.7
|
||||
)
|
||||
|
||||
# 4. Build summary
|
||||
summary = f"Recent Conversation Summary:\n\n"
|
||||
|
||||
# Recent messages (last 10)
|
||||
summary += "Last 10 messages:\n"
|
||||
for msg in recent_messages[-10:]:
|
||||
summary += f"- {msg.role}: {msg.content[:100]}...\n"
|
||||
|
||||
# Relevant past context
|
||||
if relevant_context:
|
||||
summary += "\nRelevant past discussions (RAG):\n"
|
||||
for ctx in relevant_context:
|
||||
timestamp = ctx['metadata'].get('timestamp', 'unknown')
|
||||
summary += f"- [{timestamp}] {ctx['content'][:150]}...\n"
|
||||
|
||||
# Inferred focus
|
||||
summary += f"\nCurrent focus: {infer_topic(last_user_msg.content)}\n"
|
||||
|
||||
return summary
|
||||
|
||||
def infer_topic(message: str) -> str:
|
||||
"""Simple topic extraction"""
|
||||
keywords = {
|
||||
'strategy': ['strategy', 'backtest', 'trading system'],
|
||||
'indicator': ['indicator', 'rsi', 'macd', 'moving average'],
|
||||
'analysis': ['analyze', 'chart', 'price action'],
|
||||
'risk': ['risk', 'position size', 'stop loss'],
|
||||
}
|
||||
|
||||
message_lower = message.lower()
|
||||
for topic, words in keywords.items():
|
||||
if any(word in message_lower for word in words):
|
||||
return topic
|
||||
|
||||
return 'general trading discussion'
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Benefits of This Architecture
|
||||
|
||||
1. **Privacy**: Conversation history never leaves user's container
|
||||
2. **Customization**: Each user controls their RAG, embeddings, prompt engineering
|
||||
3. **Scalability**: Platform harness is stateless - horizontally scalable
|
||||
4. **Cost Control**: Platform pays for Claude, users pay for their compute/storage
|
||||
5. **Portability**: Users can export/migrate their entire context
|
||||
6. **Development**: Users can test prompts/context locally without platform involvement
|
||||
|
||||
---
|
||||
|
||||
## Future Enhancements
|
||||
|
||||
### Dynamic Resource URIs
|
||||
|
||||
Support parameterized resources:
|
||||
```
|
||||
context://conversation/{session_id}
|
||||
context://strategy/{strategy_name}
|
||||
context://backtest/{backtest_id}/results
|
||||
```
|
||||
|
||||
### Resource Templates
|
||||
|
||||
MCP supports resource templates for dynamic discovery:
|
||||
```python
|
||||
@server.list_resource_templates()
|
||||
async def list_templates() -> list[ResourceTemplate]:
|
||||
return [
|
||||
ResourceTemplate(
|
||||
uriTemplate="context://strategy/{name}",
|
||||
name="Strategy Context",
|
||||
description="Context for specific strategy",
|
||||
)
|
||||
]
|
||||
```
|
||||
|
||||
### Streaming Resources
|
||||
|
||||
For large context (e.g., full backtest results), support streaming:
|
||||
```python
|
||||
@server.read_resource()
|
||||
async def read_resource(uri: str) -> AsyncIterator[str]:
|
||||
if uri.startswith("context://backtest/"):
|
||||
async for chunk in stream_backtest_results(uri):
|
||||
yield chunk
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Migration Path
|
||||
|
||||
For users with existing conversation history in platform DB:
|
||||
|
||||
1. **Export script**: Migrate platform history → user container DB
|
||||
2. **RAG indexing**: Embed all historical messages into ChromaDB
|
||||
3. **Preference migration**: Copy user preferences to container
|
||||
4. **Cutover**: Switch to resource-based context fetching
|
||||
|
||||
Platform can keep read-only archive for compliance, but active context lives in user container.
|
||||
Reference in New Issue
Block a user