container lifecycle management

2026-03-12 15:13:38 -04:00
parent e99ef5d2dd
commit b9cc397e05
61 changed files with 6880 additions and 31 deletions
--- a/gateway/README.md
+++ b/gateway/README.md
@@ -0,0 +1,212 @@
+# Dexorder Gateway
+
+Multi-channel gateway with agent harness for the Dexorder AI platform.
+
+## Architecture
+
+```
+┌─────────────────────────────────────────────────────────┐
+│                    Platform Gateway                      │
+│                   (Node.js/Fastify)                      │
+│                                                          │
+│  ┌────────────────────────────────────────────────┐    │
+│  │  Channels                                       │    │
+│  │  - WebSocket (/ws/chat)                         │    │
+│  │  - Telegram Webhook (/webhook/telegram)        │    │
+│  └────────────────────────────────────────────────┘    │
+│                         ↕                                │
+│  ┌────────────────────────────────────────────────┐    │
+│  │  Authenticator                                  │    │
+│  │  - JWT verification (WebSocket)                 │    │
+│  │  - Channel linking (Telegram)                   │    │
+│  │  - User license lookup (PostgreSQL)             │    │
+│  └────────────────────────────────────────────────┘    │
+│                         ↕                                │
+│  ┌────────────────────────────────────────────────┐    │
+│  │  Agent Harness (per-session)                    │    │
+│  │  - Claude API integration                       │    │
+│  │  - MCP client connector                         │    │
+│  │  - Conversation state                           │    │
+│  └────────────────────────────────────────────────┘    │
+│                         ↕                                │
+│  ┌────────────────────────────────────────────────┐    │
+│  │  MCP Client                                      │    │
+│  │  - User container connection                    │    │
+│  │  - Tool routing                                  │    │
+│  └────────────────────────────────────────────────┘    │
+└─────────────────────────────────────────────────────────┘
+                          ↕
+          ┌───────────────────────────────┐
+          │  User MCP Server (Python)      │
+          │  - Strategies, indicators       │
+          │  - Memory, preferences          │
+          │  - Backtest sandbox             │
+          └───────────────────────────────────┘
+```
+
+## Features
+
+- **Automatic container provisioning**: Creates user agent containers on-demand via Kubernetes
+- **Multi-channel support**: WebSocket and Telegram webhooks
+- **Per-channel authentication**: JWT for web, channel linking for chat apps
+- **User license management**: Feature flags and resource limits from PostgreSQL
+- **Container lifecycle management**: Auto-shutdown on idle (handled by container sidecar)
+- **License-based resources**: Different memory/CPU/storage limits per tier
+- **Multi-model LLM support**: Anthropic Claude, OpenAI GPT, Google Gemini, OpenRouter (300+ models)
+- **Zero vendor lock-in**: Switch models with one line, powered by LangChain.js
+- **Intelligent routing**: Auto-select models based on complexity, license tier, or user preference
+- **Streaming responses**: Real-time chat with WebSocket and Telegram
+- **Complex workflows**: LangGraph for stateful trading analysis (backtest → risk → approval)
+- **Agent harness**: Stateless orchestrator (all context lives in user's MCP container)
+- **MCP resource integration**: User's RAG, conversation history, and preferences
+
+## Container Management
+
+When a user authenticates, the gateway:
+
+1. **Checks for existing container**: Queries Kubernetes for deployment
+2. **Creates if missing**: Renders YAML template based on license tier
+3. **Waits for ready**: Polls deployment status until healthy
+4. **Returns MCP endpoint**: Computed from service name
+5. **Connects to MCP server**: Proceeds with normal authentication flow
+
+Container templates by license tier:
+
+| Tier | Memory | CPU | Storage | Idle Timeout |
+|------|--------|-----|---------|--------------|
+| Free | 512Mi | 500m | 1Gi | 15min |
+| Pro | 2Gi | 2000m | 10Gi | 60min |
+| Enterprise | 4Gi | 4000m | 50Gi | Never |
+
+Containers self-manage their lifecycle using the lifecycle sidecar (see `../lifecycle-sidecar/`)
+
+## Setup
+
+### Prerequisites
+
+- Node.js >= 22.0.0
+- PostgreSQL database
+- At least one LLM provider API key:
+  - Anthropic Claude
+  - OpenAI GPT
+  - Google Gemini
+  - OpenRouter (one key for 300+ models)
+
+### Development
+
+1. Install dependencies:
+```bash
+npm install
+```
+
+2. Copy environment template:
+```bash
+cp .env.example .env
+```
+
+3. Configure `.env` (see `.env.example`):
+```bash
+DATABASE_URL=postgresql://postgres:postgres@localhost:5432/dexorder
+
+# Configure at least one provider
+ANTHROPIC_API_KEY=sk-ant-xxxxx
+# OPENAI_API_KEY=sk-xxxxx
+# GOOGLE_API_KEY=xxxxx
+# OPENROUTER_API_KEY=sk-or-xxxxx
+
+# Optional: Set default model
+DEFAULT_MODEL_PROVIDER=anthropic
+DEFAULT_MODEL=claude-3-5-sonnet-20241022
+```
+
+4. Run development server:
+```bash
+npm run dev
+```
+
+### Production Build
+
+```bash
+npm run build
+npm start
+```
+
+### Docker
+
+```bash
+docker build -t dexorder/gateway:latest .
+docker run -p 3000:3000 --env-file .env dexorder/gateway:latest
+```
+
+## Database Schema
+
+Required PostgreSQL tables (will be documented separately):
+
+### `user_licenses`
+- `user_id` (text, primary key)
+- `email` (text)
+- `license_type` (text: 'free', 'pro', 'enterprise')
+- `features` (jsonb)
+- `resource_limits` (jsonb)
+- `mcp_server_url` (text)
+- `expires_at` (timestamp, nullable)
+- `created_at` (timestamp)
+- `updated_at` (timestamp)
+
+### `user_channel_links`
+- `id` (serial, primary key)
+- `user_id` (text, foreign key)
+- `channel_type` (text: 'telegram', 'slack', 'discord')
+- `channel_user_id` (text)
+- `created_at` (timestamp)
+
+## API Endpoints
+
+### WebSocket
+
+**`GET /ws/chat`**
+- WebSocket connection for web client
+- Auth: Bearer token in headers
+- Protocol: JSON messages
+
+Example:
+```javascript
+const ws = new WebSocket('ws://localhost:3000/ws/chat', {
+  headers: {
+    'Authorization': 'Bearer your-jwt-token'
+  }
+});
+
+ws.on('message', (data) => {
+  const msg = JSON.parse(data);
+  console.log(msg);
+});
+
+ws.send(JSON.stringify({
+  type: 'message',
+  content: 'Hello, AI!'
+}));
+```
+
+### Telegram Webhook
+
+**`POST /webhook/telegram`**
+- Telegram bot webhook endpoint
+- Auth: Telegram user linked to platform user
+- Automatically processes incoming messages
+
+### Health Check
+
+**`GET /health`**
+- Returns server health status
+
+## TODO
+
+- [ ] Implement JWT verification with JWKS
+- [ ] Implement MCP HTTP/SSE transport
+- [ ] Add Redis for session persistence
+- [ ] Add rate limiting per user license
+- [ ] Add message usage tracking
+- [ ] Add streaming responses for WebSocket
+- [ ] Add Slack and Discord channel handlers
+- [ ] Add session cleanup/timeout logic