container lifecycle management

2026-03-12 15:13:38 -04:00
parent e99ef5d2dd
commit b9cc397e05
61 changed files with 6880 additions and 31 deletions
--- a/gateway/ARCHITECTURE.md
+++ b/gateway/ARCHITECTURE.md
@@ -0,0 +1,313 @@
+# Gateway Architecture: LangChain.js + LangGraph
+
+## Why LangChain.js (Not Vercel AI SDK or Direct Anthropic SDK)?
+
+### The Decision
+
+After evaluating Vercel AI SDK and LangChain.js, we chose **LangChain.js + LangGraph** for these reasons:
+
+1. **Multi-model support**: 300+ models via OpenRouter, plus direct integrations
+2. **Complex workflows**: LangGraph for stateful trading analysis pipelines
+3. **No vendor lock-in**: Switch between Anthropic, OpenAI, Google with one line
+4. **Streaming**: Same as Vercel AI SDK (`.stream()` method)
+5. **Tool calling**: Unified across all providers
+6. **Trading-specific**: State management, conditional branching, human-in-the-loop
+
+**We don't need Vercel AI SDK because:**
+- ❌ We use Vue (not React) - don't need React hooks
+- ❌ We have Node.js servers (not edge) - don't need edge runtime
+- ✅ **DO need** complex workflows (strategy analysis, backtesting, approvals)
+- ✅ **DO need** stateful execution (resume from failures)
+
+---
+
+## Architecture Layers
+
+### Layer 1: Model Abstraction (`src/llm/`)
+
+**Provider Factory** (`provider.ts`)
+```typescript
+const factory = new LLMProviderFactory(config, logger);
+
+// Create any model
+const claude = factory.createModel({
+  provider: 'anthropic',
+  model: 'claude-3-5-sonnet-20241022',
+});
+
+const gpt4 = factory.createModel({
+  provider: 'openai',
+  model: 'gpt-4o',
+});
+```
+
+**Model Router** (`router.ts`)
+```typescript
+const router = new ModelRouter(factory, logger);
+
+// Intelligently route based on:
+// - User license (free → Gemini Flash, pro → GPT-4, enterprise → Claude)
+// - Query complexity (simple → cheap, complex → smart)
+// - User preference (if set in license.preferredModel)
+// - Cost optimization (always use cheapest)
+
+const model = await router.route(
+  message.content,
+  userLicense,
+  RoutingStrategy.COMPLEXITY
+);
+```
+
+---
+
+### Layer 2: Agent Harness (`src/harness/`)
+
+**Stateless Orchestrator**
+
+The harness has **ZERO conversation state**. Everything lives in user's MCP container.
+
+**Flow:**
+```typescript
+async handleMessage(message: InboundMessage) {
+  // 1. Fetch context from user's MCP (resources, not tools)
+  const resources = await mcpClient.listResources();
+  const context = await Promise.all([
+    mcpClient.readResource('context://user-profile'),        // Trading style
+    mcpClient.readResource('context://conversation-summary'), // RAG summary
+    mcpClient.readResource('context://workspace-state'),      // Current chart
+    mcpClient.readResource('context://system-prompt'),        // Custom instructions
+  ]);
+
+  // 2. Route to appropriate model
+  const model = await modelRouter.route(message, license);
+
+  // 3. Build messages with embedded context
+  const messages = buildLangChainMessages(systemPrompt, context);
+
+  // 4. Call LLM
+  const response = await model.invoke(messages);
+
+  // 5. Save to user's MCP (tool call)
+  await mcpClient.callTool('save_message', { role: 'user', content: message });
+  await mcpClient.callTool('save_message', { role: 'assistant', content: response });
+
+  return response;
+}
+```
+
+**Streaming variant:**
+```typescript
+async *streamMessage(message: InboundMessage) {
+  const model = await modelRouter.route(message, license);
+  const messages = buildMessages(context, message);
+
+  const stream = await model.stream(messages);
+
+  let fullResponse = '';
+  for await (const chunk of stream) {
+    fullResponse += chunk.content;
+    yield chunk.content; // Stream to WebSocket/Telegram
+  }
+
+  // Save after streaming completes
+  await mcpClient.callTool('save_message', { /* ... */ });
+}
+```
+
+---
+
+### Layer 3: Workflows (`src/workflows/`)
+
+**LangGraph for Complex Trading Analysis**
+
+```typescript
+// Example: Strategy Analysis Pipeline
+const workflow = new StateGraph(StrategyAnalysisState)
+  .addNode('code_review', async (state) => {
+    const model = new ChatAnthropic({ model: 'claude-3-opus' });
+    const review = await model.invoke(`Review: ${state.strategyCode}`);
+    return { codeReview: review.content };
+  })
+  .addNode('backtest', async (state) => {
+    // Call user's MCP backtest tool
+    const results = await mcpClient.callTool('run_backtest', {
+      strategy: state.strategyCode,
+      ticker: state.ticker,
+    });
+    return { backtestResults: results };
+  })
+  .addNode('risk_assessment', async (state) => {
+    const model = new ChatAnthropic({ model: 'claude-3-5-sonnet' });
+    const assessment = await model.invoke(
+      `Analyze risk: ${JSON.stringify(state.backtestResults)}`
+    );
+    return { riskAssessment: assessment.content };
+  })
+  .addNode('human_approval', async (state) => {
+    // Pause for user review (human-in-the-loop)
+    return { humanApproved: await waitForUserApproval(state) };
+  })
+  .addConditionalEdges('human_approval', (state) => {
+    return state.humanApproved ? 'deploy' : 'reject';
+  })
+  .compile();
+
+// Execute
+const result = await workflow.invoke({
+  strategyCode: userCode,
+  ticker: 'BTC/USDT',
+  timeframe: '1h',
+});
+```
+
+**Benefits:**
+- **Stateful**: Resume if server crashes mid-analysis
+- **Conditional**: Route based on results (if Sharpe > 2 → deploy, else → reject)
+- **Human-in-the-loop**: Pause for user approval
+- **Multi-step**: Each node can use different models
+
+---
+
+## User Context Architecture
+
+### MCP Resources (Not Tools)
+
+**User's MCP server exposes resources** (read-only context):
+
+```
+context://user-profile          → Trading style, preferences
+context://conversation-summary  → RAG-generated summary
+context://workspace-state       → Current chart, positions
+context://system-prompt         → User's custom AI instructions
+```
+
+**Gateway fetches and embeds in LLM call:**
+```typescript
+const userProfile = await mcpClient.readResource('context://user-profile');
+const conversationSummary = await mcpClient.readResource('context://conversation-summary');
+
+// User's MCP server runs RAG search and returns summary
+// Gateway embeds this in Claude/GPT prompt
+```
+
+**Why resources, not tools?**
+- Resources = context injection (read-only)
+- Tools = actions (write operations)
+- Context should be fetched **before** LLM call, not during
+
+---
+
+## Model Routing Strategies
+
+### 1. User Preference
+```typescript
+// User's license has preferred model
+{
+  "preferredModel": {
+    "provider": "anthropic",
+    "model": "claude-3-5-sonnet-20241022"
+  }
+}
+
+// Router uses this if set
+```
+
+### 2. Complexity-Based
+```typescript
+const isComplex = message.includes('backtest') || message.length > 200;
+
+if (isComplex) {
+  return { provider: 'anthropic', model: 'claude-3-opus' }; // Smart
+} else {
+  return { provider: 'openai', model: 'gpt-4o-mini' }; // Fast
+}
+```
+
+### 3. License Tier
+```typescript
+switch (license.licenseType) {
+  case 'free':
+    return { provider: 'google', model: 'gemini-2.0-flash-exp' }; // Cheap
+  case 'pro':
+    return { provider: 'openai', model: 'gpt-4o' }; // Balanced
+  case 'enterprise':
+    return { provider: 'anthropic', model: 'claude-3-5-sonnet' }; // Premium
+}
+```
+
+### 4. Cost-Optimized
+```typescript
+return { provider: 'google', model: 'gemini-2.0-flash-exp' }; // Always cheapest
+```
+
+---
+
+## When to Use What
+
+### Simple Chat → Agent Harness
+```typescript
+// User: "What's the RSI on BTC?"
+// → Fast streaming response via harness.streamMessage()
+```
+
+### Complex Analysis → LangGraph Workflow
+```typescript
+// User: "Analyze this strategy and backtest it"
+// → Multi-step workflow: code review → backtest → risk → approval
+```
+
+### Direct Tool Call → MCP Client
+```typescript
+// User: "Get my watchlist"
+// → Direct MCP tool call, no LLM needed
+```
+
+---
+
+## Data Flow
+
+```
+User Message ("Analyze my strategy")
+    ↓
+Gateway → Route to workflow (not harness)
+    ↓
+LangGraph Workflow:
+  ├─ Node 1: Code Review (Claude Opus)
+  │   └─ Analyzes strategy code
+  ├─ Node 2: Backtest (MCP tool call)
+  │   └─ User's container runs backtest
+  ├─ Node 3: Risk Assessment (Claude Sonnet)
+  │   └─ Evaluates results
+  ├─ Node 4: Human Approval (pause)
+  │   └─ User reviews in UI
+  └─ Node 5: Recommendation (GPT-4o-mini)
+      └─ Final decision
+
+Result → Return to user
+```
+
+---
+
+## Benefits Summary
+
+| Feature | LangChain.js | Vercel AI SDK | Direct Anthropic SDK |
+|---------|--------------|---------------|----------------------|
+| Multi-model | ✅ 300+ models | ✅ 100+ models | ❌ Anthropic only |
+| Streaming | ✅ `.stream()` | ✅ `streamText()` | ✅ `.stream()` |
+| Tool calling | ✅ Unified | ✅ Unified | ✅ Anthropic format |
+| Complex workflows | ✅ LangGraph | ❌ Limited | ❌ DIY |
+| Stateful agents | ✅ LangGraph | ❌ No | ❌ No |
+| Human-in-the-loop | ✅ LangGraph | ❌ No | ❌ No |
+| React hooks | ❌ N/A | ✅ `useChat()` | ❌ N/A |
+| Bundle size | Large (101kb) | Small (30kb) | Medium (60kb) |
+| **Dexorder needs** | **✅ Perfect fit** | **❌ Missing workflows** | **❌ Vendor lock-in** |
+
+---
+
+## Next Steps
+
+1. **Implement tool calling** in agent harness (bind MCP tools to LangChain)
+2. **Add state persistence** for LangGraph (PostgreSQL checkpointer)
+3. **Build more workflows**: market scanner, portfolio optimizer
+4. **Add monitoring**: Track model usage, costs, latency
+5. **User container**: Implement Python MCP server with resources