container lifecycle management
This commit is contained in:
313
gateway/ARCHITECTURE.md
Normal file
313
gateway/ARCHITECTURE.md
Normal file
@@ -0,0 +1,313 @@
|
||||
# Gateway Architecture: LangChain.js + LangGraph
|
||||
|
||||
## Why LangChain.js (Not Vercel AI SDK or Direct Anthropic SDK)?
|
||||
|
||||
### The Decision
|
||||
|
||||
After evaluating Vercel AI SDK and LangChain.js, we chose **LangChain.js + LangGraph** for these reasons:
|
||||
|
||||
1. **Multi-model support**: 300+ models via OpenRouter, plus direct integrations
|
||||
2. **Complex workflows**: LangGraph for stateful trading analysis pipelines
|
||||
3. **No vendor lock-in**: Switch between Anthropic, OpenAI, Google with one line
|
||||
4. **Streaming**: Same as Vercel AI SDK (`.stream()` method)
|
||||
5. **Tool calling**: Unified across all providers
|
||||
6. **Trading-specific**: State management, conditional branching, human-in-the-loop
|
||||
|
||||
**We don't need Vercel AI SDK because:**
|
||||
- ❌ We use Vue (not React) - don't need React hooks
|
||||
- ❌ We have Node.js servers (not edge) - don't need edge runtime
|
||||
- ✅ **DO need** complex workflows (strategy analysis, backtesting, approvals)
|
||||
- ✅ **DO need** stateful execution (resume from failures)
|
||||
|
||||
---
|
||||
|
||||
## Architecture Layers
|
||||
|
||||
### Layer 1: Model Abstraction (`src/llm/`)
|
||||
|
||||
**Provider Factory** (`provider.ts`)
|
||||
```typescript
|
||||
const factory = new LLMProviderFactory(config, logger);
|
||||
|
||||
// Create any model
|
||||
const claude = factory.createModel({
|
||||
provider: 'anthropic',
|
||||
model: 'claude-3-5-sonnet-20241022',
|
||||
});
|
||||
|
||||
const gpt4 = factory.createModel({
|
||||
provider: 'openai',
|
||||
model: 'gpt-4o',
|
||||
});
|
||||
```
|
||||
|
||||
**Model Router** (`router.ts`)
|
||||
```typescript
|
||||
const router = new ModelRouter(factory, logger);
|
||||
|
||||
// Intelligently route based on:
|
||||
// - User license (free → Gemini Flash, pro → GPT-4, enterprise → Claude)
|
||||
// - Query complexity (simple → cheap, complex → smart)
|
||||
// - User preference (if set in license.preferredModel)
|
||||
// - Cost optimization (always use cheapest)
|
||||
|
||||
const model = await router.route(
|
||||
message.content,
|
||||
userLicense,
|
||||
RoutingStrategy.COMPLEXITY
|
||||
);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Layer 2: Agent Harness (`src/harness/`)
|
||||
|
||||
**Stateless Orchestrator**
|
||||
|
||||
The harness has **ZERO conversation state**. Everything lives in user's MCP container.
|
||||
|
||||
**Flow:**
|
||||
```typescript
|
||||
async handleMessage(message: InboundMessage) {
|
||||
// 1. Fetch context from user's MCP (resources, not tools)
|
||||
const resources = await mcpClient.listResources();
|
||||
const context = await Promise.all([
|
||||
mcpClient.readResource('context://user-profile'), // Trading style
|
||||
mcpClient.readResource('context://conversation-summary'), // RAG summary
|
||||
mcpClient.readResource('context://workspace-state'), // Current chart
|
||||
mcpClient.readResource('context://system-prompt'), // Custom instructions
|
||||
]);
|
||||
|
||||
// 2. Route to appropriate model
|
||||
const model = await modelRouter.route(message, license);
|
||||
|
||||
// 3. Build messages with embedded context
|
||||
const messages = buildLangChainMessages(systemPrompt, context);
|
||||
|
||||
// 4. Call LLM
|
||||
const response = await model.invoke(messages);
|
||||
|
||||
// 5. Save to user's MCP (tool call)
|
||||
await mcpClient.callTool('save_message', { role: 'user', content: message });
|
||||
await mcpClient.callTool('save_message', { role: 'assistant', content: response });
|
||||
|
||||
return response;
|
||||
}
|
||||
```
|
||||
|
||||
**Streaming variant:**
|
||||
```typescript
|
||||
async *streamMessage(message: InboundMessage) {
|
||||
const model = await modelRouter.route(message, license);
|
||||
const messages = buildMessages(context, message);
|
||||
|
||||
const stream = await model.stream(messages);
|
||||
|
||||
let fullResponse = '';
|
||||
for await (const chunk of stream) {
|
||||
fullResponse += chunk.content;
|
||||
yield chunk.content; // Stream to WebSocket/Telegram
|
||||
}
|
||||
|
||||
// Save after streaming completes
|
||||
await mcpClient.callTool('save_message', { /* ... */ });
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Layer 3: Workflows (`src/workflows/`)
|
||||
|
||||
**LangGraph for Complex Trading Analysis**
|
||||
|
||||
```typescript
|
||||
// Example: Strategy Analysis Pipeline
|
||||
const workflow = new StateGraph(StrategyAnalysisState)
|
||||
.addNode('code_review', async (state) => {
|
||||
const model = new ChatAnthropic({ model: 'claude-3-opus' });
|
||||
const review = await model.invoke(`Review: ${state.strategyCode}`);
|
||||
return { codeReview: review.content };
|
||||
})
|
||||
.addNode('backtest', async (state) => {
|
||||
// Call user's MCP backtest tool
|
||||
const results = await mcpClient.callTool('run_backtest', {
|
||||
strategy: state.strategyCode,
|
||||
ticker: state.ticker,
|
||||
});
|
||||
return { backtestResults: results };
|
||||
})
|
||||
.addNode('risk_assessment', async (state) => {
|
||||
const model = new ChatAnthropic({ model: 'claude-3-5-sonnet' });
|
||||
const assessment = await model.invoke(
|
||||
`Analyze risk: ${JSON.stringify(state.backtestResults)}`
|
||||
);
|
||||
return { riskAssessment: assessment.content };
|
||||
})
|
||||
.addNode('human_approval', async (state) => {
|
||||
// Pause for user review (human-in-the-loop)
|
||||
return { humanApproved: await waitForUserApproval(state) };
|
||||
})
|
||||
.addConditionalEdges('human_approval', (state) => {
|
||||
return state.humanApproved ? 'deploy' : 'reject';
|
||||
})
|
||||
.compile();
|
||||
|
||||
// Execute
|
||||
const result = await workflow.invoke({
|
||||
strategyCode: userCode,
|
||||
ticker: 'BTC/USDT',
|
||||
timeframe: '1h',
|
||||
});
|
||||
```
|
||||
|
||||
**Benefits:**
|
||||
- **Stateful**: Resume if server crashes mid-analysis
|
||||
- **Conditional**: Route based on results (if Sharpe > 2 → deploy, else → reject)
|
||||
- **Human-in-the-loop**: Pause for user approval
|
||||
- **Multi-step**: Each node can use different models
|
||||
|
||||
---
|
||||
|
||||
## User Context Architecture
|
||||
|
||||
### MCP Resources (Not Tools)
|
||||
|
||||
**User's MCP server exposes resources** (read-only context):
|
||||
|
||||
```
|
||||
context://user-profile → Trading style, preferences
|
||||
context://conversation-summary → RAG-generated summary
|
||||
context://workspace-state → Current chart, positions
|
||||
context://system-prompt → User's custom AI instructions
|
||||
```
|
||||
|
||||
**Gateway fetches and embeds in LLM call:**
|
||||
```typescript
|
||||
const userProfile = await mcpClient.readResource('context://user-profile');
|
||||
const conversationSummary = await mcpClient.readResource('context://conversation-summary');
|
||||
|
||||
// User's MCP server runs RAG search and returns summary
|
||||
// Gateway embeds this in Claude/GPT prompt
|
||||
```
|
||||
|
||||
**Why resources, not tools?**
|
||||
- Resources = context injection (read-only)
|
||||
- Tools = actions (write operations)
|
||||
- Context should be fetched **before** LLM call, not during
|
||||
|
||||
---
|
||||
|
||||
## Model Routing Strategies
|
||||
|
||||
### 1. User Preference
|
||||
```typescript
|
||||
// User's license has preferred model
|
||||
{
|
||||
"preferredModel": {
|
||||
"provider": "anthropic",
|
||||
"model": "claude-3-5-sonnet-20241022"
|
||||
}
|
||||
}
|
||||
|
||||
// Router uses this if set
|
||||
```
|
||||
|
||||
### 2. Complexity-Based
|
||||
```typescript
|
||||
const isComplex = message.includes('backtest') || message.length > 200;
|
||||
|
||||
if (isComplex) {
|
||||
return { provider: 'anthropic', model: 'claude-3-opus' }; // Smart
|
||||
} else {
|
||||
return { provider: 'openai', model: 'gpt-4o-mini' }; // Fast
|
||||
}
|
||||
```
|
||||
|
||||
### 3. License Tier
|
||||
```typescript
|
||||
switch (license.licenseType) {
|
||||
case 'free':
|
||||
return { provider: 'google', model: 'gemini-2.0-flash-exp' }; // Cheap
|
||||
case 'pro':
|
||||
return { provider: 'openai', model: 'gpt-4o' }; // Balanced
|
||||
case 'enterprise':
|
||||
return { provider: 'anthropic', model: 'claude-3-5-sonnet' }; // Premium
|
||||
}
|
||||
```
|
||||
|
||||
### 4. Cost-Optimized
|
||||
```typescript
|
||||
return { provider: 'google', model: 'gemini-2.0-flash-exp' }; // Always cheapest
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## When to Use What
|
||||
|
||||
### Simple Chat → Agent Harness
|
||||
```typescript
|
||||
// User: "What's the RSI on BTC?"
|
||||
// → Fast streaming response via harness.streamMessage()
|
||||
```
|
||||
|
||||
### Complex Analysis → LangGraph Workflow
|
||||
```typescript
|
||||
// User: "Analyze this strategy and backtest it"
|
||||
// → Multi-step workflow: code review → backtest → risk → approval
|
||||
```
|
||||
|
||||
### Direct Tool Call → MCP Client
|
||||
```typescript
|
||||
// User: "Get my watchlist"
|
||||
// → Direct MCP tool call, no LLM needed
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Data Flow
|
||||
|
||||
```
|
||||
User Message ("Analyze my strategy")
|
||||
↓
|
||||
Gateway → Route to workflow (not harness)
|
||||
↓
|
||||
LangGraph Workflow:
|
||||
├─ Node 1: Code Review (Claude Opus)
|
||||
│ └─ Analyzes strategy code
|
||||
├─ Node 2: Backtest (MCP tool call)
|
||||
│ └─ User's container runs backtest
|
||||
├─ Node 3: Risk Assessment (Claude Sonnet)
|
||||
│ └─ Evaluates results
|
||||
├─ Node 4: Human Approval (pause)
|
||||
│ └─ User reviews in UI
|
||||
└─ Node 5: Recommendation (GPT-4o-mini)
|
||||
└─ Final decision
|
||||
|
||||
Result → Return to user
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Benefits Summary
|
||||
|
||||
| Feature | LangChain.js | Vercel AI SDK | Direct Anthropic SDK |
|
||||
|---------|--------------|---------------|----------------------|
|
||||
| Multi-model | ✅ 300+ models | ✅ 100+ models | ❌ Anthropic only |
|
||||
| Streaming | ✅ `.stream()` | ✅ `streamText()` | ✅ `.stream()` |
|
||||
| Tool calling | ✅ Unified | ✅ Unified | ✅ Anthropic format |
|
||||
| Complex workflows | ✅ LangGraph | ❌ Limited | ❌ DIY |
|
||||
| Stateful agents | ✅ LangGraph | ❌ No | ❌ No |
|
||||
| Human-in-the-loop | ✅ LangGraph | ❌ No | ❌ No |
|
||||
| React hooks | ❌ N/A | ✅ `useChat()` | ❌ N/A |
|
||||
| Bundle size | Large (101kb) | Small (30kb) | Medium (60kb) |
|
||||
| **Dexorder needs** | **✅ Perfect fit** | **❌ Missing workflows** | **❌ Vendor lock-in** |
|
||||
|
||||
---
|
||||
|
||||
## Next Steps
|
||||
|
||||
1. **Implement tool calling** in agent harness (bind MCP tools to LangChain)
|
||||
2. **Add state persistence** for LangGraph (PostgreSQL checkpointer)
|
||||
3. **Build more workflows**: market scanner, portfolio optimizer
|
||||
4. **Add monitoring**: Track model usage, costs, latency
|
||||
5. **User container**: Implement Python MCP server with resources
|
||||
Reference in New Issue
Block a user