9.0 KiB
9.0 KiB
Gateway Architecture: LangChain.js + LangGraph
Why LangChain.js (Not Vercel AI SDK or Direct Anthropic SDK)?
The Decision
After evaluating Vercel AI SDK and LangChain.js, we chose LangChain.js + LangGraph for these reasons:
- Multi-model support: 300+ models via OpenRouter, plus direct integrations
- Complex workflows: LangGraph for stateful trading analysis pipelines
- No vendor lock-in: Switch between Anthropic, OpenAI, Google with one line
- Streaming: Same as Vercel AI SDK (
.stream()method) - Tool calling: Unified across all providers
- Trading-specific: State management, conditional branching, human-in-the-loop
We don't need Vercel AI SDK because:
- ❌ We use Vue (not React) - don't need React hooks
- ❌ We have Node.js servers (not edge) - don't need edge runtime
- ✅ DO need complex workflows (strategy analysis, backtesting, approvals)
- ✅ DO need stateful execution (resume from failures)
Architecture Layers
Layer 1: Model Abstraction (src/llm/)
Provider Factory (provider.ts)
const factory = new LLMProviderFactory(config, logger);
// Create any model
const claude = factory.createModel({
provider: 'anthropic',
model: 'claude-3-5-sonnet-20241022',
});
const gpt4 = factory.createModel({
provider: 'openai',
model: 'gpt-4o',
});
Model Router (router.ts)
const router = new ModelRouter(factory, logger);
// Intelligently route based on:
// - User license (free → Gemini Flash, pro → GPT-4, enterprise → Claude)
// - Query complexity (simple → cheap, complex → smart)
// - User preference (if set in license.preferredModel)
// - Cost optimization (always use cheapest)
const model = await router.route(
message.content,
userLicense,
RoutingStrategy.COMPLEXITY
);
Layer 2: Agent Harness (src/harness/)
Stateless Orchestrator
The harness has ZERO conversation state. Everything lives in user's MCP container.
Flow:
async handleMessage(message: InboundMessage) {
// 1. Fetch context from user's MCP (resources, not tools)
const resources = await mcpClient.listResources();
const context = await Promise.all([
mcpClient.readResource('context://user-profile'), // Trading style
mcpClient.readResource('context://conversation-summary'), // RAG summary
mcpClient.readResource('context://workspace-state'), // Current chart
mcpClient.readResource('context://system-prompt'), // Custom instructions
]);
// 2. Route to appropriate model
const model = await modelRouter.route(message, license);
// 3. Build messages with embedded context
const messages = buildLangChainMessages(systemPrompt, context);
// 4. Call LLM
const response = await model.invoke(messages);
// 5. Save to user's MCP (tool call)
await mcpClient.callTool('save_message', { role: 'user', content: message });
await mcpClient.callTool('save_message', { role: 'assistant', content: response });
return response;
}
Streaming variant:
async *streamMessage(message: InboundMessage) {
const model = await modelRouter.route(message, license);
const messages = buildMessages(context, message);
const stream = await model.stream(messages);
let fullResponse = '';
for await (const chunk of stream) {
fullResponse += chunk.content;
yield chunk.content; // Stream to WebSocket/Telegram
}
// Save after streaming completes
await mcpClient.callTool('save_message', { /* ... */ });
}
Layer 3: Workflows (src/workflows/)
LangGraph for Complex Trading Analysis
// Example: Strategy Analysis Pipeline
const workflow = new StateGraph(StrategyAnalysisState)
.addNode('code_review', async (state) => {
const model = new ChatAnthropic({ model: 'claude-3-opus' });
const review = await model.invoke(`Review: ${state.strategyCode}`);
return { codeReview: review.content };
})
.addNode('backtest', async (state) => {
// Call user's MCP backtest tool
const results = await mcpClient.callTool('run_backtest', {
strategy: state.strategyCode,
ticker: state.ticker,
});
return { backtestResults: results };
})
.addNode('risk_assessment', async (state) => {
const model = new ChatAnthropic({ model: 'claude-3-5-sonnet' });
const assessment = await model.invoke(
`Analyze risk: ${JSON.stringify(state.backtestResults)}`
);
return { riskAssessment: assessment.content };
})
.addNode('human_approval', async (state) => {
// Pause for user review (human-in-the-loop)
return { humanApproved: await waitForUserApproval(state) };
})
.addConditionalEdges('human_approval', (state) => {
return state.humanApproved ? 'deploy' : 'reject';
})
.compile();
// Execute
const result = await workflow.invoke({
strategyCode: userCode,
ticker: 'BTC/USDT',
timeframe: '1h',
});
Benefits:
- Stateful: Resume if server crashes mid-analysis
- Conditional: Route based on results (if Sharpe > 2 → deploy, else → reject)
- Human-in-the-loop: Pause for user approval
- Multi-step: Each node can use different models
User Context Architecture
MCP Resources (Not Tools)
User's MCP server exposes resources (read-only context):
context://user-profile → Trading style, preferences
context://conversation-summary → RAG-generated summary
context://workspace-state → Current chart, positions
context://system-prompt → User's custom AI instructions
Gateway fetches and embeds in LLM call:
const userProfile = await mcpClient.readResource('context://user-profile');
const conversationSummary = await mcpClient.readResource('context://conversation-summary');
// User's MCP server runs RAG search and returns summary
// Gateway embeds this in Claude/GPT prompt
Why resources, not tools?
- Resources = context injection (read-only)
- Tools = actions (write operations)
- Context should be fetched before LLM call, not during
Model Routing Strategies
1. User Preference
// User's license has preferred model
{
"preferredModel": {
"provider": "anthropic",
"model": "claude-3-5-sonnet-20241022"
}
}
// Router uses this if set
2. Complexity-Based
const isComplex = message.includes('backtest') || message.length > 200;
if (isComplex) {
return { provider: 'anthropic', model: 'claude-3-opus' }; // Smart
} else {
return { provider: 'openai', model: 'gpt-4o-mini' }; // Fast
}
3. License Tier
switch (license.licenseType) {
case 'free':
return { provider: 'google', model: 'gemini-2.0-flash-exp' }; // Cheap
case 'pro':
return { provider: 'openai', model: 'gpt-4o' }; // Balanced
case 'enterprise':
return { provider: 'anthropic', model: 'claude-3-5-sonnet' }; // Premium
}
4. Cost-Optimized
return { provider: 'google', model: 'gemini-2.0-flash-exp' }; // Always cheapest
When to Use What
Simple Chat → Agent Harness
// User: "What's the RSI on BTC?"
// → Fast streaming response via harness.streamMessage()
Complex Analysis → LangGraph Workflow
// User: "Analyze this strategy and backtest it"
// → Multi-step workflow: code review → backtest → risk → approval
Direct Tool Call → MCP Client
// User: "Get my watchlist"
// → Direct MCP tool call, no LLM needed
Data Flow
User Message ("Analyze my strategy")
↓
Gateway → Route to workflow (not harness)
↓
LangGraph Workflow:
├─ Node 1: Code Review (Claude Opus)
│ └─ Analyzes strategy code
├─ Node 2: Backtest (MCP tool call)
│ └─ User's container runs backtest
├─ Node 3: Risk Assessment (Claude Sonnet)
│ └─ Evaluates results
├─ Node 4: Human Approval (pause)
│ └─ User reviews in UI
└─ Node 5: Recommendation (GPT-4o-mini)
└─ Final decision
Result → Return to user
Benefits Summary
| Feature | LangChain.js | Vercel AI SDK | Direct Anthropic SDK |
|---|---|---|---|
| Multi-model | ✅ 300+ models | ✅ 100+ models | ❌ Anthropic only |
| Streaming | ✅ .stream() |
✅ streamText() |
✅ .stream() |
| Tool calling | ✅ Unified | ✅ Unified | ✅ Anthropic format |
| Complex workflows | ✅ LangGraph | ❌ Limited | ❌ DIY |
| Stateful agents | ✅ LangGraph | ❌ No | ❌ No |
| Human-in-the-loop | ✅ LangGraph | ❌ No | ❌ No |
| React hooks | ❌ N/A | ✅ useChat() |
❌ N/A |
| Bundle size | Large (101kb) | Small (30kb) | Medium (60kb) |
| Dexorder needs | ✅ Perfect fit | ❌ Missing workflows | ❌ Vendor lock-in |
Next Steps
- Implement tool calling in agent harness (bind MCP tools to LangChain)
- Add state persistence for LangGraph (PostgreSQL checkpointer)
- Build more workflows: market scanner, portfolio optimizer
- Add monitoring: Track model usage, costs, latency
- User container: Implement Python MCP server with resources