14 KiB
User MCP Server - Resource Architecture
The user's MCP server container owns all conversation history, RAG, and contextual data. The platform gateway is a thin, stateless orchestrator that only holds the Anthropic API key.
Architecture Principle
User Container = Fat Context
- Conversation history (PostgreSQL/SQLite)
- RAG system (embeddings, vector search)
- User preferences and custom prompts
- Trading context (positions, watchlists, alerts)
- All user-specific data
Platform Gateway = Thin Orchestrator
- Anthropic API key (platform pays for LLM)
- Session management (WebSocket/Telegram connections)
- MCP client connection pooling
- Tool routing (platform vs user tools)
- Zero conversation state stored
MCP Resources for Context Injection
Resources are read-only data sources that provide context to the LLM. They're fetched before each Claude API call and embedded in the conversation.
Standard Context Resources
1. context://user-profile
Purpose: User's trading background and preferences
MIME Type: text/plain
Example Content:
User Profile:
- Trading experience: Intermediate
- Preferred timeframes: 1h, 4h, 1d
- Risk tolerance: Medium
- Focus: Swing trading with technical indicators
- Favorite indicators: RSI, MACD, Bollinger Bands
- Active pairs: BTC/USDT, ETH/USDT, SOL/USDT
Implementation Notes:
- Stored in user's database
user_preferencestable - Updated via preference management tools
- Includes inferred data from usage patterns
2. context://conversation-summary
Purpose: Semantic summary of recent conversation with RAG-enhanced context
MIME Type: text/plain
Example Content:
Recent Conversation Summary:
Last 10 messages (summarized):
- User asked about moving average crossover strategies
- Discussed backtesting parameters for BTC/USDT
- Reviewed risk management with 2% position sizing
- Explored adding RSI filter to reduce false signals
Relevant past discussions (RAG search):
- 2 weeks ago: Similar strategy development on ETH/USDT
- 1 month ago: User prefers simple strategies over complex ones
- Past preference: Avoid strategies with >5 indicators
Current focus: Optimizing MA crossover with momentum filter
Implementation Notes:
- Last N messages stored in
conversation_historytable - RAG search against embeddings of past conversations
- Semantic search using user's current message as query
- ChromaDB/pgvector for embedding storage
- Summary generated on-demand (can be cached for 1-5 minutes)
RAG Integration:
async def get_conversation_summary() -> str:
# Get recent messages
recent = await db.get_recent_messages(limit=50)
# Semantic search for relevant context
relevant = await rag.search_conversation_history(
query=recent[-1].content, # Last user message
limit=5,
min_score=0.7
)
# Build summary
return build_summary(recent[-10:], relevant)
3. context://workspace-state
Purpose: Current trading workspace (chart, positions, watchlist)
MIME Type: application/json
Example Content:
{
"currentChart": {
"ticker": "BINANCE:BTC/USDT",
"timeframe": "1h",
"indicators": ["SMA(20)", "RSI(14)", "MACD(12,26,9)"]
},
"watchlist": ["BTC/USDT", "ETH/USDT", "SOL/USDT"],
"openPositions": [
{
"ticker": "BTC/USDT",
"side": "long",
"size": 0.1,
"entryPrice": 45000,
"currentPrice": 46500,
"unrealizedPnL": 150
}
],
"recentAlerts": [
{
"type": "price_alert",
"message": "BTC/USDT crossed above $46,000",
"timestamp": "2025-01-15T10:30:00Z"
}
]
}
Implementation Notes:
- Synced from web client chart state
- Updated via WebSocket sync protocol
- Includes active indicators on current chart
- Position data from trading system
4. context://system-prompt
Purpose: User's custom instructions and preferences for AI behavior
MIME Type: text/plain
Example Content:
Custom Instructions:
- Be concise and data-driven
- Always show risk/reward ratios
- Prefer simple strategies over complex ones
- When suggesting trades, include stop-loss and take-profit levels
- Explain your reasoning in trading decisions
Implementation Notes:
- User-editable in preferences UI
- Appended last to system prompt (highest priority)
- Can override platform defaults
- Stored in
user_preferences.custom_promptfield
MCP Tools for Actions
Tools are for actions that have side effects. These are not used for context fetching.
Conversation Management
save_message(role, content, timestamp)- Save message to historysearch_conversation(query, limit)- Explicit semantic search (for user queries like "what did we discuss about BTC?")
Strategy & Indicators
list_strategies()- List user's strategiesread_strategy(name)- Get strategy codewrite_strategy(name, code)- Save strategyrun_backtest(strategy, params)- Execute backtest
Trading
get_watchlist()- Get watchlist (action that may trigger sync)execute_trade(params)- Execute trade orderget_positions()- Fetch current positions from exchange
Sandbox
run_python(code)- Execute Python code with data science libraries
Gateway Harness Flow
// gateway/src/harness/agent-harness.ts
async handleMessage(message: InboundMessage): Promise<OutboundMessage> {
// 1. Fetch context resources from user's MCP
const contextResources = await fetchContextResources([
'context://user-profile',
'context://conversation-summary', // <-- RAG happens here
'context://workspace-state',
'context://system-prompt',
]);
// 2. Build system prompt from resources
const systemPrompt = buildSystemPrompt(contextResources);
// 3. Build messages with embedded conversation context
const messages = buildMessages(message, contextResources);
// 4. Get tools from MCP
const tools = await mcpClient.listTools();
// 5. Call Claude with embedded context
const response = await anthropic.messages.create({
model: 'claude-3-5-sonnet-20241022',
system: systemPrompt, // <-- User profile + workspace + custom prompt
messages, // <-- Conversation summary from RAG
tools,
});
// 6. Save to user's MCP (tool call)
await mcpClient.callTool('save_message', { role: 'user', content: message.content });
await mcpClient.callTool('save_message', { role: 'assistant', content: response });
return response;
}
User MCP Server Implementation (Python)
Resource Handler
# user-mcp/src/resources.py
from mcp.server import Server
from mcp.types import Resource, ResourceTemplate
import asyncpg
server = Server("dexorder-user")
@server.list_resources()
async def list_resources() -> list[Resource]:
return [
Resource(
uri="context://user-profile",
name="User Profile",
description="Trading style, preferences, and background",
mimeType="text/plain",
),
Resource(
uri="context://conversation-summary",
name="Conversation Summary",
description="Recent conversation with RAG-enhanced context",
mimeType="text/plain",
),
Resource(
uri="context://workspace-state",
name="Workspace State",
description="Current chart, watchlist, positions",
mimeType="application/json",
),
Resource(
uri="context://system-prompt",
name="Custom System Prompt",
description="User's custom AI instructions",
mimeType="text/plain",
),
]
@server.read_resource()
async def read_resource(uri: str) -> str:
if uri == "context://user-profile":
return await build_user_profile()
elif uri == "context://conversation-summary":
return await build_conversation_summary()
elif uri == "context://workspace-state":
return await build_workspace_state()
elif uri == "context://system-prompt":
return await get_custom_prompt()
else:
raise ValueError(f"Unknown resource: {uri}")
RAG Integration
# user-mcp/src/rag.py
import chromadb
from sentence_transformers import SentenceTransformer
class ConversationRAG:
def __init__(self, db_path: str):
self.chroma = chromadb.PersistentClient(path=db_path)
self.collection = self.chroma.get_or_create_collection("conversations")
self.embedder = SentenceTransformer('all-MiniLM-L6-v2')
async def search_conversation_history(
self,
query: str,
limit: int = 5,
min_score: float = 0.7
) -> list[dict]:
"""Semantic search over conversation history"""
# Embed query
query_embedding = self.embedder.encode(query).tolist()
# Search
results = self.collection.query(
query_embeddings=[query_embedding],
n_results=limit,
)
# Filter by score and format
relevant = []
for i, score in enumerate(results['distances'][0]):
if score >= min_score:
relevant.append({
'content': results['documents'][0][i],
'metadata': results['metadatas'][0][i],
'score': score,
})
return relevant
async def add_message(self, message_id: str, role: str, content: str, metadata: dict):
"""Add message to RAG index"""
embedding = self.embedder.encode(content).tolist()
self.collection.add(
ids=[message_id],
embeddings=[embedding],
documents=[content],
metadatas=[{
'role': role,
'timestamp': metadata.get('timestamp'),
**metadata
}]
)
Conversation Summary Builder
# user-mcp/src/context.py
async def build_conversation_summary(user_id: str) -> str:
"""Build conversation summary with RAG"""
# 1. Get recent messages
recent_messages = await db.get_messages(
user_id=user_id,
limit=50,
order='desc'
)
# 2. Get current focus (last user message)
last_user_msg = next(
(m for m in recent_messages if m.role == 'user'),
None
)
if not last_user_msg:
return "No recent conversation history."
# 3. RAG search for relevant context
rag = ConversationRAG(f"/data/users/{user_id}/rag")
relevant_context = await rag.search_conversation_history(
query=last_user_msg.content,
limit=5,
min_score=0.7
)
# 4. Build summary
summary = f"Recent Conversation Summary:\n\n"
# Recent messages (last 10)
summary += "Last 10 messages:\n"
for msg in recent_messages[-10:]:
summary += f"- {msg.role}: {msg.content[:100]}...\n"
# Relevant past context
if relevant_context:
summary += "\nRelevant past discussions (RAG):\n"
for ctx in relevant_context:
timestamp = ctx['metadata'].get('timestamp', 'unknown')
summary += f"- [{timestamp}] {ctx['content'][:150]}...\n"
# Inferred focus
summary += f"\nCurrent focus: {infer_topic(last_user_msg.content)}\n"
return summary
def infer_topic(message: str) -> str:
"""Simple topic extraction"""
keywords = {
'strategy': ['strategy', 'backtest', 'trading system'],
'indicator': ['indicator', 'rsi', 'macd', 'moving average'],
'analysis': ['analyze', 'chart', 'price action'],
'risk': ['risk', 'position size', 'stop loss'],
}
message_lower = message.lower()
for topic, words in keywords.items():
if any(word in message_lower for word in words):
return topic
return 'general trading discussion'
Benefits of This Architecture
- Privacy: Conversation history never leaves user's container
- Customization: Each user controls their RAG, embeddings, prompt engineering
- Scalability: Platform harness is stateless - horizontally scalable
- Cost Control: Platform pays for Claude, users pay for their compute/storage
- Portability: Users can export/migrate their entire context
- Development: Users can test prompts/context locally without platform involvement
Future Enhancements
Dynamic Resource URIs
Support parameterized resources:
context://conversation/{session_id}
context://strategy/{strategy_name}
context://backtest/{backtest_id}/results
Resource Templates
MCP supports resource templates for dynamic discovery:
@server.list_resource_templates()
async def list_templates() -> list[ResourceTemplate]:
return [
ResourceTemplate(
uriTemplate="context://strategy/{name}",
name="Strategy Context",
description="Context for specific strategy",
)
]
Streaming Resources
For large context (e.g., full backtest results), support streaming:
@server.read_resource()
async def read_resource(uri: str) -> AsyncIterator[str]:
if uri.startswith("context://backtest/"):
async for chunk in stream_backtest_results(uri):
yield chunk
Migration Path
For users with existing conversation history in platform DB:
- Export script: Migrate platform history → user container DB
- RAG indexing: Embed all historical messages into ChromaDB
- Preference migration: Copy user preferences to container
- Cutover: Switch to resource-based context fetching
Platform can keep read-only archive for compliance, but active context lives in user container.