# User MCP Server - Resource Architecture The user's MCP server container owns **all** conversation history, RAG, and contextual data. The platform gateway is a thin, stateless orchestrator that only holds the Anthropic API key. ## Architecture Principle **User Container = Fat Context** - Conversation history (PostgreSQL/SQLite) - RAG system (embeddings, vector search) - User preferences and custom prompts - Trading context (positions, watchlists, alerts) - All user-specific data **Platform Gateway = Thin Orchestrator** - Anthropic API key (platform pays for LLM) - Session management (WebSocket/Telegram connections) - MCP client connection pooling - Tool routing (platform vs user tools) - **Zero conversation state stored** ## MCP Resources for Context Injection Resources are **read-only** data sources that provide context to the LLM. They're fetched before each Claude API call and embedded in the conversation. ### Standard Context Resources #### 1. `context://user-profile` **Purpose:** User's trading background and preferences **MIME Type:** `text/plain` **Example Content:** ``` User Profile: - Trading experience: Intermediate - Preferred timeframes: 1h, 4h, 1d - Risk tolerance: Medium - Focus: Swing trading with technical indicators - Favorite indicators: RSI, MACD, Bollinger Bands - Active pairs: BTC/USDT, ETH/USDT, SOL/USDT ``` **Implementation Notes:** - Stored in user's database `user_preferences` table - Updated via preference management tools - Includes inferred data from usage patterns --- #### 2. `context://conversation-summary` **Purpose:** Semantic summary of recent conversation with RAG-enhanced context **MIME Type:** `text/plain` **Example Content:** ``` Recent Conversation Summary: Last 10 messages (summarized): - User asked about moving average crossover strategies - Discussed backtesting parameters for BTC/USDT - Reviewed risk management with 2% position sizing - Explored adding RSI filter to reduce false signals Relevant past discussions (RAG search): - 2 weeks ago: Similar strategy development on ETH/USDT - 1 month ago: User prefers simple strategies over complex ones - Past preference: Avoid strategies with >5 indicators Current focus: Optimizing MA crossover with momentum filter ``` **Implementation Notes:** - Last N messages stored in `conversation_history` table - RAG search against embeddings of past conversations - Semantic search using user's current message as query - ChromaDB/pgvector for embedding storage - Summary generated on-demand (can be cached for 1-5 minutes) **RAG Integration:** ```python async def get_conversation_summary() -> str: # Get recent messages recent = await db.get_recent_messages(limit=50) # Semantic search for relevant context relevant = await rag.search_conversation_history( query=recent[-1].content, # Last user message limit=5, min_score=0.7 ) # Build summary return build_summary(recent[-10:], relevant) ``` --- #### 3. `context://workspace-state` **Purpose:** Current trading workspace (chart, positions, watchlist) **MIME Type:** `application/json` **Example Content:** ```json { "currentChart": { "ticker": "BINANCE:BTC/USDT", "timeframe": "1h", "indicators": ["SMA(20)", "RSI(14)", "MACD(12,26,9)"] }, "watchlist": ["BTC/USDT", "ETH/USDT", "SOL/USDT"], "openPositions": [ { "ticker": "BTC/USDT", "side": "long", "size": 0.1, "entryPrice": 45000, "currentPrice": 46500, "unrealizedPnL": 150 } ], "recentAlerts": [ { "type": "price_alert", "message": "BTC/USDT crossed above $46,000", "timestamp": "2025-01-15T10:30:00Z" } ] } ``` **Implementation Notes:** - Synced from web client chart state - Updated via WebSocket sync protocol - Includes active indicators on current chart - Position data from trading system --- #### 4. `context://system-prompt` **Purpose:** User's custom instructions and preferences for AI behavior **MIME Type:** `text/plain` **Example Content:** ``` Custom Instructions: - Be concise and data-driven - Always show risk/reward ratios - Prefer simple strategies over complex ones - When suggesting trades, include stop-loss and take-profit levels - Explain your reasoning in trading decisions ``` **Implementation Notes:** - User-editable in preferences UI - Appended **last** to system prompt (highest priority) - Can override platform defaults - Stored in `user_preferences.custom_prompt` field --- ## MCP Tools for Actions Tools are for **actions** that have side effects. These are **not** used for context fetching. ### Conversation Management - `save_message(role, content, timestamp)` - Save message to history - `search_conversation(query, limit)` - Explicit semantic search (for user queries like "what did we discuss about BTC?") ### Strategy & Indicators - `list_strategies()` - List user's strategies - `read_strategy(name)` - Get strategy code - `write_strategy(name, code)` - Save strategy - `run_backtest(strategy, params)` - Execute backtest ### Trading - `get_watchlist()` - Get watchlist (action that may trigger sync) - `execute_trade(params)` - Execute trade order - `get_positions()` - Fetch current positions from exchange ### Sandbox - `run_python(code)` - Execute Python code with data science libraries --- ## Gateway Harness Flow ```typescript // gateway/src/harness/agent-harness.ts async handleMessage(message: InboundMessage): Promise { // 1. Fetch context resources from user's MCP const contextResources = await fetchContextResources([ 'context://user-profile', 'context://conversation-summary', // <-- RAG happens here 'context://workspace-state', 'context://system-prompt', ]); // 2. Build system prompt from resources const systemPrompt = buildSystemPrompt(contextResources); // 3. Build messages with embedded conversation context const messages = buildMessages(message, contextResources); // 4. Get tools from MCP const tools = await mcpClient.listTools(); // 5. Call Claude with embedded context const response = await anthropic.messages.create({ model: 'claude-3-5-sonnet-20241022', system: systemPrompt, // <-- User profile + workspace + custom prompt messages, // <-- Conversation summary from RAG tools, }); // 6. Save to user's MCP (tool call) await mcpClient.callTool('save_message', { role: 'user', content: message.content }); await mcpClient.callTool('save_message', { role: 'assistant', content: response }); return response; } ``` --- ## User MCP Server Implementation (Python) ### Resource Handler ```python # user-mcp/src/resources.py from mcp.server import Server from mcp.types import Resource, ResourceTemplate import asyncpg server = Server("dexorder-user") @server.list_resources() async def list_resources() -> list[Resource]: return [ Resource( uri="context://user-profile", name="User Profile", description="Trading style, preferences, and background", mimeType="text/plain", ), Resource( uri="context://conversation-summary", name="Conversation Summary", description="Recent conversation with RAG-enhanced context", mimeType="text/plain", ), Resource( uri="context://workspace-state", name="Workspace State", description="Current chart, watchlist, positions", mimeType="application/json", ), Resource( uri="context://system-prompt", name="Custom System Prompt", description="User's custom AI instructions", mimeType="text/plain", ), ] @server.read_resource() async def read_resource(uri: str) -> str: if uri == "context://user-profile": return await build_user_profile() elif uri == "context://conversation-summary": return await build_conversation_summary() elif uri == "context://workspace-state": return await build_workspace_state() elif uri == "context://system-prompt": return await get_custom_prompt() else: raise ValueError(f"Unknown resource: {uri}") ``` ### RAG Integration ```python # user-mcp/src/rag.py import chromadb from sentence_transformers import SentenceTransformer class ConversationRAG: def __init__(self, db_path: str): self.chroma = chromadb.PersistentClient(path=db_path) self.collection = self.chroma.get_or_create_collection("conversations") self.embedder = SentenceTransformer('all-MiniLM-L6-v2') async def search_conversation_history( self, query: str, limit: int = 5, min_score: float = 0.7 ) -> list[dict]: """Semantic search over conversation history""" # Embed query query_embedding = self.embedder.encode(query).tolist() # Search results = self.collection.query( query_embeddings=[query_embedding], n_results=limit, ) # Filter by score and format relevant = [] for i, score in enumerate(results['distances'][0]): if score >= min_score: relevant.append({ 'content': results['documents'][0][i], 'metadata': results['metadatas'][0][i], 'score': score, }) return relevant async def add_message(self, message_id: str, role: str, content: str, metadata: dict): """Add message to RAG index""" embedding = self.embedder.encode(content).tolist() self.collection.add( ids=[message_id], embeddings=[embedding], documents=[content], metadatas=[{ 'role': role, 'timestamp': metadata.get('timestamp'), **metadata }] ) ``` ### Conversation Summary Builder ```python # user-mcp/src/context.py async def build_conversation_summary(user_id: str) -> str: """Build conversation summary with RAG""" # 1. Get recent messages recent_messages = await db.get_messages( user_id=user_id, limit=50, order='desc' ) # 2. Get current focus (last user message) last_user_msg = next( (m for m in recent_messages if m.role == 'user'), None ) if not last_user_msg: return "No recent conversation history." # 3. RAG search for relevant context rag = ConversationRAG(f"/data/users/{user_id}/rag") relevant_context = await rag.search_conversation_history( query=last_user_msg.content, limit=5, min_score=0.7 ) # 4. Build summary summary = f"Recent Conversation Summary:\n\n" # Recent messages (last 10) summary += "Last 10 messages:\n" for msg in recent_messages[-10:]: summary += f"- {msg.role}: {msg.content[:100]}...\n" # Relevant past context if relevant_context: summary += "\nRelevant past discussions (RAG):\n" for ctx in relevant_context: timestamp = ctx['metadata'].get('timestamp', 'unknown') summary += f"- [{timestamp}] {ctx['content'][:150]}...\n" # Inferred focus summary += f"\nCurrent focus: {infer_topic(last_user_msg.content)}\n" return summary def infer_topic(message: str) -> str: """Simple topic extraction""" keywords = { 'strategy': ['strategy', 'backtest', 'trading system'], 'indicator': ['indicator', 'rsi', 'macd', 'moving average'], 'analysis': ['analyze', 'chart', 'price action'], 'risk': ['risk', 'position size', 'stop loss'], } message_lower = message.lower() for topic, words in keywords.items(): if any(word in message_lower for word in words): return topic return 'general trading discussion' ``` --- ## Benefits of This Architecture 1. **Privacy**: Conversation history never leaves user's container 2. **Customization**: Each user controls their RAG, embeddings, prompt engineering 3. **Scalability**: Platform harness is stateless - horizontally scalable 4. **Cost Control**: Platform pays for Claude, users pay for their compute/storage 5. **Portability**: Users can export/migrate their entire context 6. **Development**: Users can test prompts/context locally without platform involvement --- ## Future Enhancements ### Dynamic Resource URIs Support parameterized resources: ``` context://conversation/{session_id} context://strategy/{strategy_name} context://backtest/{backtest_id}/results ``` ### Resource Templates MCP supports resource templates for dynamic discovery: ```python @server.list_resource_templates() async def list_templates() -> list[ResourceTemplate]: return [ ResourceTemplate( uriTemplate="context://strategy/{name}", name="Strategy Context", description="Context for specific strategy", ) ] ``` ### Streaming Resources For large context (e.g., full backtest results), support streaming: ```python @server.read_resource() async def read_resource(uri: str) -> AsyncIterator[str]: if uri.startswith("context://backtest/"): async for chunk in stream_backtest_results(uri): yield chunk ``` --- ## Migration Path For users with existing conversation history in platform DB: 1. **Export script**: Migrate platform history → user container DB 2. **RAG indexing**: Embed all historical messages into ChromaDB 3. **Preference migration**: Copy user preferences to container 4. **Cutover**: Switch to resource-based context fetching Platform can keep read-only archive for compliance, but active context lives in user container.