Files

Tim Olson b9cc397e05 container lifecycle management

2026-03-12 15:13:38 -04:00

14 KiB

Raw Blame History

User MCP Server - Resource Architecture

The user's MCP server container owns all conversation history, RAG, and contextual data. The platform gateway is a thin, stateless orchestrator that only holds the Anthropic API key.

Architecture Principle

User Container = Fat Context

Conversation history (PostgreSQL/SQLite)
RAG system (embeddings, vector search)
User preferences and custom prompts
Trading context (positions, watchlists, alerts)
All user-specific data

Platform Gateway = Thin Orchestrator

Anthropic API key (platform pays for LLM)
Session management (WebSocket/Telegram connections)
MCP client connection pooling
Tool routing (platform vs user tools)
Zero conversation state stored

MCP Resources for Context Injection

Resources are read-only data sources that provide context to the LLM. They're fetched before each Claude API call and embedded in the conversation.

Standard Context Resources

1. `context://user-profile`

Purpose: User's trading background and preferences

MIME Type: text/plain

Example Content:

User Profile:
- Trading experience: Intermediate
- Preferred timeframes: 1h, 4h, 1d
- Risk tolerance: Medium
- Focus: Swing trading with technical indicators
- Favorite indicators: RSI, MACD, Bollinger Bands
- Active pairs: BTC/USDT, ETH/USDT, SOL/USDT

Implementation Notes:

Stored in user's database user_preferences table
Updated via preference management tools
Includes inferred data from usage patterns

2. `context://conversation-summary`

Purpose: Semantic summary of recent conversation with RAG-enhanced context

MIME Type: text/plain

Example Content:

Recent Conversation Summary:

Last 10 messages (summarized):
- User asked about moving average crossover strategies
- Discussed backtesting parameters for BTC/USDT
- Reviewed risk management with 2% position sizing
- Explored adding RSI filter to reduce false signals

Relevant past discussions (RAG search):
- 2 weeks ago: Similar strategy development on ETH/USDT
- 1 month ago: User prefers simple strategies over complex ones
- Past preference: Avoid strategies with >5 indicators

Current focus: Optimizing MA crossover with momentum filter

Implementation Notes:

Last N messages stored in conversation_history table
RAG search against embeddings of past conversations
Semantic search using user's current message as query
ChromaDB/pgvector for embedding storage
Summary generated on-demand (can be cached for 1-5 minutes)

RAG Integration:

async def get_conversation_summary() -> str:
    # Get recent messages
    recent = await db.get_recent_messages(limit=50)

    # Semantic search for relevant context
    relevant = await rag.search_conversation_history(
        query=recent[-1].content,  # Last user message
        limit=5,
        min_score=0.7
    )

    # Build summary
    return build_summary(recent[-10:], relevant)

3. `context://workspace-state`

Purpose: Current trading workspace (chart, positions, watchlist)

MIME Type: application/json

Example Content:

{
  "currentChart": {
    "ticker": "BINANCE:BTC/USDT",
    "timeframe": "1h",
    "indicators": ["SMA(20)", "RSI(14)", "MACD(12,26,9)"]
  },
  "watchlist": ["BTC/USDT", "ETH/USDT", "SOL/USDT"],
  "openPositions": [
    {
      "ticker": "BTC/USDT",
      "side": "long",
      "size": 0.1,
      "entryPrice": 45000,
      "currentPrice": 46500,
      "unrealizedPnL": 150
    }
  ],
  "recentAlerts": [
    {
      "type": "price_alert",
      "message": "BTC/USDT crossed above $46,000",
      "timestamp": "2025-01-15T10:30:00Z"
    }
  ]
}

Implementation Notes:

Synced from web client chart state
Updated via WebSocket sync protocol
Includes active indicators on current chart
Position data from trading system

4. `context://system-prompt`

Purpose: User's custom instructions and preferences for AI behavior

MIME Type: text/plain

Example Content:

Custom Instructions:
- Be concise and data-driven
- Always show risk/reward ratios
- Prefer simple strategies over complex ones
- When suggesting trades, include stop-loss and take-profit levels
- Explain your reasoning in trading decisions

Implementation Notes:

User-editable in preferences UI
Appended last to system prompt (highest priority)
Can override platform defaults
Stored in user_preferences.custom_prompt field

MCP Tools for Actions

Tools are for actions that have side effects. These are not used for context fetching.

Conversation Management

save_message(role, content, timestamp) - Save message to history
search_conversation(query, limit) - Explicit semantic search (for user queries like "what did we discuss about BTC?")

Strategy & Indicators

list_strategies() - List user's strategies
read_strategy(name) - Get strategy code
write_strategy(name, code) - Save strategy
run_backtest(strategy, params) - Execute backtest

Trading

get_watchlist() - Get watchlist (action that may trigger sync)
execute_trade(params) - Execute trade order
get_positions() - Fetch current positions from exchange

Sandbox

run_python(code) - Execute Python code with data science libraries

Gateway Harness Flow

// gateway/src/harness/agent-harness.ts

async handleMessage(message: InboundMessage): Promise<OutboundMessage> {
  // 1. Fetch context resources from user's MCP
  const contextResources = await fetchContextResources([
    'context://user-profile',
    'context://conversation-summary',  // <-- RAG happens here
    'context://workspace-state',
    'context://system-prompt',
  ]);

  // 2. Build system prompt from resources
  const systemPrompt = buildSystemPrompt(contextResources);

  // 3. Build messages with embedded conversation context
  const messages = buildMessages(message, contextResources);

  // 4. Get tools from MCP
  const tools = await mcpClient.listTools();

  // 5. Call Claude with embedded context
  const response = await anthropic.messages.create({
    model: 'claude-3-5-sonnet-20241022',
    system: systemPrompt,  // <-- User profile + workspace + custom prompt
    messages,              // <-- Conversation summary from RAG
    tools,
  });

  // 6. Save to user's MCP (tool call)
  await mcpClient.callTool('save_message', { role: 'user', content: message.content });
  await mcpClient.callTool('save_message', { role: 'assistant', content: response });

  return response;
}

User MCP Server Implementation (Python)

Resource Handler

# user-mcp/src/resources.py

from mcp.server import Server
from mcp.types import Resource, ResourceTemplate
import asyncpg

server = Server("dexorder-user")

@server.list_resources()
async def list_resources() -> list[Resource]:
    return [
        Resource(
            uri="context://user-profile",
            name="User Profile",
            description="Trading style, preferences, and background",
            mimeType="text/plain",
        ),
        Resource(
            uri="context://conversation-summary",
            name="Conversation Summary",
            description="Recent conversation with RAG-enhanced context",
            mimeType="text/plain",
        ),
        Resource(
            uri="context://workspace-state",
            name="Workspace State",
            description="Current chart, watchlist, positions",
            mimeType="application/json",
        ),
        Resource(
            uri="context://system-prompt",
            name="Custom System Prompt",
            description="User's custom AI instructions",
            mimeType="text/plain",
        ),
    ]

@server.read_resource()
async def read_resource(uri: str) -> str:
    if uri == "context://user-profile":
        return await build_user_profile()
    elif uri == "context://conversation-summary":
        return await build_conversation_summary()
    elif uri == "context://workspace-state":
        return await build_workspace_state()
    elif uri == "context://system-prompt":
        return await get_custom_prompt()
    else:
        raise ValueError(f"Unknown resource: {uri}")

RAG Integration

# user-mcp/src/rag.py

import chromadb
from sentence_transformers import SentenceTransformer

class ConversationRAG:
    def __init__(self, db_path: str):
        self.chroma = chromadb.PersistentClient(path=db_path)
        self.collection = self.chroma.get_or_create_collection("conversations")
        self.embedder = SentenceTransformer('all-MiniLM-L6-v2')

    async def search_conversation_history(
        self,
        query: str,
        limit: int = 5,
        min_score: float = 0.7
    ) -> list[dict]:
        """Semantic search over conversation history"""
        # Embed query
        query_embedding = self.embedder.encode(query).tolist()

        # Search
        results = self.collection.query(
            query_embeddings=[query_embedding],
            n_results=limit,
        )

        # Filter by score and format
        relevant = []
        for i, score in enumerate(results['distances'][0]):
            if score >= min_score:
                relevant.append({
                    'content': results['documents'][0][i],
                    'metadata': results['metadatas'][0][i],
                    'score': score,
                })

        return relevant

    async def add_message(self, message_id: str, role: str, content: str, metadata: dict):
        """Add message to RAG index"""
        embedding = self.embedder.encode(content).tolist()

        self.collection.add(
            ids=[message_id],
            embeddings=[embedding],
            documents=[content],
            metadatas=[{
                'role': role,
                'timestamp': metadata.get('timestamp'),
                **metadata
            }]
        )

Conversation Summary Builder

# user-mcp/src/context.py

async def build_conversation_summary(user_id: str) -> str:
    """Build conversation summary with RAG"""
    # 1. Get recent messages
    recent_messages = await db.get_messages(
        user_id=user_id,
        limit=50,
        order='desc'
    )

    # 2. Get current focus (last user message)
    last_user_msg = next(
        (m for m in recent_messages if m.role == 'user'),
        None
    )

    if not last_user_msg:
        return "No recent conversation history."

    # 3. RAG search for relevant context
    rag = ConversationRAG(f"/data/users/{user_id}/rag")
    relevant_context = await rag.search_conversation_history(
        query=last_user_msg.content,
        limit=5,
        min_score=0.7
    )

    # 4. Build summary
    summary = f"Recent Conversation Summary:\n\n"

    # Recent messages (last 10)
    summary += "Last 10 messages:\n"
    for msg in recent_messages[-10:]:
        summary += f"- {msg.role}: {msg.content[:100]}...\n"

    # Relevant past context
    if relevant_context:
        summary += "\nRelevant past discussions (RAG):\n"
        for ctx in relevant_context:
            timestamp = ctx['metadata'].get('timestamp', 'unknown')
            summary += f"- [{timestamp}] {ctx['content'][:150]}...\n"

    # Inferred focus
    summary += f"\nCurrent focus: {infer_topic(last_user_msg.content)}\n"

    return summary

def infer_topic(message: str) -> str:
    """Simple topic extraction"""
    keywords = {
        'strategy': ['strategy', 'backtest', 'trading system'],
        'indicator': ['indicator', 'rsi', 'macd', 'moving average'],
        'analysis': ['analyze', 'chart', 'price action'],
        'risk': ['risk', 'position size', 'stop loss'],
    }

    message_lower = message.lower()
    for topic, words in keywords.items():
        if any(word in message_lower for word in words):
            return topic

    return 'general trading discussion'

Benefits of This Architecture

Privacy: Conversation history never leaves user's container
Customization: Each user controls their RAG, embeddings, prompt engineering
Scalability: Platform harness is stateless - horizontally scalable
Cost Control: Platform pays for Claude, users pay for their compute/storage
Portability: Users can export/migrate their entire context
Development: Users can test prompts/context locally without platform involvement

Future Enhancements

Dynamic Resource URIs

Support parameterized resources:

context://conversation/{session_id}
context://strategy/{strategy_name}
context://backtest/{backtest_id}/results

Resource Templates

MCP supports resource templates for dynamic discovery:

@server.list_resource_templates()
async def list_templates() -> list[ResourceTemplate]:
    return [
        ResourceTemplate(
            uriTemplate="context://strategy/{name}",
            name="Strategy Context",
            description="Context for specific strategy",
        )
    ]

Streaming Resources

For large context (e.g., full backtest results), support streaming:

@server.read_resource()
async def read_resource(uri: str) -> AsyncIterator[str]:
    if uri.startswith("context://backtest/"):
        async for chunk in stream_backtest_results(uri):
            yield chunk

Migration Path

For users with existing conversation history in platform DB:

Export script: Migrate platform history → user container DB
RAG indexing: Embed all historical messages into ChromaDB
Preference migration: Copy user preferences to container
Cutover: Switch to resource-based context fetching

Platform can keep read-only archive for compliance, but active context lives in user container.

14 KiB Raw Blame History

User MCP Server - Resource Architecture

Architecture Principle

MCP Resources for Context Injection

Standard Context Resources

1. context://user-profile

2. context://conversation-summary

3. context://workspace-state

4. context://system-prompt

MCP Tools for Actions

Conversation Management

Strategy & Indicators

Trading

Sandbox

Gateway Harness Flow

User MCP Server Implementation (Python)

Resource Handler

RAG Integration

Conversation Summary Builder

Benefits of This Architecture

Future Enhancements

Dynamic Resource URIs

Resource Templates

Streaming Resources

Migration Path

14 KiB

Raw Blame History

1. `context://user-profile`

2. `context://conversation-summary`

3. `context://workspace-state`

4. `context://system-prompt`