# Agent Harness Architecture The Agent Harness is the core orchestration layer for the Dexorder AI platform, built on LangChain.js and LangGraph.js. ## Architecture Overview ``` ┌─────────────────────────────────────────────────────────────┐ │ Gateway (Fastify) │ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │ │ WebSocket │ │ Telegram │ │ Event │ │ │ │ Handler │ │ Handler │ │ Router │ │ │ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │ │ │ │ │ │ │ └──────────────────┴──────────────────┘ │ │ │ │ │ ┌───────▼────────┐ │ │ │ Agent Harness │ │ │ │ (Stateless) │ │ │ └───────┬────────┘ │ │ │ │ │ ┌──────────────────┼──────────────────┐ │ │ │ │ │ │ │ ┌────▼─────┐ ┌────▼─────┐ ┌────▼─────┐ │ │ │ MCP │ │ LLM │ │ RAG │ │ │ │ Connector│ │ Router │ │ Retriever│ │ │ └────┬─────┘ └────┬─────┘ └────┬─────┘ │ │ │ │ │ │ └─────────┼──────────────────┼──────────────────┼─────────────┘ │ │ │ ▼ ▼ ▼ ┌────────────┐ ┌───────────┐ ┌───────────┐ │ User's │ │ LLM │ │ Qdrant │ │ MCP │ │ Providers │ │ (Vectors) │ │ Container │ │(Anthropic,│ │ │ │ (k8s pod) │ │ OpenAI, │ │ Global + │ │ │ │ etc) │ │ User │ └────────────┘ └───────────┘ └───────────┘ ``` ## Message Processing Flow When a user sends a message: ``` 1. Gateway receives message via channel (WebSocket/Telegram) ↓ 2. Authenticator validates user and gets license info ↓ 3. Container Manager ensures user's MCP container is running ↓ 4. Agent Harness processes message: │ ├─→ a. MCPClientConnector fetches context resources: │ - context://user-profile │ - context://conversation-summary │ - context://workspace-state │ - context://system-prompt │ ├─→ b. RAGRetriever searches for relevant memories: │ - Embeds user query │ - Searches Qdrant: user_id = current_user OR user_id = "0" │ - Returns user-specific + global platform knowledge │ ├─→ c. Build system prompt: │ - Base platform prompt │ - User profile context │ - Workspace state │ - Custom user instructions │ - Relevant RAG memories │ ├─→ d. ModelRouter selects LLM: │ - Based on license tier │ - Query complexity │ - Configured routing strategy │ ├─→ e. LLM invocation with tool support: │ - Send messages to LLM │ - If tool calls requested: │ • Platform tools → handled by gateway │ • User tools → proxied to MCP container │ - Loop until no more tool calls │ ├─→ f. Save conversation to MCP: │ - mcp.callTool('save_message', user_message) │ - mcp.callTool('save_message', assistant_message) │ └─→ g. Return response to user via channel ``` ## Core Components ### 1. Agent Harness (`gateway/src/harness/agent-harness.ts`) **Stateless orchestrator** - all state lives in user's MCP container or RAG. **Responsibilities:** - Fetch context from user's MCP resources - Query RAG for relevant memories - Build prompts with full context - Route to appropriate LLM - Handle tool calls (platform vs user) - Save conversation back to MCP - Stream responses to user **Key Methods:** - `handleMessage()`: Process single message (non-streaming) - `streamMessage()`: Process with streaming response - `initialize()`: Connect to user's MCP server ### 2. MCP Client Connector (`gateway/src/harness/mcp-client.ts`) Connects to user's MCP container using Model Context Protocol. **Features:** - Resource reading (context://, indicators://, strategies://) - Tool execution (save_message, run_backtest, etc.) - Automatic reconnection on container restarts - Error handling and fallbacks ### 3. Model Router (`gateway/src/llm/router.ts`) Routes queries to appropriate LLM based on: - **License tier**: Free users → smaller models, paid → better models - **Complexity**: Simple queries → fast models, complex → powerful models - **Cost optimization**: Balance performance vs cost **Routing Strategies:** - `COST`: Minimize cost - `COMPLEXITY`: Match model to query complexity - `SPEED`: Prioritize fast responses - `QUALITY`: Best available model ### 4. Memory Layer #### Three-Tier Storage: **Redis** (Hot Storage) - Active session state - Recent conversation history (last 50 messages) - LangGraph checkpoints (1 hour TTL) - Fast reads for active conversations **Qdrant** (Vector Search) - Conversation embeddings - User-specific memories (user_id = actual user ID) - **Global platform knowledge** (user_id = "0") - RAG retrieval with cosine similarity - GDPR-compliant (indexed by user_id for fast deletion) **Iceberg** (Cold Storage) - Full conversation history (partitioned by user_id, session_id) - Checkpoint snapshots for replay - Analytics and time-travel queries - GDPR-compliant with compaction #### RAG System: **Global Knowledge** (user_id="0"): - Platform capabilities and architecture - Trading concepts and fundamentals - Indicator development guides - Strategy patterns and examples - Loaded from `gateway/knowledge/` markdown files **User Knowledge** (user_id=specific user): - Personal conversation history - Trading preferences and style - Custom indicators and strategies - Workspace state and context **Query Flow:** 1. User query is embedded using EmbeddingService 2. Qdrant searches: `user_id IN (current_user, "0")` 3. Top-K relevant chunks returned 4. Added to LLM context automatically ### 5. Skills vs Subagents #### Skills (`gateway/src/harness/skills/`) **Use for**: Well-defined, specific tasks - Market analysis - Strategy validation - Single-purpose capabilities - Defined in markdown + TypeScript **Structure:** ```typescript class MarketAnalysisSkill extends BaseSkill { async execute(context, parameters) { // Implementation } } ``` #### Subagents (`gateway/src/harness/subagents/`) **Use for**: Complex domain expertise with context - Code reviewer with review guidelines - Risk analyzer with risk models - Multi-file knowledge base in memory/ directory - Custom system prompts **Structure:** ``` subagents/ code-reviewer/ config.yaml # Model, memory files, capabilities system-prompt.md # Specialized instructions memory/ review-guidelines.md common-patterns.md best-practices.md index.ts ``` **Recommendation**: Prefer skills for most tasks. Use subagents when you need: - Substantial domain-specific knowledge - Multi-file context management - Specialized system prompts ### 6. Workflows (`gateway/src/harness/workflows/`) LangGraph state machines for multi-step orchestration: **Features:** - Validation loops (retry with fixes) - Human-in-the-loop (approval gates) - Error recovery - State persistence via checkpoints **Example Workflows:** - Strategy validation: review → backtest → risk → approval - Trading request: analysis → risk → approval → execute ## User Context Structure Every interaction includes rich context: ```typescript interface UserContext { userId: string; sessionId: string; license: UserLicense; // Multi-channel support activeChannel: { type: 'websocket' | 'telegram' | 'slack' | 'discord'; channelUserId: string; capabilities: { supportsMarkdown: boolean; supportsImages: boolean; supportsButtons: boolean; maxMessageLength: number; }; }; // Retrieved from MCP + RAG conversationHistory: BaseMessage[]; relevantMemories: MemoryChunk[]; workspaceState: WorkspaceContext; } ``` ## User-Specific Files and Tools User's MCP container provides access to: **Indicators** (`indicators/*.py`) - Custom technical indicators - Pure functions: DataFrame → Series/DataFrame - Version controlled in user's git repo **Strategies** (`strategies/*.py`) - Trading strategies with entry/exit rules - Position sizing and risk management - Backtestable and deployable **Watchlists** - Saved ticker lists - Market monitoring **Preferences** - Trading style and risk tolerance - Chart settings and colors - Notification preferences **Executors** (sub-strategies) - Tactical order generators (TWAP, iceberg, etc.) - Smart order routing ## Global Knowledge Management ### Document Loading At gateway startup: 1. DocumentLoader scans `gateway/knowledge/` directory 2. Markdown files chunked by headers (~1000 tokens/chunk) 3. Embeddings generated via EmbeddingService 4. Stored in Qdrant with user_id="0" 5. Content hashing enables incremental updates ### Directory Structure ``` gateway/knowledge/ ├── platform/ # Platform capabilities ├── trading/ # Trading fundamentals ├── indicators/ # Indicator development └── strategies/ # Strategy patterns ``` ### Updating Knowledge **Development:** ```bash curl -X POST http://localhost:3000/admin/reload-knowledge ``` **Production:** - Update markdown files - Deploy new version - Auto-loaded on startup **Monitoring:** ```bash curl http://localhost:3000/admin/knowledge-stats ``` ## Container Lifecycle ### User Container Creation When user connects: 1. Gateway checks if container exists (ContainerManager) 2. If not, creates Kubernetes pod with: - Agent container (Python + conda) - Lifecycle sidecar (container management) - Persistent volume (git repo) 3. Waits for MCP server ready (~5-10s cold start) 4. Establishes MCP connection 5. Begins message processing ### Container Shutdown **Free users:** 15 minutes idle timeout **Paid users:** Longer timeout based on license **On shutdown:** - Graceful save of all state - Persistent storage retained - Fast restart on next connection ### MCP Authentication Modes 1. **Public Mode** (Free tier): No auth, read-only, anonymous session 2. **Gateway Auth** (Standard): Gateway authenticates, container trusts gateway 3. **Direct Auth** (Enterprise): User authenticates directly with container ## Implementation Status ### ✅ Completed - Agent Harness with MCP integration - Model routing with license tiers - RAG retriever with Qdrant - Document loader for global knowledge - EmbeddingService (Ollama/OpenAI) - Skills and subagents framework - Multi-channel support (WebSocket, Telegram) - Container lifecycle management - Event system with ZeroMQ ### 🚧 In Progress - Iceberg integration (checkpoint-saver, conversation-store) - More subagents (risk-analyzer, market-analyst) - LangGraph workflows with interrupts - Platform tools (market data, charting) ### 📋 Planned - File watcher for hot-reload in development - Advanced RAG strategies (hybrid search, re-ranking) - Caching layer for expensive operations - Performance monitoring and metrics ## References - Implementation: `gateway/src/harness/` - Documentation: `gateway/src/harness/README.md` - Knowledge base: `gateway/knowledge/` - LangGraph: https://langchain-ai.github.io/langgraphjs/ - Qdrant: https://qdrant.tech/documentation/ - MCP Spec: https://modelcontextprotocol.io/