Files
ai/doc/agent_harness.md

13 KiB

Agent Harness Architecture

The Agent Harness is the core orchestration layer for the Dexorder AI platform, built on LangChain.js and LangGraph.js.

Architecture Overview

┌─────────────────────────────────────────────────────────────┐
│                    Gateway (Fastify)                        │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐     │
│  │  WebSocket   │  │  Telegram    │  │  Event       │     │
│  │  Handler     │  │  Handler     │  │  Router      │     │
│  └──────┬───────┘  └──────┬───────┘  └──────┬───────┘     │
│         │                  │                  │              │
│         └──────────────────┴──────────────────┘              │
│                            │                                 │
│                    ┌───────▼────────┐                        │
│                    │ Agent Harness  │                        │
│                    │  (Stateless)   │                        │
│                    └───────┬────────┘                        │
│                            │                                 │
│         ┌──────────────────┼──────────────────┐             │
│         │                  │                  │             │
│    ┌────▼─────┐      ┌────▼─────┐      ┌────▼─────┐       │
│    │   MCP    │      │   LLM    │      │   RAG    │       │
│    │ Connector│      │  Router  │      │ Retriever│       │
│    └────┬─────┘      └────┬─────┘      └────┬─────┘       │
│         │                  │                  │             │
└─────────┼──────────────────┼──────────────────┼─────────────┘
          │                  │                  │
          ▼                  ▼                  ▼
   ┌────────────┐     ┌───────────┐     ┌───────────┐
   │   User's   │     │    LLM    │     │  Qdrant   │
   │    MCP     │     │ Providers │     │ (Vectors) │
   │ Container  │     │(Anthropic,│     │           │
   │ (k8s pod)  │     │  OpenAI,  │     │  Global + │
   │            │     │   etc)    │     │   User    │
   └────────────┘     └───────────┘     └───────────┘

Message Processing Flow

When a user sends a message:

1. Gateway receives message via channel (WebSocket/Telegram)
   ↓
2. Authenticator validates user and gets license info
   ↓
3. Container Manager ensures user's MCP container is running
   ↓
4. Agent Harness processes message:
   │
   ├─→ a. MCPClientConnector fetches context resources:
   │      - context://user-profile
   │      - context://conversation-summary
   │      - context://workspace-state
   │      - context://system-prompt
   │
   ├─→ b. RAGRetriever searches for relevant memories:
   │      - Embeds user query
   │      - Searches Qdrant: user_id = current_user OR user_id = "0"
   │      - Returns user-specific + global platform knowledge
   │
   ├─→ c. Build system prompt:
   │      - Base platform prompt
   │      - User profile context
   │      - Workspace state
   │      - Custom user instructions
   │      - Relevant RAG memories
   │
   ├─→ d. ModelRouter selects LLM:
   │      - Based on license tier
   │      - Query complexity
   │      - Configured routing strategy
   │
   ├─→ e. LLM invocation with tool support:
   │      - Send messages to LLM
   │      - If tool calls requested:
   │         • Platform tools → handled by gateway
   │         • User tools → proxied to MCP container
   │      - Loop until no more tool calls
   │
   ├─→ f. Save conversation to MCP:
   │      - mcp.callTool('save_message', user_message)
   │      - mcp.callTool('save_message', assistant_message)
   │
   └─→ g. Return response to user via channel

Core Components

1. Agent Harness (gateway/src/harness/agent-harness.ts)

Stateless orchestrator - all state lives in user's MCP container or RAG.

Responsibilities:

  • Fetch context from user's MCP resources
  • Query RAG for relevant memories
  • Build prompts with full context
  • Route to appropriate LLM
  • Handle tool calls (platform vs user)
  • Save conversation back to MCP
  • Stream responses to user

Key Methods:

  • handleMessage(): Process single message (non-streaming)
  • streamMessage(): Process with streaming response
  • initialize(): Connect to user's MCP server

2. MCP Client Connector (gateway/src/harness/mcp-client.ts)

Connects to user's MCP container using Model Context Protocol.

Features:

  • Resource reading (context://, indicators://, strategies://)
  • Tool execution (save_message, run_backtest, etc.)
  • Automatic reconnection on container restarts
  • Error handling and fallbacks

3. Model Router (gateway/src/llm/router.ts)

Routes queries to appropriate LLM based on:

  • License tier: Free users → smaller models, paid → better models
  • Complexity: Simple queries → fast models, complex → powerful models
  • Cost optimization: Balance performance vs cost

Routing Strategies:

  • COST: Minimize cost
  • COMPLEXITY: Match model to query complexity
  • SPEED: Prioritize fast responses
  • QUALITY: Best available model

4. Memory Layer

Three-Tier Storage:

Redis (Hot Storage)

  • Active session state
  • Recent conversation history (last 50 messages)
  • LangGraph checkpoints (1 hour TTL)
  • Fast reads for active conversations

Qdrant (Vector Search)

  • Conversation embeddings
  • User-specific memories (user_id = actual user ID)
  • Global platform knowledge (user_id = "0")
  • RAG retrieval with cosine similarity
  • GDPR-compliant (indexed by user_id for fast deletion)

Iceberg (Cold Storage)

  • Full conversation history (partitioned by user_id, session_id)
  • Checkpoint snapshots for replay
  • Analytics and time-travel queries
  • GDPR-compliant with compaction

RAG System:

Global Knowledge (user_id="0"):

  • Platform capabilities and architecture
  • Trading concepts and fundamentals
  • Indicator development guides
  • Strategy patterns and examples
  • Loaded from gateway/knowledge/ markdown files

User Knowledge (user_id=specific user):

  • Personal conversation history
  • Trading preferences and style
  • Custom indicators and strategies
  • Workspace state and context

Query Flow:

  1. User query is embedded using EmbeddingService
  2. Qdrant searches: user_id IN (current_user, "0")
  3. Top-K relevant chunks returned
  4. Added to LLM context automatically

5. Skills vs Subagents

Skills (gateway/src/harness/skills/)

Use for: Well-defined, specific tasks

  • Market analysis
  • Strategy validation
  • Single-purpose capabilities
  • Defined in markdown + TypeScript

Structure:

class MarketAnalysisSkill extends BaseSkill {
  async execute(context, parameters) {
    // Implementation
  }
}

Subagents (gateway/src/harness/subagents/)

Use for: Complex domain expertise with context

  • Code reviewer with review guidelines
  • Risk analyzer with risk models
  • Multi-file knowledge base in memory/ directory
  • Custom system prompts

Structure:

subagents/
  code-reviewer/
    config.yaml              # Model, memory files, capabilities
    system-prompt.md         # Specialized instructions
    memory/
      review-guidelines.md
      common-patterns.md
      best-practices.md
    index.ts

Recommendation: Prefer skills for most tasks. Use subagents when you need:

  • Substantial domain-specific knowledge
  • Multi-file context management
  • Specialized system prompts

6. Workflows (gateway/src/harness/workflows/)

LangGraph state machines for multi-step orchestration:

Features:

  • Validation loops (retry with fixes)
  • Human-in-the-loop (approval gates)
  • Error recovery
  • State persistence via checkpoints

Example Workflows:

  • Strategy validation: review → backtest → risk → approval
  • Trading request: analysis → risk → approval → execute

User Context Structure

Every interaction includes rich context:

interface UserContext {
  userId: string;
  sessionId: string;
  license: UserLicense;

  // Multi-channel support
  activeChannel: {
    type: 'websocket' | 'telegram' | 'slack' | 'discord';
    channelUserId: string;
    capabilities: {
      supportsMarkdown: boolean;
      supportsImages: boolean;
      supportsButtons: boolean;
      maxMessageLength: number;
    };
  };

  // Retrieved from MCP + RAG
  conversationHistory: BaseMessage[];
  relevantMemories: MemoryChunk[];
  workspaceState: WorkspaceContext;
}

User-Specific Files and Tools

User's MCP container provides access to:

Indicators (indicators/*.py)

  • Custom technical indicators
  • Pure functions: DataFrame → Series/DataFrame
  • Version controlled in user's git repo

Strategies (strategies/*.py)

  • Trading strategies with entry/exit rules
  • Position sizing and risk management
  • Backtestable and deployable

Watchlists

  • Saved ticker lists
  • Market monitoring

Preferences

  • Trading style and risk tolerance
  • Chart settings and colors
  • Notification preferences

Executors (sub-strategies)

  • Tactical order generators (TWAP, iceberg, etc.)
  • Smart order routing

Global Knowledge Management

Document Loading

At gateway startup:

  1. DocumentLoader scans gateway/knowledge/ directory
  2. Markdown files chunked by headers (~1000 tokens/chunk)
  3. Embeddings generated via EmbeddingService
  4. Stored in Qdrant with user_id="0"
  5. Content hashing enables incremental updates

Directory Structure

gateway/knowledge/
  ├── platform/          # Platform capabilities
  ├── trading/           # Trading fundamentals
  ├── indicators/        # Indicator development
  └── strategies/        # Strategy patterns

Updating Knowledge

Development:

curl -X POST http://localhost:3000/admin/reload-knowledge

Production:

  • Update markdown files
  • Deploy new version
  • Auto-loaded on startup

Monitoring:

curl http://localhost:3000/admin/knowledge-stats

Container Lifecycle

User Container Creation

When user connects:

  1. Gateway checks if container exists (ContainerManager)
  2. If not, creates Kubernetes pod with:
    • Agent container (Python + conda)
    • Lifecycle sidecar (container management)
    • Persistent volume (git repo)
  3. Waits for MCP server ready (~5-10s cold start)
  4. Establishes MCP connection
  5. Begins message processing

Container Shutdown

Free users: 15 minutes idle timeout Paid users: Longer timeout based on license On shutdown:

  • Graceful save of all state
  • Persistent storage retained
  • Fast restart on next connection

MCP Authentication Modes

  1. Public Mode (Free tier): No auth, read-only, anonymous session
  2. Gateway Auth (Standard): Gateway authenticates, container trusts gateway
  3. Direct Auth (Enterprise): User authenticates directly with container

Implementation Status

Completed

  • Agent Harness with MCP integration
  • Model routing with license tiers
  • RAG retriever with Qdrant
  • Document loader for global knowledge
  • EmbeddingService (Ollama/OpenAI)
  • Skills and subagents framework
  • Multi-channel support (WebSocket, Telegram)
  • Container lifecycle management
  • Event system with ZeroMQ

🚧 In Progress

  • Iceberg integration (checkpoint-saver, conversation-store)
  • More subagents (risk-analyzer, market-analyst)
  • LangGraph workflows with interrupts
  • Platform tools (market data, charting)

📋 Planned

  • File watcher for hot-reload in development
  • Advanced RAG strategies (hybrid search, re-ranking)
  • Caching layer for expensive operations
  • Performance monitoring and metrics

References