redesign fully scaffolded and web login works
This commit is contained in:
392
doc/agent_harness.md
Normal file
392
doc/agent_harness.md
Normal file
@@ -0,0 +1,392 @@
|
||||
# Agent Harness Architecture
|
||||
|
||||
The Agent Harness is the core orchestration layer for the Dexorder AI platform, built on LangChain.js and LangGraph.js.
|
||||
|
||||
## Architecture Overview
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────┐
|
||||
│ Gateway (Fastify) │
|
||||
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
|
||||
│ │ WebSocket │ │ Telegram │ │ Event │ │
|
||||
│ │ Handler │ │ Handler │ │ Router │ │
|
||||
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
|
||||
│ │ │ │ │
|
||||
│ └──────────────────┴──────────────────┘ │
|
||||
│ │ │
|
||||
│ ┌───────▼────────┐ │
|
||||
│ │ Agent Harness │ │
|
||||
│ │ (Stateless) │ │
|
||||
│ └───────┬────────┘ │
|
||||
│ │ │
|
||||
│ ┌──────────────────┼──────────────────┐ │
|
||||
│ │ │ │ │
|
||||
│ ┌────▼─────┐ ┌────▼─────┐ ┌────▼─────┐ │
|
||||
│ │ MCP │ │ LLM │ │ RAG │ │
|
||||
│ │ Connector│ │ Router │ │ Retriever│ │
|
||||
│ └────┬─────┘ └────┬─────┘ └────┬─────┘ │
|
||||
│ │ │ │ │
|
||||
└─────────┼──────────────────┼──────────────────┼─────────────┘
|
||||
│ │ │
|
||||
▼ ▼ ▼
|
||||
┌────────────┐ ┌───────────┐ ┌───────────┐
|
||||
│ User's │ │ LLM │ │ Qdrant │
|
||||
│ MCP │ │ Providers │ │ (Vectors) │
|
||||
│ Container │ │(Anthropic,│ │ │
|
||||
│ (k8s pod) │ │ OpenAI, │ │ Global + │
|
||||
│ │ │ etc) │ │ User │
|
||||
└────────────┘ └───────────┘ └───────────┘
|
||||
```
|
||||
|
||||
## Message Processing Flow
|
||||
|
||||
When a user sends a message:
|
||||
|
||||
```
|
||||
1. Gateway receives message via channel (WebSocket/Telegram)
|
||||
↓
|
||||
2. Authenticator validates user and gets license info
|
||||
↓
|
||||
3. Container Manager ensures user's MCP container is running
|
||||
↓
|
||||
4. Agent Harness processes message:
|
||||
│
|
||||
├─→ a. MCPClientConnector fetches context resources:
|
||||
│ - context://user-profile
|
||||
│ - context://conversation-summary
|
||||
│ - context://workspace-state
|
||||
│ - context://system-prompt
|
||||
│
|
||||
├─→ b. RAGRetriever searches for relevant memories:
|
||||
│ - Embeds user query
|
||||
│ - Searches Qdrant: user_id = current_user OR user_id = "0"
|
||||
│ - Returns user-specific + global platform knowledge
|
||||
│
|
||||
├─→ c. Build system prompt:
|
||||
│ - Base platform prompt
|
||||
│ - User profile context
|
||||
│ - Workspace state
|
||||
│ - Custom user instructions
|
||||
│ - Relevant RAG memories
|
||||
│
|
||||
├─→ d. ModelRouter selects LLM:
|
||||
│ - Based on license tier
|
||||
│ - Query complexity
|
||||
│ - Configured routing strategy
|
||||
│
|
||||
├─→ e. LLM invocation with tool support:
|
||||
│ - Send messages to LLM
|
||||
│ - If tool calls requested:
|
||||
│ • Platform tools → handled by gateway
|
||||
│ • User tools → proxied to MCP container
|
||||
│ - Loop until no more tool calls
|
||||
│
|
||||
├─→ f. Save conversation to MCP:
|
||||
│ - mcp.callTool('save_message', user_message)
|
||||
│ - mcp.callTool('save_message', assistant_message)
|
||||
│
|
||||
└─→ g. Return response to user via channel
|
||||
```
|
||||
|
||||
## Core Components
|
||||
|
||||
### 1. Agent Harness (`gateway/src/harness/agent-harness.ts`)
|
||||
|
||||
**Stateless orchestrator** - all state lives in user's MCP container or RAG.
|
||||
|
||||
**Responsibilities:**
|
||||
- Fetch context from user's MCP resources
|
||||
- Query RAG for relevant memories
|
||||
- Build prompts with full context
|
||||
- Route to appropriate LLM
|
||||
- Handle tool calls (platform vs user)
|
||||
- Save conversation back to MCP
|
||||
- Stream responses to user
|
||||
|
||||
**Key Methods:**
|
||||
- `handleMessage()`: Process single message (non-streaming)
|
||||
- `streamMessage()`: Process with streaming response
|
||||
- `initialize()`: Connect to user's MCP server
|
||||
|
||||
### 2. MCP Client Connector (`gateway/src/harness/mcp-client.ts`)
|
||||
|
||||
Connects to user's MCP container using Model Context Protocol.
|
||||
|
||||
**Features:**
|
||||
- Resource reading (context://, indicators://, strategies://)
|
||||
- Tool execution (save_message, run_backtest, etc.)
|
||||
- Automatic reconnection on container restarts
|
||||
- Error handling and fallbacks
|
||||
|
||||
### 3. Model Router (`gateway/src/llm/router.ts`)
|
||||
|
||||
Routes queries to appropriate LLM based on:
|
||||
- **License tier**: Free users → smaller models, paid → better models
|
||||
- **Complexity**: Simple queries → fast models, complex → powerful models
|
||||
- **Cost optimization**: Balance performance vs cost
|
||||
|
||||
**Routing Strategies:**
|
||||
- `COST`: Minimize cost
|
||||
- `COMPLEXITY`: Match model to query complexity
|
||||
- `SPEED`: Prioritize fast responses
|
||||
- `QUALITY`: Best available model
|
||||
|
||||
### 4. Memory Layer
|
||||
|
||||
#### Three-Tier Storage:
|
||||
|
||||
**Redis** (Hot Storage)
|
||||
- Active session state
|
||||
- Recent conversation history (last 50 messages)
|
||||
- LangGraph checkpoints (1 hour TTL)
|
||||
- Fast reads for active conversations
|
||||
|
||||
**Qdrant** (Vector Search)
|
||||
- Conversation embeddings
|
||||
- User-specific memories (user_id = actual user ID)
|
||||
- **Global platform knowledge** (user_id = "0")
|
||||
- RAG retrieval with cosine similarity
|
||||
- GDPR-compliant (indexed by user_id for fast deletion)
|
||||
|
||||
**Iceberg** (Cold Storage)
|
||||
- Full conversation history (partitioned by user_id, session_id)
|
||||
- Checkpoint snapshots for replay
|
||||
- Analytics and time-travel queries
|
||||
- GDPR-compliant with compaction
|
||||
|
||||
#### RAG System:
|
||||
|
||||
**Global Knowledge** (user_id="0"):
|
||||
- Platform capabilities and architecture
|
||||
- Trading concepts and fundamentals
|
||||
- Indicator development guides
|
||||
- Strategy patterns and examples
|
||||
- Loaded from `gateway/knowledge/` markdown files
|
||||
|
||||
**User Knowledge** (user_id=specific user):
|
||||
- Personal conversation history
|
||||
- Trading preferences and style
|
||||
- Custom indicators and strategies
|
||||
- Workspace state and context
|
||||
|
||||
**Query Flow:**
|
||||
1. User query is embedded using EmbeddingService
|
||||
2. Qdrant searches: `user_id IN (current_user, "0")`
|
||||
3. Top-K relevant chunks returned
|
||||
4. Added to LLM context automatically
|
||||
|
||||
### 5. Skills vs Subagents
|
||||
|
||||
#### Skills (`gateway/src/harness/skills/`)
|
||||
|
||||
**Use for**: Well-defined, specific tasks
|
||||
- Market analysis
|
||||
- Strategy validation
|
||||
- Single-purpose capabilities
|
||||
- Defined in markdown + TypeScript
|
||||
|
||||
**Structure:**
|
||||
```typescript
|
||||
class MarketAnalysisSkill extends BaseSkill {
|
||||
async execute(context, parameters) {
|
||||
// Implementation
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### Subagents (`gateway/src/harness/subagents/`)
|
||||
|
||||
**Use for**: Complex domain expertise with context
|
||||
- Code reviewer with review guidelines
|
||||
- Risk analyzer with risk models
|
||||
- Multi-file knowledge base in memory/ directory
|
||||
- Custom system prompts
|
||||
|
||||
**Structure:**
|
||||
```
|
||||
subagents/
|
||||
code-reviewer/
|
||||
config.yaml # Model, memory files, capabilities
|
||||
system-prompt.md # Specialized instructions
|
||||
memory/
|
||||
review-guidelines.md
|
||||
common-patterns.md
|
||||
best-practices.md
|
||||
index.ts
|
||||
```
|
||||
|
||||
**Recommendation**: Prefer skills for most tasks. Use subagents when you need:
|
||||
- Substantial domain-specific knowledge
|
||||
- Multi-file context management
|
||||
- Specialized system prompts
|
||||
|
||||
### 6. Workflows (`gateway/src/harness/workflows/`)
|
||||
|
||||
LangGraph state machines for multi-step orchestration:
|
||||
|
||||
**Features:**
|
||||
- Validation loops (retry with fixes)
|
||||
- Human-in-the-loop (approval gates)
|
||||
- Error recovery
|
||||
- State persistence via checkpoints
|
||||
|
||||
**Example Workflows:**
|
||||
- Strategy validation: review → backtest → risk → approval
|
||||
- Trading request: analysis → risk → approval → execute
|
||||
|
||||
## User Context Structure
|
||||
|
||||
Every interaction includes rich context:
|
||||
|
||||
```typescript
|
||||
interface UserContext {
|
||||
userId: string;
|
||||
sessionId: string;
|
||||
license: UserLicense;
|
||||
|
||||
// Multi-channel support
|
||||
activeChannel: {
|
||||
type: 'websocket' | 'telegram' | 'slack' | 'discord';
|
||||
channelUserId: string;
|
||||
capabilities: {
|
||||
supportsMarkdown: boolean;
|
||||
supportsImages: boolean;
|
||||
supportsButtons: boolean;
|
||||
maxMessageLength: number;
|
||||
};
|
||||
};
|
||||
|
||||
// Retrieved from MCP + RAG
|
||||
conversationHistory: BaseMessage[];
|
||||
relevantMemories: MemoryChunk[];
|
||||
workspaceState: WorkspaceContext;
|
||||
}
|
||||
```
|
||||
|
||||
## User-Specific Files and Tools
|
||||
|
||||
User's MCP container provides access to:
|
||||
|
||||
**Indicators** (`indicators/*.py`)
|
||||
- Custom technical indicators
|
||||
- Pure functions: DataFrame → Series/DataFrame
|
||||
- Version controlled in user's git repo
|
||||
|
||||
**Strategies** (`strategies/*.py`)
|
||||
- Trading strategies with entry/exit rules
|
||||
- Position sizing and risk management
|
||||
- Backtestable and deployable
|
||||
|
||||
**Watchlists**
|
||||
- Saved ticker lists
|
||||
- Market monitoring
|
||||
|
||||
**Preferences**
|
||||
- Trading style and risk tolerance
|
||||
- Chart settings and colors
|
||||
- Notification preferences
|
||||
|
||||
**Executors** (sub-strategies)
|
||||
- Tactical order generators (TWAP, iceberg, etc.)
|
||||
- Smart order routing
|
||||
|
||||
## Global Knowledge Management
|
||||
|
||||
### Document Loading
|
||||
|
||||
At gateway startup:
|
||||
1. DocumentLoader scans `gateway/knowledge/` directory
|
||||
2. Markdown files chunked by headers (~1000 tokens/chunk)
|
||||
3. Embeddings generated via EmbeddingService
|
||||
4. Stored in Qdrant with user_id="0"
|
||||
5. Content hashing enables incremental updates
|
||||
|
||||
### Directory Structure
|
||||
|
||||
```
|
||||
gateway/knowledge/
|
||||
├── platform/ # Platform capabilities
|
||||
├── trading/ # Trading fundamentals
|
||||
├── indicators/ # Indicator development
|
||||
└── strategies/ # Strategy patterns
|
||||
```
|
||||
|
||||
### Updating Knowledge
|
||||
|
||||
**Development:**
|
||||
```bash
|
||||
curl -X POST http://localhost:3000/admin/reload-knowledge
|
||||
```
|
||||
|
||||
**Production:**
|
||||
- Update markdown files
|
||||
- Deploy new version
|
||||
- Auto-loaded on startup
|
||||
|
||||
**Monitoring:**
|
||||
```bash
|
||||
curl http://localhost:3000/admin/knowledge-stats
|
||||
```
|
||||
|
||||
## Container Lifecycle
|
||||
|
||||
### User Container Creation
|
||||
|
||||
When user connects:
|
||||
1. Gateway checks if container exists (ContainerManager)
|
||||
2. If not, creates Kubernetes pod with:
|
||||
- Agent container (Python + conda)
|
||||
- Lifecycle sidecar (container management)
|
||||
- Persistent volume (git repo)
|
||||
3. Waits for MCP server ready (~5-10s cold start)
|
||||
4. Establishes MCP connection
|
||||
5. Begins message processing
|
||||
|
||||
### Container Shutdown
|
||||
|
||||
**Free users:** 15 minutes idle timeout
|
||||
**Paid users:** Longer timeout based on license
|
||||
**On shutdown:**
|
||||
- Graceful save of all state
|
||||
- Persistent storage retained
|
||||
- Fast restart on next connection
|
||||
|
||||
### MCP Authentication Modes
|
||||
|
||||
1. **Public Mode** (Free tier): No auth, read-only, anonymous session
|
||||
2. **Gateway Auth** (Standard): Gateway authenticates, container trusts gateway
|
||||
3. **Direct Auth** (Enterprise): User authenticates directly with container
|
||||
|
||||
## Implementation Status
|
||||
|
||||
### ✅ Completed
|
||||
- Agent Harness with MCP integration
|
||||
- Model routing with license tiers
|
||||
- RAG retriever with Qdrant
|
||||
- Document loader for global knowledge
|
||||
- EmbeddingService (Ollama/OpenAI)
|
||||
- Skills and subagents framework
|
||||
- Multi-channel support (WebSocket, Telegram)
|
||||
- Container lifecycle management
|
||||
- Event system with ZeroMQ
|
||||
|
||||
### 🚧 In Progress
|
||||
- Iceberg integration (checkpoint-saver, conversation-store)
|
||||
- More subagents (risk-analyzer, market-analyst)
|
||||
- LangGraph workflows with interrupts
|
||||
- Platform tools (market data, charting)
|
||||
|
||||
### 📋 Planned
|
||||
- File watcher for hot-reload in development
|
||||
- Advanced RAG strategies (hybrid search, re-ranking)
|
||||
- Caching layer for expensive operations
|
||||
- Performance monitoring and metrics
|
||||
|
||||
## References
|
||||
|
||||
- Implementation: `gateway/src/harness/`
|
||||
- Documentation: `gateway/src/harness/README.md`
|
||||
- Knowledge base: `gateway/knowledge/`
|
||||
- LangGraph: https://langchain-ai.github.io/langgraphjs/
|
||||
- Qdrant: https://qdrant.tech/documentation/
|
||||
- MCP Spec: https://modelcontextprotocol.io/
|
||||
Reference in New Issue
Block a user