redesign fully scaffolded and web login works

This commit is contained in:
2026-03-17 20:10:47 -04:00
parent b9cc397e05
commit f6bd22a8ef
143 changed files with 17317 additions and 693 deletions

392
doc/agent_harness.md Normal file
View File

@@ -0,0 +1,392 @@
# Agent Harness Architecture
The Agent Harness is the core orchestration layer for the Dexorder AI platform, built on LangChain.js and LangGraph.js.
## Architecture Overview
```
┌─────────────────────────────────────────────────────────────┐
│ Gateway (Fastify) │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ WebSocket │ │ Telegram │ │ Event │ │
│ │ Handler │ │ Handler │ │ Router │ │
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
│ │ │ │ │
│ └──────────────────┴──────────────────┘ │
│ │ │
│ ┌───────▼────────┐ │
│ │ Agent Harness │ │
│ │ (Stateless) │ │
│ └───────┬────────┘ │
│ │ │
│ ┌──────────────────┼──────────────────┐ │
│ │ │ │ │
│ ┌────▼─────┐ ┌────▼─────┐ ┌────▼─────┐ │
│ │ MCP │ │ LLM │ │ RAG │ │
│ │ Connector│ │ Router │ │ Retriever│ │
│ └────┬─────┘ └────┬─────┘ └────┬─────┘ │
│ │ │ │ │
└─────────┼──────────────────┼──────────────────┼─────────────┘
│ │ │
▼ ▼ ▼
┌────────────┐ ┌───────────┐ ┌───────────┐
│ User's │ │ LLM │ │ Qdrant │
│ MCP │ │ Providers │ │ (Vectors) │
│ Container │ │(Anthropic,│ │ │
│ (k8s pod) │ │ OpenAI, │ │ Global + │
│ │ │ etc) │ │ User │
└────────────┘ └───────────┘ └───────────┘
```
## Message Processing Flow
When a user sends a message:
```
1. Gateway receives message via channel (WebSocket/Telegram)
2. Authenticator validates user and gets license info
3. Container Manager ensures user's MCP container is running
4. Agent Harness processes message:
├─→ a. MCPClientConnector fetches context resources:
│ - context://user-profile
│ - context://conversation-summary
│ - context://workspace-state
│ - context://system-prompt
├─→ b. RAGRetriever searches for relevant memories:
│ - Embeds user query
│ - Searches Qdrant: user_id = current_user OR user_id = "0"
│ - Returns user-specific + global platform knowledge
├─→ c. Build system prompt:
│ - Base platform prompt
│ - User profile context
│ - Workspace state
│ - Custom user instructions
│ - Relevant RAG memories
├─→ d. ModelRouter selects LLM:
│ - Based on license tier
│ - Query complexity
│ - Configured routing strategy
├─→ e. LLM invocation with tool support:
│ - Send messages to LLM
│ - If tool calls requested:
│ • Platform tools → handled by gateway
│ • User tools → proxied to MCP container
│ - Loop until no more tool calls
├─→ f. Save conversation to MCP:
│ - mcp.callTool('save_message', user_message)
│ - mcp.callTool('save_message', assistant_message)
└─→ g. Return response to user via channel
```
## Core Components
### 1. Agent Harness (`gateway/src/harness/agent-harness.ts`)
**Stateless orchestrator** - all state lives in user's MCP container or RAG.
**Responsibilities:**
- Fetch context from user's MCP resources
- Query RAG for relevant memories
- Build prompts with full context
- Route to appropriate LLM
- Handle tool calls (platform vs user)
- Save conversation back to MCP
- Stream responses to user
**Key Methods:**
- `handleMessage()`: Process single message (non-streaming)
- `streamMessage()`: Process with streaming response
- `initialize()`: Connect to user's MCP server
### 2. MCP Client Connector (`gateway/src/harness/mcp-client.ts`)
Connects to user's MCP container using Model Context Protocol.
**Features:**
- Resource reading (context://, indicators://, strategies://)
- Tool execution (save_message, run_backtest, etc.)
- Automatic reconnection on container restarts
- Error handling and fallbacks
### 3. Model Router (`gateway/src/llm/router.ts`)
Routes queries to appropriate LLM based on:
- **License tier**: Free users → smaller models, paid → better models
- **Complexity**: Simple queries → fast models, complex → powerful models
- **Cost optimization**: Balance performance vs cost
**Routing Strategies:**
- `COST`: Minimize cost
- `COMPLEXITY`: Match model to query complexity
- `SPEED`: Prioritize fast responses
- `QUALITY`: Best available model
### 4. Memory Layer
#### Three-Tier Storage:
**Redis** (Hot Storage)
- Active session state
- Recent conversation history (last 50 messages)
- LangGraph checkpoints (1 hour TTL)
- Fast reads for active conversations
**Qdrant** (Vector Search)
- Conversation embeddings
- User-specific memories (user_id = actual user ID)
- **Global platform knowledge** (user_id = "0")
- RAG retrieval with cosine similarity
- GDPR-compliant (indexed by user_id for fast deletion)
**Iceberg** (Cold Storage)
- Full conversation history (partitioned by user_id, session_id)
- Checkpoint snapshots for replay
- Analytics and time-travel queries
- GDPR-compliant with compaction
#### RAG System:
**Global Knowledge** (user_id="0"):
- Platform capabilities and architecture
- Trading concepts and fundamentals
- Indicator development guides
- Strategy patterns and examples
- Loaded from `gateway/knowledge/` markdown files
**User Knowledge** (user_id=specific user):
- Personal conversation history
- Trading preferences and style
- Custom indicators and strategies
- Workspace state and context
**Query Flow:**
1. User query is embedded using EmbeddingService
2. Qdrant searches: `user_id IN (current_user, "0")`
3. Top-K relevant chunks returned
4. Added to LLM context automatically
### 5. Skills vs Subagents
#### Skills (`gateway/src/harness/skills/`)
**Use for**: Well-defined, specific tasks
- Market analysis
- Strategy validation
- Single-purpose capabilities
- Defined in markdown + TypeScript
**Structure:**
```typescript
class MarketAnalysisSkill extends BaseSkill {
async execute(context, parameters) {
// Implementation
}
}
```
#### Subagents (`gateway/src/harness/subagents/`)
**Use for**: Complex domain expertise with context
- Code reviewer with review guidelines
- Risk analyzer with risk models
- Multi-file knowledge base in memory/ directory
- Custom system prompts
**Structure:**
```
subagents/
code-reviewer/
config.yaml # Model, memory files, capabilities
system-prompt.md # Specialized instructions
memory/
review-guidelines.md
common-patterns.md
best-practices.md
index.ts
```
**Recommendation**: Prefer skills for most tasks. Use subagents when you need:
- Substantial domain-specific knowledge
- Multi-file context management
- Specialized system prompts
### 6. Workflows (`gateway/src/harness/workflows/`)
LangGraph state machines for multi-step orchestration:
**Features:**
- Validation loops (retry with fixes)
- Human-in-the-loop (approval gates)
- Error recovery
- State persistence via checkpoints
**Example Workflows:**
- Strategy validation: review → backtest → risk → approval
- Trading request: analysis → risk → approval → execute
## User Context Structure
Every interaction includes rich context:
```typescript
interface UserContext {
userId: string;
sessionId: string;
license: UserLicense;
// Multi-channel support
activeChannel: {
type: 'websocket' | 'telegram' | 'slack' | 'discord';
channelUserId: string;
capabilities: {
supportsMarkdown: boolean;
supportsImages: boolean;
supportsButtons: boolean;
maxMessageLength: number;
};
};
// Retrieved from MCP + RAG
conversationHistory: BaseMessage[];
relevantMemories: MemoryChunk[];
workspaceState: WorkspaceContext;
}
```
## User-Specific Files and Tools
User's MCP container provides access to:
**Indicators** (`indicators/*.py`)
- Custom technical indicators
- Pure functions: DataFrame → Series/DataFrame
- Version controlled in user's git repo
**Strategies** (`strategies/*.py`)
- Trading strategies with entry/exit rules
- Position sizing and risk management
- Backtestable and deployable
**Watchlists**
- Saved ticker lists
- Market monitoring
**Preferences**
- Trading style and risk tolerance
- Chart settings and colors
- Notification preferences
**Executors** (sub-strategies)
- Tactical order generators (TWAP, iceberg, etc.)
- Smart order routing
## Global Knowledge Management
### Document Loading
At gateway startup:
1. DocumentLoader scans `gateway/knowledge/` directory
2. Markdown files chunked by headers (~1000 tokens/chunk)
3. Embeddings generated via EmbeddingService
4. Stored in Qdrant with user_id="0"
5. Content hashing enables incremental updates
### Directory Structure
```
gateway/knowledge/
├── platform/ # Platform capabilities
├── trading/ # Trading fundamentals
├── indicators/ # Indicator development
└── strategies/ # Strategy patterns
```
### Updating Knowledge
**Development:**
```bash
curl -X POST http://localhost:3000/admin/reload-knowledge
```
**Production:**
- Update markdown files
- Deploy new version
- Auto-loaded on startup
**Monitoring:**
```bash
curl http://localhost:3000/admin/knowledge-stats
```
## Container Lifecycle
### User Container Creation
When user connects:
1. Gateway checks if container exists (ContainerManager)
2. If not, creates Kubernetes pod with:
- Agent container (Python + conda)
- Lifecycle sidecar (container management)
- Persistent volume (git repo)
3. Waits for MCP server ready (~5-10s cold start)
4. Establishes MCP connection
5. Begins message processing
### Container Shutdown
**Free users:** 15 minutes idle timeout
**Paid users:** Longer timeout based on license
**On shutdown:**
- Graceful save of all state
- Persistent storage retained
- Fast restart on next connection
### MCP Authentication Modes
1. **Public Mode** (Free tier): No auth, read-only, anonymous session
2. **Gateway Auth** (Standard): Gateway authenticates, container trusts gateway
3. **Direct Auth** (Enterprise): User authenticates directly with container
## Implementation Status
### ✅ Completed
- Agent Harness with MCP integration
- Model routing with license tiers
- RAG retriever with Qdrant
- Document loader for global knowledge
- EmbeddingService (Ollama/OpenAI)
- Skills and subagents framework
- Multi-channel support (WebSocket, Telegram)
- Container lifecycle management
- Event system with ZeroMQ
### 🚧 In Progress
- Iceberg integration (checkpoint-saver, conversation-store)
- More subagents (risk-analyzer, market-analyst)
- LangGraph workflows with interrupts
- Platform tools (market data, charting)
### 📋 Planned
- File watcher for hot-reload in development
- Advanced RAG strategies (hybrid search, re-ranking)
- Caching layer for expensive operations
- Performance monitoring and metrics
## References
- Implementation: `gateway/src/harness/`
- Documentation: `gateway/src/harness/README.md`
- Knowledge base: `gateway/knowledge/`
- LangGraph: https://langchain-ai.github.io/langgraphjs/
- Qdrant: https://qdrant.tech/documentation/
- MCP Spec: https://modelcontextprotocol.io/