redesign fully scaffolded and web login works

2026-03-17 20:10:47 -04:00
parent b9cc397e05
commit f6bd22a8ef
143 changed files with 17317 additions and 693 deletions
--- a/doc/agent_harness.md
+++ b/doc/agent_harness.md
@@ -0,0 +1,392 @@
+# Agent Harness Architecture
+
+The Agent Harness is the core orchestration layer for the Dexorder AI platform, built on LangChain.js and LangGraph.js.
+
+## Architecture Overview
+
+```
+┌─────────────────────────────────────────────────────────────┐
+│                    Gateway (Fastify)                        │
+│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐     │
+│  │  WebSocket   │  │  Telegram    │  │  Event       │     │
+│  │  Handler     │  │  Handler     │  │  Router      │     │
+│  └──────┬───────┘  └──────┬───────┘  └──────┬───────┘     │
+│         │                  │                  │              │
+│         └──────────────────┴──────────────────┘              │
+│                            │                                 │
+│                    ┌───────▼────────┐                        │
+│                    │ Agent Harness  │                        │
+│                    │  (Stateless)   │                        │
+│                    └───────┬────────┘                        │
+│                            │                                 │
+│         ┌──────────────────┼──────────────────┐             │
+│         │                  │                  │             │
+│    ┌────▼─────┐      ┌────▼─────┐      ┌────▼─────┐       │
+│    │   MCP    │      │   LLM    │      │   RAG    │       │
+│    │ Connector│      │  Router  │      │ Retriever│       │
+│    └────┬─────┘      └────┬─────┘      └────┬─────┘       │
+│         │                  │                  │             │
+└─────────┼──────────────────┼──────────────────┼─────────────┘
+          │                  │                  │
+          ▼                  ▼                  ▼
+   ┌────────────┐     ┌───────────┐     ┌───────────┐
+   │   User's   │     │    LLM    │     │  Qdrant   │
+   │    MCP     │     │ Providers │     │ (Vectors) │
+   │ Container  │     │(Anthropic,│     │           │
+   │ (k8s pod)  │     │  OpenAI,  │     │  Global + │
+   │            │     │   etc)    │     │   User    │
+   └────────────┘     └───────────┘     └───────────┘
+```
+
+## Message Processing Flow
+
+When a user sends a message:
+
+```
+1. Gateway receives message via channel (WebSocket/Telegram)
+   ↓
+2. Authenticator validates user and gets license info
+   ↓
+3. Container Manager ensures user's MCP container is running
+   ↓
+4. Agent Harness processes message:
+   │
+   ├─→ a. MCPClientConnector fetches context resources:
+   │      - context://user-profile
+   │      - context://conversation-summary
+   │      - context://workspace-state
+   │      - context://system-prompt
+   │
+   ├─→ b. RAGRetriever searches for relevant memories:
+   │      - Embeds user query
+   │      - Searches Qdrant: user_id = current_user OR user_id = "0"
+   │      - Returns user-specific + global platform knowledge
+   │
+   ├─→ c. Build system prompt:
+   │      - Base platform prompt
+   │      - User profile context
+   │      - Workspace state
+   │      - Custom user instructions
+   │      - Relevant RAG memories
+   │
+   ├─→ d. ModelRouter selects LLM:
+   │      - Based on license tier
+   │      - Query complexity
+   │      - Configured routing strategy
+   │
+   ├─→ e. LLM invocation with tool support:
+   │      - Send messages to LLM
+   │      - If tool calls requested:
+   │         • Platform tools → handled by gateway
+   │         • User tools → proxied to MCP container
+   │      - Loop until no more tool calls
+   │
+   ├─→ f. Save conversation to MCP:
+   │      - mcp.callTool('save_message', user_message)
+   │      - mcp.callTool('save_message', assistant_message)
+   │
+   └─→ g. Return response to user via channel
+```
+
+## Core Components
+
+### 1. Agent Harness (`gateway/src/harness/agent-harness.ts`)
+
+**Stateless orchestrator** - all state lives in user's MCP container or RAG.
+
+**Responsibilities:**
+- Fetch context from user's MCP resources
+- Query RAG for relevant memories
+- Build prompts with full context
+- Route to appropriate LLM
+- Handle tool calls (platform vs user)
+- Save conversation back to MCP
+- Stream responses to user
+
+**Key Methods:**
+- `handleMessage()`: Process single message (non-streaming)
+- `streamMessage()`: Process with streaming response
+- `initialize()`: Connect to user's MCP server
+
+### 2. MCP Client Connector (`gateway/src/harness/mcp-client.ts`)
+
+Connects to user's MCP container using Model Context Protocol.
+
+**Features:**
+- Resource reading (context://, indicators://, strategies://)
+- Tool execution (save_message, run_backtest, etc.)
+- Automatic reconnection on container restarts
+- Error handling and fallbacks
+
+### 3. Model Router (`gateway/src/llm/router.ts`)
+
+Routes queries to appropriate LLM based on:
+- **License tier**: Free users → smaller models, paid → better models
+- **Complexity**: Simple queries → fast models, complex → powerful models
+- **Cost optimization**: Balance performance vs cost
+
+**Routing Strategies:**
+- `COST`: Minimize cost
+- `COMPLEXITY`: Match model to query complexity
+- `SPEED`: Prioritize fast responses
+- `QUALITY`: Best available model
+
+### 4. Memory Layer
+
+#### Three-Tier Storage:
+
+**Redis** (Hot Storage)
+- Active session state
+- Recent conversation history (last 50 messages)
+- LangGraph checkpoints (1 hour TTL)
+- Fast reads for active conversations
+
+**Qdrant** (Vector Search)
+- Conversation embeddings
+- User-specific memories (user_id = actual user ID)
+- **Global platform knowledge** (user_id = "0")
+- RAG retrieval with cosine similarity
+- GDPR-compliant (indexed by user_id for fast deletion)
+
+**Iceberg** (Cold Storage)
+- Full conversation history (partitioned by user_id, session_id)
+- Checkpoint snapshots for replay
+- Analytics and time-travel queries
+- GDPR-compliant with compaction
+
+#### RAG System:
+
+**Global Knowledge** (user_id="0"):
+- Platform capabilities and architecture
+- Trading concepts and fundamentals
+- Indicator development guides
+- Strategy patterns and examples
+- Loaded from `gateway/knowledge/` markdown files
+
+**User Knowledge** (user_id=specific user):
+- Personal conversation history
+- Trading preferences and style
+- Custom indicators and strategies
+- Workspace state and context
+
+**Query Flow:**
+1. User query is embedded using EmbeddingService
+2. Qdrant searches: `user_id IN (current_user, "0")`
+3. Top-K relevant chunks returned
+4. Added to LLM context automatically
+
+### 5. Skills vs Subagents
+
+#### Skills (`gateway/src/harness/skills/`)
+
+**Use for**: Well-defined, specific tasks
+- Market analysis
+- Strategy validation
+- Single-purpose capabilities
+- Defined in markdown + TypeScript
+
+**Structure:**
+```typescript
+class MarketAnalysisSkill extends BaseSkill {
+  async execute(context, parameters) {
+    // Implementation
+  }
+}
+```
+
+#### Subagents (`gateway/src/harness/subagents/`)
+
+**Use for**: Complex domain expertise with context
+- Code reviewer with review guidelines
+- Risk analyzer with risk models
+- Multi-file knowledge base in memory/ directory
+- Custom system prompts
+
+**Structure:**
+```
+subagents/
+  code-reviewer/
+    config.yaml              # Model, memory files, capabilities
+    system-prompt.md         # Specialized instructions
+    memory/
+      review-guidelines.md
+      common-patterns.md
+      best-practices.md
+    index.ts
+```
+
+**Recommendation**: Prefer skills for most tasks. Use subagents when you need:
+- Substantial domain-specific knowledge
+- Multi-file context management
+- Specialized system prompts
+
+### 6. Workflows (`gateway/src/harness/workflows/`)
+
+LangGraph state machines for multi-step orchestration:
+
+**Features:**
+- Validation loops (retry with fixes)
+- Human-in-the-loop (approval gates)
+- Error recovery
+- State persistence via checkpoints
+
+**Example Workflows:**
+- Strategy validation: review → backtest → risk → approval
+- Trading request: analysis → risk → approval → execute
+
+## User Context Structure
+
+Every interaction includes rich context:
+
+```typescript
+interface UserContext {
+  userId: string;
+  sessionId: string;
+  license: UserLicense;
+
+  // Multi-channel support
+  activeChannel: {
+    type: 'websocket' | 'telegram' | 'slack' | 'discord';
+    channelUserId: string;
+    capabilities: {
+      supportsMarkdown: boolean;
+      supportsImages: boolean;
+      supportsButtons: boolean;
+      maxMessageLength: number;
+    };
+  };
+
+  // Retrieved from MCP + RAG
+  conversationHistory: BaseMessage[];
+  relevantMemories: MemoryChunk[];
+  workspaceState: WorkspaceContext;
+}
+```
+
+## User-Specific Files and Tools
+
+User's MCP container provides access to:
+
+**Indicators** (`indicators/*.py`)
+- Custom technical indicators
+- Pure functions: DataFrame → Series/DataFrame
+- Version controlled in user's git repo
+
+**Strategies** (`strategies/*.py`)
+- Trading strategies with entry/exit rules
+- Position sizing and risk management
+- Backtestable and deployable
+
+**Watchlists**
+- Saved ticker lists
+- Market monitoring
+
+**Preferences**
+- Trading style and risk tolerance
+- Chart settings and colors
+- Notification preferences
+
+**Executors** (sub-strategies)
+- Tactical order generators (TWAP, iceberg, etc.)
+- Smart order routing
+
+## Global Knowledge Management
+
+### Document Loading
+
+At gateway startup:
+1. DocumentLoader scans `gateway/knowledge/` directory
+2. Markdown files chunked by headers (~1000 tokens/chunk)
+3. Embeddings generated via EmbeddingService
+4. Stored in Qdrant with user_id="0"
+5. Content hashing enables incremental updates
+
+### Directory Structure
+
+```
+gateway/knowledge/
+  ├── platform/          # Platform capabilities
+  ├── trading/           # Trading fundamentals
+  ├── indicators/        # Indicator development
+  └── strategies/        # Strategy patterns
+```
+
+### Updating Knowledge
+
+**Development:**
+```bash
+curl -X POST http://localhost:3000/admin/reload-knowledge
+```
+
+**Production:**
+- Update markdown files
+- Deploy new version
+- Auto-loaded on startup
+
+**Monitoring:**
+```bash
+curl http://localhost:3000/admin/knowledge-stats
+```
+
+## Container Lifecycle
+
+### User Container Creation
+
+When user connects:
+1. Gateway checks if container exists (ContainerManager)
+2. If not, creates Kubernetes pod with:
+   - Agent container (Python + conda)
+   - Lifecycle sidecar (container management)
+   - Persistent volume (git repo)
+3. Waits for MCP server ready (~5-10s cold start)
+4. Establishes MCP connection
+5. Begins message processing
+
+### Container Shutdown
+
+**Free users:** 15 minutes idle timeout
+**Paid users:** Longer timeout based on license
+**On shutdown:**
+- Graceful save of all state
+- Persistent storage retained
+- Fast restart on next connection
+
+### MCP Authentication Modes
+
+1. **Public Mode** (Free tier): No auth, read-only, anonymous session
+2. **Gateway Auth** (Standard): Gateway authenticates, container trusts gateway
+3. **Direct Auth** (Enterprise): User authenticates directly with container
+
+## Implementation Status
+
+### ✅ Completed
+- Agent Harness with MCP integration
+- Model routing with license tiers
+- RAG retriever with Qdrant
+- Document loader for global knowledge
+- EmbeddingService (Ollama/OpenAI)
+- Skills and subagents framework
+- Multi-channel support (WebSocket, Telegram)
+- Container lifecycle management
+- Event system with ZeroMQ
+
+### 🚧 In Progress
+- Iceberg integration (checkpoint-saver, conversation-store)
+- More subagents (risk-analyzer, market-analyst)
+- LangGraph workflows with interrupts
+- Platform tools (market data, charting)
+
+### 📋 Planned
+- File watcher for hot-reload in development
+- Advanced RAG strategies (hybrid search, re-ranking)
+- Caching layer for expensive operations
+- Performance monitoring and metrics
+
+## References
+
+- Implementation: `gateway/src/harness/`
+- Documentation: `gateway/src/harness/README.md`
+- Knowledge base: `gateway/knowledge/`
+- LangGraph: https://langchain-ai.github.io/langgraphjs/
+- Qdrant: https://qdrant.tech/documentation/
+- MCP Spec: https://modelcontextprotocol.io/