feat: add @tag model override support and remove Qdrant dependencies

- Add model-tags parser for @Tag syntax in chat messages
- Support Anthropic models (Sonnet, Haiku, Opus) via @tag
- Remove Qdrant vector database from infrastructure and configs
- Simplify license model config to use null fallbacks
- Add greeting stream after model switch via @tag
- Fix protobuf field names to camelCase for v7 compatibility
- Add 429 rate limit retry logic with exponential backoff
- Remove RAG references from agent harness documentation
This commit is contained in:
2026-04-27 20:55:18 -04:00
parent 6f937f9e5e
commit d41fcd0499
50 changed files with 956 additions and 798 deletions

View File

@@ -19,7 +19,6 @@ Dexorder is an AI-powered trading platform that combines real-time market data p
│ • Authentication & session management │
│ • Agent Harness (LangChain/LangGraph orchestration) │
│ - MCP client connector to user containers │
│ - RAG retriever (Qdrant) │
│ - Model router (LLM selection) │
│ - Skills & subagents framework │
│ • Dynamic user container provisioning │
@@ -30,8 +29,7 @@ Dexorder is an AI-powered trading platform that combines real-time market data p
┌──────────────────┐ ┌──────────────┐ ┌──────────────────────┐
│ User Containers │ │ Relay │ │ Infrastructure │
│ (per-user pods) │ │ (ZMQ Router) │ │ • DragonflyDB (cache)│
│ │ │ │ │ • Qdrant (vectors)
│ • MCP Server │ │ • Market data│ │ • PostgreSQL (meta) │
│ │ │ │ • MCP Server │ │ • Market data│ │ • PostgreSQL (meta)
│ • User files: │ │ fanout │ │ • MinIO (S3) │
│ - Indicators │ │ • Work queue │ │ │
│ - Strategies │ │ • Stateless │ │ │
@@ -86,18 +84,16 @@ Dexorder is an AI-powered trading platform that combines real-time market data p
- **Agent Harness (LangChain/LangGraph):** ([[agent_harness]])
- Stateless LLM orchestration
- MCP client connector to user containers
- RAG retrieval from Qdrant (global + user-specific knowledge)
- Model routing based on license tier and complexity
- Skills and subagents framework
- Workflow state machines with validation loops
**Key Features:**
- **Stateless design:** All conversation state lives in user containers or Qdrant
- **Stateless design:** All conversation state lives in user containers
- **Multi-channel support:** WebSocket, Telegram (future: mobile, Discord, Slack)
- **Kubernetes-native:** Uses k8s API for container management
- **Three-tier memory:**
- Redis: Hot storage, active sessions, LangGraph checkpoints (1 hour TTL)
- Qdrant: Vector search, RAG, global + user knowledge, GDPR-compliant
- Iceberg: Cold storage, full history, analytics, time-travel queries
**Infrastructure:**
@@ -270,12 +266,6 @@ Exchange API → Ingestor → Kafka → Flink → Iceberg
- Redis-compatible in-memory cache
- Session state, rate limiting, hot data
#### Qdrant
- Vector database for RAG
- **Global knowledge** (user_id="0"): Platform capabilities, trading concepts, strategy patterns
- **User knowledge** (user_id=specific): Personal conversations, preferences, strategies
- GDPR-compliant (indexed by user_id for fast deletion)
#### PostgreSQL
- Iceberg catalog metadata
- User accounts and license info (gateway)
@@ -458,17 +448,11 @@ The gateway's agent harness (LangChain/LangGraph) orchestrates LLM interactions
│ - context://workspace-state
│ - context://system-prompt
├─→ b. RAGRetriever searches Qdrant for relevant memories:
│ - Embeds user query
│ - Searches: user_id IN (current_user, "0")
│ - Returns user-specific + global platform knowledge
├─→ c. Build system prompt:
├─→ b. Build system prompt:
│ - Base platform prompt
│ - User profile context
│ - Workspace state
│ - Custom user instructions
│ - Relevant RAG memories
├─→ d. ModelRouter selects LLM:
│ - Based on license tier
@@ -492,8 +476,6 @@ The gateway's agent harness (LangChain/LangGraph) orchestrates LLM interactions
**Key Architecture:**
- **Gateway is stateless:** No conversation history stored in gateway
- **User context in MCP:** All user-specific data lives in user's container
- **Global knowledge in Qdrant:** Platform documentation loaded from `gateway/knowledge/`
- **RAG at gateway level:** Semantic search combines global + user knowledge
- **Skills vs Subagents:**
- Skills: Well-defined, single-purpose tasks
- Subagents: Complex domain expertise with multi-file context
@@ -630,7 +612,6 @@ See [[backend_redesign]] for detailed notes.
- Historical backfill service
**Phase 3: Agent Features**
- RAG integration (Qdrant)
- Strategy backtesting
- Risk management tools
- Portfolio analytics