dexorder/ai

Fork 0

Files

Tim Olson f6bd22a8ef redesign fully scaffolded and web login works

2026-03-17 20:10:47 -04:00

11 KiB

Raw Permalink Blame History

Dexorder Gateway

Multi-channel gateway with agent harness for the Dexorder AI platform.

Architecture

┌─────────────────────────────────────────────────────────┐
│                    Platform Gateway                      │
│                   (Node.js/Fastify)                      │
│                                                          │
│  ┌────────────────────────────────────────────────┐    │
│  │  Channels                                       │    │
│  │  - WebSocket (/ws/chat)                         │    │
│  │  - Telegram Webhook (/webhook/telegram)        │    │
│  └────────────────────────────────────────────────┘    │
│                         ↕                                │
│  ┌────────────────────────────────────────────────┐    │
│  │  Authenticator                                  │    │
│  │  - JWT verification (WebSocket)                 │    │
│  │  - Channel linking (Telegram)                   │    │
│  │  - User license lookup (PostgreSQL)             │    │
│  └────────────────────────────────────────────────┘    │
│                         ↕                                │
│  ┌────────────────────────────────────────────────┐    │
│  │  Agent Harness (per-session)                    │    │
│  │  - Claude API integration                       │    │
│  │  - MCP client connector                         │    │
│  │  - Conversation state                           │    │
│  └────────────────────────────────────────────────┘    │
│                         ↕                                │
│  ┌────────────────────────────────────────────────┐    │
│  │  MCP Client                                      │    │
│  │  - User container connection                    │    │
│  │  - Tool routing                                  │    │
│  └────────────────────────────────────────────────┘    │
└─────────────────────────────────────────────────────────┘
                          ↕
          ┌───────────────────────────────┐
          │  User MCP Server (Python)      │
          │  - Strategies, indicators       │
          │  - Memory, preferences          │
          │  - Backtest sandbox             │
          └───────────────────────────────────┘

Features

Automatic container provisioning: Creates user agent containers on-demand via Kubernetes
Multi-channel support: WebSocket and Telegram webhooks
Per-channel authentication: JWT for web, channel linking for chat apps
User license management: Feature flags and resource limits from PostgreSQL
Container lifecycle management: Auto-shutdown on idle (handled by container sidecar)
License-based resources: Different memory/CPU/storage limits per tier
Multi-model LLM support: Anthropic Claude, OpenAI GPT, Google Gemini, OpenRouter (300+ models)
Zero vendor lock-in: Switch models with one line, powered by LangChain.js
Intelligent routing: Auto-select models based on complexity, license tier, or user preference
Streaming responses: Real-time chat with WebSocket and Telegram
Complex workflows: LangGraph for stateful trading analysis (backtest → risk → approval)
Agent harness: Stateless orchestrator (all context lives in user's MCP container)
MCP resource integration: User's RAG, conversation history, and preferences

Container Management

When a user authenticates, the gateway:

Checks for existing container: Queries Kubernetes for deployment
Creates if missing: Renders YAML template based on license tier
Waits for ready: Polls deployment status until healthy
Returns MCP endpoint: Computed from service name
Connects to MCP server: Proceeds with normal authentication flow

Container templates by license tier:

Tier	Memory	CPU	Storage	Idle Timeout
Free	512Mi	500m	1Gi	15min
Pro	2Gi	2000m	10Gi	60min
Enterprise	4Gi	4000m	50Gi	Never

Containers self-manage their lifecycle using the lifecycle sidecar (see ../lifecycle-sidecar/)

Setup

Prerequisites

Node.js >= 22.0.0
PostgreSQL database
At least one LLM provider API key:
- Anthropic Claude
- OpenAI GPT
- Google Gemini
- OpenRouter (one key for 300+ models)
Ollama (for embeddings): https://ollama.com/download
Redis (for session/hot storage)
Qdrant (for RAG vector search)
Kafka + Flink + Iceberg (for durable storage)

Development

Install dependencies:

npm install

Copy environment template:

cp .env.example .env

Configure .env (see .env.example):

DATABASE_URL=postgresql://postgres:postgres@localhost:5432/dexorder

# Configure at least one provider
ANTHROPIC_API_KEY=sk-ant-xxxxx
# OPENAI_API_KEY=sk-xxxxx
# GOOGLE_API_KEY=xxxxx
# OPENROUTER_API_KEY=sk-or-xxxxx

# Optional: Set default model
DEFAULT_MODEL_PROVIDER=anthropic
DEFAULT_MODEL=claude-3-5-sonnet-20241022

Start Ollama and pull embedding model:

# Install Ollama (one-time): https://ollama.com/download
# Or with Docker: docker run -d -p 11434:11434 ollama/ollama

# Pull the all-minilm embedding model (90MB, CPU-friendly)
ollama pull all-minilm

# Alternative models:
# ollama pull nomic-embed-text  # 8K context length
# ollama pull mxbai-embed-large  # Higher accuracy, slower

Run development server:

npm run dev

Production Build

npm run build
npm start

Docker

docker build -t dexorder/gateway:latest .
docker run -p 3000:3000 --env-file .env dexorder/gateway:latest

Database Schema

Required PostgreSQL tables (will be documented separately):

`user_licenses`

user_id (text, primary key)
email (text)
license_type (text: 'free', 'pro', 'enterprise')
features (jsonb)
resource_limits (jsonb)
mcp_server_url (text)
expires_at (timestamp, nullable)
created_at (timestamp)
updated_at (timestamp)

`user_channel_links`

id (serial, primary key)
user_id (text, foreign key)
channel_type (text: 'telegram', 'slack', 'discord')
channel_user_id (text)
created_at (timestamp)

API Endpoints

WebSocket

GET /ws/chat

WebSocket connection for web client
Auth: Bearer token in headers
Protocol: JSON messages

Example:

const ws = new WebSocket('ws://localhost:3000/ws/chat', {
  headers: {
    'Authorization': 'Bearer your-jwt-token'
  }
});

ws.on('message', (data) => {
  const msg = JSON.parse(data);
  console.log(msg);
});

ws.send(JSON.stringify({
  type: 'message',
  content: 'Hello, AI!'
}));

Telegram Webhook

POST /webhook/telegram

Telegram bot webhook endpoint
Auth: Telegram user linked to platform user
Automatically processes incoming messages

Health Check

GET /health

Returns server health status

Ollama Deployment Options

The gateway requires Ollama for embedding generation in RAG queries. You have two deployment options:

Option 1: Ollama in Gateway Container (Recommended for simplicity)

Install Ollama directly in the gateway container. This keeps all dependencies local and simplifies networking.

Dockerfile additions:

FROM node:22-slim

# Install Ollama
RUN curl -fsSL https://ollama.com/install.sh | sh

# Pull embedding model at build time
RUN ollama serve & \
    sleep 5 && \
    ollama pull all-minilm && \
    pkill ollama

# ... rest of your gateway Dockerfile

Start script (entrypoint.sh):

#!/bin/bash
# Start Ollama in background
ollama serve &

# Start gateway
node dist/main.js

Pros:

Simple networking (localhost:11434)
No extra K8s resources
Self-contained deployment

Cons:

Larger container image (~200MB extra)
CPU/memory shared with gateway process

Resource requirements:

Add +200MB memory
Add +0.2 CPU cores for embedding inference

Option 2: Ollama as Separate Pod/Sidecar

Deploy Ollama as a separate container in the same pod (sidecar) or as its own deployment.

K8s Deployment (sidecar pattern):

apiVersion: apps/v1
kind: Deployment
metadata:
  name: gateway
spec:
  template:
    spec:
      containers:
      - name: gateway
        image: ghcr.io/dexorder/gateway:latest
        env:
        - name: OLLAMA_URL
          value: http://localhost:11434

      - name: ollama
        image: ollama/ollama:latest
        command: ["/bin/sh", "-c"]
        args:
          - |
            ollama serve &
            sleep 5
            ollama pull all-minilm
            wait
        resources:
          requests:
            memory: "512Mi"
            cpu: "500m"
          limits:
            memory: "1Gi"
            cpu: "1000m"

K8s Deployment (separate service):

apiVersion: apps/v1
kind: Deployment
metadata:
  name: ollama
spec:
  replicas: 1
  template:
    spec:
      containers:
      - name: ollama
        image: ollama/ollama:latest
        # ... same as above
---
apiVersion: v1
kind: Service
metadata:
  name: ollama
spec:
  selector:
    app: ollama
  ports:
  - port: 11434

Gateway .env:

OLLAMA_URL=http://ollama:11434