3.1 KiB
3.1 KiB
aiignore
This is not implemented yet and are just notes for Tim
Overview
We need a realtime data system that is scalable and durable, so we have the following architecture:
- Protobufs over ZeroMQ for data streaming
- Ingestors
- Realtime data subscriptions (tick data)
- Historical data queries (OHLC)
- Everything pushes to Kafka topics
- Kafka
- Durable append logs for incoming and in-process data
- Topics maintained by Flink in redesign/flink/src/main/resources/topics.yaml
- Flink
- Raw ingestor streams are read from Kafka
- Deduplication
- Builds OHLCs
- Apache Iceberg
- Historical data storage
Configuration
All systems should use two YAML configuration files that are mounted by k8s from a ConfigMap and / or Secrets. Keep secrets separate from config.
When a configuration or secrets item is needed, describe it in resdesign/doc/config.md
Ingest
Ingestion API
-
all symbols
- exchange id (BINANCE)
- market_id (BTC/USDT)
- market_type
- Spot
- description (Bitcoin/Tether on Binance)
- column names ( ['open', 'high', 'low', 'close', 'volume', 'taker_vol', 'maker_vol'])
- name
- exchange
- base asset
- quote asset
- earliest time
- tick size
- supported periods
-
Centralized data streaming backend
- Ingestion of tick, ohlc, news, etc. into Kafka by worker gatherers
- Flink with:
- zmq pubsub
- (seq, time) key for every row in a tick series
- every series also has seq->time and time->seq indexes
- Sequence tickers with strict seq's AND time index (seq can just be row counter autoincrement)
-
Historical data
- Apache Iceberg
- Clients query here first
- Backfill service
- Apache Iceberg
-
Quote Server
- Realtime current prices for selected quote currencies
-
Workspace
- Current chart, indicators, drawings, etc.
- Always in context, must be brief. Data series are a reference not the actual data.
-
Analysis
- Analysis engines are short-running and always tied to a user
- Free users lose pod and data when session times out
- Conda available with many preinstalled packages
- Pip & Conda configured to install
- Src dir r/w with git
- Indicators
- Strategies
- Analysis
-
Request Context
- User ID
- Workspace ID
- Channel
- Telegram
- Web
-
Website
- Current vue site
-
Gateway
- Websocket gateway
- Authentication
- User Featureset / License Info added to requests/headers
- Relays data pub/sub to web/mobile clients
- Routes agent chat to/from user container
- Authentication
- Active channel features
- TV Chart
- Text chat
- Plot out
- Voice/Audio
- Static file server
- Kafka
- Temp Gateway files (image responses, etc.)
- Websocket gateway
-
Logs
- Kafka
- Strategy Logs
- Order/Execution Logs
- Chat Logs
- User ID Topic has TTL based on license
- Kafka
-
Agent Framework
- Soul file
- Tool set (incl subagents)
- LLM choice
- RAG namespace
- Agents
- Top-level coordinator
- TradingView skill
- Indicators, Drawings, Annotations
- Research Agent
- Pandas/Polars analysis
- Plot generation
-
License Manager
-
Kafka Topics Doc w/ schemas