# aiignore # This is not implemented yet and are just notes for Tim # Overview We need a realtime data system that is scalable and durable, so we have the following architecture: * Protobufs over ZeroMQ for data streaming * Ingestors * Realtime data subscriptions (tick data) * Historical data queries (OHLC) * Everything pushes to Kafka topics * Kafka * Durable append logs for incoming and in-process data * Topics maintained by Flink in redesign/flink/src/main/resources/topics.yaml * Flink * Raw ingestor streams are read from Kafka * Deduplication * Builds OHLCs * Apache Iceberg * Historical data storage # Configuration All systems should use two YAML configuration files that are mounted by k8s from a ConfigMap and / or Secrets. Keep secrets separate from config. When a configuration or secrets item is needed, describe it in resdesign/doc/config.md # Ingest Ingestion API * all symbols * exchange id (BINANCE) * market_id (BTC/USDT) * market_type * Spot * description (Bitcoin/Tether on Binance) * column names ( ['open', 'high', 'low', 'close', 'volume', 'taker_vol', 'maker_vol']) * name * exchange * base asset * quote asset * earliest time * tick size * supported periods * Centralized data streaming backend * Ingestion of tick, ohlc, news, etc. into Kafka by worker gatherers * Flink with: * zmq pubsub * (seq, time) key for every row in a tick series * every series also has seq->time and time->seq indexes * Sequence tickers with strict seq's AND time index (seq can just be row counter autoincrement) * Historical data * Apache Iceberg * Clients query here first * Backfill service * Quote Server * Realtime current prices for selected quote currencies * Workspace * Current chart, indicators, drawings, etc. * Always in context, must be brief. Data series are a reference not the actual data. * Analysis * Analysis engines are short-running and always tied to a user * Free users lose pod and data when session times out * Conda available with many preinstalled packages * Pip & Conda configured to install * Src dir r/w with git * Indicators * Strategies * Analysis * Request Context * User ID * Workspace ID * Channel * Telegram * Web * Website * Current vue site * Gateway * Websocket gateway * Authentication * User Featureset / License Info added to requests/headers * Relays data pub/sub to web/mobile clients * Routes agent chat to/from user container * Active channel features * TV Chart * Text chat * Plot out * Voice/Audio * Static file server * Kafka * Temp Gateway files (image responses, etc.) * Logs * Kafka * Strategy Logs * Order/Execution Logs * Chat Logs * User ID Topic has TTL based on license * Agent Framework * Soul file * Tool set (incl subagents) * LLM choice * RAG namespace * Agents * Top-level coordinator * TradingView skill * Indicators, Drawings, Annotations * Research Agent * Pandas/Polars analysis * Plot generation * License Manager * Kafka Topics Doc w/ schemas *