container lifecycle management
This commit is contained in:
168
doc/protocol.md
Normal file
168
doc/protocol.md
Normal file
@@ -0,0 +1,168 @@
|
||||
# ZeroMQ Protocol Architecture
|
||||
|
||||
Our data transfer protocol uses ZeroMQ with Protobufs. We send a small envelope with a protocol version byte as the first frame, then a type ID as the first byte of the second frame, followed by the protobuf payload also in the second frame.
|
||||
|
||||
OHLC periods are represented as seconds.
|
||||
|
||||
## Data Flow Overview
|
||||
|
||||
**Relay as Gateway**: The Relay is a well-known bind point that all components connect to. It routes messages between clients, ingestors, and Flink.
|
||||
|
||||
### Historical Data Query Flow (Async Event-Driven Architecture)
|
||||
* Client generates request_id and/or client_id (both are client-generated)
|
||||
* Client computes notification topic: `RESPONSE:{client_id}` or `HISTORY_READY:{request_id}`
|
||||
* **Client subscribes to notification topic BEFORE sending request (prevents race condition)**
|
||||
* Client sends SubmitHistoricalRequest to Relay (REQ/REP)
|
||||
* Relay returns immediate SubmitResponse with request_id and notification_topic (for confirmation)
|
||||
* Relay publishes DataRequest to ingestor work queue with exchange prefix (PUB/SUB)
|
||||
* Ingestor receives request, fetches data from exchange
|
||||
* Ingestor writes OHLC data to Kafka with __metadata in first record
|
||||
* Flink reads from Kafka, processes data, writes to Iceberg
|
||||
* Flink publishes HistoryReadyNotification to ZMQ PUB socket (port 5557) with deterministic topic
|
||||
* Relay proxies notification via XSUB → XPUB to clients
|
||||
* Client receives notification (already subscribed) and queries Iceberg for data
|
||||
|
||||
**Key Architectural Change**: Relay is completely stateless. No request/response correlation needed. All notification routing is topic-based (e.g., "RESPONSE:{client_id}").
|
||||
|
||||
**Race Condition Prevention**: Notification topics are deterministic based on client-generated values (request_id or client_id). Clients MUST subscribe to the notification topic BEFORE submitting the request to avoid missing notifications.
|
||||
|
||||
**Two Notification Patterns**:
|
||||
1. **Per-client topic** (`RESPONSE:{client_id}`): Subscribe once during connection, reuse for all requests from this client. Recommended for most clients.
|
||||
2. **Per-request topic** (`HISTORY_READY:{request_id}`): Subscribe immediately before each request. Use when you need per-request isolation or don't have a persistent client_id.
|
||||
|
||||
### Realtime Data Flow (Flink → Relay → Clients)
|
||||
* Ingestors write realtime ticks to Kafka
|
||||
* Flink reads from Kafka, processes OHLC aggregations, CEP triggers
|
||||
* Flink publishes market data via ZMQ PUB
|
||||
* Relay subscribes to Flink (XSUB) and fanouts to clients (XPUB)
|
||||
* Clients subscribe to specific tickers
|
||||
|
||||
### Data Processing (Kafka → Flink → Iceberg)
|
||||
* All market data flows through Kafka (durable event log)
|
||||
* Flink processes streams for aggregations and CEP
|
||||
* Flink writes historical data to Apache Iceberg tables
|
||||
* Clients can query Iceberg for historical data (alternative to ingestor backfill)
|
||||
|
||||
**Key Design Principles**:
|
||||
* Relay is the well-known bind point - all other components connect to it
|
||||
* Relay is completely stateless - no request tracking, only topic-based routing
|
||||
* Exchange prefix filtering allows ingestor specialization (e.g., only BINANCE ingestors)
|
||||
* Historical data flows through Kafka (durable processing) only - no direct response
|
||||
* Async event-driven notifications via pub/sub (Flink → Relay → Clients)
|
||||
* Protobufs over ZMQ for all inter-service communication
|
||||
* Kafka for durability and Flink stream processing
|
||||
* Iceberg for long-term historical storage and client queries
|
||||
|
||||
## ZeroMQ Channels and Patterns
|
||||
|
||||
All sockets bind on **Relay** (well-known endpoint). Components connect to relay.
|
||||
|
||||
### 1. Client Request Channel (Clients → Relay)
|
||||
**Pattern**: ROUTER (Relay binds, Clients use REQ)
|
||||
- **Socket Type**: Relay uses ROUTER (bind), Clients use REQ (connect)
|
||||
- **Endpoint**: `tcp://*:5559` (Relay binds)
|
||||
- **Message Types**: `SubmitHistoricalRequest` → `SubmitResponse`
|
||||
- **Behavior**:
|
||||
- Client generates request_id and/or client_id
|
||||
- Client computes notification topic deterministically
|
||||
- **Client subscribes to notification topic FIRST (prevents race)**
|
||||
- Client sends REQ for historical OHLC data
|
||||
- Relay validates request and returns immediate acknowledgment
|
||||
- Response includes notification_topic for client confirmation
|
||||
- Relay publishes DataRequest to ingestor work queue
|
||||
- No request tracking - relay is stateless
|
||||
|
||||
### 2. Ingestor Work Queue (Relay → Ingestors)
|
||||
**Pattern**: PUB/SUB with exchange prefix filtering
|
||||
- **Socket Type**: Relay uses PUB (bind), Ingestors use SUB (connect)
|
||||
- **Endpoint**: `tcp://*:5555` (Relay binds)
|
||||
- **Message Types**: `DataRequest` (historical or realtime)
|
||||
- **Topic Prefix**: Exchange name (e.g., `BINANCE:`, `COINBASE:`)
|
||||
- **Behavior**:
|
||||
- Relay publishes work with exchange prefix from ticker
|
||||
- Ingestors subscribe only to exchanges they support
|
||||
- Multiple ingestors can compete for same exchange
|
||||
- Ingestors write data to Kafka only (no direct response)
|
||||
- Flink processes Kafka → Iceberg → notification
|
||||
|
||||
### 3. Market Data Fanout (Relay ↔ Flink ↔ Clients)
|
||||
**Pattern**: XPUB/XSUB proxy
|
||||
- **Socket Type**:
|
||||
- Relay XPUB (bind) ← Clients SUB (connect) - Port 5558
|
||||
- Relay XSUB (connect) → Flink PUB (bind) - Port 5557
|
||||
- **Message Types**: `Tick`, `OHLC`, `HistoryReadyNotification`
|
||||
- **Topic Formats**:
|
||||
- Market data: `{ticker}|{data_type}` (e.g., `BINANCE:BTC/USDT|tick`)
|
||||
- Notifications: `RESPONSE:{client_id}` or `HISTORY_READY:{request_id}`
|
||||
- **Behavior**:
|
||||
- Clients subscribe to ticker topics and notification topics via Relay XPUB
|
||||
- Relay forwards subscriptions to Flink via XSUB
|
||||
- Flink publishes processed market data and notifications
|
||||
- Relay proxies data to subscribed clients (stateless forwarding)
|
||||
- Dynamic subscription management (no pre-registration)
|
||||
|
||||
### 4. Ingestor Control Channel (Optional - Future Use)
|
||||
**Pattern**: PUB/SUB (Broadcast control)
|
||||
- **Socket Type**: Relay uses PUB, Ingestors use SUB
|
||||
- **Endpoint**: `tcp://*:5557` (Relay binds)
|
||||
- **Message Types**: `IngestorControl` (cancel, config updates)
|
||||
- **Behavior**:
|
||||
- Broadcast control messages to all ingestors
|
||||
- Used for realtime subscription cancellation
|
||||
- Configuration updates
|
||||
|
||||
## Message Envelope Format
|
||||
|
||||
The core protocol uses two ZeroMQ frames:
|
||||
```
|
||||
Frame 1: [1 byte: protocol version]
|
||||
Frame 2: [1 byte: message type ID][N bytes: protobuf message]
|
||||
```
|
||||
|
||||
This two-frame approach allows receivers to check the protocol version before parsing the message type and protobuf payload.
|
||||
|
||||
**Important**: Some ZeroMQ socket patterns (PUB/SUB, XPUB/XSUB) may prepend additional frames for routing purposes. For example:
|
||||
- **PUB/SUB with topic filtering**: SUB sockets receive `[topic frame][version frame][message frame]`
|
||||
- **ROUTER sockets**: Prepend identity frames before the message
|
||||
|
||||
Components must handle these additional frames appropriately:
|
||||
- SUB sockets: Skip the first frame (topic), then parse the remaining frames as the standard 2-frame envelope
|
||||
- ROUTER sockets: Extract identity frames, then parse the standard 2-frame envelope
|
||||
|
||||
The two-frame envelope is the **logical protocol format**, but physical transmission may include additional ZeroMQ transport frames.
|
||||
|
||||
## Message Type IDs
|
||||
|
||||
| Type ID | Message Type | Description |
|
||||
|---------|---------------------------|------------------------------------------------|
|
||||
| 0x01 | DataRequest | Request for historical or realtime data |
|
||||
| 0x02 | DataResponse (deprecated) | Historical data response (no longer used) |
|
||||
| 0x03 | IngestorControl | Control messages for ingestors |
|
||||
| 0x04 | Tick | Individual trade tick data |
|
||||
| 0x05 | OHLC | Single OHLC candle with volume |
|
||||
| 0x06 | Market | Market metadata |
|
||||
| 0x07 | OHLCRequest (deprecated) | Client request (replaced by SubmitHistorical) |
|
||||
| 0x08 | Response (deprecated) | Generic response (replaced by SubmitResponse) |
|
||||
| 0x09 | CEPTriggerRequest | Register CEP trigger |
|
||||
| 0x0A | CEPTriggerAck | CEP trigger acknowledgment |
|
||||
| 0x0B | CEPTriggerEvent | CEP trigger fired callback |
|
||||
| 0x0C | OHLCBatch | Batch of OHLC rows with metadata (Kafka) |
|
||||
| 0x10 | SubmitHistoricalRequest | Client request for historical data (async) |
|
||||
| 0x11 | SubmitResponse | Immediate ack with notification topic |
|
||||
| 0x12 | HistoryReadyNotification | Notification that data is ready in Iceberg |
|
||||
|
||||
## Error Handling
|
||||
|
||||
**Async Architecture Error Handling**:
|
||||
- Failed historical requests: ingestor writes error marker to Kafka
|
||||
- Flink reads error marker and publishes HistoryReadyNotification with ERROR status
|
||||
- Client timeout: if no notification received within timeout, assume failure
|
||||
- Realtime requests cancelled via control channel if ingestor fails
|
||||
- REQ/REP timeouts: 30 seconds default for client request submission
|
||||
- PUB/SUB has no delivery guarantees (Kafka provides durability)
|
||||
- No response routing needed - all notifications via topic-based pub/sub
|
||||
|
||||
**Durability**:
|
||||
- All data flows through Kafka for durability
|
||||
- Flink checkpointing ensures exactly-once processing
|
||||
- Client can retry request with new request_id if notification not received
|
||||
Reference in New Issue
Block a user