# ZMQ Relay Gateway High-performance ZMQ relay/gateway that routes messages between clients, Flink, and ingestors. ## Architecture The relay acts as a well-known bind point for all components: ``` ┌─────────┐ ┌───────┐ ┌──────────┐ │ Clients │◄──────────────────►│ Relay │◄──────────────────►│ Ingestors│ └─────────┘ └───┬───┘ └──────────┘ │ │ ▼ ┌────────┐ │ Flink │ └────────┘ ``` ## Responsibilities ### 1. Client Request Routing - **Socket**: ROUTER (bind on port 5559) - **Flow**: Client REQ → Relay ROUTER → Ingestor PUB - Receives OHLC requests from clients - Routes to appropriate ingestors using exchange prefix filtering - Tracks pending requests and matches responses ### 2. Ingestor Work Distribution - **Socket**: PUB (bind on port 5555) - **Pattern**: Topic-based distribution with exchange prefixes - Publishes work requests with exchange prefix (e.g., `BINANCE:`) - Ingestors subscribe to exchanges they support ### 3. Response Routing - **Socket**: ROUTER (bind on port 5556) - **Flow**: Ingestor DEALER → Relay ROUTER → Client REQ - Receives responses from ingestors - Matches responses to pending client requests by request_id - Returns data to waiting clients ### 4. Market Data Fanout - **Sockets**: XPUB (bind on 5558) + XSUB (connect to Flink:5557) - **Pattern**: XPUB/XSUB proxy - Relays market data from Flink to multiple clients - Manages subscriptions dynamically - Forwards subscription messages upstream to Flink ## Message Flows ### Historical Data Request ``` 1. Client → Relay Socket: REQ → ROUTER (5559) Message: OHLCRequest (0x07) 2. Relay → Ingestor Socket: PUB (5555) Topic: Exchange prefix (e.g., "BINANCE:") Message: DataRequest (0x01) 3. Ingestor fetches data from exchange 4. Ingestor → Relay Socket: DEALER → ROUTER (5556) Message: DataResponse (0x02) 5. Relay → Client Socket: ROUTER → REQ Message: Response (0x08) ``` ### Market Data Subscription ``` 1. Client subscribes to ticker Socket: SUB → XPUB (5558) Topic: "BINANCE:BTC/USDT|tick" 2. Relay forwards subscription Socket: XSUB → Flink PUB (5557) 3. Flink publishes data Socket: PUB (5557) → XSUB 4. Relay fanout to clients Socket: XPUB (5558) → SUB ``` ## Configuration Edit `config.yaml`: ```yaml bind_address: "tcp://*" client_request_port: 5559 market_data_pub_port: 5558 ingestor_work_port: 5555 ingestor_response_port: 5556 flink_market_data_endpoint: "tcp://flink-jobmanager:5557" request_timeout_secs: 30 high_water_mark: 10000 ``` ## Building ```bash cargo build --release ``` ## Running ```bash # With default config ./target/release/relay # With custom config CONFIG_PATH=/path/to/config.yaml ./target/release/relay # With Docker docker build -t relay . docker run -p 5555-5559:5555-5559 relay ``` ## Environment Variables - `CONFIG_PATH`: Path to config file (default: `/config/config.yaml`) - `RUST_LOG`: Log level (default: `relay=info`) ## Ports | Port | Socket Type | Direction | Purpose | |------|------------|-----------|---------| | 5555 | PUB | → Ingestors | Work distribution with exchange prefix | | 5556 | ROUTER | ← Ingestors | Response collection | | 5557 | - | (Flink) | Flink market data publication | | 5558 | XPUB | → Clients | Market data fanout | | 5559 | ROUTER | ← Clients | Client request handling | ## Monitoring The relay logs all major events: ``` INFO relay: Client request routing INFO relay: Forwarded request to ingestors: prefix=BINANCE:, request_id=... INFO relay: Received response from ingestor: request_id=..., status=OK INFO relay: Sent response to client: request_id=... WARN relay: Request timed out: request_id=... ``` ## Performance - **High water mark**: Configurable per socket (default: 10,000 messages) - **Request timeout**: Automatic cleanup of expired requests (default: 30s) - **Zero-copy proxying**: XPUB/XSUB market data forwarding - **Async cleanup**: Background task for timeout management ## Design Decisions ### Why Rust? - **Performance**: Zero-cost abstractions, minimal overhead - **Safety**: Memory safety without garbage collection - **Concurrency**: Fearless concurrency with strong type system - **ZMQ Integration**: Excellent ZMQ bindings ### Why ROUTER for clients? - Preserves client identity for request/response matching - Allows async responses (no blocking) - Handles multiple concurrent clients efficiently ### Why PUB for ingestor work? - Topic-based filtering by exchange - Multiple ingestors can compete for same exchange - Scales horizontally with ingestor count - No single point of failure ### Why XPUB/XSUB for market data? - Dynamic subscription management - Efficient fanout to many clients - Upstream subscription control - Standard ZMQ proxy pattern ## Troubleshooting ### No response from ingestors Check: - Ingestors are connected to port 5555 - Ingestors have subscribed to exchange prefix - Topic format: `EXCHANGE:` (e.g., `BINANCE:`) ### Client timeout Check: - Request timeout configuration - Ingestor availability - Network connectivity - Pending requests map (logged on timeout) ### Market data not flowing Check: - Flink is publishing on port 5557 - Relay XSUB is connected to Flink - Clients have subscribed to correct topics - Topic format: `{ticker}|{data_type}` ## Testing Run the test client: ```bash cd ../test/history_client python client.py ``` Expected flow: 1. Client sends request to relay:5559 2. Relay publishes to ingestors:5555 3. Ingestor fetches and responds to relay:5556 4. Relay returns to client ## Future Enhancements - [ ] Metrics collection (Prometheus) - [ ] Health check endpoint - [ ] Request rate limiting - [ ] Circuit breaker for failed ingestors - [ ] Request deduplication - [ ] Response caching - [ ] Multi-part response support for large datasets