Files
ai/relay/README.md
2026-03-11 18:47:11 -04:00

6.2 KiB

ZMQ Relay Gateway

High-performance ZMQ relay/gateway that routes messages between clients, Flink, and ingestors.

Architecture

The relay acts as a well-known bind point for all components:

┌─────────┐                    ┌───────┐                    ┌──────────┐
│ Clients │◄──────────────────►│ Relay │◄──────────────────►│ Ingestors│
└─────────┘                    └───┬───┘                    └──────────┘
                                   │
                                   │
                                   ▼
                              ┌────────┐
                              │  Flink │
                              └────────┘

Responsibilities

1. Client Request Routing

  • Socket: ROUTER (bind on port 5559)
  • Flow: Client REQ → Relay ROUTER → Ingestor PUB
  • Receives OHLC requests from clients
  • Routes to appropriate ingestors using exchange prefix filtering
  • Tracks pending requests and matches responses

2. Ingestor Work Distribution

  • Socket: PUB (bind on port 5555)
  • Pattern: Topic-based distribution with exchange prefixes
  • Publishes work requests with exchange prefix (e.g., BINANCE:)
  • Ingestors subscribe to exchanges they support

3. Response Routing

  • Socket: ROUTER (bind on port 5556)
  • Flow: Ingestor DEALER → Relay ROUTER → Client REQ
  • Receives responses from ingestors
  • Matches responses to pending client requests by request_id
  • Returns data to waiting clients

4. Market Data Fanout

  • Sockets: XPUB (bind on 5558) + XSUB (connect to Flink:5557)
  • Pattern: XPUB/XSUB proxy
  • Relays market data from Flink to multiple clients
  • Manages subscriptions dynamically
  • Forwards subscription messages upstream to Flink

Message Flows

Historical Data Request

1. Client → Relay
   Socket: REQ → ROUTER (5559)
   Message: OHLCRequest (0x07)

2. Relay → Ingestor
   Socket: PUB (5555)
   Topic: Exchange prefix (e.g., "BINANCE:")
   Message: DataRequest (0x01)

3. Ingestor fetches data from exchange

4. Ingestor → Relay
   Socket: DEALER → ROUTER (5556)
   Message: DataResponse (0x02)

5. Relay → Client
   Socket: ROUTER → REQ
   Message: Response (0x08)

Market Data Subscription

1. Client subscribes to ticker
   Socket: SUB → XPUB (5558)
   Topic: "BINANCE:BTC/USDT|tick"

2. Relay forwards subscription
   Socket: XSUB → Flink PUB (5557)

3. Flink publishes data
   Socket: PUB (5557) → XSUB

4. Relay fanout to clients
   Socket: XPUB (5558) → SUB

Configuration

Edit config.yaml:

bind_address: "tcp://*"
client_request_port: 5559
market_data_pub_port: 5558
ingestor_work_port: 5555
ingestor_response_port: 5556
flink_market_data_endpoint: "tcp://flink-jobmanager:5557"
request_timeout_secs: 30
high_water_mark: 10000

Building

cargo build --release

Running

# With default config
./target/release/relay

# With custom config
CONFIG_PATH=/path/to/config.yaml ./target/release/relay

# With Docker
docker build -t relay .
docker run -p 5555-5559:5555-5559 relay

Environment Variables

  • CONFIG_PATH: Path to config file (default: /config/config.yaml)
  • RUST_LOG: Log level (default: relay=info)

Ports

Port Socket Type Direction Purpose
5555 PUB → Ingestors Work distribution with exchange prefix
5556 ROUTER ← Ingestors Response collection
5557 - (Flink) Flink market data publication
5558 XPUB → Clients Market data fanout
5559 ROUTER ← Clients Client request handling

Monitoring

The relay logs all major events:

INFO relay: Client request routing
INFO relay: Forwarded request to ingestors: prefix=BINANCE:, request_id=...
INFO relay: Received response from ingestor: request_id=..., status=OK
INFO relay: Sent response to client: request_id=...
WARN relay: Request timed out: request_id=...

Performance

  • High water mark: Configurable per socket (default: 10,000 messages)
  • Request timeout: Automatic cleanup of expired requests (default: 30s)
  • Zero-copy proxying: XPUB/XSUB market data forwarding
  • Async cleanup: Background task for timeout management

Design Decisions

Why Rust?

  • Performance: Zero-cost abstractions, minimal overhead
  • Safety: Memory safety without garbage collection
  • Concurrency: Fearless concurrency with strong type system
  • ZMQ Integration: Excellent ZMQ bindings

Why ROUTER for clients?

  • Preserves client identity for request/response matching
  • Allows async responses (no blocking)
  • Handles multiple concurrent clients efficiently

Why PUB for ingestor work?

  • Topic-based filtering by exchange
  • Multiple ingestors can compete for same exchange
  • Scales horizontally with ingestor count
  • No single point of failure

Why XPUB/XSUB for market data?

  • Dynamic subscription management
  • Efficient fanout to many clients
  • Upstream subscription control
  • Standard ZMQ proxy pattern

Troubleshooting

No response from ingestors

Check:

  • Ingestors are connected to port 5555
  • Ingestors have subscribed to exchange prefix
  • Topic format: EXCHANGE: (e.g., BINANCE:)

Client timeout

Check:

  • Request timeout configuration
  • Ingestor availability
  • Network connectivity
  • Pending requests map (logged on timeout)

Market data not flowing

Check:

  • Flink is publishing on port 5557
  • Relay XSUB is connected to Flink
  • Clients have subscribed to correct topics
  • Topic format: {ticker}|{data_type}

Testing

Run the test client:

cd ../test/history_client
python client.py

Expected flow:

  1. Client sends request to relay:5559
  2. Relay publishes to ingestors:5555
  3. Ingestor fetches and responds to relay:5556
  4. Relay returns to client

Future Enhancements

  • Metrics collection (Prometheus)
  • Health check endpoint
  • Request rate limiting
  • Circuit breaker for failed ingestors
  • Request deduplication
  • Response caching
  • Multi-part response support for large datasets