backend redesign
This commit is contained in:
238
relay/README.md
Normal file
238
relay/README.md
Normal file
@@ -0,0 +1,238 @@
|
||||
# ZMQ Relay Gateway
|
||||
|
||||
High-performance ZMQ relay/gateway that routes messages between clients, Flink, and ingestors.
|
||||
|
||||
## Architecture
|
||||
|
||||
The relay acts as a well-known bind point for all components:
|
||||
|
||||
```
|
||||
┌─────────┐ ┌───────┐ ┌──────────┐
|
||||
│ Clients │◄──────────────────►│ Relay │◄──────────────────►│ Ingestors│
|
||||
└─────────┘ └───┬───┘ └──────────┘
|
||||
│
|
||||
│
|
||||
▼
|
||||
┌────────┐
|
||||
│ Flink │
|
||||
└────────┘
|
||||
```
|
||||
|
||||
## Responsibilities
|
||||
|
||||
### 1. Client Request Routing
|
||||
- **Socket**: ROUTER (bind on port 5559)
|
||||
- **Flow**: Client REQ → Relay ROUTER → Ingestor PUB
|
||||
- Receives OHLC requests from clients
|
||||
- Routes to appropriate ingestors using exchange prefix filtering
|
||||
- Tracks pending requests and matches responses
|
||||
|
||||
### 2. Ingestor Work Distribution
|
||||
- **Socket**: PUB (bind on port 5555)
|
||||
- **Pattern**: Topic-based distribution with exchange prefixes
|
||||
- Publishes work requests with exchange prefix (e.g., `BINANCE:`)
|
||||
- Ingestors subscribe to exchanges they support
|
||||
|
||||
### 3. Response Routing
|
||||
- **Socket**: ROUTER (bind on port 5556)
|
||||
- **Flow**: Ingestor DEALER → Relay ROUTER → Client REQ
|
||||
- Receives responses from ingestors
|
||||
- Matches responses to pending client requests by request_id
|
||||
- Returns data to waiting clients
|
||||
|
||||
### 4. Market Data Fanout
|
||||
- **Sockets**: XPUB (bind on 5558) + XSUB (connect to Flink:5557)
|
||||
- **Pattern**: XPUB/XSUB proxy
|
||||
- Relays market data from Flink to multiple clients
|
||||
- Manages subscriptions dynamically
|
||||
- Forwards subscription messages upstream to Flink
|
||||
|
||||
## Message Flows
|
||||
|
||||
### Historical Data Request
|
||||
|
||||
```
|
||||
1. Client → Relay
|
||||
Socket: REQ → ROUTER (5559)
|
||||
Message: OHLCRequest (0x07)
|
||||
|
||||
2. Relay → Ingestor
|
||||
Socket: PUB (5555)
|
||||
Topic: Exchange prefix (e.g., "BINANCE:")
|
||||
Message: DataRequest (0x01)
|
||||
|
||||
3. Ingestor fetches data from exchange
|
||||
|
||||
4. Ingestor → Relay
|
||||
Socket: DEALER → ROUTER (5556)
|
||||
Message: DataResponse (0x02)
|
||||
|
||||
5. Relay → Client
|
||||
Socket: ROUTER → REQ
|
||||
Message: Response (0x08)
|
||||
```
|
||||
|
||||
### Market Data Subscription
|
||||
|
||||
```
|
||||
1. Client subscribes to ticker
|
||||
Socket: SUB → XPUB (5558)
|
||||
Topic: "BINANCE:BTC/USDT|tick"
|
||||
|
||||
2. Relay forwards subscription
|
||||
Socket: XSUB → Flink PUB (5557)
|
||||
|
||||
3. Flink publishes data
|
||||
Socket: PUB (5557) → XSUB
|
||||
|
||||
4. Relay fanout to clients
|
||||
Socket: XPUB (5558) → SUB
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
Edit `config.yaml`:
|
||||
|
||||
```yaml
|
||||
bind_address: "tcp://*"
|
||||
client_request_port: 5559
|
||||
market_data_pub_port: 5558
|
||||
ingestor_work_port: 5555
|
||||
ingestor_response_port: 5556
|
||||
flink_market_data_endpoint: "tcp://flink-jobmanager:5557"
|
||||
request_timeout_secs: 30
|
||||
high_water_mark: 10000
|
||||
```
|
||||
|
||||
## Building
|
||||
|
||||
```bash
|
||||
cargo build --release
|
||||
```
|
||||
|
||||
## Running
|
||||
|
||||
```bash
|
||||
# With default config
|
||||
./target/release/relay
|
||||
|
||||
# With custom config
|
||||
CONFIG_PATH=/path/to/config.yaml ./target/release/relay
|
||||
|
||||
# With Docker
|
||||
docker build -t relay .
|
||||
docker run -p 5555-5559:5555-5559 relay
|
||||
```
|
||||
|
||||
## Environment Variables
|
||||
|
||||
- `CONFIG_PATH`: Path to config file (default: `/config/config.yaml`)
|
||||
- `RUST_LOG`: Log level (default: `relay=info`)
|
||||
|
||||
## Ports
|
||||
|
||||
| Port | Socket Type | Direction | Purpose |
|
||||
|------|------------|-----------|---------|
|
||||
| 5555 | PUB | → Ingestors | Work distribution with exchange prefix |
|
||||
| 5556 | ROUTER | ← Ingestors | Response collection |
|
||||
| 5557 | - | (Flink) | Flink market data publication |
|
||||
| 5558 | XPUB | → Clients | Market data fanout |
|
||||
| 5559 | ROUTER | ← Clients | Client request handling |
|
||||
|
||||
## Monitoring
|
||||
|
||||
The relay logs all major events:
|
||||
|
||||
```
|
||||
INFO relay: Client request routing
|
||||
INFO relay: Forwarded request to ingestors: prefix=BINANCE:, request_id=...
|
||||
INFO relay: Received response from ingestor: request_id=..., status=OK
|
||||
INFO relay: Sent response to client: request_id=...
|
||||
WARN relay: Request timed out: request_id=...
|
||||
```
|
||||
|
||||
## Performance
|
||||
|
||||
- **High water mark**: Configurable per socket (default: 10,000 messages)
|
||||
- **Request timeout**: Automatic cleanup of expired requests (default: 30s)
|
||||
- **Zero-copy proxying**: XPUB/XSUB market data forwarding
|
||||
- **Async cleanup**: Background task for timeout management
|
||||
|
||||
## Design Decisions
|
||||
|
||||
### Why Rust?
|
||||
|
||||
- **Performance**: Zero-cost abstractions, minimal overhead
|
||||
- **Safety**: Memory safety without garbage collection
|
||||
- **Concurrency**: Fearless concurrency with strong type system
|
||||
- **ZMQ Integration**: Excellent ZMQ bindings
|
||||
|
||||
### Why ROUTER for clients?
|
||||
|
||||
- Preserves client identity for request/response matching
|
||||
- Allows async responses (no blocking)
|
||||
- Handles multiple concurrent clients efficiently
|
||||
|
||||
### Why PUB for ingestor work?
|
||||
|
||||
- Topic-based filtering by exchange
|
||||
- Multiple ingestors can compete for same exchange
|
||||
- Scales horizontally with ingestor count
|
||||
- No single point of failure
|
||||
|
||||
### Why XPUB/XSUB for market data?
|
||||
|
||||
- Dynamic subscription management
|
||||
- Efficient fanout to many clients
|
||||
- Upstream subscription control
|
||||
- Standard ZMQ proxy pattern
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### No response from ingestors
|
||||
|
||||
Check:
|
||||
- Ingestors are connected to port 5555
|
||||
- Ingestors have subscribed to exchange prefix
|
||||
- Topic format: `EXCHANGE:` (e.g., `BINANCE:`)
|
||||
|
||||
### Client timeout
|
||||
|
||||
Check:
|
||||
- Request timeout configuration
|
||||
- Ingestor availability
|
||||
- Network connectivity
|
||||
- Pending requests map (logged on timeout)
|
||||
|
||||
### Market data not flowing
|
||||
|
||||
Check:
|
||||
- Flink is publishing on port 5557
|
||||
- Relay XSUB is connected to Flink
|
||||
- Clients have subscribed to correct topics
|
||||
- Topic format: `{ticker}|{data_type}`
|
||||
|
||||
## Testing
|
||||
|
||||
Run the test client:
|
||||
|
||||
```bash
|
||||
cd ../test/history_client
|
||||
python client.py
|
||||
```
|
||||
|
||||
Expected flow:
|
||||
1. Client sends request to relay:5559
|
||||
2. Relay publishes to ingestors:5555
|
||||
3. Ingestor fetches and responds to relay:5556
|
||||
4. Relay returns to client
|
||||
|
||||
## Future Enhancements
|
||||
|
||||
- [ ] Metrics collection (Prometheus)
|
||||
- [ ] Health check endpoint
|
||||
- [ ] Request rate limiting
|
||||
- [ ] Circuit breaker for failed ingestors
|
||||
- [ ] Request deduplication
|
||||
- [ ] Response caching
|
||||
- [ ] Multi-part response support for large datasets
|
||||
Reference in New Issue
Block a user