backend redesign

This commit is contained in:
2026-03-11 18:47:11 -04:00
parent 8ff277c8c6
commit e99ef5d2dd
210 changed files with 12147 additions and 155 deletions

238
relay/README.md Normal file
View File

@@ -0,0 +1,238 @@
# ZMQ Relay Gateway
High-performance ZMQ relay/gateway that routes messages between clients, Flink, and ingestors.
## Architecture
The relay acts as a well-known bind point for all components:
```
┌─────────┐ ┌───────┐ ┌──────────┐
│ Clients │◄──────────────────►│ Relay │◄──────────────────►│ Ingestors│
└─────────┘ └───┬───┘ └──────────┘
┌────────┐
│ Flink │
└────────┘
```
## Responsibilities
### 1. Client Request Routing
- **Socket**: ROUTER (bind on port 5559)
- **Flow**: Client REQ → Relay ROUTER → Ingestor PUB
- Receives OHLC requests from clients
- Routes to appropriate ingestors using exchange prefix filtering
- Tracks pending requests and matches responses
### 2. Ingestor Work Distribution
- **Socket**: PUB (bind on port 5555)
- **Pattern**: Topic-based distribution with exchange prefixes
- Publishes work requests with exchange prefix (e.g., `BINANCE:`)
- Ingestors subscribe to exchanges they support
### 3. Response Routing
- **Socket**: ROUTER (bind on port 5556)
- **Flow**: Ingestor DEALER → Relay ROUTER → Client REQ
- Receives responses from ingestors
- Matches responses to pending client requests by request_id
- Returns data to waiting clients
### 4. Market Data Fanout
- **Sockets**: XPUB (bind on 5558) + XSUB (connect to Flink:5557)
- **Pattern**: XPUB/XSUB proxy
- Relays market data from Flink to multiple clients
- Manages subscriptions dynamically
- Forwards subscription messages upstream to Flink
## Message Flows
### Historical Data Request
```
1. Client → Relay
Socket: REQ → ROUTER (5559)
Message: OHLCRequest (0x07)
2. Relay → Ingestor
Socket: PUB (5555)
Topic: Exchange prefix (e.g., "BINANCE:")
Message: DataRequest (0x01)
3. Ingestor fetches data from exchange
4. Ingestor → Relay
Socket: DEALER → ROUTER (5556)
Message: DataResponse (0x02)
5. Relay → Client
Socket: ROUTER → REQ
Message: Response (0x08)
```
### Market Data Subscription
```
1. Client subscribes to ticker
Socket: SUB → XPUB (5558)
Topic: "BINANCE:BTC/USDT|tick"
2. Relay forwards subscription
Socket: XSUB → Flink PUB (5557)
3. Flink publishes data
Socket: PUB (5557) → XSUB
4. Relay fanout to clients
Socket: XPUB (5558) → SUB
```
## Configuration
Edit `config.yaml`:
```yaml
bind_address: "tcp://*"
client_request_port: 5559
market_data_pub_port: 5558
ingestor_work_port: 5555
ingestor_response_port: 5556
flink_market_data_endpoint: "tcp://flink-jobmanager:5557"
request_timeout_secs: 30
high_water_mark: 10000
```
## Building
```bash
cargo build --release
```
## Running
```bash
# With default config
./target/release/relay
# With custom config
CONFIG_PATH=/path/to/config.yaml ./target/release/relay
# With Docker
docker build -t relay .
docker run -p 5555-5559:5555-5559 relay
```
## Environment Variables
- `CONFIG_PATH`: Path to config file (default: `/config/config.yaml`)
- `RUST_LOG`: Log level (default: `relay=info`)
## Ports
| Port | Socket Type | Direction | Purpose |
|------|------------|-----------|---------|
| 5555 | PUB | → Ingestors | Work distribution with exchange prefix |
| 5556 | ROUTER | ← Ingestors | Response collection |
| 5557 | - | (Flink) | Flink market data publication |
| 5558 | XPUB | → Clients | Market data fanout |
| 5559 | ROUTER | ← Clients | Client request handling |
## Monitoring
The relay logs all major events:
```
INFO relay: Client request routing
INFO relay: Forwarded request to ingestors: prefix=BINANCE:, request_id=...
INFO relay: Received response from ingestor: request_id=..., status=OK
INFO relay: Sent response to client: request_id=...
WARN relay: Request timed out: request_id=...
```
## Performance
- **High water mark**: Configurable per socket (default: 10,000 messages)
- **Request timeout**: Automatic cleanup of expired requests (default: 30s)
- **Zero-copy proxying**: XPUB/XSUB market data forwarding
- **Async cleanup**: Background task for timeout management
## Design Decisions
### Why Rust?
- **Performance**: Zero-cost abstractions, minimal overhead
- **Safety**: Memory safety without garbage collection
- **Concurrency**: Fearless concurrency with strong type system
- **ZMQ Integration**: Excellent ZMQ bindings
### Why ROUTER for clients?
- Preserves client identity for request/response matching
- Allows async responses (no blocking)
- Handles multiple concurrent clients efficiently
### Why PUB for ingestor work?
- Topic-based filtering by exchange
- Multiple ingestors can compete for same exchange
- Scales horizontally with ingestor count
- No single point of failure
### Why XPUB/XSUB for market data?
- Dynamic subscription management
- Efficient fanout to many clients
- Upstream subscription control
- Standard ZMQ proxy pattern
## Troubleshooting
### No response from ingestors
Check:
- Ingestors are connected to port 5555
- Ingestors have subscribed to exchange prefix
- Topic format: `EXCHANGE:` (e.g., `BINANCE:`)
### Client timeout
Check:
- Request timeout configuration
- Ingestor availability
- Network connectivity
- Pending requests map (logged on timeout)
### Market data not flowing
Check:
- Flink is publishing on port 5557
- Relay XSUB is connected to Flink
- Clients have subscribed to correct topics
- Topic format: `{ticker}|{data_type}`
## Testing
Run the test client:
```bash
cd ../test/history_client
python client.py
```
Expected flow:
1. Client sends request to relay:5559
2. Relay publishes to ingestors:5555
3. Ingestor fetches and responds to relay:5556
4. Relay returns to client
## Future Enhancements
- [ ] Metrics collection (Prometheus)
- [ ] Health check endpoint
- [ ] Request rate limiting
- [ ] Circuit breaker for failed ingestors
- [ ] Request deduplication
- [ ] Response caching
- [ ] Multi-part response support for large datasets