5.2 KiB
CCXT Market Data Ingestor
A NodeJS-based market data ingestor that uses CCXT to fetch historical OHLC data and realtime tick data from cryptocurrency exchanges. Integrates with Apache Flink via ZeroMQ for work distribution and writes data to Kafka.
Architecture
The ingestor is a worker process that:
- Connects to Flink's ZMQ work queue (PULL socket) to receive data requests
- Connects to Flink's ZMQ control channel (SUB socket) to receive control messages
- Fetches market data from exchanges using CCXT
- Writes data to Kafka using the protobuf protocol
Data Request Types
Historical OHLC
- Fetches historical candlestick data for a specified time range
- Uses CCXT's
fetchOHLCVmethod - Writes OHLC messages to Kafka
- Request is completed and removed from queue after processing
Realtime Ticks
- Subscribes to realtime trade data
- Uses 10-second polling to fetch recent trades via
fetchTrades - Writes Tick messages to Kafka
market-0topic - Subscription persists until cancelled by Flink control message
Installation
npm install
Configuration
Create config.yaml based on config.example.yaml:
# Flink ZMQ endpoints
flink_hostname: localhost
ingestor_work_port: 5555
ingestor_control_port: 5556
# Kafka configuration
kafka_brokers:
- localhost:9092
kafka_topic: market-0
# Worker configuration
max_concurrent: 10
poll_interval_ms: 10000
Optional secrets.yaml for sensitive configuration.
Usage
Development
npm run dev
Production
npm start
Docker
docker build -t ccxt-ingestor .
docker run -v /path/to/config:/config ccxt-ingestor
Ticker Format
Tickers must be in the format: EXCHANGE:SYMBOL
Examples:
BINANCE:BTC/USDTCOINBASE:ETH/USDKRAKEN:XRP/EUR
Protocol
ZeroMQ Message Format
All messages use a two-frame envelope:
Frame 1: [1 byte: protocol version = 0x01]
Frame 2: [1 byte: message type ID][N bytes: protobuf message]
Message Type IDs
0x01: DataRequest0x02: IngestorControl0x03: Tick0x04: OHLC
DataRequest (from Flink)
message DataRequest {
string request_id = 1;
RequestType type = 2; // HISTORICAL_OHLC or REALTIME_TICKS
string ticker = 3;
optional HistoricalParams historical = 4;
optional RealtimeParams realtime = 5;
}
IngestorControl (from Flink)
message IngestorControl {
ControlAction action = 1; // CANCEL, SHUTDOWN, CONFIG_UPDATE, HEARTBEAT
optional string request_id = 2;
optional IngestorConfig config = 3;
}
Tick (to Kafka)
message Tick {
string trade_id = 1;
string ticker = 2;
uint64 timestamp = 3; // microseconds
int64 price = 4; // fixed-point (10^8)
int64 amount = 5; // fixed-point (10^8)
int64 quote_amount = 6; // fixed-point (10^8)
bool taker_buy = 7;
}
OHLC (to Kafka)
message OHLC {
int64 open = 2; // fixed-point (10^8)
int64 high = 3;
int64 low = 4;
int64 close = 5;
optional int64 volume = 6;
optional int64 open_time = 9; // microseconds
optional int64 close_time = 12;
string ticker = 14;
}
Fixed-Point Encoding
All prices and amounts are encoded as fixed-point integers using 8 decimal places (denominator = 10^8):
- Example: 123.45678901 → 12345678901
- This provides precision while avoiding floating-point errors
Components
src/index.js
Main worker process that coordinates all components and handles the work loop.
src/zmq-client.js
ZeroMQ client for connecting to Flink's work queue and control channel.
src/kafka-producer.js
Kafka producer for writing protobuf-encoded messages to Kafka topics.
src/ccxt-fetcher.js
CCXT wrapper for fetching historical OHLC and recent trades from exchanges.
src/realtime-poller.js
Manages realtime subscriptions with 10-second polling for trade updates.
src/proto/messages.js
Protobuf message definitions and encoding/decoding utilities.
Error Handling
- Failed requests automatically return to the Flink work queue
- Realtime subscriptions are cancelled after 5 consecutive errors
- Worker logs all errors with context for debugging
- Graceful shutdown on SIGINT/SIGTERM
Monitoring
The worker logs status information every 60 seconds including:
- Number of active requests
- Realtime subscription statistics
- Error counts
Environment Variables
CONFIG_PATH: Path to config.yaml (default:/config/config.yaml)SECRETS_PATH: Path to secrets.yaml (default:/config/secrets.yaml)LOG_LEVEL: Log level (default:info)
Supported Exchanges
All exchanges supported by CCXT can be used. Popular exchanges include:
- Binance
- Coinbase
- Kraken
- Bitfinex
- Huobi
- And 100+ more
Development
Project Structure
redesign/ingestor/
├── src/
│ ├── index.js # Main worker
│ ├── zmq-client.js # ZMQ client
│ ├── kafka-producer.js # Kafka producer
│ ├── ccxt-fetcher.js # CCXT wrapper
│ ├── realtime-poller.js # Realtime poller
│ └── proto/
│ └── messages.js # Protobuf definitions
├── config.example.yaml
├── Dockerfile
├── package.json
└── README.md
License
ISC