# DexOrder Python Client Library High-level Python API for accessing historical OHLC data from the DexOrder trading platform. ## Features - **Smart Caching**: Automatically checks Iceberg warehouse before requesting new data - **Async Request/Response**: Non-blocking historical data requests via relay - **Gap Detection**: Identifies and requests only missing data ranges - **Transparent Access**: Single API for both cached and on-demand data ## Installation ```bash cd redesign/client-py pip install -e . ``` ## Quick Start ```python import asyncio from dexorder import OHLCClient async def main(): # Initialize client client = OHLCClient( iceberg_catalog_uri="http://iceberg-catalog:8181", relay_endpoint="tcp://relay:5555", notification_endpoint="tcp://flink:5557" ) # Start background notification listener await client.start() try: # Fetch OHLC data (automatically checks cache and requests missing data) df = await client.fetch_ohlc( ticker="BINANCE:BTC/USDT", period_seconds=3600, # 1-hour candles start_time=1735689600000000, # microseconds end_time=1736294399000000 ) print(f"Fetched {len(df)} candles") print(df.head()) finally: await client.stop() # Run asyncio.run(main()) ``` ## Using Context Manager ```python async def main(): async with OHLCClient(...) as client: df = await client.fetch_ohlc(...) ``` ## Architecture ### Components 1. **OHLCClient**: High-level API with smart caching 2. **IcebergClient**: Direct queries to Iceberg warehouse 3. **HistoryClient**: Submit requests via relay and wait for notifications ### Data Flow ``` ┌─────────┐ │ Client │ └────┬────┘ │ 1. fetch_ohlc() ▼ ┌─────────────────┐ │ OHLCClient │ └────┬────────────┘ │ 2. Check Iceberg ▼ ┌─────────────────┐ ┌──────────┐ │ IcebergClient │─────▶│ Iceberg │ └─────────────────┘ └──────────┘ │ 3. Missing data? ▼ ┌─────────────────┐ ┌──────────┐ │ HistoryClient │─────▶│ Relay │ └────┬────────────┘ └──────────┘ │ │ │ 4. Wait for notification │ │◀─────────────────────────┘ │ 5. Query Iceberg again ▼ ┌─────────────────┐ │ Return data │ └─────────────────┘ ``` ## API Reference ### OHLCClient #### `__init__(iceberg_catalog_uri, relay_endpoint, notification_endpoint, namespace="trading")` Initialize the client with connection parameters. #### `async fetch_ohlc(ticker, period_seconds, start_time, end_time, request_timeout=30.0)` Fetch OHLC data with smart caching. **Parameters:** - `ticker` (str): Market identifier (e.g., "BINANCE:BTC/USDT") - `period_seconds` (int): OHLC period in seconds (60, 300, 3600, 86400, etc.) - `start_time` (int): Start timestamp in microseconds - `end_time` (int): End timestamp in microseconds - `request_timeout` (float): Timeout for historical requests in seconds **Returns:** `pd.DataFrame` with columns: - `ticker`: Market identifier - `period_seconds`: Period in seconds - `timestamp`: Candle timestamp (microseconds) - `open`, `high`, `low`, `close`: Prices (integer format) - `volume`: Trading volume - Additional fields: `buy_vol`, `sell_vol`, `open_interest`, etc. ### IcebergClient Direct access to Iceberg warehouse. #### `query_ohlc(ticker, period_seconds, start_time, end_time)` Query OHLC data directly from Iceberg. #### `find_missing_ranges(ticker, period_seconds, start_time, end_time)` Identify missing data ranges. Returns list of `(start_time, end_time)` tuples. #### `has_data(ticker, period_seconds, start_time, end_time)` Check if any data exists for the given parameters. ### HistoryClient Low-level client for submitting historical data requests. **IMPORTANT**: Always call `connect()` before making requests to prevent race condition. #### `async connect()` Connect to relay and start notification listener. **MUST be called before making any requests.** This subscribes to the notification topic `RESPONSE:{client_id}` BEFORE any requests are sent, preventing the race condition where notifications arrive before subscription. #### `async request_historical_ohlc(ticker, period_seconds, start_time, end_time, timeout=30.0, limit=None)` Submit historical data request and wait for completion notification. **Returns:** dict with keys: - `request_id`: The request ID - `status`: 'OK', 'NOT_FOUND', or 'ERROR' - `error_message`: Error message if status is 'ERROR' - `iceberg_namespace`, `iceberg_table`, `row_count`: Available when status is 'OK' **Example:** ```python from dexorder import HistoryClient client = HistoryClient( relay_endpoint="tcp://relay:5559", notification_endpoint="tcp://relay:5558" ) # CRITICAL: Connect first to prevent race condition await client.connect() # Now safe to make requests result = await client.request_historical_ohlc( ticker="BINANCE:BTC/USDT", period_seconds=3600, start_time=1735689600000000, end_time=1736294399000000 ) await client.close() ``` ## Configuration The client requires the following endpoints: - **Iceberg Catalog URI**: REST API endpoint for Iceberg metadata (default: `http://iceberg-catalog:8181`) - **Relay Endpoint**: ZMQ REQ/REP endpoint for submitting requests (default: `tcp://relay:5555`) - **Notification Endpoint**: ZMQ PUB/SUB endpoint for receiving notifications (default: `tcp://flink:5557`) ## Development ### Generate Protobuf Files ```bash cd redesign/protobuf protoc -I . --python_out=../client-py/dexorder ingestor.proto ohlc.proto ``` ### Run Tests ```bash pytest tests/ ``` ## Examples See `../relay/test/async_client.py` for a complete example. ## Timestamp Format All timestamps are in **microseconds since epoch**: ```python # Convert from datetime from datetime import datetime, timezone dt = datetime(2024, 1, 1, tzinfo=timezone.utc) timestamp_micros = int(dt.timestamp() * 1_000_000) # Convert to datetime dt = datetime.fromtimestamp(timestamp_micros / 1_000_000, tz=timezone.utc) ``` ## Period Seconds Common period values: - `60` - 1 minute - `300` - 5 minutes - `900` - 15 minutes - `3600` - 1 hour - `14400` - 4 hours - `86400` - 1 day - `604800` - 1 week ## Error Handling ```python try: df = await client.fetch_ohlc(...) except TimeoutError: print("Request timed out") except ValueError as e: print(f"Request failed: {e}") except ConnectionError: print("Unable to connect to relay") ``` ## License Internal use only.