260 lines
7.0 KiB
Markdown
260 lines
7.0 KiB
Markdown
# DexOrder Python Client Library
|
|
|
|
High-level Python API for accessing historical OHLC data from the DexOrder trading platform.
|
|
|
|
## Features
|
|
|
|
- **Smart Caching**: Automatically checks Iceberg warehouse before requesting new data
|
|
- **Async Request/Response**: Non-blocking historical data requests via relay
|
|
- **Gap Detection**: Identifies and requests only missing data ranges
|
|
- **Transparent Access**: Single API for both cached and on-demand data
|
|
|
|
## Installation
|
|
|
|
```bash
|
|
cd redesign/client-py
|
|
pip install -e .
|
|
```
|
|
|
|
## Quick Start
|
|
|
|
```python
|
|
import asyncio
|
|
from dexorder import OHLCClient
|
|
|
|
async def main():
|
|
# Initialize client
|
|
client = OHLCClient(
|
|
iceberg_catalog_uri="http://iceberg-catalog:8181",
|
|
relay_endpoint="tcp://relay:5555",
|
|
notification_endpoint="tcp://flink:5557"
|
|
)
|
|
|
|
# Start background notification listener
|
|
await client.start()
|
|
|
|
try:
|
|
# Fetch OHLC data (automatically checks cache and requests missing data)
|
|
df = await client.fetch_ohlc(
|
|
ticker="BINANCE:BTC/USDT",
|
|
period_seconds=3600, # 1-hour candles
|
|
start_time=1735689600000000, # microseconds
|
|
end_time=1736294399000000
|
|
)
|
|
|
|
print(f"Fetched {len(df)} candles")
|
|
print(df.head())
|
|
|
|
finally:
|
|
await client.stop()
|
|
|
|
# Run
|
|
asyncio.run(main())
|
|
```
|
|
|
|
## Using Context Manager
|
|
|
|
```python
|
|
async def main():
|
|
async with OHLCClient(...) as client:
|
|
df = await client.fetch_ohlc(...)
|
|
```
|
|
|
|
## Architecture
|
|
|
|
### Components
|
|
|
|
1. **OHLCClient**: High-level API with smart caching
|
|
2. **IcebergClient**: Direct queries to Iceberg warehouse
|
|
3. **HistoryClient**: Submit requests via relay and wait for notifications
|
|
|
|
### Data Flow
|
|
|
|
```
|
|
┌─────────┐
|
|
│ Client │
|
|
└────┬────┘
|
|
│ 1. fetch_ohlc()
|
|
▼
|
|
┌─────────────────┐
|
|
│ OHLCClient │
|
|
└────┬────────────┘
|
|
│ 2. Check Iceberg
|
|
▼
|
|
┌─────────────────┐ ┌──────────┐
|
|
│ IcebergClient │─────▶│ Iceberg │
|
|
└─────────────────┘ └──────────┘
|
|
│ 3. Missing data?
|
|
▼
|
|
┌─────────────────┐ ┌──────────┐
|
|
│ HistoryClient │─────▶│ Relay │
|
|
└────┬────────────┘ └──────────┘
|
|
│ │
|
|
│ 4. Wait for notification │
|
|
│◀─────────────────────────┘
|
|
│ 5. Query Iceberg again
|
|
▼
|
|
┌─────────────────┐
|
|
│ Return data │
|
|
└─────────────────┘
|
|
```
|
|
|
|
## API Reference
|
|
|
|
### OHLCClient
|
|
|
|
#### `__init__(iceberg_catalog_uri, relay_endpoint, notification_endpoint, namespace="trading")`
|
|
|
|
Initialize the client with connection parameters.
|
|
|
|
#### `async fetch_ohlc(ticker, period_seconds, start_time, end_time, request_timeout=30.0)`
|
|
|
|
Fetch OHLC data with smart caching.
|
|
|
|
**Parameters:**
|
|
- `ticker` (str): Market identifier (e.g., "BINANCE:BTC/USDT")
|
|
- `period_seconds` (int): OHLC period in seconds (60, 300, 3600, 86400, etc.)
|
|
- `start_time` (int): Start timestamp in microseconds
|
|
- `end_time` (int): End timestamp in microseconds
|
|
- `request_timeout` (float): Timeout for historical requests in seconds
|
|
|
|
**Returns:** `pd.DataFrame` with columns:
|
|
- `ticker`: Market identifier
|
|
- `period_seconds`: Period in seconds
|
|
- `timestamp`: Candle timestamp (microseconds)
|
|
- `open`, `high`, `low`, `close`: Prices (integer format)
|
|
- `volume`: Trading volume
|
|
- Additional fields: `buy_vol`, `sell_vol`, `open_interest`, etc.
|
|
|
|
### IcebergClient
|
|
|
|
Direct access to Iceberg warehouse.
|
|
|
|
#### `query_ohlc(ticker, period_seconds, start_time, end_time)`
|
|
|
|
Query OHLC data directly from Iceberg.
|
|
|
|
#### `find_missing_ranges(ticker, period_seconds, start_time, end_time)`
|
|
|
|
Identify missing data ranges. Returns list of `(start_time, end_time)` tuples.
|
|
|
|
#### `has_data(ticker, period_seconds, start_time, end_time)`
|
|
|
|
Check if any data exists for the given parameters.
|
|
|
|
### HistoryClient
|
|
|
|
Low-level client for submitting historical data requests.
|
|
|
|
**IMPORTANT**: Always call `connect()` before making requests to prevent race condition.
|
|
|
|
#### `async connect()`
|
|
|
|
Connect to relay and start notification listener. **MUST be called before making any requests.**
|
|
|
|
This subscribes to the notification topic `RESPONSE:{client_id}` BEFORE any requests are sent,
|
|
preventing the race condition where notifications arrive before subscription.
|
|
|
|
#### `async request_historical_ohlc(ticker, period_seconds, start_time, end_time, timeout=30.0, limit=None)`
|
|
|
|
Submit historical data request and wait for completion notification.
|
|
|
|
**Returns:** dict with keys:
|
|
- `request_id`: The request ID
|
|
- `status`: 'OK', 'NOT_FOUND', or 'ERROR'
|
|
- `error_message`: Error message if status is 'ERROR'
|
|
- `iceberg_namespace`, `iceberg_table`, `row_count`: Available when status is 'OK'
|
|
|
|
**Example:**
|
|
```python
|
|
from dexorder import HistoryClient
|
|
|
|
client = HistoryClient(
|
|
relay_endpoint="tcp://relay:5559",
|
|
notification_endpoint="tcp://relay:5558"
|
|
)
|
|
|
|
# CRITICAL: Connect first to prevent race condition
|
|
await client.connect()
|
|
|
|
# Now safe to make requests
|
|
result = await client.request_historical_ohlc(
|
|
ticker="BINANCE:BTC/USDT",
|
|
period_seconds=3600,
|
|
start_time=1735689600000000,
|
|
end_time=1736294399000000
|
|
)
|
|
|
|
await client.close()
|
|
```
|
|
|
|
## Configuration
|
|
|
|
The client requires the following endpoints:
|
|
|
|
- **Iceberg Catalog URI**: REST API endpoint for Iceberg metadata (default: `http://iceberg-catalog:8181`)
|
|
- **Relay Endpoint**: ZMQ REQ/REP endpoint for submitting requests (default: `tcp://relay:5555`)
|
|
- **Notification Endpoint**: ZMQ PUB/SUB endpoint for receiving notifications (default: `tcp://flink:5557`)
|
|
|
|
## Development
|
|
|
|
### Generate Protobuf Files
|
|
|
|
```bash
|
|
cd redesign/protobuf
|
|
protoc -I . --python_out=../client-py/dexorder ingestor.proto ohlc.proto
|
|
```
|
|
|
|
### Run Tests
|
|
|
|
```bash
|
|
pytest tests/
|
|
```
|
|
|
|
## Examples
|
|
|
|
See `../relay/test/async_client.py` for a complete example.
|
|
|
|
## Timestamp Format
|
|
|
|
All timestamps are in **microseconds since epoch**:
|
|
|
|
```python
|
|
# Convert from datetime
|
|
from datetime import datetime, timezone
|
|
|
|
dt = datetime(2024, 1, 1, tzinfo=timezone.utc)
|
|
timestamp_micros = int(dt.timestamp() * 1_000_000)
|
|
|
|
# Convert to datetime
|
|
dt = datetime.fromtimestamp(timestamp_micros / 1_000_000, tz=timezone.utc)
|
|
```
|
|
|
|
## Period Seconds
|
|
|
|
Common period values:
|
|
- `60` - 1 minute
|
|
- `300` - 5 minutes
|
|
- `900` - 15 minutes
|
|
- `3600` - 1 hour
|
|
- `14400` - 4 hours
|
|
- `86400` - 1 day
|
|
- `604800` - 1 week
|
|
|
|
## Error Handling
|
|
|
|
```python
|
|
try:
|
|
df = await client.fetch_ohlc(...)
|
|
except TimeoutError:
|
|
print("Request timed out")
|
|
except ValueError as e:
|
|
print(f"Request failed: {e}")
|
|
except ConnectionError:
|
|
print("Unable to connect to relay")
|
|
```
|
|
|
|
## License
|
|
|
|
Internal use only.
|