backend redesign

This commit is contained in:
2026-03-11 18:47:11 -04:00
parent 8ff277c8c6
commit e99ef5d2dd
210 changed files with 12147 additions and 155 deletions

259
client-py/README.md Normal file
View File

@@ -0,0 +1,259 @@
# DexOrder Python Client Library
High-level Python API for accessing historical OHLC data from the DexOrder trading platform.
## Features
- **Smart Caching**: Automatically checks Iceberg warehouse before requesting new data
- **Async Request/Response**: Non-blocking historical data requests via relay
- **Gap Detection**: Identifies and requests only missing data ranges
- **Transparent Access**: Single API for both cached and on-demand data
## Installation
```bash
cd redesign/client-py
pip install -e .
```
## Quick Start
```python
import asyncio
from dexorder import OHLCClient
async def main():
# Initialize client
client = OHLCClient(
iceberg_catalog_uri="http://iceberg-catalog:8181",
relay_endpoint="tcp://relay:5555",
notification_endpoint="tcp://flink:5557"
)
# Start background notification listener
await client.start()
try:
# Fetch OHLC data (automatically checks cache and requests missing data)
df = await client.fetch_ohlc(
ticker="BINANCE:BTC/USDT",
period_seconds=3600, # 1-hour candles
start_time=1735689600000000, # microseconds
end_time=1736294399000000
)
print(f"Fetched {len(df)} candles")
print(df.head())
finally:
await client.stop()
# Run
asyncio.run(main())
```
## Using Context Manager
```python
async def main():
async with OHLCClient(...) as client:
df = await client.fetch_ohlc(...)
```
## Architecture
### Components
1. **OHLCClient**: High-level API with smart caching
2. **IcebergClient**: Direct queries to Iceberg warehouse
3. **HistoryClient**: Submit requests via relay and wait for notifications
### Data Flow
```
┌─────────┐
│ Client │
└────┬────┘
│ 1. fetch_ohlc()
┌─────────────────┐
│ OHLCClient │
└────┬────────────┘
│ 2. Check Iceberg
┌─────────────────┐ ┌──────────┐
│ IcebergClient │─────▶│ Iceberg │
└─────────────────┘ └──────────┘
│ 3. Missing data?
┌─────────────────┐ ┌──────────┐
│ HistoryClient │─────▶│ Relay │
└────┬────────────┘ └──────────┘
│ │
│ 4. Wait for notification │
│◀─────────────────────────┘
│ 5. Query Iceberg again
┌─────────────────┐
│ Return data │
└─────────────────┘
```
## API Reference
### OHLCClient
#### `__init__(iceberg_catalog_uri, relay_endpoint, notification_endpoint, namespace="trading")`
Initialize the client with connection parameters.
#### `async fetch_ohlc(ticker, period_seconds, start_time, end_time, request_timeout=30.0)`
Fetch OHLC data with smart caching.
**Parameters:**
- `ticker` (str): Market identifier (e.g., "BINANCE:BTC/USDT")
- `period_seconds` (int): OHLC period in seconds (60, 300, 3600, 86400, etc.)
- `start_time` (int): Start timestamp in microseconds
- `end_time` (int): End timestamp in microseconds
- `request_timeout` (float): Timeout for historical requests in seconds
**Returns:** `pd.DataFrame` with columns:
- `ticker`: Market identifier
- `period_seconds`: Period in seconds
- `timestamp`: Candle timestamp (microseconds)
- `open`, `high`, `low`, `close`: Prices (integer format)
- `volume`: Trading volume
- Additional fields: `buy_vol`, `sell_vol`, `open_interest`, etc.
### IcebergClient
Direct access to Iceberg warehouse.
#### `query_ohlc(ticker, period_seconds, start_time, end_time)`
Query OHLC data directly from Iceberg.
#### `find_missing_ranges(ticker, period_seconds, start_time, end_time)`
Identify missing data ranges. Returns list of `(start_time, end_time)` tuples.
#### `has_data(ticker, period_seconds, start_time, end_time)`
Check if any data exists for the given parameters.
### HistoryClient
Low-level client for submitting historical data requests.
**IMPORTANT**: Always call `connect()` before making requests to prevent race condition.
#### `async connect()`
Connect to relay and start notification listener. **MUST be called before making any requests.**
This subscribes to the notification topic `RESPONSE:{client_id}` BEFORE any requests are sent,
preventing the race condition where notifications arrive before subscription.
#### `async request_historical_ohlc(ticker, period_seconds, start_time, end_time, timeout=30.0, limit=None)`
Submit historical data request and wait for completion notification.
**Returns:** dict with keys:
- `request_id`: The request ID
- `status`: 'OK', 'NOT_FOUND', or 'ERROR'
- `error_message`: Error message if status is 'ERROR'
- `iceberg_namespace`, `iceberg_table`, `row_count`: Available when status is 'OK'
**Example:**
```python
from dexorder import HistoryClient
client = HistoryClient(
relay_endpoint="tcp://relay:5559",
notification_endpoint="tcp://relay:5558"
)
# CRITICAL: Connect first to prevent race condition
await client.connect()
# Now safe to make requests
result = await client.request_historical_ohlc(
ticker="BINANCE:BTC/USDT",
period_seconds=3600,
start_time=1735689600000000,
end_time=1736294399000000
)
await client.close()
```
## Configuration
The client requires the following endpoints:
- **Iceberg Catalog URI**: REST API endpoint for Iceberg metadata (default: `http://iceberg-catalog:8181`)
- **Relay Endpoint**: ZMQ REQ/REP endpoint for submitting requests (default: `tcp://relay:5555`)
- **Notification Endpoint**: ZMQ PUB/SUB endpoint for receiving notifications (default: `tcp://flink:5557`)
## Development
### Generate Protobuf Files
```bash
cd redesign/protobuf
protoc -I . --python_out=../client-py/dexorder ingestor.proto ohlc.proto
```
### Run Tests
```bash
pytest tests/
```
## Examples
See `../relay/test/async_client.py` for a complete example.
## Timestamp Format
All timestamps are in **microseconds since epoch**:
```python
# Convert from datetime
from datetime import datetime, timezone
dt = datetime(2024, 1, 1, tzinfo=timezone.utc)
timestamp_micros = int(dt.timestamp() * 1_000_000)
# Convert to datetime
dt = datetime.fromtimestamp(timestamp_micros / 1_000_000, tz=timezone.utc)
```
## Period Seconds
Common period values:
- `60` - 1 minute
- `300` - 5 minutes
- `900` - 15 minutes
- `3600` - 1 hour
- `14400` - 4 hours
- `86400` - 1 day
- `604800` - 1 week
## Error Handling
```python
try:
df = await client.fetch_ohlc(...)
except TimeoutError:
print("Request timed out")
except ValueError as e:
print(f"Request failed: {e}")
except ConnectionError:
print("Unable to connect to relay")
```
## License
Internal use only.