25 Commits

Author SHA1 Message Date
43aeba0b25 bump sandbox; bin/user-activity 2026-04-28 20:09:23 -04:00
b4e99744d8 Support custom column selection in OHLC queries and extend CCXT with configurable exchange-specific fields
- Add `columns` parameter to `get_ohlc_async` and pass through to Iceberg queries
- Replace hardcoded Binance field extraction with declarative `EXCHANGE_OHLCV_EXTENSIONS` config
- Add `applyScale` helper for field-specific transformations (ms_to_ns, price, size, int)
- Support `complementOf` spec for derived fields (e.g., sell_vol from total - buy_vol)
- Apply extensions dynamically in `convertToOHLC` and gap-filling logic
- Remove redundant column filtering in DataAPI (now handled upstream)
2026-04-28 20:00:10 -04:00
77e9ad7f68 bump sandbox 2026-04-28 18:57:05 -04:00
2fded95b31 Add synthetic taker flow, timestamps, open interest, and metadata to indicator harness; improve error messages and theme tweaks
- Generate buy/sell volume split with random fractions
- Add nanosecond timestamps for OHLC extremes within 1-minute bars
- Include open interest as separate random walk
- Add num_trades and quote_volume derived fields
- Derive extra_columns dynamically from indicator input requirements instead of hardcoded volume check
- Improve input_series error message clarity
- Adjust ChatPanel background color for user messages
2026-04-28 18:47:41 -04:00
6482cfa347 indicator timeout increased 2026-04-28 17:13:35 -04:00
7039f0357b bump sandbox 2026-04-28 16:48:59 -04:00
a0248540e0 Optimize OHLC queries: run Iceberg scans in threads and reuse DataFrames to avoid redundant scans 2026-04-28 16:36:41 -04:00
47471b7700 Expand model tag support: add GLM-5.1, simplify Anthropic IDs, scan tags anywhere in message
- Flink update_bars debouncing
- update_bars subscription idempotency bugfix
- Price decimal correction bugfix of previous commit
- Add GLM-5.1 model tag alongside renamed GLM-5
- Use short Anthropic model IDs (sonnet/haiku/opus) instead of full version strings
- Allow @tags anywhere in message content, not just at start
- Return hasOtherContent flag instead of trimmed rest string
- Only trigger greeting stream when tag has no other content
- Update workspace knowledge base references to platform/workspace and platform/shapes
- Hierarchical knowledge base catalog
- 151 Trading Strategies knowledge base articles
- Shapes knowledge base article
- MutateShapes tool instead of workspace patch
2026-04-28 15:05:15 -04:00
d41fcd0499 feat: add @tag model override support and remove Qdrant dependencies
- Add model-tags parser for @Tag syntax in chat messages
- Support Anthropic models (Sonnet, Haiku, Opus) via @tag
- Remove Qdrant vector database from infrastructure and configs
- Simplify license model config to use null fallbacks
- Add greeting stream after model switch via @tag
- Fix protobuf field names to camelCase for v7 compatibility
- Add 429 rate limit retry logic with exponential backoff
- Remove RAG references from agent harness documentation
2026-04-27 20:55:18 -04:00
6f937f9e5e bump sandbox 2026-04-26 18:54:17 -04:00
0178b5d29d Add Ticker24h support: hourly market snapshots with USD-normalized volume filtering 2026-04-26 18:39:52 -04:00
85fcbe1330 indicator validation looks for all NaN's and all zeroes 2026-04-26 11:51:29 -04:00
2268ef0d3f Update prod_deployment.md 2026-04-24 21:16:29 -04:00
fecefa15ef bumped sandbox 2026-04-24 20:54:06 -04:00
319d81c41f data timeout fixes; research agent improvements 2026-04-24 20:43:42 -04:00
1800363566 sandbox bump 2026-04-23 18:31:30 -04:00
44a1688657 major agent refactoring: wiki knowledge base, no RAG, no Qdrant, no Ollama 2026-04-21 21:03:24 -04:00
7e4b54d701 top-level agent workspace update fix 2026-04-20 16:16:32 -04:00
9736a3b44e indicator refresh on detail edit 2026-04-20 15:46:45 -04:00
eabc307a2a saving details modal 2026-04-20 15:39:08 -04:00
810d2ca14f agent python_delete 2026-04-20 15:34:45 -04:00
b1d4459809 subagent thinking accordion; indicator fixes; script details & edit 2026-04-20 15:09:37 -04:00
a188268906 sandbox bump 2026-04-17 22:44:13 -04:00
7649811762 research price fix 2026-04-17 22:33:38 -04:00
7abbd2e5c7 store sync bugfix 2026-04-17 22:27:06 -04:00
336 changed files with 17672 additions and 5911 deletions

6
.aiignore Normal file
View File

@@ -0,0 +1,6 @@
ingestor/protobuf
flink/protobuf
relay/protobuf
gateway/protobuf
deploy/k8s/dev/configs/gateway-config.yaml
deploy/k8s/prod/configs/gateway-config.yaml

5
.idea/ai.iml generated
View File

@@ -20,6 +20,11 @@
<excludeFolder url="file://$MODULE_DIR$/doc/competition" /> <excludeFolder url="file://$MODULE_DIR$/doc/competition" />
<excludeFolder url="file://$MODULE_DIR$/sandbox/dexorder_sandbox.egg-info" /> <excludeFolder url="file://$MODULE_DIR$/sandbox/dexorder_sandbox.egg-info" />
<excludeFolder url="file://$MODULE_DIR$/sandbox/protobuf" /> <excludeFolder url="file://$MODULE_DIR$/sandbox/protobuf" />
<excludeFolder url="file://$MODULE_DIR$/.idea/runConfigurations" />
<excludeFolder url="file://$MODULE_DIR$/chat" />
<excludeFolder url="file://$MODULE_DIR$/gateway/protobuf" />
<excludeFolder url="file://$MODULE_DIR$/gateway/src/generated" />
<excludeFolder url="file://$MODULE_DIR$/web/protobuf" />
</content> </content>
<orderEntry type="jdk" jdkName="Python 3.12 (ai)" jdkType="Python SDK" /> <orderEntry type="jdk" jdkName="Python 3.12 (ai)" jdkType="Python SDK" />
<orderEntry type="sourceFolder" forTests="false" /> <orderEntry type="sourceFolder" forTests="false" />

View File

@@ -109,72 +109,6 @@ if [ "$PROJECT" != "lifecycle-sidecar" ]; then
rsync -a --checksum --delete protobuf/ $PROJECT/protobuf/ rsync -a --checksum --delete protobuf/ $PROJECT/protobuf/
fi fi
# For gateway: copy Python API files for research subagent
if [ "$PROJECT" == "gateway" ]; then
echo "Copying Python API files for research subagent..."
# Create api-source directory
mkdir -p gateway/src/harness/subagents/research/api-source
# Copy all Python API files (for easy future expansion)
cp sandbox/dexorder/api/*.py gateway/src/harness/subagents/research/api-source/
# Generate api-reference.md with verbatim Python source code
API_REF="gateway/src/harness/subagents/research/memory/api-reference.md"
cat > "$API_REF" << 'HEADER'
# Dexorder Research API Reference
This file contains the complete Python API source code with full docstrings.
These files are copied verbatim from `sandbox/dexorder/api/`.
The API provides access to market data and charting capabilities for research scripts.
---
## Overview
Research scripts access the API via:
```python
from dexorder.api import get_api
api = get_api()
```
The API instance provides:
- `api.data` - DataAPI for fetching OHLC market data
- `api.charting` - ChartingAPI for creating financial charts
---
## Complete API Source Code
The following sections contain the verbatim Python source files with complete
type hints, docstrings, and examples.
HEADER
# Append each Python file
for py_file in api.py data_api.py charting_api.py __init__.py; do
if [ -f "sandbox/dexorder/api/$py_file" ]; then
echo "" >> "$API_REF"
echo "### $py_file" >> "$API_REF"
echo '```python' >> "$API_REF"
cat "sandbox/dexorder/api/$py_file" >> "$API_REF"
echo '```' >> "$API_REF"
echo "" >> "$API_REF"
fi
done
cat >> "$API_REF" << 'FOOTER'
---
For practical usage patterns and complete working examples, see `usage-examples.md`.
FOOTER
echo "Generated api-reference.md with Python API source code"
fi
docker build $NO_CACHE -f $PROJECT/Dockerfile --build-arg="CONFIG=$CONFIG" --build-arg="DEPLOYMENT=$DEPLOYMENT" -t dexorder/ai-$PROJECT:latest $PROJECT || exit 1 docker build $NO_CACHE -f $PROJECT/Dockerfile --build-arg="CONFIG=$CONFIG" --build-arg="DEPLOYMENT=$DEPLOYMENT" -t dexorder/ai-$PROJECT:latest $PROJECT || exit 1
# Cleanup is handled by trap # Cleanup is handled by trap

29
bin/dev
View File

@@ -20,10 +20,10 @@ usage() {
echo " start Start minikube and deploy all services" echo " start Start minikube and deploy all services"
echo " stop [--keep-data] Stop minikube (deletes PVCs by default)" echo " stop [--keep-data] Stop minikube (deletes PVCs by default)"
echo " restart [svc] Rebuild and redeploy all services, or just one (relay|ingestor|flink|gateway|sidecar|web|sandbox)" echo " restart [svc] Rebuild and redeploy all services, or just one (relay|ingestor|flink|gateway|sidecar|web|sandbox)"
echo " deep-restart [svc] Restart StatefulSet(s) and delete their PVCs (kafka|postgres|minio|qdrant|all)" echo " deep-restart [svc] Restart StatefulSet(s) and delete their PVCs (kafka|postgres|minio|all)"
echo " rebuild [svc] Rebuild all custom images, or just one" echo " rebuild [svc] Rebuild all custom images, or just one"
echo " deploy [svc] Deploy/update all services, or just one" echo " deploy [svc] Deploy/update all services, or just one"
echo " delete-pvcs [svc] Delete PVCs for specific service or all (kafka|postgres|minio|qdrant|all)" echo " delete-pvcs [svc] Delete PVCs for specific service or all (kafka|postgres|minio|all)"
echo " status Show status of all services" echo " status Show status of all services"
echo " logs Tail logs for a service" echo " logs Tail logs for a service"
echo " shell Open a shell in a service pod" echo " shell Open a shell in a service pod"
@@ -446,19 +446,15 @@ delete_pvcs() {
minio) minio)
kubectl delete pvc -l app=minio || true kubectl delete pvc -l app=minio || true
;; ;;
qdrant)
kubectl delete pvc -l app=qdrant || true
;;
all) all)
echo -e "${YELLOW}Deleting all StatefulSet PVCs...${NC}" echo -e "${YELLOW}Deleting all StatefulSet PVCs...${NC}"
kubectl delete pvc -l app=kafka 2>/dev/null || true kubectl delete pvc -l app=kafka 2>/dev/null || true
kubectl delete pvc -l app=postgres 2>/dev/null || true kubectl delete pvc -l app=postgres 2>/dev/null || true
kubectl delete pvc -l app=minio 2>/dev/null || true kubectl delete pvc -l app=minio 2>/dev/null || true
kubectl delete pvc -l app=qdrant 2>/dev/null || true
;; ;;
*) *)
echo -e "${RED}Error: Unknown service '$service'${NC}" echo -e "${RED}Error: Unknown service '$service'${NC}"
echo "Valid services: kafka, postgres, minio, qdrant, all" echo "Valid services: kafka, postgres, minio, all"
exit 1 exit 1
;; ;;
esac esac
@@ -497,27 +493,21 @@ deep_restart() {
echo -e "${GREEN}→${NC} Force restarting iceberg-catalog (depends on minio)..." echo -e "${GREEN}→${NC} Force restarting iceberg-catalog (depends on minio)..."
kubectl delete pod -l app=iceberg-catalog 2>/dev/null || true kubectl delete pod -l app=iceberg-catalog 2>/dev/null || true
;; ;;
qdrant)
echo -e "${GREEN}→${NC} Deleting qdrant StatefulSet..."
kubectl delete statefulset qdrant || true
sleep 2
delete_pvcs qdrant
;;
all) all)
echo -e "${GREEN}→${NC} Deleting all StatefulSets..." echo -e "${GREEN}→${NC} Deleting all StatefulSets..."
kubectl delete statefulset kafka postgres minio qdrant || true kubectl delete statefulset kafka postgres minio || true
sleep 2 sleep 2
delete_pvcs all delete_pvcs all
# Force restart iceberg-catalog since it depends on postgres and minio # Force restart iceberg-catalog since it depends on postgres and minio
echo -e "${GREEN}→${NC} Force restarting iceberg-catalog (depends on postgres/minio)..." echo -e "${GREEN}→${NC} Force restarting iceberg-catalog (depends on postgres/minio)..."
kubectl delete pod -l app=iceberg-catalog 2>/dev/null || true kubectl delete pod -l app=iceberg-catalog 2>/dev/null || true
# Remove all sandbox deployments and services to free quota # Remove all sandbox deployments, services, and PVCs to fully reset user state
echo -e "${GREEN}→${NC} Removing all sandbox deployments and services..." echo -e "${GREEN}→${NC} Removing all sandbox deployments, services, and PVCs..."
kubectl delete deployments,services --all -n sandbox 2>/dev/null || true kubectl delete deployments,services,pvc --all -n sandbox 2>/dev/null || true
;; ;;
*) *)
echo -e "${RED}Error: Unknown service '$service'${NC}" echo -e "${RED}Error: Unknown service '$service'${NC}"
echo "Valid services: kafka, postgres, minio, qdrant, all" echo "Valid services: kafka, postgres, minio, all"
exit 1 exit 1
;; ;;
esac esac
@@ -642,13 +632,12 @@ case "$COMMAND" in
echo -e "${BLUE}Stopping minikube and deleting PVCs...${NC}" echo -e "${BLUE}Stopping minikube and deleting PVCs...${NC}"
# Scale down StatefulSets first to release PVCs # Scale down StatefulSets first to release PVCs
echo -e "${GREEN}→${NC} Scaling down StatefulSets..." echo -e "${GREEN}→${NC} Scaling down StatefulSets..."
kubectl scale statefulset kafka postgres minio qdrant --replicas=0 2>/dev/null || true kubectl scale statefulset kafka postgres minio --replicas=0 2>/dev/null || true
# Wait for pods to terminate # Wait for pods to terminate
echo -e "${GREEN}→${NC} Waiting for pods to terminate..." echo -e "${GREEN}→${NC} Waiting for pods to terminate..."
kubectl wait --for=delete pod -l app=kafka --timeout=60s 2>/dev/null || true kubectl wait --for=delete pod -l app=kafka --timeout=60s 2>/dev/null || true
kubectl wait --for=delete pod -l app=postgres --timeout=60s 2>/dev/null || true kubectl wait --for=delete pod -l app=postgres --timeout=60s 2>/dev/null || true
kubectl wait --for=delete pod -l app=minio --timeout=60s 2>/dev/null || true kubectl wait --for=delete pod -l app=minio --timeout=60s 2>/dev/null || true
kubectl wait --for=delete pod -l app=qdrant --timeout=60s 2>/dev/null || true
# Now delete PVCs # Now delete PVCs
delete_pvcs all delete_pvcs all
# Delete sandbox namespace # Delete sandbox namespace

21
bin/user-activity Executable file
View File

@@ -0,0 +1,21 @@
#!/bin/bash
CONTEXT=${KUBECTL_CONTEXT:-prod}
kubectl --context "$CONTEXT" exec -n ai postgres-0 -- psql -U postgres -d iceberg -t -A -F $'\t' -c "
SELECT u.name,
MAX(s.\"createdAt\") as last_login,
MAX(s.\"updatedAt\") as last_active
FROM \"user\" u
LEFT JOIN session s ON s.\"userId\" = u.id
GROUP BY u.name
ORDER BY last_active DESC NULLS LAST;" 2>/dev/null \
| awk -F'\t' '
BEGIN {
fmt = "%-16s %-24s %-24s\n"
printf fmt, "Name", "Last Login", "Last Active"
printf fmt, "----------------", "------------------------", "------------------------"
}
{
printf fmt, $1, $2, $3
}'

View File

@@ -13,6 +13,10 @@ spec:
protocol: TCP protocol: TCP
port: 3000 port: 3000
targetPort: http targetPort: http
- name: zmq-events
protocol: TCP
port: 5571
targetPort: 5571
type: ClusterIP type: ClusterIP
--- ---
apiVersion: apps/v1 apiVersion: apps/v1
@@ -40,9 +44,6 @@ spec:
- name: wait-for-dragonfly - name: wait-for-dragonfly
image: busybox:1.36 image: busybox:1.36
command: ['sh', '-c', 'until nc -z dragonfly 6379; do echo waiting for dragonfly; sleep 2; done;'] command: ['sh', '-c', 'until nc -z dragonfly 6379; do echo waiting for dragonfly; sleep 2; done;']
- name: wait-for-qdrant
image: busybox:1.36
command: ['sh', '-c', 'until nc -z qdrant 6333; do echo waiting for qdrant; sleep 2; done;']
- name: wait-for-iceberg-catalog - name: wait-for-iceberg-catalog
image: busybox:1.36 image: busybox:1.36
command: ['sh', '-c', 'until nc -z iceberg-catalog 8181; do echo waiting for iceberg-catalog; sleep 2; done;'] command: ['sh', '-c', 'until nc -z iceberg-catalog 8181; do echo waiting for iceberg-catalog; sleep 2; done;']
@@ -64,6 +65,9 @@ spec:
- name: http - name: http
containerPort: 3000 containerPort: 3000
protocol: TCP protocol: TCP
- name: zmq-events
containerPort: 5571
protocol: TCP
volumeMounts: volumeMounts:
- name: config - name: config

View File

@@ -69,6 +69,8 @@ spec:
ports: ports:
- protocol: TCP - protocol: TCP
port: 3000 port: 3000
- protocol: TCP
port: 5571
# External HTTPS (for exchange APIs, LLM APIs) # External HTTPS (for exchange APIs, LLM APIs)
- to: - to:
- ipBlock: - ipBlock:
@@ -102,3 +104,5 @@ spec:
ports: ports:
- protocol: TCP - protocol: TCP
port: 3000 port: 3000
- protocol: TCP
port: 5571

View File

@@ -27,29 +27,22 @@ data:
model_provider: deepinfra model_provider: deepinfra
model: zai-org/GLM-5 model: zai-org/GLM-5
# License tier model configuration # License tier model configuration (null = fall back to defaults.model)
license_models: license_models:
# Free tier models
free: free:
default: zai-org/GLM-5 default: ~
cost_optimized: zai-org/GLM-5 cost_optimized: ~
complex: zai-org/GLM-5 complex: ~
allowed_models:
- zai-org/GLM-5
# Pro tier models
pro: pro:
default: zai-org/GLM-5 default: ~
cost_optimized: zai-org/GLM-5 cost_optimized: ~
complex: zai-org/GLM-5 complex: ~
blocked_models:
- Qwen/Qwen3-235B-A22B-Instruct-2507
# Enterprise tier models
enterprise: enterprise:
default: zai-org/GLM-5 default: ~
cost_optimized: zai-org/GLM-5 cost_optimized: ~
complex: Qwen/Qwen3-235B-A22B-Instruct-2507 complex: ~
# Kubernetes configuration # Kubernetes configuration
kubernetes: kubernetes:
@@ -70,11 +63,6 @@ data:
redis: redis:
url: redis://dragonfly:6379 url: redis://dragonfly:6379
# Qdrant (for RAG vector search)
qdrant:
url: http://qdrant:6333
collection: gateway_memory
# Iceberg (for durable storage via REST catalog) # Iceberg (for durable storage via REST catalog)
iceberg: iceberg:
catalog_uri: http://iceberg-catalog:8181 catalog_uri: http://iceberg-catalog:8181

View File

@@ -16,13 +16,13 @@ supported_exchanges:
# limits and connection constraints — these are conservative starting values. # limits and connection constraints — these are conservative starting values.
exchange_capacity: exchange_capacity:
BINANCE: BINANCE:
historical_slots: 1 historical_slots: 2
realtime_slots: 5 realtime_slots: 5
COINBASE: COINBASE:
historical_slots: 1 historical_slots: 2
realtime_slots: 4 realtime_slots: 4
KRAKEN: KRAKEN:
historical_slots: 1 historical_slots: 2
realtime_slots: 3 realtime_slots: 3
# Kafka configuration # Kafka configuration

View File

@@ -45,68 +45,6 @@ spec:
memory: "512Mi" memory: "512Mi"
cpu: "500m" cpu: "500m"
--- ---
# Qdrant (Vector database for RAG)
apiVersion: v1
kind: Service
metadata:
name: qdrant
spec:
selector:
app: qdrant
ports:
- name: http
protocol: TCP
port: 6333
targetPort: 6333
- name: grpc
protocol: TCP
port: 6334
targetPort: 6334
type: ClusterIP
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: qdrant
spec:
serviceName: qdrant
replicas: 1
selector:
matchLabels:
app: qdrant
template:
metadata:
labels:
app: qdrant
spec:
containers:
- name: qdrant
image: qdrant/qdrant:latest
ports:
- containerPort: 6333
name: http
- containerPort: 6334
name: grpc
resources:
requests:
memory: "512Mi"
cpu: "200m"
limits:
memory: "1Gi"
cpu: "1000m"
volumeMounts:
- name: qdrant-data
mountPath: /qdrant/storage
volumeClaimTemplates:
- metadata:
name: qdrant-data
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: dev-ephemeral
resources:
requests:
storage: 10Gi
---
# Kafka (KRaft mode - no Zookeeper needed) # Kafka (KRaft mode - no Zookeeper needed)
# Using apache/kafka:3.9.0 instead of confluentinc/cp-kafka because: # Using apache/kafka:3.9.0 instead of confluentinc/cp-kafka because:
# - cp-kafka's entrypoint script has issues with KRaft configuration # - cp-kafka's entrypoint script has issues with KRaft configuration

View File

@@ -21,36 +21,12 @@ data:
model_provider: deepinfra model_provider: deepinfra
model: zai-org/GLM-5 model: zai-org/GLM-5
# License tier model configuration
license_models:
# Free tier models
free:
default: zai-org/GLM-5
cost_optimized: zai-org/GLM-5
complex: zai-org/GLM-5
allowed_models:
- zai-org/GLM-5
# Pro tier models
pro:
default: zai-org/GLM-5
cost_optimized: zai-org/GLM-5
complex: zai-org/GLM-5
blocked_models:
- Qwen/Qwen3-235B-A22B-Instruct-2507
# Enterprise tier models
enterprise:
default: zai-org/GLM-5
cost_optimized: zai-org/GLM-5
complex: Qwen/Qwen3-235B-A22B-Instruct-2507
# Kubernetes configuration # Kubernetes configuration
kubernetes: kubernetes:
namespace: sandbox namespace: sandbox
service_namespace: ai service_namespace: ai
in_cluster: true in_cluster: true
sandbox_image: git.dxod.org/dexorder/dexorder/ai-sandbox:27c603e sandbox_image: git.dxod.org/dexorder/dexorder/ai-sandbox:b4e99744
sidecar_image: git.dxod.org/dexorder/dexorder/ai-lifecycle-sidecar:latest sidecar_image: git.dxod.org/dexorder/dexorder/ai-lifecycle-sidecar:latest
image_pull_policy: Always image_pull_policy: Always
storage_class: ceph-block storage_class: ceph-block
@@ -59,11 +35,6 @@ data:
redis: redis:
url: redis://dragonfly:6379 url: redis://dragonfly:6379
# Qdrant (for RAG vector search)
qdrant:
url: http://qdrant:6333
collection: gateway_memory
# Agent configuration # Agent configuration
agent: agent:
# Number of prior conversation turns loaded as LLM context and flushed to Iceberg at session end # Number of prior conversation turns loaded as LLM context and flushed to Iceberg at session end

View File

@@ -45,67 +45,6 @@ spec:
memory: "512Mi" memory: "512Mi"
cpu: "500m" cpu: "500m"
--- ---
# Qdrant (Vector database for RAG)
apiVersion: v1
kind: Service
metadata:
name: qdrant
spec:
selector:
app: qdrant
ports:
- name: http
protocol: TCP
port: 6333
targetPort: 6333
- name: grpc
protocol: TCP
port: 6334
targetPort: 6334
type: ClusterIP
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: qdrant
spec:
serviceName: qdrant
replicas: 1
selector:
matchLabels:
app: qdrant
template:
metadata:
labels:
app: qdrant
spec:
containers:
- name: qdrant
image: qdrant/qdrant:latest
ports:
- containerPort: 6333
name: http
- containerPort: 6334
name: grpc
resources:
requests:
memory: "512Mi"
cpu: "200m"
limits:
memory: "1Gi"
cpu: "1000m"
volumeMounts:
- name: qdrant-data
mountPath: /qdrant/storage
volumeClaimTemplates:
- metadata:
name: qdrant-data
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 10Gi
---
# Kafka (KRaft mode - no Zookeeper needed) # Kafka (KRaft mode - no Zookeeper needed)
apiVersion: v1 apiVersion: v1
kind: Service kind: Service

View File

@@ -11,7 +11,7 @@ resources:
- ../base - ../base
# Add the 'ai' namespace (base only creates 'sandbox') # Add the 'ai' namespace (base only creates 'sandbox')
- namespaces.yaml - namespaces.yaml
# Prod infrastructure (postgres, minio, kafka, flink, relay, ingestor, qdrant, dragonfly, iceberg) # Prod infrastructure (postgres, minio, kafka, flink, relay, ingestor, dragonfly, iceberg)
- infrastructure.yaml - infrastructure.yaml
# Sandbox namespace resources (go to sandbox namespace, not ai) # Sandbox namespace resources (go to sandbox namespace, not ai)
- sandbox-config.yaml - sandbox-config.yaml

View File

@@ -19,6 +19,7 @@ stringData:
# LLM Provider API Keys # LLM Provider API Keys
llm_providers: llm_providers:
deepinfra_api_key: "{{ op://AI Prod/Gateway/deepinfra_api_key }}" deepinfra_api_key: "{{ op://AI Prod/Gateway/deepinfra_api_key }}"
anthropic_api_key: "{{ op://AI Prod/Gateway/anthropic_api_key }}"
# Search API Keys # Search API Keys
search: search:
@@ -36,10 +37,6 @@ stringData:
push: push:
service_key: "" service_key: ""
# Qdrant API key (optional, for hosted Qdrant)
qdrant:
api_key: ""
# Iceberg S3 credentials (must match minio-secret) # Iceberg S3 credentials (must match minio-secret)
iceberg: iceberg:
s3_access_key: "{{ op://AI Prod/MinIO/access_key }}" s3_access_key: "{{ op://AI Prod/MinIO/access_key }}"

View File

@@ -10,7 +10,7 @@ The platform runs across two namespaces:
| Namespace | Contents | | Namespace | Contents |
|-----------|----------| |-----------|----------|
| `ai` | Gateway, web UI, all infrastructure services (postgres, minio, kafka, flink, relay, ingestor, qdrant, dragonfly, iceberg-catalog) | | `ai` | Gateway, web UI, all infrastructure services (postgres, minio, kafka, flink, relay, ingestor, dragonfly, iceberg-catalog) |
| `sandbox` | Per-user sandbox containers (created dynamically by the gateway) | | `sandbox` | Per-user sandbox containers (created dynamically by the gateway) |
Secrets are managed via 1Password CLI (`op inject`). All `.tpl.yaml` files in `deploy/k8s/prod/secrets/` contain `op://` references and are safe to commit; actual values are never stored in git. Secrets are managed via 1Password CLI (`op inject`). All `.tpl.yaml` files in `deploy/k8s/prod/secrets/` contain `op://` references and are safe to commit; actual values are never stored in git.
@@ -217,7 +217,7 @@ kubectl --context=prod -n ai get configmaps
## Step 7 — Deploy Infrastructure ## Step 7 — Deploy Infrastructure
Infrastructure services (postgres, minio, kafka, iceberg-catalog, dragonfly, qdrant, relay, ingestor, flink) are defined in `deploy/k8s/prod/infrastructure.yaml` and were applied in Step 4. Infrastructure services (postgres, minio, kafka, iceberg-catalog, dragonfly, relay, ingestor, flink) are defined in `deploy/k8s/prod/infrastructure.yaml` and were applied in Step 4.
Wait for the StatefulSets and Deployments to become ready: Wait for the StatefulSets and Deployments to become ready:
@@ -225,7 +225,6 @@ Wait for the StatefulSets and Deployments to become ready:
kubectl --context=prod -n ai rollout status statefulset/postgres kubectl --context=prod -n ai rollout status statefulset/postgres
kubectl --context=prod -n ai rollout status statefulset/minio kubectl --context=prod -n ai rollout status statefulset/minio
kubectl --context=prod -n ai rollout status statefulset/kafka kubectl --context=prod -n ai rollout status statefulset/kafka
kubectl --context=prod -n ai rollout status statefulset/qdrant
kubectl --context=prod -n ai rollout status deployment/dragonfly kubectl --context=prod -n ai rollout status deployment/dragonfly
kubectl --context=prod -n ai rollout status deployment/iceberg-catalog kubectl --context=prod -n ai rollout status deployment/iceberg-catalog
kubectl --context=prod -n ai rollout status deployment/relay kubectl --context=prod -n ai rollout status deployment/relay

View File

@@ -22,20 +22,20 @@ The Agent Harness is the core orchestration layer for the Dexorder AI platform,
│ ┌──────────────────┼──────────────────┐ │ │ ┌──────────────────┼──────────────────┐ │
│ │ │ │ │ │ │ │ │ │
│ ┌────▼─────┐ ┌────▼─────┐ ┌────▼─────┐ │ │ ┌────▼─────┐ ┌────▼─────┐ ┌────▼─────┐ │
│ │ MCP │ │ LLM │ │ RAG │ │ │ │ MCP │ │ LLM │ │
│ │ Connector│ │ Router │ │ Retriever│ │ │ │ Connector│ │ Router │ │
│ └────┬─────┘ └────┬─────┘ └────┬─────┘ │ └────┬─────┘ └────┬─────┘ │
│ │ │ │ │ │ │
└─────────┼──────────────────┼──────────────────┼───────────── └─────────┼──────────────────┼─────────────┘
│ │ │ │
▼ ▼ ▼ ▼
┌────────────┐ ┌───────────┐ ┌───────────┐ ┌────────────┐ ┌───────────┐
│ User's │ │ LLM │ │ Qdrant │ │ User's │ │ LLM │
│ MCP │ │ Providers │ │ (Vectors) │ │ MCP │ │ Providers │
│ Container │ │(Anthropic,│ │ │ │ Container │ │(Anthropic,│
│ (k8s pod) │ │ OpenAI, │ │ Global + │ │ (k8s pod) │ │ OpenAI, │
│ │ │ etc) │ │ User │ │ │ │ etc) │
└────────────┘ └───────────┘ └───────────┘ └────────────┘ └───────────┘
``` ```
## Message Processing Flow ## Message Processing Flow
@@ -57,17 +57,11 @@ When a user sends a message:
│ - context://workspace-state │ - context://workspace-state
│ - context://system-prompt │ - context://system-prompt
├─→ b. RAGRetriever searches for relevant memories: ├─→ b. Build system prompt:
│ - Embeds user query
│ - Searches Qdrant: user_id = current_user OR user_id = "0"
│ - Returns user-specific + global platform knowledge
├─→ c. Build system prompt:
│ - Base platform prompt │ - Base platform prompt
│ - User profile context │ - User profile context
│ - Workspace state │ - Workspace state
│ - Custom user instructions │ - Custom user instructions
│ - Relevant RAG memories
├─→ d. ModelRouter selects LLM: ├─→ d. ModelRouter selects LLM:
│ - Based on license tier │ - Based on license tier
@@ -92,11 +86,10 @@ When a user sends a message:
### 1. Agent Harness (`gateway/src/harness/agent-harness.ts`) ### 1. Agent Harness (`gateway/src/harness/agent-harness.ts`)
**Stateless orchestrator** - all state lives in user's MCP container or RAG. **Stateless orchestrator** - all state lives in user's MCP container.
**Responsibilities:** **Responsibilities:**
- Fetch context from user's MCP resources - Fetch context from user's MCP resources
- Query RAG for relevant memories
- Build prompts with full context - Build prompts with full context
- Route to appropriate LLM - Route to appropriate LLM
- Handle tool calls (platform vs user) - Handle tool calls (platform vs user)
@@ -141,40 +134,12 @@ Routes queries to appropriate LLM based on:
- LangGraph checkpoints (1 hour TTL) - LangGraph checkpoints (1 hour TTL)
- Fast reads for active conversations - Fast reads for active conversations
**Qdrant** (Vector Search)
- Conversation embeddings
- User-specific memories (user_id = actual user ID)
- **Global platform knowledge** (user_id = "0")
- RAG retrieval with cosine similarity
- GDPR-compliant (indexed by user_id for fast deletion)
**Iceberg** (Cold Storage) **Iceberg** (Cold Storage)
- Full conversation history (partitioned by user_id, session_id) - Full conversation history (partitioned by user_id, session_id)
- Checkpoint snapshots for replay - Checkpoint snapshots for replay
- Analytics and time-travel queries - Analytics and time-travel queries
- GDPR-compliant with compaction - GDPR-compliant with compaction
#### RAG System:
**Global Knowledge** (user_id="0"):
- Platform capabilities and architecture
- Trading concepts and fundamentals
- Indicator development guides
- Strategy patterns and examples
- Loaded from `gateway/knowledge/` markdown files
**User Knowledge** (user_id=specific user):
- Personal conversation history
- Trading preferences and style
- Custom indicators and strategies
- Workspace state and context
**Query Flow:**
1. User query is embedded using EmbeddingService
2. Qdrant searches: `user_id IN (current_user, "0")`
3. Top-K relevant chunks returned
4. Added to LLM context automatically
### 5. Skills vs Subagents ### 5. Skills vs Subagents
#### Skills (`gateway/src/harness/skills/`) #### Skills (`gateway/src/harness/skills/`)
@@ -290,44 +255,6 @@ User's MCP container provides access to:
- Tactical order generators (TWAP, iceberg, etc.) - Tactical order generators (TWAP, iceberg, etc.)
- Smart order routing - Smart order routing
## Global Knowledge Management
### Document Loading
At gateway startup:
1. DocumentLoader scans `gateway/knowledge/` directory
2. Markdown files chunked by headers (~1000 tokens/chunk)
3. Embeddings generated via EmbeddingService
4. Stored in Qdrant with user_id="0"
5. Content hashing enables incremental updates
### Directory Structure
```
gateway/knowledge/
├── platform/ # Platform capabilities
├── trading/ # Trading fundamentals
├── indicators/ # Indicator development
└── strategies/ # Strategy patterns
```
### Updating Knowledge
**Development:**
```bash
curl -X POST http://localhost:3000/admin/reload-knowledge
```
**Production:**
- Update markdown files
- Deploy new version
- Auto-loaded on startup
**Monitoring:**
```bash
curl http://localhost:3000/admin/knowledge-stats
```
## Container Lifecycle ## Container Lifecycle
### User Container Creation ### User Container Creation
@@ -362,7 +289,6 @@ When user connects:
### ✅ Completed ### ✅ Completed
- Agent Harness with MCP integration - Agent Harness with MCP integration
- Model routing with license tiers - Model routing with license tiers
- RAG retriever with Qdrant
- Document loader for global knowledge - Document loader for global knowledge
- EmbeddingService (Ollama/OpenAI) - EmbeddingService (Ollama/OpenAI)
- Skills and subagents framework - Skills and subagents framework
@@ -388,5 +314,4 @@ When user connects:
- Documentation: `gateway/src/harness/README.md` - Documentation: `gateway/src/harness/README.md`
- Knowledge base: `gateway/knowledge/` - Knowledge base: `gateway/knowledge/`
- LangGraph: https://langchain-ai.github.io/langgraphjs/ - LangGraph: https://langchain-ai.github.io/langgraphjs/
- Qdrant: https://qdrant.tech/documentation/
- MCP Spec: https://modelcontextprotocol.io/ - MCP Spec: https://modelcontextprotocol.io/

View File

@@ -19,7 +19,6 @@ Dexorder is an AI-powered trading platform that combines real-time market data p
│ • Authentication & session management │ │ • Authentication & session management │
│ • Agent Harness (LangChain/LangGraph orchestration) │ │ • Agent Harness (LangChain/LangGraph orchestration) │
│ - MCP client connector to user containers │ │ - MCP client connector to user containers │
│ - RAG retriever (Qdrant) │
│ - Model router (LLM selection) │ │ - Model router (LLM selection) │
│ - Skills & subagents framework │ │ - Skills & subagents framework │
│ • Dynamic user container provisioning │ │ • Dynamic user container provisioning │
@@ -30,8 +29,7 @@ Dexorder is an AI-powered trading platform that combines real-time market data p
┌──────────────────┐ ┌──────────────┐ ┌──────────────────────┐ ┌──────────────────┐ ┌──────────────┐ ┌──────────────────────┐
│ User Containers │ │ Relay │ │ Infrastructure │ │ User Containers │ │ Relay │ │ Infrastructure │
│ (per-user pods) │ │ (ZMQ Router) │ │ • DragonflyDB (cache)│ │ (per-user pods) │ │ (ZMQ Router) │ │ • DragonflyDB (cache)│
│ │ │ │ │ • Qdrant (vectors) │ │ │ │ • MCP Server │ │ • Market data│ │ • PostgreSQL (meta)
│ • MCP Server │ │ • Market data│ │ • PostgreSQL (meta) │
│ • User files: │ │ fanout │ │ • MinIO (S3) │ │ • User files: │ │ fanout │ │ • MinIO (S3) │
│ - Indicators │ │ • Work queue │ │ │ │ - Indicators │ │ • Work queue │ │ │
│ - Strategies │ │ • Stateless │ │ │ │ - Strategies │ │ • Stateless │ │ │
@@ -86,18 +84,16 @@ Dexorder is an AI-powered trading platform that combines real-time market data p
- **Agent Harness (LangChain/LangGraph):** ([[agent_harness]]) - **Agent Harness (LangChain/LangGraph):** ([[agent_harness]])
- Stateless LLM orchestration - Stateless LLM orchestration
- MCP client connector to user containers - MCP client connector to user containers
- RAG retrieval from Qdrant (global + user-specific knowledge)
- Model routing based on license tier and complexity - Model routing based on license tier and complexity
- Skills and subagents framework - Skills and subagents framework
- Workflow state machines with validation loops - Workflow state machines with validation loops
**Key Features:** **Key Features:**
- **Stateless design:** All conversation state lives in user containers or Qdrant - **Stateless design:** All conversation state lives in user containers
- **Multi-channel support:** WebSocket, Telegram (future: mobile, Discord, Slack) - **Multi-channel support:** WebSocket, Telegram (future: mobile, Discord, Slack)
- **Kubernetes-native:** Uses k8s API for container management - **Kubernetes-native:** Uses k8s API for container management
- **Three-tier memory:** - **Three-tier memory:**
- Redis: Hot storage, active sessions, LangGraph checkpoints (1 hour TTL) - Redis: Hot storage, active sessions, LangGraph checkpoints (1 hour TTL)
- Qdrant: Vector search, RAG, global + user knowledge, GDPR-compliant
- Iceberg: Cold storage, full history, analytics, time-travel queries - Iceberg: Cold storage, full history, analytics, time-travel queries
**Infrastructure:** **Infrastructure:**
@@ -270,12 +266,6 @@ Exchange API → Ingestor → Kafka → Flink → Iceberg
- Redis-compatible in-memory cache - Redis-compatible in-memory cache
- Session state, rate limiting, hot data - Session state, rate limiting, hot data
#### Qdrant
- Vector database for RAG
- **Global knowledge** (user_id="0"): Platform capabilities, trading concepts, strategy patterns
- **User knowledge** (user_id=specific): Personal conversations, preferences, strategies
- GDPR-compliant (indexed by user_id for fast deletion)
#### PostgreSQL #### PostgreSQL
- Iceberg catalog metadata - Iceberg catalog metadata
- User accounts and license info (gateway) - User accounts and license info (gateway)
@@ -458,17 +448,11 @@ The gateway's agent harness (LangChain/LangGraph) orchestrates LLM interactions
│ - context://workspace-state │ - context://workspace-state
│ - context://system-prompt │ - context://system-prompt
├─→ b. RAGRetriever searches Qdrant for relevant memories: ├─→ b. Build system prompt:
│ - Embeds user query
│ - Searches: user_id IN (current_user, "0")
│ - Returns user-specific + global platform knowledge
├─→ c. Build system prompt:
│ - Base platform prompt │ - Base platform prompt
│ - User profile context │ - User profile context
│ - Workspace state │ - Workspace state
│ - Custom user instructions │ - Custom user instructions
│ - Relevant RAG memories
├─→ d. ModelRouter selects LLM: ├─→ d. ModelRouter selects LLM:
│ - Based on license tier │ - Based on license tier
@@ -492,8 +476,6 @@ The gateway's agent harness (LangChain/LangGraph) orchestrates LLM interactions
**Key Architecture:** **Key Architecture:**
- **Gateway is stateless:** No conversation history stored in gateway - **Gateway is stateless:** No conversation history stored in gateway
- **User context in MCP:** All user-specific data lives in user's container - **User context in MCP:** All user-specific data lives in user's container
- **Global knowledge in Qdrant:** Platform documentation loaded from `gateway/knowledge/`
- **RAG at gateway level:** Semantic search combines global + user knowledge
- **Skills vs Subagents:** - **Skills vs Subagents:**
- Skills: Well-defined, single-purpose tasks - Skills: Well-defined, single-purpose tasks
- Subagents: Complex domain expertise with multi-file context - Subagents: Complex domain expertise with multi-file context
@@ -630,7 +612,6 @@ See [[backend_redesign]] for detailed notes.
- Historical backfill service - Historical backfill service
**Phase 3: Agent Features** **Phase 3: Agent Features**
- RAG integration (Qdrant)
- Strategy backtesting - Strategy backtesting
- Risk management tools - Risk management tools
- Portfolio analytics - Portfolio analytics

View File

@@ -1,8 +1,9 @@
# Development Plan # Development Plan
* Single conversation in gateway
* Realtime data
* Triggers * Triggers
* Alerts
* Needs Email/Telegram channels
* Screeners
* Strategy UI * Strategy UI
* Backtesting TV integration * Backtesting TV integration
* Paper Trading * Paper Trading
@@ -10,7 +11,21 @@
* Live Execution * Live Execution
* Sandbox <=> Dexorder auth * Sandbox <=> Dexorder auth
* Chat channels * Chat channels
* MCP channel (with or without images) * MCP channel (with or without images)
* Telegram cookie binding
* Needs user secrets vault
* TradingView indicator import tool * TradingView indicator import tool
* Trader preferences tool * Results persistence: ~~research analysis~~, backtests, strategy performance metrics, etc.
* * Free tier with token limits and sandbox shutdown
* Registration
* System email
* Performance analysis
* Custom pre-session scanners / summaries
* Saved prompts (Create /presession prompt command for easy re-use)
https://github.com/wangzhe3224/awesome-systematic-trading
https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3247865 151 trading strategies
https://vectorbt.dev/
https://github.com/shiyu-coder/Kronos
https://x.com/RohOnChain/status/2041180375838498950?s=20 combining signals

View File

@@ -90,15 +90,26 @@ kubectl --context prod -n ai exec minio-0 -- mc alias set local http://localhost
kubectl --context prod -n ai exec minio-0 -- mc rm --recursive --force local/warehouse/ kubectl --context prod -n ai exec minio-0 -- mc rm --recursive --force local/warehouse/
``` ```
#### 4. Run the full deploy #### 4. Delete sandbox deployments and wipe sandbox PVCs
Sandbox PVCs have a finalizer that prevents deletion until the sandbox pod is gone. Delete the deployments first, then the PVCs:
```bash
kubectl --context prod -n sandbox delete deployments --all
kubectl --context prod -n sandbox delete pvc --all
```
The PVC deletion will complete once the pods finish terminating (Ceph cleanup can take ~30s). You can proceed to the deploy immediately — it does not depend on PVC termination completing.
#### 5. Run the full deploy
```bash ```bash
bin/deploy-all --sandboxes bin/deploy-all --sandboxes
``` ```
This rebuilds and redeploys all services, including `iceberg-catalog`, `flink-jobmanager`, and `flink-taskmanager` (which were scaled to zero above — `deploy-all` will restore them to their manifest replica counts). This rebuilds and redeploys all services, including `iceberg-catalog`, `flink-jobmanager`, and `flink-taskmanager` (which were scaled to zero above — `deploy-all` will restore them to their manifest replica counts). The `--sandboxes` flag also cleans up any remaining sandbox Services.
#### 5. Re-apply the gateway database schema #### 6. Re-apply the gateway database schema
The gateway does **not** auto-migrate. After the `iceberg` database is recreated, the schema must be applied manually: The gateway does **not** auto-migrate. After the `iceberg` database is recreated, the schema must be applied manually:
@@ -108,7 +119,7 @@ kubectl --context prod -n ai exec -i postgres-0 -- psql -U postgres -d iceberg <
This creates the `user`, `session`, `user_licenses`, and related tables. This creates the `user`, `session`, `user_licenses`, and related tables.
#### 6. Recreate all users #### 7. Recreate all users
```bash ```bash
bin/create-all-users prod bin/create-all-users prod
@@ -142,7 +153,7 @@ kubectl --context prod -n ai logs deployment/gateway --tail=100
**Cause:** Dropping the `iceberg` database removes the gateway's auth tables along with the Iceberg catalog metadata — they share the same database. **Cause:** Dropping the `iceberg` database removes the gateway's auth tables along with the Iceberg catalog metadata — they share the same database.
**Fix:** Re-apply the schema and recreate users (steps 5 and 6 above). **Fix:** Re-apply the schema and recreate users (steps 6 and 7 above).
### Gateway shows `42P01` errors but pod is running ### Gateway shows `42P01` errors but pod is running
@@ -164,3 +175,25 @@ The gateway does not auto-migrate on startup. The schema file must be applied ma
1Password's `op inject` requires interactive desktop authentication. Running it via `echo "yes" | bin/secret-update prod` or any background/piped invocation will fail silently (the script prints `✓` even though `kubectl apply` received empty input). 1Password's `op inject` requires interactive desktop authentication. Running it via `echo "yes" | bin/secret-update prod` or any background/piped invocation will fail silently (the script prints `✓` even though `kubectl apply` received empty input).
**Fix:** Run `bin/secret-update prod` in an interactive terminal with 1Password unlocked. **Fix:** Run `bin/secret-update prod` in an interactive terminal with 1Password unlocked.
### Config validation warnings during `bin/deploy-all`
**Symptom:** Step 3 (config update) prints errors like:
```
error: error validating "deploy/k8s/prod/configs/relay-config.yaml": error validating data: [apiVersion not set, kind not set]
```
for `relay-config`, `ingestor-config`, and `flink-config`.
**Cause:** These config files are raw data files (not Kubernetes manifests), so `kubectl` can't validate their structure. The underlying `kubectl create configmap` command succeeds regardless.
**Impact:** None — the configs are applied correctly and the script reports `✓ All configs updated successfully`. These warnings are expected and can be ignored.
### Flink image build produces many Maven shading warnings
**Symptom:** During Step 4, the Flink image build outputs dozens of `[WARNING] Discovered module-info.class` and overlapping class/resource warnings from Maven.
**Impact:** None — these are pre-existing warnings from bundling Iceberg, AWS SDK, and Flink dependencies together into a shaded JAR. The build completes successfully.
### `bin/deploy-all` confirmation prompt
Unlike `bin/secret-update`, the `bin/deploy-all` confirmation prompt (`Are you sure you want to continue? (yes/no)`) works fine with `echo "yes" | bin/deploy-all --sandboxes` from a script or non-interactive context.

View File

@@ -1 +1,11 @@
---
For the following series of analysis questions, use 5 years of 15 minute data from `ETH/USDT.BINANCE`
what conclusions can you make by analyzing historical data on ETH price direction changes near market session overlaps and market sessions changes on monday and tuesday? what conclusions can you make by analyzing historical data on ETH price direction changes near market session overlaps and market sessions changes on monday and tuesday?
---
do the same price direction change analysis but specifically compre a two hr range (1 hour before and after 9am EST, NY open; 11am to 1pm range, and 4:30 to 7pm EST range to look for potential price direction changes and compare relative probibilty for monday and tuesday for all other times between wednesday and sunday..
---

View File

@@ -16,6 +16,12 @@ import com.dexorder.flink.publisher.RealtimeBarFunction;
import com.dexorder.flink.publisher.RealtimeBarPublisher; import com.dexorder.flink.publisher.RealtimeBarPublisher;
import com.dexorder.flink.publisher.TickWrapper; import com.dexorder.flink.publisher.TickWrapper;
import com.dexorder.flink.publisher.TickDeserializer; import com.dexorder.flink.publisher.TickDeserializer;
import com.dexorder.flink.quotes.Ticker24hFunction;
import com.dexorder.flink.quotes.Ticker24hPublisher;
import com.dexorder.flink.quotes.Ticker24hScheduler;
import com.dexorder.flink.quotes.Ticker24hWrapper;
import com.dexorder.flink.quotes.TickerBatchDeserializer;
import com.dexorder.flink.quotes.TickerBatchWrapper;
import com.dexorder.flink.sink.HistoricalBatchWriter; import com.dexorder.flink.sink.HistoricalBatchWriter;
import com.dexorder.flink.sink.SymbolMetadataWriter; import com.dexorder.flink.sink.SymbolMetadataWriter;
import com.dexorder.flink.zmq.ZmqChannelManager; import com.dexorder.flink.zmq.ZmqChannelManager;
@@ -263,7 +269,7 @@ public class TradingFlinkApp {
DataStream<RealtimeBar> barStream = tickStream DataStream<RealtimeBar> barStream = tickStream
.keyBy(TickWrapper::getTicker) .keyBy(TickWrapper::getTicker)
.flatMap(new RealtimeBarFunction(periods)) .process(new RealtimeBarFunction(periods))
.setParallelism(1); .setParallelism(1);
barStream.addSink(new RealtimeBarPublisher(notificationEndpoint)) barStream.addSink(new RealtimeBarPublisher(notificationEndpoint))
@@ -273,6 +279,35 @@ public class TradingFlinkApp {
LOG.info("Realtime tick pipeline configured: market-tick → OHLC bars → clients (periods={})", LOG.info("Realtime tick pipeline configured: market-tick → OHLC bars → clients (periods={})",
java.util.Arrays.toString(periods)); java.util.Arrays.toString(periods));
// Ticker24h pipeline: market-ticker Kafka → QuoteCurrencyIndex → ZMQ XPUB
KafkaSource<TickerBatchWrapper> tickerSource = KafkaSource.<TickerBatchWrapper>builder()
.setBootstrapServers(config.getKafkaBootstrapServers())
.setTopics(config.getKafkaTickerTopic())
.setGroupId("flink-ticker24h-consumer")
.setStartingOffsets(OffsetsInitializer.latest())
.setValueOnlyDeserializer(new TickerBatchDeserializer())
.build();
DataStream<TickerBatchWrapper> tickerBatchStream = env
.fromSource(tickerSource, WatermarkStrategy.noWatermarks(), "TickerBatch Kafka Source");
DataStream<Ticker24hWrapper> ticker24hStream = tickerBatchStream
.flatMap(new Ticker24hFunction())
.setParallelism(1)
.name("Ticker24hFunction");
ticker24hStream.addSink(new Ticker24hPublisher(notificationEndpoint))
.setParallelism(1)
.name("Ticker24hPublisher");
LOG.info("Ticker24h pipeline configured: market-ticker → Ticker24hFunction → clients");
// Start Ticker24h scheduler (fires on startup + hourly for all configured exchanges)
Ticker24hScheduler ticker24hScheduler = new Ticker24hScheduler(
broker, config.getSupportedExchanges());
ticker24hScheduler.start();
LOG.info("Ticker24hScheduler started for exchanges: {}", config.getSupportedExchanges());
// TODO: Set up CEP patterns and triggers // TODO: Set up CEP patterns and triggers
LOG.info("Flink job configured, starting execution"); LOG.info("Flink job configured, starting execution");
@@ -281,6 +316,7 @@ public class TradingFlinkApp {
Runtime.getRuntime().addShutdownHook(new Thread(() -> { Runtime.getRuntime().addShutdownHook(new Thread(() -> {
LOG.info("Shutting down Trading Flink Application"); LOG.info("Shutting down Trading Flink Application");
try { try {
ticker24hScheduler.stop();
notificationForwarder.close(); notificationForwarder.close();
subscriptionManager.stop(); subscriptionManager.stop();
broker.stop(); broker.stop();

View File

@@ -4,7 +4,10 @@ import org.yaml.snakeyaml.Yaml;
import java.io.FileInputStream; import java.io.FileInputStream;
import java.io.IOException; import java.io.IOException;
import java.io.InputStream; import java.io.InputStream;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.HashMap; import java.util.HashMap;
import java.util.List;
import java.util.Map; import java.util.Map;
/** /**
@@ -136,6 +139,26 @@ public class AppConfig {
return getString("kafka_ohlc_topic", "market-ohlc"); return getString("kafka_ohlc_topic", "market-ohlc");
} }
public String getKafkaTickerTopic() {
return getString("kafka_ticker_topic", "market-ticker");
}
/**
* Comma-separated list of exchange IDs to fetch Ticker24h snapshots for.
* Default: BINANCE only.
*/
public List<String> getSupportedExchanges() {
String raw = getString("supported_exchanges", "BINANCE");
List<String> result = new ArrayList<>();
for (String part : raw.split(",")) {
String trimmed = part.trim().toUpperCase();
if (!trimmed.isEmpty()) {
result.add(trimmed);
}
}
return result;
}
// Notification config: // Notification config:
// Task managers PUSH notifications to this endpoint (job manager PULL address) // Task managers PUSH notifications to this endpoint (job manager PULL address)
public String getNotificationPublishEndpoint() { public String getNotificationPublishEndpoint() {

View File

@@ -123,10 +123,10 @@ public class SchemaInitializer {
/** /**
* Initialize the OHLC table if it doesn't exist. * Initialize the OHLC table if it doesn't exist.
*/ */
// Bump this when the schema changes. Tables with a different (or missing) version // Bump this when the schema changes. Increment by 1 for each change.
// will be dropped and recreated. Increment by 1 for each incompatible change.
// v1: open/high/low/close required; ingestor forward-fills interior gaps with previous close // v1: open/high/low/close required; ingestor forward-fills interior gaps with previous close
private static final String OHLC_SCHEMA_VERSION = "1"; // v2: added num_trades and quote_volume (appended; backward-compatible via Iceberg schema evolution)
private static final String OHLC_SCHEMA_VERSION = "2";
private static final String SCHEMA_VERSION_PROP = "app.schema.version"; private static final String SCHEMA_VERSION_PROP = "app.schema.version";
private void initializeOhlcTable() { private void initializeOhlcTable() {
@@ -154,11 +154,13 @@ public class SchemaInitializer {
if (tableExists) { if (tableExists) {
Table existing = catalog.loadTable(tableId); Table existing = catalog.loadTable(tableId);
String existingVersion = existing.properties().get(SCHEMA_VERSION_PROP); String existingVersion = existing.properties().get(SCHEMA_VERSION_PROP);
LOG.info("Table {} already exists at schema version {}", tableId, existingVersion);
if (!OHLC_SCHEMA_VERSION.equals(existingVersion)) { if (!OHLC_SCHEMA_VERSION.equals(existingVersion)) {
LOG.warn("Table {} has schema version '{}', expected '{}' — skipping (manual migration required if needed)", LOG.info("Evolving table {} from version '{}' to '{}'", tableId, existingVersion, OHLC_SCHEMA_VERSION);
tableId, existingVersion, OHLC_SCHEMA_VERSION); evolveOhlcSchema(existing);
existing.updateProperties().set(SCHEMA_VERSION_PROP, OHLC_SCHEMA_VERSION).commit();
LOG.info("Schema evolution complete for {}", tableId);
} }
LOG.info("Table {} already exists at schema version {} — skipping creation", tableId, existingVersion);
return; return;
} }
@@ -195,7 +197,11 @@ public class SchemaInitializer {
// Metadata fields // Metadata fields
optional(16, "request_id", Types.StringType.get(), "Request ID that generated this data"), optional(16, "request_id", Types.StringType.get(), "Request ID that generated this data"),
required(17, "ingested_at", Types.LongType.get(), "Timestamp when data was ingested by Flink (nanoseconds since epoch)") required(17, "ingested_at", Types.LongType.get(), "Timestamp when data was ingested by Flink (nanoseconds since epoch)"),
// Extended exchange fields — appended for backward-compatible schema evolution (v2)
optional(18, "num_trades", Types.LongType.get(), "Number of trades in the candle"),
optional(19, "quote_volume", Types.LongType.get(), "Total quote asset volume (scaled by price precision)")
); );
// Create the table with partitioning and properties // Create the table with partitioning and properties
@@ -218,6 +224,30 @@ public class SchemaInitializer {
} }
} }
/**
* Add any columns missing from a v1 OHLC table to bring it to v2.
* Iceberg schema evolution is safe and non-destructive — existing rows get null for new columns.
*/
private void evolveOhlcSchema(Table table) {
org.apache.iceberg.UpdateSchema update = table.updateSchema();
boolean changed = false;
java.util.Set<String> existing = new java.util.HashSet<>();
for (org.apache.iceberg.types.Types.NestedField f : table.schema().columns()) {
existing.add(f.name());
}
if (!existing.contains("num_trades")) {
update.addColumn("num_trades", Types.LongType.get(), "Number of trades in the candle");
changed = true;
}
if (!existing.contains("quote_volume")) {
update.addColumn("quote_volume", Types.LongType.get(), "Total quote asset volume (scaled by price precision)");
changed = true;
}
if (changed) {
update.commit();
}
}
/** /**
* Initialize the symbol_metadata table if it doesn't exist. * Initialize the symbol_metadata table if it doesn't exist.
*/ */

View File

@@ -60,7 +60,9 @@ public class IngestorBroker implements AutoCloseable {
/** Re-queue realtime job if no heartbeat received within this window (ms) */ /** Re-queue realtime job if no heartbeat received within this window (ms) */
private static final long HEARTBEAT_TIMEOUT_MS = 25_000; private static final long HEARTBEAT_TIMEOUT_MS = 25_000;
/** Re-queue historical job if not completed within this window (ms) */ /** Re-queue historical job if not completed within this window (ms) */
private static final long HISTORICAL_TIMEOUT_MS = 60_000; private static final long HISTORICAL_TIMEOUT_MS = 120_000;
/** Re-queue ticker snapshot job if not completed within this window (ms) */
private static final long TICKER_SNAPSHOT_TIMEOUT_MS = 30_000;
private final ZmqChannelManager zmqManager; private final ZmqChannelManager zmqManager;
private volatile boolean running; private volatile boolean running;
@@ -113,6 +115,23 @@ public class IngestorBroker implements AutoCloseable {
LOG.info("IngestorBroker stopped"); LOG.info("IngestorBroker stopped");
} }
/**
* Submit a TICKER_SNAPSHOT request from outside the broker thread (thread-safe).
* Called by Ticker24hScheduler on startup and hourly.
* Uses sentinel ticker "@TICKER24H.{EXCHANGE}" (e.g., "@TICKER24H.BINANCE").
*/
public void submitTicker24hRequest(String exchange) {
String jobId = UUID.randomUUID().toString();
DataRequest request = DataRequest.newBuilder()
.setRequestId(jobId)
.setJobId(jobId)
.setType(DataRequest.RequestType.TICKER_SNAPSHOT)
.setTicker("@TICKER24H." + exchange.toUpperCase())
.build();
externalSubmissions.add(request);
LOG.info("Enqueued TICKER_SNAPSHOT request: exchange={}, jobId={}", exchange, jobId);
}
/** /**
* Submit a realtime data request from outside the broker thread (thread-safe). * Submit a realtime data request from outside the broker thread (thread-safe).
* Called by RealtimeSubscriptionManager when subscription ref count goes 0→1. * Called by RealtimeSubscriptionManager when subscription ref count goes 0→1.
@@ -219,21 +238,38 @@ public class IngestorBroker implements AutoCloseable {
try { try {
SubmitHistoricalRequest req = SubmitHistoricalRequest.parseFrom(payload); SubmitHistoricalRequest req = SubmitHistoricalRequest.parseFrom(payload);
String jobId = UUID.randomUUID().toString(); String jobId = UUID.randomUUID().toString();
DataRequest dataRequest = DataRequest.newBuilder() String ticker = req.getTicker();
.setRequestId(req.getRequestId()) String clientId = req.hasClientId() ? req.getClientId() : "";
.setJobId(jobId)
.setType(DataRequest.RequestType.HISTORICAL_OHLC) DataRequest dataRequest;
.setTicker(req.getTicker()) if (ticker.startsWith("@TICKER24H.")) {
.setHistorical(com.dexorder.proto.HistoricalParams.newBuilder() // Client-initiated ticker snapshot — route to TICKER_SNAPSHOT, not OHLC
.setStartTime(req.getStartTime()) dataRequest = DataRequest.newBuilder()
.setEndTime(req.getEndTime()) .setRequestId(req.getRequestId())
.setPeriodSeconds(req.getPeriodSeconds()) .setJobId(jobId)
.build()) .setType(DataRequest.RequestType.TICKER_SNAPSHOT)
.setClientId(req.hasClientId() ? req.getClientId() : "") .setTicker(ticker)
.build(); .setClientId(clientId)
.build();
LOG.info("Routing client-initiated TICKER_SNAPSHOT: request_id={}, ticker={}, client_id={}",
req.getRequestId(), ticker, clientId);
} else {
dataRequest = DataRequest.newBuilder()
.setRequestId(req.getRequestId())
.setJobId(jobId)
.setType(DataRequest.RequestType.HISTORICAL_OHLC)
.setTicker(ticker)
.setHistorical(com.dexorder.proto.HistoricalParams.newBuilder()
.setStartTime(req.getStartTime())
.setEndTime(req.getEndTime())
.setPeriodSeconds(req.getPeriodSeconds())
.build())
.setClientId(clientId)
.build();
LOG.info("Received historical request from relay: request_id={}, ticker={}",
req.getRequestId(), ticker);
}
enqueueJob(dataRequest); enqueueJob(dataRequest);
LOG.info("Received historical request from relay: request_id={}, ticker={}",
req.getRequestId(), req.getTicker());
} catch (Exception e) { } catch (Exception e) {
LOG.error("Failed to parse SubmitHistoricalRequest from relay", e); LOG.error("Failed to parse SubmitHistoricalRequest from relay", e);
} }
@@ -411,8 +447,14 @@ public class IngestorBroker implements AutoCloseable {
for (Map.Entry<String, ActiveJob> entry : activeJobs.entrySet()) { for (Map.Entry<String, ActiveJob> entry : activeJobs.entrySet()) {
ActiveJob job = entry.getValue(); ActiveJob job = entry.getValue();
long timeout = job.type == DataRequest.RequestType.REALTIME_TICKS long timeout;
? HEARTBEAT_TIMEOUT_MS : HISTORICAL_TIMEOUT_MS; if (job.type == DataRequest.RequestType.REALTIME_TICKS) {
timeout = HEARTBEAT_TIMEOUT_MS;
} else if (job.type == DataRequest.RequestType.TICKER_SNAPSHOT) {
timeout = TICKER_SNAPSHOT_TIMEOUT_MS;
} else {
timeout = HISTORICAL_TIMEOUT_MS;
}
if (now - job.lastHeartbeat > timeout) { if (now - job.lastHeartbeat > timeout) {
timedOut.add(entry.getKey()); timedOut.add(entry.getKey());
} }
@@ -460,7 +502,8 @@ public class IngestorBroker implements AutoCloseable {
boolean exchangeMatch = exchange.isEmpty() || slot.exchange.equals(exchange); boolean exchangeMatch = exchange.isEmpty() || slot.exchange.equals(exchange);
boolean typeMatch = slot.slotType == SlotType.ANY boolean typeMatch = slot.slotType == SlotType.ANY
|| (slot.slotType == SlotType.HISTORICAL || (slot.slotType == SlotType.HISTORICAL
&& requestType == DataRequest.RequestType.HISTORICAL_OHLC) && (requestType == DataRequest.RequestType.HISTORICAL_OHLC
|| requestType == DataRequest.RequestType.TICKER_SNAPSHOT))
|| (slot.slotType == SlotType.REALTIME || (slot.slotType == SlotType.REALTIME
&& requestType == DataRequest.RequestType.REALTIME_TICKS); && requestType == DataRequest.RequestType.REALTIME_TICKS);
if (exchangeMatch && typeMatch) { if (exchangeMatch && typeMatch) {

View File

@@ -19,17 +19,21 @@ import java.util.regex.Pattern;
* must go through {@link #enqueuePublish(byte[]...)} so they are sent from the single loop * must go through {@link #enqueuePublish(byte[]...)} so they are sent from the single loop
* thread — ZMQ sockets are not thread-safe. * thread — ZMQ sockets are not thread-safe.
* *
* Topic format: {@code {ticker}|ohlc:{period_seconds}} * Topic formats:
* Example: {@code BTC/USDT.BINANCE|ohlc:60} * Closed bars: {@code {ticker}|ohlc:{period_seconds}} (strategies, existing consumers)
* Open bars: {@code {ticker}|ohlc:{period_seconds}:open} (chart, live price updates)
*
* Both topic forms map to the same underlying ingestor activation for that ticker.
* *
* Reference counting: * Reference counting:
* tickerRefs — across all periods for a ticker; 0→1 triggers ingestor activation * tickerRefs — across all subscribed topics for a ticker; 0→1 triggers ingestor activation
* topicRefs — per (ticker, period); consulted by RealtimeOHLCPublisher to filter output * topicRefs — per topic string; consulted by RealtimeOHLCPublisher to filter output
*/ */
public class RealtimeSubscriptionManager implements AutoCloseable { public class RealtimeSubscriptionManager implements AutoCloseable {
private static final Logger LOG = LoggerFactory.getLogger(RealtimeSubscriptionManager.class); private static final Logger LOG = LoggerFactory.getLogger(RealtimeSubscriptionManager.class);
private static final Pattern TOPIC_PATTERN = Pattern.compile("^(.+)\\|ohlc:(\\d+)$"); // Matches both "{ticker}|ohlc:{period}" and "{ticker}|ohlc:{period}:open"
private static final Pattern TOPIC_PATTERN = Pattern.compile("^(.+)\\|ohlc:(\\d+)(:open)?$");
private final ZmqChannelManager zmqManager; private final ZmqChannelManager zmqManager;
private final ZMQ.Socket xpubSocket; private final ZMQ.Socket xpubSocket;

View File

@@ -69,7 +69,13 @@ public class OHLCBatchDeserializer implements DeserializationSchema<OHLCBatchWra
row.getHigh(), row.getHigh(),
row.getLow(), row.getLow(),
row.getClose(), row.getClose(),
row.hasVolume() ? row.getVolume() : null row.hasVolume() ? row.getVolume() : null,
row.hasBuyVol() ? row.getBuyVol() : null,
row.hasSellVol() ? row.getSellVol() : null,
row.hasOpenTime() ? row.getOpenTime() : null,
row.hasCloseTime() ? row.getCloseTime() : null,
row.hasNumTrades() ? row.getNumTrades() : null,
row.hasQuoteVolume() ? row.getQuoteVolume() : null
)); ));
} }

View File

@@ -116,57 +116,58 @@ public class OHLCBatchWrapper implements Serializable {
/** /**
* Single OHLC row. open/high/low/close/volume are nullable to support gap bars * Single OHLC row. open/high/low/close/volume are nullable to support gap bars
* (periods where no trades occurred). * (periods where no trades occurred). All extended fields are nullable and only
* populated when the exchange provides them (e.g. Binance klines).
*/ */
public static class OHLCRow implements Serializable { public static class OHLCRow implements Serializable {
private static final long serialVersionUID = 1L; private static final long serialVersionUID = 1L;
private final long timestamp; private final long timestamp;
private final String ticker; private final String ticker;
private final Long open; // null for gap bars private final Long open; // null for gap bars
private final Long high; // null for gap bars private final Long high; // null for gap bars
private final Long low; // null for gap bars private final Long low; // null for gap bars
private final Long close; // null for gap bars private final Long close; // null for gap bars
private final Long volume; // null when no volume data private final Long volume;
private final Long buyVol;
private final Long sellVol;
private final Long openTime;
private final Long closeTime;
private final Long numTrades;
private final Long quoteVolume;
public OHLCRow(long timestamp, String ticker, Long open, Long high, public OHLCRow(long timestamp, String ticker, Long open, Long high,
Long low, Long close, Long volume) { Long low, Long close, Long volume,
this.timestamp = timestamp; Long buyVol, Long sellVol, Long openTime, Long closeTime,
this.ticker = ticker; Long numTrades, Long quoteVolume) {
this.open = open; this.timestamp = timestamp;
this.high = high; this.ticker = ticker;
this.low = low; this.open = open;
this.close = close; this.high = high;
this.volume = volume; this.low = low;
this.close = close;
this.volume = volume;
this.buyVol = buyVol;
this.sellVol = sellVol;
this.openTime = openTime;
this.closeTime = closeTime;
this.numTrades = numTrades;
this.quoteVolume = quoteVolume;
} }
public long getTimestamp() { public long getTimestamp() { return timestamp; }
return timestamp; public String getTicker() { return ticker; }
} public Long getOpen() { return open; }
public Long getHigh() { return high; }
public String getTicker() { public Long getLow() { return low; }
return ticker; public Long getClose() { return close; }
} public Long getVolume() { return volume; }
public Long getBuyVol() { return buyVol; }
public Long getOpen() { public Long getSellVol() { return sellVol; }
return open; public Long getOpenTime() { return openTime; }
} public Long getCloseTime() { return closeTime; }
public Long getNumTrades() { return numTrades; }
public Long getHigh() { public Long getQuoteVolume() { return quoteVolume; }
return high;
}
public Long getLow() {
return low;
}
public Long getClose() {
return close;
}
public Long getVolume() {
return volume;
}
public boolean isGapBar() { public boolean isGapBar() {
return open == null && high == null && low == null && close == null; return open == null && high == null && low == null && close == null;
@@ -180,6 +181,9 @@ public class OHLCBatchWrapper implements Serializable {
(isGapBar() ? ", gap=true" : (isGapBar() ? ", gap=true" :
", open=" + open + ", high=" + high + ", low=" + low + ", close=" + close) + ", open=" + open + ", high=" + high + ", low=" + low + ", close=" + close) +
", volume=" + volume + ", volume=" + volume +
", buyVol=" + buyVol +
", sellVol=" + sellVol +
", numTrades=" + numTrades +
'}'; '}';
} }
} }

View File

@@ -3,8 +3,11 @@ package com.dexorder.flink.publisher;
import java.io.Serializable; import java.io.Serializable;
/** /**
* A single completed OHLC bar for a given ticker and period. * A single OHLC bar for a given ticker and period.
* Output type of RealtimeBarFunction, input type of RealtimeBarPublisher. * Output type of RealtimeBarFunction, input type of RealtimeBarPublisher.
*
* isClosed=true → window fully closed; published on topic "{ticker}|ohlc:{period}"
* isClosed=false → window still open (snapshot); published on "{ticker}|ohlc:{period}:open"
*/ */
public class RealtimeBar implements Serializable { public class RealtimeBar implements Serializable {
private static final long serialVersionUID = 1L; private static final long serialVersionUID = 1L;
@@ -23,11 +26,14 @@ public class RealtimeBar implements Serializable {
private long volume; private long volume;
/** Number of ticks in this window */ /** Number of ticks in this window */
private int tickCount; private int tickCount;
/** True if this bar's time window has fully closed; false if still accumulating. */
private boolean isClosed;
public RealtimeBar() {} public RealtimeBar() {}
public RealtimeBar(String ticker, int periodSeconds, long windowStartMs, public RealtimeBar(String ticker, int periodSeconds, long windowStartMs,
long open, long high, long low, long close, long volume, int tickCount) { long open, long high, long low, long close, long volume, int tickCount,
boolean isClosed) {
this.ticker = ticker; this.ticker = ticker;
this.periodSeconds = periodSeconds; this.periodSeconds = periodSeconds;
this.windowStartMs = windowStartMs; this.windowStartMs = windowStartMs;
@@ -37,6 +43,7 @@ public class RealtimeBar implements Serializable {
this.close = close; this.close = close;
this.volume = volume; this.volume = volume;
this.tickCount = tickCount; this.tickCount = tickCount;
this.isClosed = isClosed;
} }
public String getTicker() { return ticker; } public String getTicker() { return ticker; }
@@ -48,6 +55,7 @@ public class RealtimeBar implements Serializable {
public long getClose() { return close; } public long getClose() { return close; }
public long getVolume() { return volume; } public long getVolume() { return volume; }
public int getTickCount() { return tickCount; } public int getTickCount() { return tickCount; }
public boolean isClosed() { return isClosed; }
public void setTicker(String ticker) { this.ticker = ticker; } public void setTicker(String ticker) { this.ticker = ticker; }
public void setPeriodSeconds(int periodSeconds) { this.periodSeconds = periodSeconds; } public void setPeriodSeconds(int periodSeconds) { this.periodSeconds = periodSeconds; }
@@ -58,16 +66,22 @@ public class RealtimeBar implements Serializable {
public void setClose(long close) { this.close = close; } public void setClose(long close) { this.close = close; }
public void setVolume(long volume) { this.volume = volume; } public void setVolume(long volume) { this.volume = volume; }
public void setTickCount(int tickCount) { this.tickCount = tickCount; } public void setTickCount(int tickCount) { this.tickCount = tickCount; }
public void setClosed(boolean closed) { this.isClosed = closed; }
/** ZMQ topic for this bar: e.g., "BTC/USDT.BINANCE|ohlc:60" */ /**
* ZMQ topic for this bar.
* Closed bars: "{ticker}|ohlc:{period}" (strategies, existing consumers)
* Open bars: "{ticker}|ohlc:{period}:open" (chart, live price updates)
*/
public String topic() { public String topic() {
return ticker + "|ohlc:" + periodSeconds; return ticker + "|ohlc:" + periodSeconds + (isClosed ? "" : ":open");
} }
@Override @Override
public String toString() { public String toString() {
return "RealtimeBar{ticker='" + ticker + "', period=" + periodSeconds + return "RealtimeBar{ticker='" + ticker + "', period=" + periodSeconds +
"s, windowStart=" + windowStartMs + ", O=" + open + " H=" + high + "s, windowStart=" + windowStartMs + ", O=" + open + " H=" + high +
" L=" + low + " C=" + close + ", ticks=" + tickCount + '}'; " L=" + low + " C=" + close + ", ticks=" + tickCount +
", closed=" + isClosed + '}';
} }
} }

View File

@@ -1,11 +1,13 @@
package com.dexorder.flink.publisher; package com.dexorder.flink.publisher;
import org.apache.flink.api.common.functions.RichFlatMapFunction;
import org.apache.flink.api.common.state.MapState; import org.apache.flink.api.common.state.MapState;
import org.apache.flink.api.common.state.MapStateDescriptor; import org.apache.flink.api.common.state.MapStateDescriptor;
import org.apache.flink.api.common.state.ValueState;
import org.apache.flink.api.common.state.ValueStateDescriptor;
import org.apache.flink.api.common.typeinfo.BasicTypeInfo; import org.apache.flink.api.common.typeinfo.BasicTypeInfo;
import org.apache.flink.api.common.typeinfo.PrimitiveArrayTypeInfo; import org.apache.flink.api.common.typeinfo.PrimitiveArrayTypeInfo;
import org.apache.flink.configuration.Configuration; import org.apache.flink.configuration.Configuration;
import org.apache.flink.streaming.api.functions.KeyedProcessFunction;
import org.apache.flink.util.Collector; import org.apache.flink.util.Collector;
import org.slf4j.Logger; import org.slf4j.Logger;
import org.slf4j.LoggerFactory; import org.slf4j.LoggerFactory;
@@ -19,10 +21,23 @@ import org.slf4j.LoggerFactory;
* emitted immediately when the boundary is crossed, so bars are delayed by at most * emitted immediately when the boundary is crossed, so bars are delayed by at most
* one tick interval (~10s for realtime polling). * one tick interval (~10s for realtime polling).
* *
* Periods are configurable at construction time. All configured periods are computed * Emits two types of bars per tick:
* for every ticker receiving ticks; the ZMQ publisher filters to active subscriptions. * - Open bar (isClosed=false): current accumulator state, debounced via processing-time
* timer. Multiple ticks within DEBOUNCE_MS share a single emission. Emitted on
* topic "{ticker}|ohlc:{period}:open" — consumed by charts for live price display.
* - Closed bar (isClosed=true): emitted immediately when a window boundary is crossed.
* Topic: "{ticker}|ohlc:{period}" — consumed by strategies/triggers.
* *
* Accumulator layout (long[7]): * Debouncing: open bars are not emitted per-tick. Instead the first tick in a batch
* registers a processing-time timer (DEBOUNCE_MS in the future). onTimer() emits
* the final accumulated state once, after the Kafka poll queue drains.
*
* Replay protection: ticks whose trade timestamp predates a period's current window start
* are discarded (prevents Kafka replay from contaminating current bars). Open bars are
* additionally suppressed until the first live tick (within LIVE_TICK_THRESHOLD_MS of now)
* is processed, so Kafka catch-up produces a single bar rather than a flood.
*
* Accumulator layout (long[8]):
* [0] open * [0] open
* [1] high * [1] high
* [2] low * [2] low
@@ -30,13 +45,23 @@ import org.slf4j.LoggerFactory;
* [4] volume (sum of base amount) * [4] volume (sum of base amount)
* [5] windowStartMs (epoch ms) * [5] windowStartMs (epoch ms)
* [6] tickCount * [6] tickCount
* [7] valid (1 = seeded or fresh window, 0 = mid-window cold start — open bars suppressed)
*/ */
public class RealtimeBarFunction extends RichFlatMapFunction<TickWrapper, RealtimeBar> { public class RealtimeBarFunction extends KeyedProcessFunction<String, TickWrapper, RealtimeBar> {
private static final Logger LOG = LoggerFactory.getLogger(RealtimeBarFunction.class); private static final Logger LOG = LoggerFactory.getLogger(RealtimeBarFunction.class);
private static final long serialVersionUID = 1L; private static final long serialVersionUID = 1L;
// Ticks within this many ms of wall-clock time are considered live (vs. Kafka catch-up).
private static final long LIVE_TICK_THRESHOLD_MS = 30_000L;
// Open bars are debounced: first tick in a batch registers a timer; onTimer emits once.
private static final long DEBOUNCE_MS = 50L;
private final int[] periods; private final int[] periods;
private transient MapState<Integer, long[]> accumState; private transient MapState<Integer, long[]> accumState;
// Tracks the timestamp of the pending debounce timer (null if none registered).
private transient ValueState<Long> pendingTimerTs;
// Suppresses open bar emissions during Kafka catch-up; set to true on first live tick.
private transient boolean caughtUp = false;
/** /**
* @param periods Period lengths in seconds (e.g., 60, 300, 900, 3600) * @param periods Period lengths in seconds (e.g., 60, 300, 900, 3600)
@@ -53,13 +78,31 @@ public class RealtimeBarFunction extends RichFlatMapFunction<TickWrapper, Realti
PrimitiveArrayTypeInfo.LONG_PRIMITIVE_ARRAY_TYPE_INFO PrimitiveArrayTypeInfo.LONG_PRIMITIVE_ARRAY_TYPE_INFO
); );
accumState = getRuntimeContext().getMapState(desc); accumState = getRuntimeContext().getMapState(desc);
pendingTimerTs = getRuntimeContext().getState(
new ValueStateDescriptor<>("pendingTimerTs", Long.class));
} }
@Override @Override
public void flatMap(TickWrapper tick, Collector<RealtimeBar> out) throws Exception { public void processElement(TickWrapper tick, Context ctx, Collector<RealtimeBar> out) throws Exception {
if (tick == null) return; if (tick == null) return;
long nowMs = System.currentTimeMillis(); long nowMs = System.currentTimeMillis();
// Seeds use Long.MAX_VALUE so they always pass the per-period timestamp gate below.
long tickTimestampMs = tick.isSeed() ? Long.MAX_VALUE : (tick.getTimestamp() / 1_000_000L);
if (tick.isSeed()) {
LOG.info("Seed tick received: ticker={}, seedPeriod={}, seedWindowStart={}, seedHigh={}, nowMs={}",
tick.getTicker(), tick.getSeedPeriodSeconds(), tick.getSeedWindowStartMs(),
tick.getSeedHigh(), nowMs);
}
// Advance catch-up flag on the first live tick (within threshold of wall-clock time).
if (!caughtUp && !tick.isSeed() && (nowMs - tickTimestampMs) < LIVE_TICK_THRESHOLD_MS) {
caughtUp = true;
LOG.info("Caught up to live data: ticker={}", tick.getTicker());
}
boolean needsOpenBarTimer = false;
for (int period : periods) { for (int period : periods) {
long periodMs = period * 1000L; long periodMs = period * 1000L;
@@ -67,50 +110,129 @@ public class RealtimeBarFunction extends RichFlatMapFunction<TickWrapper, Realti
long[] accum = accumState.get(period); long[] accum = accumState.get(period);
// Seed ticks pre-populate the accumulator from historical OHLC.
// Only apply when the accumulator is absent and the seed targets this period's current window.
if (tick.isSeed()) {
if (tick.getSeedPeriodSeconds() == period && accum == null
&& tick.getSeedWindowStartMs() == windowStart) {
long[] seeded = {
tick.getPrice(), // open
tick.getSeedHigh(), // high
tick.getSeedLow(), // low
tick.getSeedClose(), // close
tick.getAmount(), // volume
windowStart,
0L, // tickCount (no live ticks yet)
1L // valid
};
accumState.put(period, seeded);
LOG.info("Applied seed: ticker={}, period={}s, windowStart={}", tick.getTicker(), period, windowStart);
} else if (tick.getSeedPeriodSeconds() == period) {
LOG.info("Seed not applied: ticker={}, period={}s, accumNull={}, seedWindow={}, currentWindow={}",
tick.getTicker(), period, accum == null, tick.getSeedWindowStartMs(), windowStart);
}
continue;
}
// Discard ticks whose trade timestamp predates this period's current window.
// Prevents Kafka replay of historical trades from contaminating current bars.
if (tickTimestampMs < windowStart) {
continue;
}
if (accum == null) { if (accum == null) {
// First tick for this period // First live tick for this period, no seed — open mid-window, suppress open bars
accumState.put(period, openWindow(tick, windowStart)); long[] newAccum = openWindow(tick, windowStart, false);
accumState.put(period, newAccum);
LOG.info("Cold-start (no seed): ticker={}, period={}s, valid=0, open bars suppressed", tick.getTicker(), period);
} else if (accum[5] != windowStart) { } else if (accum[5] != windowStart) {
// Window boundary crossed — emit completed bar then start fresh // Window boundary crossed — emit closed bar immediately, then start a fresh valid window
if (accum[6] > 0) { if (accum[6] > 0) {
out.collect(toBar(tick.getTicker(), period, accum)); out.collect(toBar(tick.getTicker(), period, accum, true));
LOG.debug("Emitted bar: ticker={}, period={}s, windowStart={}, ticks={}", LOG.debug("Emitted closed bar: ticker={}, period={}s, windowStart={}, ticks={}",
tick.getTicker(), period, accum[5], accum[6]); tick.getTicker(), period, accum[5], accum[6]);
} }
accumState.put(period, openWindow(tick, windowStart)); long[] newAccum = openWindow(tick, windowStart, true);
accumState.put(period, newAccum);
if (caughtUp) {
needsOpenBarTimer = true;
}
} else { } else {
// Same window — update // Same window — update accumulator
accum[1] = Math.max(accum[1], tick.getPrice()); // high accum[1] = Math.max(accum[1], tick.getPrice()); // high
accum[2] = Math.min(accum[2], tick.getPrice()); // low accum[2] = Math.min(accum[2], tick.getPrice()); // low
accum[3] = tick.getPrice(); // close accum[3] = tick.getPrice(); // close
accum[4] += tick.getAmount(); // volume accum[4] += tick.getAmount(); // volume
accum[6]++; // tick count accum[6]++; // tick count
accumState.put(period, accum); accumState.put(period, accum);
if (accum[7] == 1 && caughtUp) {
needsOpenBarTimer = true;
} else if (accum[7] == 0 && caughtUp) {
LOG.debug("Open bar suppressed (valid=0, no seed): ticker={}, period={}s", tick.getTicker(), period);
}
} }
} }
// Register a debounce timer for open bar emission (if not already pending).
// The timer fires after DEBOUNCE_MS, emitting the final accumulated state once
// for all ticks that arrived in the same Kafka poll batch.
if (needsOpenBarTimer && pendingTimerTs.value() == null) {
long timerTs = ctx.timerService().currentProcessingTime() + DEBOUNCE_MS;
ctx.timerService().registerProcessingTimeTimer(timerTs);
pendingTimerTs.update(timerTs);
}
} }
private static long[] openWindow(TickWrapper tick, long windowStart) { /**
* Fires after DEBOUNCE_MS: emits the current open bar state for all valid periods.
* By this point the Kafka poll queue has drained, so this represents one combined
* update for all ticks that arrived in the batch.
*/
@Override
public void onTimer(long timestamp, OnTimerContext ctx, Collector<RealtimeBar> out) throws Exception {
pendingTimerTs.clear();
if (!caughtUp) return;
String ticker = ctx.getCurrentKey();
long nowMs = System.currentTimeMillis();
for (int period : periods) {
long[] accum = accumState.get(period);
if (accum == null || accum[7] != 1) continue;
// Verify accumulator is still in the current window (guard against stale state)
long periodMs = period * 1000L;
long windowStart = (nowMs / periodMs) * periodMs;
if (accum[5] != windowStart) continue;
out.collect(toBar(ticker, period, accum, false));
LOG.debug("Debounced open bar emitted: ticker={}, period={}s", ticker, period);
}
}
private static long[] openWindow(TickWrapper tick, long windowStart, boolean valid) {
return new long[]{ return new long[]{
tick.getPrice(), // open tick.getPrice(), // open
tick.getPrice(), // high tick.getPrice(), // high
tick.getPrice(), // low tick.getPrice(), // low
tick.getPrice(), // close tick.getPrice(), // close
tick.getAmount(), // volume tick.getAmount(), // volume
windowStart, windowStart,
1L // tickCount 1L, // tickCount
valid ? 1L : 0L // valid flag
}; };
} }
private static RealtimeBar toBar(String ticker, int period, long[] accum) { private static RealtimeBar toBar(String ticker, int period, long[] accum, boolean isClosed) {
return new RealtimeBar( return new RealtimeBar(
ticker, period, ticker, period,
accum[5], // windowStartMs accum[5], // windowStartMs
accum[0], accum[1], accum[2], accum[3], // O H L C accum[0], accum[1], accum[2], accum[3], // O H L C
accum[4], // volume accum[4], // volume
(int) accum[6] // tickCount (int) accum[6], // tickCount
isClosed
); );
} }
} }

View File

@@ -40,7 +40,7 @@ public class TickDeserializer implements DeserializationSchema<TickWrapper> {
Tick tick = Tick.parseFrom(payload); Tick tick = Tick.parseFrom(payload);
return new TickWrapper( TickWrapper tw = new TickWrapper(
tick.getTicker(), tick.getTicker(),
tick.getTradeId(), tick.getTradeId(),
tick.getTimestamp(), tick.getTimestamp(),
@@ -49,6 +49,15 @@ public class TickDeserializer implements DeserializationSchema<TickWrapper> {
tick.getQuoteAmount(), tick.getQuoteAmount(),
tick.getTakerBuy() tick.getTakerBuy()
); );
if (tick.hasIsSeed() && tick.getIsSeed()) {
tw.setIsSeed(true);
tw.setSeedHigh(tick.getSeedHigh());
tw.setSeedLow(tick.getSeedLow());
tw.setSeedClose(tick.getSeedClose());
tw.setSeedWindowStartMs(tick.getSeedWindowStartMs());
tw.setSeedPeriodSeconds(tick.getSeedPeriodSeconds());
}
return tw;
} catch (Exception e) { } catch (Exception e) {
LOG.warn("Failed to deserialize Tick, skipping: {}", e.getMessage()); LOG.warn("Failed to deserialize Tick, skipping: {}", e.getMessage());

View File

@@ -20,6 +20,12 @@ public class TickWrapper implements Serializable {
/** Quote amount as scaled integer */ /** Quote amount as scaled integer */
private long quoteAmount; private long quoteAmount;
private boolean takerBuy; private boolean takerBuy;
private boolean isSeed;
private long seedHigh;
private long seedLow;
private long seedClose;
private long seedWindowStartMs;
private int seedPeriodSeconds;
public TickWrapper() {} public TickWrapper() {}
@@ -41,6 +47,12 @@ public class TickWrapper implements Serializable {
public long getAmount() { return amount; } public long getAmount() { return amount; }
public long getQuoteAmount() { return quoteAmount; } public long getQuoteAmount() { return quoteAmount; }
public boolean isTakerBuy() { return takerBuy; } public boolean isTakerBuy() { return takerBuy; }
public boolean isSeed() { return isSeed; }
public long getSeedHigh() { return seedHigh; }
public long getSeedLow() { return seedLow; }
public long getSeedClose() { return seedClose; }
public long getSeedWindowStartMs() { return seedWindowStartMs; }
public int getSeedPeriodSeconds() { return seedPeriodSeconds; }
public void setTicker(String ticker) { this.ticker = ticker; } public void setTicker(String ticker) { this.ticker = ticker; }
public void setTradeId(String tradeId) { this.tradeId = tradeId; } public void setTradeId(String tradeId) { this.tradeId = tradeId; }
@@ -49,6 +61,12 @@ public class TickWrapper implements Serializable {
public void setAmount(long amount) { this.amount = amount; } public void setAmount(long amount) { this.amount = amount; }
public void setQuoteAmount(long quoteAmount) { this.quoteAmount = quoteAmount; } public void setQuoteAmount(long quoteAmount) { this.quoteAmount = quoteAmount; }
public void setTakerBuy(boolean takerBuy) { this.takerBuy = takerBuy; } public void setTakerBuy(boolean takerBuy) { this.takerBuy = takerBuy; }
public void setIsSeed(boolean isSeed) { this.isSeed = isSeed; }
public void setSeedHigh(long seedHigh) { this.seedHigh = seedHigh; }
public void setSeedLow(long seedLow) { this.seedLow = seedLow; }
public void setSeedClose(long seedClose) { this.seedClose = seedClose; }
public void setSeedWindowStartMs(long seedWindowStartMs) { this.seedWindowStartMs = seedWindowStartMs; }
public void setSeedPeriodSeconds(int seedPeriodSeconds) { this.seedPeriodSeconds = seedPeriodSeconds; }
@Override @Override
public String toString() { public String toString() {

View File

@@ -0,0 +1,172 @@
package com.dexorder.flink.quotes;
import com.dexorder.proto.QuoteCurrencyIndex;
import com.dexorder.proto.QuoteCurrencyRate;
import com.dexorder.proto.Ticker24h;
import com.dexorder.proto.TickerStats;
import org.apache.flink.api.common.functions.RichFlatMapFunction;
import org.apache.flink.configuration.Configuration;
import org.apache.flink.util.Collector;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import java.util.*;
/**
* Flink function that converts TickerBatch messages into Ticker24h snapshots.
*
* Maintains an in-memory cross-exchange price index to compute std_quote_volume
* (quote volume normalized to USD) for each ticker. USD stablecoins are hardcoded
* to 1.0; crypto quote currencies are looked up from the exchange price index
* using priority order: BINANCE → COINBASE → KRAKEN → others.
*
* Must run with parallelism=1 (maintains non-replicated cross-exchange state).
*/
public class Ticker24hFunction extends RichFlatMapFunction<TickerBatchWrapper, Ticker24hWrapper> {
private static final long serialVersionUID = 1L;
private static final Logger LOG = LoggerFactory.getLogger(Ticker24hFunction.class);
private static final Set<String> USD_STABLECOINS = new HashSet<>(Arrays.asList(
"USDT", "USDC", "BUSD", "TUSD", "DAI", "USDP", "GUSD"
));
// Exchanges checked in priority order when looking up cross-currency rates
private static final List<String> EXCHANGE_PRIORITY = Arrays.asList(
"BINANCE", "COINBASE", "KRAKEN"
);
// exchange → (ticker → lastPrice), maintained across all received batches
private transient Map<String, Map<String, Double>> exchangePriceIndex;
@Override
public void open(Configuration parameters) {
exchangePriceIndex = new HashMap<>();
}
@Override
public void flatMap(TickerBatchWrapper batch, Collector<Ticker24hWrapper> out) {
String exchangeId = batch.getExchangeId();
long fetchedAt = batch.getFetchedAt();
List<TickerBatchWrapper.TickerStatsRow> rows = batch.getTickers();
// Update cross-exchange price index with this batch's prices
Map<String, Double> priceMap = new HashMap<>(rows.size() * 2);
for (TickerBatchWrapper.TickerStatsRow row : rows) {
if (row.lastPrice > 0) {
priceMap.put(row.ticker, row.lastPrice);
}
}
exchangePriceIndex.put(exchangeId, priceMap);
// Build QuoteCurrencyIndex from all unique quote assets in this batch
Set<String> quoteAssets = new LinkedHashSet<>();
for (TickerBatchWrapper.TickerStatsRow row : rows) {
quoteAssets.add(row.quoteAsset);
}
Map<String, Double> usdRates = new HashMap<>();
Map<String, String> usdSources = new HashMap<>();
QuoteCurrencyIndex.Builder indexBuilder = QuoteCurrencyIndex.newBuilder()
.setGeneratedAt(fetchedAt);
for (String quoteAsset : quoteAssets) {
QuoteCurrencyRate rate = buildRate(quoteAsset, fetchedAt);
if (rate != null) {
indexBuilder.addRates(rate);
usdRates.put(quoteAsset, rate.getUsdRate());
usdSources.put(quoteAsset, rate.getSourceTicker());
}
}
QuoteCurrencyIndex currencyIndex = indexBuilder.build();
// Build Ticker24h with std_quote_volume for each ticker
Ticker24h.Builder ticker24hBuilder = Ticker24h.newBuilder()
.setExchangeId(exchangeId)
.setGeneratedAt(fetchedAt)
.setCurrencyIndex(currencyIndex);
for (TickerBatchWrapper.TickerStatsRow row : rows) {
TickerStats.Builder tsBuilder = TickerStats.newBuilder()
.setTicker(row.ticker)
.setExchangeId(row.exchangeId)
.setBaseAsset(row.baseAsset)
.setQuoteAsset(row.quoteAsset)
.setLastPrice(row.lastPrice)
.setPriceChangePct(row.priceChangePct)
.setQuoteVolume24H(row.quoteVolume24h)
.setTimestamp(row.timestamp);
if (row.bidPrice != null) tsBuilder.setBidPrice(row.bidPrice);
if (row.askPrice != null) tsBuilder.setAskPrice(row.askPrice);
if (row.open24h != null) tsBuilder.setOpen24H(row.open24h);
if (row.high24h != null) tsBuilder.setHigh24H(row.high24h);
if (row.low24h != null) tsBuilder.setLow24H(row.low24h);
if (row.volume24h != null) tsBuilder.setVolume24H(row.volume24h);
if (row.numTrades != null) tsBuilder.setNumTrades(row.numTrades);
Double usdRate = usdRates.get(row.quoteAsset);
if (usdRate != null) {
tsBuilder.setStdQuoteVolume(row.quoteVolume24h * usdRate);
}
ticker24hBuilder.addTickers(tsBuilder.build());
}
byte[] protoBytes = ticker24hBuilder.build().toByteArray();
String clientId = batch.getClientId();
String topic = (clientId != null && !clientId.isEmpty())
? "RESPONSE:" + clientId
: exchangeId + "|ticker24h";
LOG.info("Built Ticker24h snapshot: exchange={}, tickers={}, bytes={}, topic={}",
exchangeId, rows.size(), protoBytes.length, topic);
out.collect(new Ticker24hWrapper(exchangeId, topic, protoBytes));
}
/**
* Build a USD rate for a quote currency.
* Returns null if no conversion path is known (fiat, or crypto with no available pair).
*/
private QuoteCurrencyRate buildRate(String currency, long timestampNs) {
if (USD_STABLECOINS.contains(currency)) {
return QuoteCurrencyRate.newBuilder()
.setCurrency(currency)
.setUsdRate(1.0)
.setSourceTicker("hardcoded")
.setTimestamp(timestampNs)
.build();
}
// Try priority exchanges first, then any remaining exchange
List<String> orderedExchanges = new ArrayList<>(EXCHANGE_PRIORITY);
for (String ex : exchangePriceIndex.keySet()) {
if (!orderedExchanges.contains(ex)) {
orderedExchanges.add(ex);
}
}
for (String exchange : orderedExchanges) {
Map<String, Double> priceMap = exchangePriceIndex.get(exchange);
if (priceMap == null) continue;
for (String stablecoin : Arrays.asList("USDT", "USDC")) {
String pairTicker = currency + "/" + stablecoin + "." + exchange;
Double price = priceMap.get(pairTicker);
if (price != null && price > 0) {
return QuoteCurrencyRate.newBuilder()
.setCurrency(currency)
.setUsdRate(price)
.setSourceTicker(pairTicker)
.setTimestamp(timestampNs)
.build();
}
}
}
LOG.debug("No USD conversion path for quote currency: {}", currency);
return null;
}
}

View File

@@ -0,0 +1,78 @@
package com.dexorder.flink.quotes;
import org.apache.flink.configuration.Configuration;
import org.apache.flink.streaming.api.functions.sink.RichSinkFunction;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.zeromq.SocketType;
import org.zeromq.ZContext;
import org.zeromq.ZMQ;
/**
* Flink sink that publishes Ticker24h snapshots to subscribers via ZMQ.
*
* Connects a ZMQ PUSH socket to the job manager's notification PULL endpoint.
* HistoryNotificationForwarder receives these frames and enqueues them to
* RealtimeSubscriptionManager, which publishes them on the MARKET_DATA_PUB XPUB socket.
* Clients subscribed to "{exchange_id}|ticker24h" receive the snapshot.
*
* Wire format (matches other notification publishers):
* Frame 1: topic bytes (e.g., "BINANCE|ticker24h")
* Frame 2: [0x01] (protocol version)
* Frame 3: [0x0D][Ticker24h protobuf bytes] (type 0x0D = TICKER_24H)
*
* Parallelism MUST be 1.
*/
public class Ticker24hPublisher extends RichSinkFunction<Ticker24hWrapper> {
private static final Logger LOG = LoggerFactory.getLogger(Ticker24hPublisher.class);
private static final long serialVersionUID = 1L;
private static final byte PROTOCOL_VERSION = 0x01;
private static final byte MSG_TYPE_TICKER_24H = 0x0D;
private final String jobManagerPullEndpoint;
private transient ZContext context;
private transient ZMQ.Socket pushSocket;
public Ticker24hPublisher(String jobManagerPullEndpoint) {
this.jobManagerPullEndpoint = jobManagerPullEndpoint;
}
@Override
public void open(Configuration parameters) {
context = new ZContext();
pushSocket = context.createSocket(SocketType.PUSH);
pushSocket.setLinger(1000);
pushSocket.setSndHWM(10000);
pushSocket.connect(jobManagerPullEndpoint);
LOG.info("Ticker24hPublisher PUSH connected to {}", jobManagerPullEndpoint);
}
@Override
public void invoke(Ticker24hWrapper wrapper, Context context) {
try {
byte[] protoBytes = wrapper.getProtoBytes();
byte[] messageFrame = new byte[1 + protoBytes.length];
messageFrame[0] = MSG_TYPE_TICKER_24H;
System.arraycopy(protoBytes, 0, messageFrame, 1, protoBytes.length);
String topic = wrapper.getZmqTopic();
pushSocket.sendMore(topic.getBytes(ZMQ.CHARSET));
pushSocket.sendMore(new byte[]{PROTOCOL_VERSION});
pushSocket.send(messageFrame, 0);
LOG.info("Published Ticker24h snapshot: topic={}, bytes={}", topic, protoBytes.length);
} catch (Exception e) {
LOG.error("Failed to publish Ticker24h: exchange={}", wrapper.getExchangeId(), e);
}
}
@Override
public void close() {
if (pushSocket != null) pushSocket.close();
if (context != null) context.close();
LOG.info("Ticker24hPublisher closed");
}
}

View File

@@ -0,0 +1,71 @@
package com.dexorder.flink.quotes;
import com.dexorder.flink.ingestor.IngestorBroker;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import java.util.List;
import java.util.concurrent.Executors;
import java.util.concurrent.ScheduledExecutorService;
import java.util.concurrent.TimeUnit;
/**
* Schedules periodic TICKER_SNAPSHOT requests for all configured exchanges.
*
* Fires immediately on startup, then at the top of each hour.
* The IngestorBroker dispatches the requests to ingestor workers, which call
* fetchTickers() and publish TickerBatch messages to the market-ticker Kafka topic.
*/
public class Ticker24hScheduler {
private static final Logger LOG = LoggerFactory.getLogger(Ticker24hScheduler.class);
private final IngestorBroker broker;
private final List<String> exchanges;
private final ScheduledExecutorService scheduler;
public Ticker24hScheduler(IngestorBroker broker, List<String> exchanges) {
this.broker = broker;
this.exchanges = exchanges;
this.scheduler = Executors.newSingleThreadScheduledExecutor(r -> {
Thread t = new Thread(r, "Ticker24hScheduler");
t.setDaemon(true);
return t;
});
}
public void start() {
// Fire immediately for all exchanges
fireAll();
// Schedule next firing at top of next hour, then every hour after that
long delayMs = msUntilNextHour();
scheduler.scheduleAtFixedRate(this::fireAll, delayMs, 3_600_000L, TimeUnit.MILLISECONDS);
long delayMin = delayMs / 60_000;
LOG.info("Ticker24hScheduler started: fired immediately for {}, next firing in ~{}min",
exchanges, delayMin);
}
public void stop() {
scheduler.shutdownNow();
LOG.info("Ticker24hScheduler stopped");
}
private void fireAll() {
LOG.info("Ticker24hScheduler firing TICKER_SNAPSHOT requests for exchanges: {}", exchanges);
for (String exchange : exchanges) {
try {
broker.submitTicker24hRequest(exchange);
} catch (Exception e) {
LOG.error("Failed to submit TICKER_SNAPSHOT for exchange={}", exchange, e);
}
}
}
/** Milliseconds until the next full-hour boundary (e.g., 14:00:00.000). */
private static long msUntilNextHour() {
long now = System.currentTimeMillis();
long nextHour = (now / 3_600_000L + 1) * 3_600_000L;
return nextHour - now;
}
}

View File

@@ -0,0 +1,24 @@
package com.dexorder.flink.quotes;
import java.io.Serializable;
/**
* Wrapper for a serialized Ticker24h proto message, ready for ZMQ publication.
*/
public class Ticker24hWrapper implements Serializable {
private static final long serialVersionUID = 1L;
private final String exchangeId;
private final String zmqTopic; // "RESPONSE:{clientId}" or "{exchange}|ticker24h"
private final byte[] protoBytes; // serialized Ticker24h proto
public Ticker24hWrapper(String exchangeId, String zmqTopic, byte[] protoBytes) {
this.exchangeId = exchangeId;
this.zmqTopic = zmqTopic;
this.protoBytes = protoBytes;
}
public String getExchangeId() { return exchangeId; }
public String getZmqTopic() { return zmqTopic; }
public byte[] getProtoBytes() { return protoBytes; }
}

View File

@@ -0,0 +1,95 @@
package com.dexorder.flink.quotes;
import com.dexorder.proto.TickerBatch;
import com.dexorder.proto.TickerStats;
import org.apache.flink.api.common.serialization.DeserializationSchema;
import org.apache.flink.api.common.typeinfo.TypeInformation;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
/**
* Kafka deserializer for TickerBatch protobuf messages from the market-ticker topic.
* Wire format: [0x01 version][0x0C type][protobuf bytes]
*/
public class TickerBatchDeserializer implements DeserializationSchema<TickerBatchWrapper> {
private static final long serialVersionUID = 1L;
private static final Logger LOG = LoggerFactory.getLogger(TickerBatchDeserializer.class);
private static final byte PROTOCOL_VERSION = 0x01;
private static final byte MSG_TYPE_TICKER_BATCH = 0x0C;
@Override
public TickerBatchWrapper deserialize(byte[] message) throws IOException {
try {
if (message.length < 2) {
throw new IOException("Message too short: " + message.length + " bytes");
}
byte version = message[0];
if (version != PROTOCOL_VERSION) {
throw new IOException("Unsupported protocol version: " + version);
}
byte messageType = message[1];
if (messageType != MSG_TYPE_TICKER_BATCH) {
throw new IOException("Unexpected message type: 0x" + Integer.toHexString(messageType & 0xFF));
}
byte[] protoPayload = new byte[message.length - 2];
System.arraycopy(message, 2, protoPayload, 0, protoPayload.length);
TickerBatchWrapper wrapper = parseTickerBatch(protoPayload);
LOG.debug("Deserialized TickerBatch: exchange={}, tickers={}",
wrapper.getExchangeId(), wrapper.getTickerCount());
return wrapper;
} catch (Exception e) {
LOG.error("Failed to deserialize TickerBatch", e);
throw new IOException("Failed to deserialize TickerBatch", e);
}
}
private TickerBatchWrapper parseTickerBatch(byte[] payload) throws Exception {
TickerBatch batch = TickerBatch.parseFrom(payload);
List<TickerBatchWrapper.TickerStatsRow> rows = new ArrayList<>(batch.getTickersCount());
for (TickerStats ts : batch.getTickersList()) {
rows.add(new TickerBatchWrapper.TickerStatsRow(
ts.getTicker(),
ts.getExchangeId(),
ts.getBaseAsset(),
ts.getQuoteAsset(),
ts.getLastPrice(),
ts.getPriceChangePct(),
ts.getQuoteVolume24H(),
ts.getTimestamp(),
ts.hasBidPrice() ? ts.getBidPrice() : null,
ts.hasAskPrice() ? ts.getAskPrice() : null,
ts.hasOpen24H() ? ts.getOpen24H() : null,
ts.hasHigh24H() ? ts.getHigh24H() : null,
ts.hasLow24H() ? ts.getLow24H() : null,
ts.hasVolume24H() ? ts.getVolume24H() : null,
ts.hasNumTrades() ? ts.getNumTrades() : null
));
}
return new TickerBatchWrapper(
batch.getExchangeId(), rows, batch.getFetchedAt(),
batch.hasClientId() ? batch.getClientId() : "",
batch.hasRequestId() ? batch.getRequestId() : "");
}
@Override
public boolean isEndOfStream(TickerBatchWrapper nextElement) {
return false;
}
@Override
public TypeInformation<TickerBatchWrapper> getProducedType() {
return TypeInformation.of(TickerBatchWrapper.class);
}
}

View File

@@ -0,0 +1,86 @@
package com.dexorder.flink.quotes;
import java.io.Serializable;
import java.util.List;
/**
* POJO wrapper for TickerBatch Kafka messages from market-ticker topic.
* Unwraps the protobuf into plain Java fields for Flink processing.
*/
public class TickerBatchWrapper implements Serializable {
private static final long serialVersionUID = 1L;
private final String exchangeId;
private final List<TickerStatsRow> tickers;
private final long fetchedAt; // nanoseconds
private final String clientId; // non-empty = client-initiated; "" = scheduled broadcast
private final String requestId; // echoed for tracing
public TickerBatchWrapper(String exchangeId, List<TickerStatsRow> tickers, long fetchedAt,
String clientId, String requestId) {
this.exchangeId = exchangeId;
this.tickers = tickers;
this.fetchedAt = fetchedAt;
this.clientId = clientId != null ? clientId : "";
this.requestId = requestId != null ? requestId : "";
}
public String getExchangeId() { return exchangeId; }
public List<TickerStatsRow> getTickers() { return tickers; }
public long getFetchedAt() { return fetchedAt; }
public String getClientId() { return clientId; }
public String getRequestId() { return requestId; }
public int getTickerCount() { return tickers != null ? tickers.size() : 0; }
@Override
public String toString() {
return "TickerBatchWrapper{exchangeId='" + exchangeId + "', count=" + getTickerCount() + '}';
}
/**
* Single ticker stats row. Optional fields are null when the exchange did not provide them.
*/
public static class TickerStatsRow implements Serializable {
private static final long serialVersionUID = 1L;
public final String ticker;
public final String exchangeId;
public final String baseAsset;
public final String quoteAsset;
public final double lastPrice;
public final double priceChangePct;
public final double quoteVolume24h;
public final long timestamp; // nanoseconds
// Optional fields — null if not provided by exchange
public final Double bidPrice;
public final Double askPrice;
public final Double open24h;
public final Double high24h;
public final Double low24h;
public final Double volume24h;
public final Integer numTrades;
public TickerStatsRow(
String ticker, String exchangeId, String baseAsset, String quoteAsset,
double lastPrice, double priceChangePct, double quoteVolume24h, long timestamp,
Double bidPrice, Double askPrice,
Double open24h, Double high24h, Double low24h, Double volume24h,
Integer numTrades) {
this.ticker = ticker;
this.exchangeId = exchangeId;
this.baseAsset = baseAsset;
this.quoteAsset = quoteAsset;
this.lastPrice = lastPrice;
this.priceChangePct = priceChangePct;
this.quoteVolume24h = quoteVolume24h;
this.timestamp = timestamp;
this.bidPrice = bidPrice;
this.askPrice = askPrice;
this.open24h = open24h;
this.high24h = high24h;
this.low24h = low24h;
this.volume24h = volume24h;
this.numTrades = numTrades;
}
}
}

View File

@@ -79,10 +79,9 @@ public class IcebergOHLCSink {
// Emit one RowData for each OHLC row in the batch // Emit one RowData for each OHLC row in the batch
for (OHLCBatchWrapper.OHLCRow row : batch.getRows()) { for (OHLCBatchWrapper.OHLCRow row : batch.getRows()) {
GenericRowData rowData = new GenericRowData(RowKind.INSERT, 17); GenericRowData rowData = new GenericRowData(RowKind.INSERT, 19);
// Natural key fields (ticker, period_seconds, timestamp) // Natural key fields (ticker, period_seconds, timestamp)
// Used by equality delete files for deduplication
rowData.setField(0, StringData.fromString(ticker)); rowData.setField(0, StringData.fromString(ticker));
rowData.setField(1, periodSeconds); rowData.setField(1, periodSeconds);
rowData.setField(2, row.getTimestamp()); rowData.setField(2, row.getTimestamp());
@@ -95,22 +94,26 @@ public class IcebergOHLCSink {
// Volume data // Volume data
rowData.setField(7, row.getVolume()); rowData.setField(7, row.getVolume());
rowData.setField(8, null); // buy_vol (TODO: extract from protobuf) rowData.setField(8, row.getBuyVol());
rowData.setField(9, null); // sell_vol rowData.setField(9, row.getSellVol());
// Timing data // Timing data
rowData.setField(10, null); // open_time rowData.setField(10, row.getOpenTime());
rowData.setField(11, null); // high_time rowData.setField(11, null); // high_time — not provided by exchanges
rowData.setField(12, null); // low_time rowData.setField(12, null); // low_time — not provided by exchanges
rowData.setField(13, null); // close_time rowData.setField(13, row.getCloseTime());
// Additional fields // Additional fields
rowData.setField(14, null); // open_interest rowData.setField(14, null); // open_interest (futures only, not yet fetched)
// Metadata fields // Metadata fields
rowData.setField(15, StringData.fromString(requestId)); rowData.setField(15, StringData.fromString(requestId));
rowData.setField(16, ingestedAt); rowData.setField(16, ingestedAt);
// Extended exchange fields (appended at end for backward-compatible schema evolution)
rowData.setField(17, row.getNumTrades());
rowData.setField(18, row.getQuoteVolume());
out.collect(rowData); out.collect(rowData);
} }

View File

@@ -28,6 +28,16 @@ topics:
compression.type: snappy compression.type: snappy
cleanup.policy: delete cleanup.policy: delete
# 24-hour rolling ticker snapshots for all symbols on an exchange.
# Written by ingestors on TICKER_SNAPSHOT requests; consumed by Ticker24hConsumer.
- name: market-ticker
partitions: 3
replication: 2
config:
retention.ms: 7200000 # 2 hours (hourly refresh; keep one backup)
compression.type: snappy
cleanup.policy: delete
# Symbol metadata from ingestors # Symbol metadata from ingestors
- name: symbol-metadata - name: symbol-metadata
partitions: 3 partitions: 3

View File

@@ -38,10 +38,6 @@ SANDBOX_STORAGE_CLASS=standard
# Redis (for hot storage and session management) # Redis (for hot storage and session management)
REDIS_URL=redis://localhost:6379 REDIS_URL=redis://localhost:6379
# Qdrant (for RAG vector search)
QDRANT_URL=http://localhost:6333
QDRANT_API_KEY= # optional, leave empty for local dev
# Iceberg (for durable storage via REST catalog) # Iceberg (for durable storage via REST catalog)
ICEBERG_CATALOG_URI=http://iceberg-catalog:8181 ICEBERG_CATALOG_URI=http://iceberg-catalog:8181
ICEBERG_NAMESPACE=gateway ICEBERG_NAMESPACE=gateway

View File

@@ -18,38 +18,18 @@ COPY src ./src
# Build (includes protobuf generation) # Build (includes protobuf generation)
RUN npm run build RUN npm run build
# Note: Python API files for research subagent are copied by bin/build script
# to src/harness/subagents/research/api-source/ before docker build
# Production image # Production image
FROM node:22-slim FROM node:22-slim
WORKDIR /app WORKDIR /app
# Install dependencies for Ollama (early in the build for caching) RUN apt-get update && apt-get install -y bash zstd ca-certificates && rm -rf /var/lib/apt/lists/*
RUN apt-get update && apt-get install -y curl bash zstd ca-certificates && rm -rf /var/lib/apt/lists/*
# Install Ollama (before npm dependencies for better caching) # Create non-root user
RUN curl -fsSL https://ollama.com/install.sh | sh
# Create non-root user early (before pulling model)
RUN groupadd --gid 1001 nodejs && \ RUN groupadd --gid 1001 nodejs && \
useradd --uid 1001 --gid nodejs --shell /bin/bash --create-home nodejs && \ useradd --uid 1001 --gid nodejs --shell /bin/bash --create-home nodejs && \
chown -R nodejs:nodejs /app chown -R nodejs:nodejs /app
# Pull embedding model (all-minilm: 90MB, CPU-friendly) as nodejs user
# This is the most expensive operation, so do it early
USER nodejs
RUN ollama serve & \
OLLAMA_PID=$! && \
sleep 10 && \
ollama pull all-minilm && \
kill $OLLAMA_PID && \
wait $OLLAMA_PID || true
# Switch back to root for remaining setup
USER root
# Copy package files # Copy package files
COPY package*.json ./ COPY package*.json ./
@@ -65,16 +45,14 @@ COPY protobuf ./protobuf
# Copy k8s templates (not included in TypeScript build) # Copy k8s templates (not included in TypeScript build)
COPY src/k8s/templates ./dist/k8s/templates COPY src/k8s/templates ./dist/k8s/templates
# Copy harness prompts (not included in TypeScript build) # Copy harness prompts (welcome.md, etc.)
COPY src/harness/prompts ./dist/harness/prompts COPY src/harness/prompts ./dist/harness/prompts
# Copy all subagent directories (config.yaml, system-prompt.md, memory/, etc.) # Copy wiki knowledge base
# TypeScript build already compiled .ts files to .js in dist, so we copy the entire COPY knowledge ./knowledge
# source directory to get all non-TypeScript assets, then remove .ts duplicates
COPY src/harness/subagents ./dist/harness/subagents # Copy agent prompt pages (agent-*.md, index.md, tools.md)
# Remove source .ts files (we only need the compiled .js from builder stage) COPY prompt ./prompt
# Keep .yaml, .md files and memory/ directories
RUN find ./dist/harness/subagents -name "*.ts" -type f -delete
# Copy entrypoint script # Copy entrypoint script
COPY entrypoint.sh ./ COPY entrypoint.sh ./

View File

@@ -58,7 +58,6 @@ Multi-channel gateway with agent harness for the Dexorder AI platform.
- **Streaming responses**: Real-time chat with WebSocket and Telegram - **Streaming responses**: Real-time chat with WebSocket and Telegram
- **Complex workflows**: LangGraph for stateful trading analysis (backtest → risk → approval) - **Complex workflows**: LangGraph for stateful trading analysis (backtest → risk → approval)
- **Agent harness**: Stateless orchestrator (all context lives in user's MCP container) - **Agent harness**: Stateless orchestrator (all context lives in user's MCP container)
- **MCP resource integration**: User's RAG, conversation history, and preferences
## Container Management ## Container Management
@@ -91,9 +90,7 @@ Containers self-manage their lifecycle using the lifecycle sidecar (see `../life
- OpenAI GPT - OpenAI GPT
- Google Gemini - Google Gemini
- OpenRouter (one key for 300+ models) - OpenRouter (one key for 300+ models)
- Ollama (for embeddings): https://ollama.com/download
- Redis (for session/hot storage) - Redis (for session/hot storage)
- Qdrant (for RAG vector search)
- Kafka + Flink + Iceberg (for durable storage) - Kafka + Flink + Iceberg (for durable storage)
### Development ### Development
@@ -123,20 +120,7 @@ DEFAULT_MODEL_PROVIDER=anthropic
DEFAULT_MODEL=claude-sonnet-4-6 DEFAULT_MODEL=claude-sonnet-4-6
``` ```
4. Start Ollama and pull embedding model: 4. Run development server:
```bash
# Install Ollama (one-time): https://ollama.com/download
# Or with Docker: docker run -d -p 11434:11434 ollama/ollama
# Pull the all-minilm embedding model (90MB, CPU-friendly)
ollama pull all-minilm
# Alternative models:
# ollama pull nomic-embed-text # 8K context length
# ollama pull mxbai-embed-large # Higher accuracy, slower
```
5. Run development server:
```bash ```bash
npm run dev npm run dev
``` ```
@@ -217,138 +201,6 @@ ws.send(JSON.stringify({
**`GET /health`** **`GET /health`**
- Returns server health status - Returns server health status
## Ollama Deployment Options
The gateway requires Ollama for embedding generation in RAG queries. You have two deployment options:
### Option 1: Ollama in Gateway Container (Recommended for simplicity)
Install Ollama directly in the gateway container. This keeps all dependencies local and simplifies networking.
**Dockerfile additions:**
```dockerfile
FROM node:22-slim
# Install Ollama
RUN curl -fsSL https://ollama.com/install.sh | sh
# Pull embedding model at build time
RUN ollama serve & \
sleep 5 && \
ollama pull all-minilm && \
pkill ollama
# ... rest of your gateway Dockerfile
```
**Start script (entrypoint.sh):**
```bash
#!/bin/bash
# Start Ollama in background
ollama serve &
# Start gateway
node dist/main.js
```
**Pros:**
- Simple networking (localhost:11434)
- No extra K8s resources
- Self-contained deployment
**Cons:**
- Larger container image (~200MB extra)
- CPU/memory shared with gateway process
**Resource requirements:**
- Add +200MB memory
- Add +0.2 CPU cores for embedding inference
### Option 2: Ollama as Separate Pod/Sidecar
Deploy Ollama as a separate container in the same pod (sidecar) or as its own deployment.
**K8s Deployment (sidecar pattern):**
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: gateway
spec:
template:
spec:
containers:
- name: gateway
image: ghcr.io/dexorder/gateway:latest
env:
- name: OLLAMA_URL
value: http://localhost:11434
- name: ollama
image: ollama/ollama:latest
command: ["/bin/sh", "-c"]
args:
- |
ollama serve &
sleep 5
ollama pull all-minilm
wait
resources:
requests:
memory: "512Mi"
cpu: "500m"
limits:
memory: "1Gi"
cpu: "1000m"
```
**K8s Deployment (separate service):**
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: ollama
spec:
replicas: 1
template:
spec:
containers:
- name: ollama
image: ollama/ollama:latest
# ... same as above
---
apiVersion: v1
kind: Service
metadata:
name: ollama
spec:
selector:
app: ollama
ports:
- port: 11434
```
Gateway `.env`:
```bash
OLLAMA_URL=http://ollama:11434
```
**Pros:**
- Isolated resource limits
- Can scale separately
- Easier to monitor/debug
**Cons:**
- More K8s resources
- Network hop (minimal latency)
- More complex deployment
### Recommendation
For most deployments: **Use Option 1 (in-container)** for simplicity, unless you need to:
- Share Ollama across multiple services
- Scale embedding inference independently
- Run Ollama on GPU nodes (gateway on CPU nodes)
## TODO ## TODO

View File

@@ -58,11 +58,6 @@ kubernetes:
redis: redis:
url: redis://localhost:6379 url: redis://localhost:6379
# Qdrant (for RAG vector search)
qdrant:
url: http://localhost:6333
collection: gateway_memory
# Iceberg (for durable storage via REST catalog) # Iceberg (for durable storage via REST catalog)
iceberg: iceberg:
catalog_uri: http://iceberg-catalog:8181 catalog_uri: http://iceberg-catalog:8181

View File

@@ -0,0 +1,113 @@
# Details Edit Protocol
This document describes the WebSocket message protocol for reading and editing the `details` field of category items (indicators, strategies, research scripts) from the web client.
## Background
Every category item stored in the sandbox has a `details` field: a full markdown description of the implementation with enough detail that another coding agent could reproduce the code from it alone. The web client can display this field, allow the user to edit it in plain text, and submit the revised version — the gateway then diffs the old vs new details and instructs the appropriate subagent to update the Python code accordingly.
The `details` field is intentionally **filtered out of the workspace `_types` stores** (see `mcp-tool-wrapper.ts:filterTypeStoreState`) because it can be several kilobytes of markdown. The read/update protocol below provides direct, on-demand access.
---
## Message Flow
### 1. Read Details
**Client → Server**
```json
{
"type": "read_details",
"category": "indicator" | "strategy" | "research",
"name": "My Indicator Name"
}
```
**Server → Client (success)**
```json
{
"type": "details_data",
"category": "indicator",
"name": "My Indicator Name",
"details": "## My Indicator\n\nFull markdown description..."
}
```
**Server → Client (error)**
```json
{
"type": "details_error",
"category": "indicator",
"name": "My Indicator Name",
"error": "Item not found or has no details"
}
```
---
### 2. Submit Updated Details
**Client → Server**
```json
{
"type": "update_details",
"category": "indicator" | "strategy" | "research",
"name": "My Indicator Name",
"details": "## My Indicator\n\nRevised full markdown description..."
}
```
The gateway will:
1. Read the current `details` from the sandbox via `PythonRead`
2. Compute a unified diff between the old and new text
3. If no changes are detected, reply immediately with `details_updated` (success)
4. Otherwise, invoke the appropriate subagent (indicator / strategy / research) with instructions to update the Python code according to the diff, and also persist the new `details` text
**While the subagent is running**, the server streams progress events using the same event types as normal agent interactions:
```json
{ "type": "subagent_chunk", "agentName": "indicator", "content": "Reading current implementation..." }
{ "type": "subagent_tool_call", "agentName": "indicator", "toolName": "PythonRead", "label": "PythonRead" }
{ "type": "subagent_tool_call", "agentName": "indicator", "toolName": "PythonEdit", "label": "PythonEdit" }
{ "type": "subagent_chunk", "agentName": "indicator", "content": "Applied patch. Validation passed." }
```
**Server → Client (completion)**
```json
{
"type": "details_updated",
"category": "indicator",
"name": "My Indicator Name",
"success": true
}
```
or on failure:
```json
{
"type": "details_updated",
"category": "indicator",
"name": "My Indicator Name",
"success": false,
"error": "Failed to update details"
}
```
---
## Workspace Sync After Update
When the subagent calls `PythonEdit`, the sandbox returns a `_workspace_sync` payload in the MCP response. The gateway automatically applies this to the `{category}_types` workspace store and sends a WebSocket `patch` message to the client (the normal workspace sync path). The client should listen for these patches to refresh any UI that displays list metadata (name, description).
The `details` field itself is **not** in the workspace store — the client must call `read_details` again if it needs the refreshed details text after an update.
---
## Implementation Notes
| Component | File | Responsibility |
|---|---|---|
| WebSocket routing | `src/channels/websocket-handler.ts` | Parse `read_details` / `update_details`, stream subagent events, send `details_data` / `details_updated` |
| Harness methods | `src/harness/agent-harness.ts` | `readDetails()`, `streamDetailsUpdate()` |
| Diff utility | `src/harness/agent-harness.ts` | `buildUnifiedDiff()`, `computeLCS()` (module-level helpers) |
| Instruction builder | `src/harness/agent-harness.ts` | `buildDetailsUpdateInstruction()` |
| Details filter | `src/tools/mcp/mcp-tool-wrapper.ts` | `filterTypeStoreState()` — strips `details` before workspace sync |

View File

@@ -1,25 +1,6 @@
#!/bin/bash #!/bin/bash
set -e set -e
# Start Ollama server in background
echo "Starting Ollama server..."
ollama serve &
OLLAMA_PID=$!
# Wait for Ollama to be ready
echo "Waiting for Ollama to be ready..."
for i in {1..30}; do
if curl -s http://localhost:11434/api/tags > /dev/null 2>&1; then
echo "Ollama is ready!"
break
fi
if [ $i -eq 30 ]; then
echo "Ollama failed to start within 30 seconds"
exit 1
fi
sleep 1
done
# Start the Node.js gateway application # Start the Node.js gateway application
echo "Starting gateway..." echo "Starting gateway..."
exec node dist/main.js exec node dist/main.js

View File

@@ -1,6 +1,6 @@
# Dexorder Knowledge Base # Dexorder Knowledge Base
This directory contains global knowledge documents that are automatically loaded into the RAG system as platform-wide knowledge (user_id="0"). This directory contains global knowledge documents that are automatically loaded into the agent's context at startup.
## Structure ## Structure
@@ -17,14 +17,21 @@ Documents should be in Markdown format with:
- Code examples where relevant - Code examples where relevant
- Cross-references to other docs - Cross-references to other docs
### Frontmatter Fields
`description` (required) — One or two sentences describing what the article covers. This is injected into every agent's system prompt as a KB catalog so agents know what to look up without making an extra tool call.
`tags` (optional) — List of topic tags for categorization.
### Example with Frontmatter ### Example with Frontmatter
```markdown ```markdown
--- ---
tags: [trading, risk-management, position-sizing] description: "Patterns for writing custom Python indicator scripts that compute values from OHLCV data and plot live on the chart."
tags: [indicators, python, development]
--- ---
# Risk Management # Custom Indicator Development
Content here... Content here...
``` ```
@@ -33,9 +40,7 @@ Content here...
1. At gateway startup, the DocumentLoader scans this directory 1. At gateway startup, the DocumentLoader scans this directory
2. Each markdown file is chunked by headers (max ~1000 tokens per chunk) 2. Each markdown file is chunked by headers (max ~1000 tokens per chunk)
3. Chunks are embedded using the configured embedding service 3. Content hash tracking enables incremental updates
4. Embeddings are stored in Qdrant with user_id="0" (global namespace)
5. Content hash tracking enables incremental updates
## Updating Documents ## Updating Documents
@@ -48,14 +53,6 @@ Content here...
- Deploy new version - Deploy new version
- Gateway will detect changes and update vectors automatically - Gateway will detect changes and update vectors automatically
## RAG Integration
When users query the agent:
1. Their query is embedded
2. Qdrant searches both global (user_id="0") and user-specific vectors
3. Relevant chunks from these docs are included in context
4. LLM generates response with platform knowledge
## Adding New Documents ## Adding New Documents
1. Create markdown file in appropriate subdirectory 1. Create markdown file in appropriate subdirectory
@@ -83,12 +80,3 @@ Check logs for load statistics:
``` ```
Knowledge documents loaded: { loaded: 5, updated: 2, skipped: 3 } Knowledge documents loaded: { loaded: 5, updated: 2, skipped: 3 }
``` ```
Monitor Qdrant collection stats:
```
GET /health
{
"qdrantVectors": 1234,
"qdrantIndexed": 1234
}
```

View File

@@ -1,3 +1,7 @@
---
description: "Complete Python API reference (DataAPI and ChartingAPI) with full source and docstrings for use in research scripts."
---
# Dexorder Research API Reference # Dexorder Research API Reference
This file contains the complete Python API source code with full docstrings. This file contains the complete Python API source code with full docstrings.
@@ -124,11 +128,14 @@ class DataAPI(ABC):
- pandas Timestamp: pd.Timestamp("2021-12-20") - pandas Timestamp: pd.Timestamp("2021-12-20")
end_time: End of time range. Same formats as start_time. end_time: End of time range. Same formats as start_time.
extra_columns: Optional additional columns to include beyond the standard extra_columns: Optional additional columns to include beyond the standard
OHLC columns. Available options: OHLC columns. Available options (all populated for Binance data):
- "volume" - Total volume (decimal float) - "volume" - Total base-asset volume (decimal float)
- "buy_vol" - Buy-side volume (decimal float) - "buy_vol" - Taker buy volume in base asset (decimal float)
- "sell_vol" - Sell-side volume (decimal float) - "sell_vol" - Taker sell volume in base asset (decimal float)
- "open_time", "high_time", "low_time", "close_time" (timestamps) - "quote_volume" - Total quote-asset volume, e.g. USDT (decimal float)
- "num_trades" - Number of trades in the candle (integer)
- "open_time", "close_time" (nanosecond timestamps; Binance only)
- "high_time", "low_time" (not provided by any exchange; always null)
- "open_interest" (for futures markets) - "open_interest" (for futures markets)
- "ticker", "period_seconds" - "ticker", "period_seconds"
@@ -201,8 +208,8 @@ class DataAPI(ABC):
length: Number of most recent candles to return (default: 1) length: Number of most recent candles to return (default: 1)
extra_columns: Optional list of additional column names to include. extra_columns: Optional list of additional column names to include.
Same column options as historical_ohlc: Same column options as historical_ohlc:
- "volume", "buy_vol", "sell_vol" - "volume", "buy_vol", "sell_vol", "quote_volume", "num_trades"
- "open_time", "high_time", "low_time", "close_time" - "open_time", "close_time", "high_time", "low_time"
- "open_interest", "ticker", "period_seconds" - "open_interest", "ticker", "period_seconds"
Returns: Returns:
@@ -240,6 +247,63 @@ class DataAPI(ABC):
""" """
pass pass
@abstractmethod
async def get_ticker_24h(
self,
exchange: str,
limit: Optional[int] = None,
min_std_quote_volume: Optional[float] = None,
market_type: Optional[str] = None,
base_asset_contains: Optional[str] = None,
) -> pd.DataFrame:
"""
Retrieve 24h rolling market stats for all symbols on an exchange.
Data is refreshed hourly by the ingestor pipeline. Use this to build a
pre-filtered symbol universe before running a scanner — it avoids requesting
per-symbol OHLC data for thousands of symbols.
Args:
exchange: Exchange name (e.g., "BINANCE", "COINBASE", "KRAKEN")
limit: If set, return only the Top N symbols By Volume. None = return all.
min_std_quote_volume: Exclude symbols with USD volume below this threshold.
market_type: Filter by market type: "spot" or "perp". None = return all.
base_asset_contains: Filter to symbols whose base asset contains this string
(case-insensitive). E.g., "BTC" matches "BTC/USDT".
Returns:
DataFrame sorted by std_quote_volume descending (NULLs last). Columns:
- ticker: Full ticker (e.g., "BTC/USDT.BINANCE")
- exchange_id: Exchange name
- base_asset: Base currency (e.g., "BTC")
- quote_asset: Quote currency (e.g., "USDT")
- last_price: Last traded price in quote currency
- price_change_pct: 24h price change as percentage (e.g. 2.5 = +2.5%)
- quote_volume_24h: Raw 24h volume in quote asset
- std_quote_volume: quote_volume_24h normalized to USD (NaN if conversion unknown)
- bid_price, ask_price: Current best bid/ask (NaN if not provided by exchange)
- open_24h, high_24h, low_24h: 24h OHLC prices (NaN if not provided)
- volume_24h: Base-asset volume (NaN if not provided)
- num_trades: 24h trade count (NaN if not provided)
- timestamp_ms: Snapshot timestamp in milliseconds
Returns empty DataFrame if no data is available (e.g., not yet fetched).
Examples:
# Top 50 most liquid Binance spot symbols
df = await api.data.get_ticker_24h("BINANCE", limit=50, market_type="spot")
# All BTC pairs with at least $10M daily volume
df = await api.data.get_ticker_24h("BINANCE",
base_asset_contains="BTC",
min_std_quote_volume=10_000_000)
# Build a scanner universe: all Binance symbols, sorted by volume
universe = await api.data.get_ticker_24h("BINANCE")
top_100 = universe.head(100)["ticker"].tolist()
"""
pass
``` ```
@@ -495,4 +559,6 @@ __all__ = ['API', 'ChartingAPI', 'DataAPI', 'get_api', 'set_api']
--- ---
For practical usage patterns and complete working examples, see `usage-examples.md`. For practical usage patterns and complete working examples, see [`usage-examples.md`](usage-examples.md).
For the pandas-ta indicator catalog used in research scripts, see [`pandas-ta-reference.md`](pandas-ta-reference.md).

View File

@@ -0,0 +1,168 @@
---
description: "API and patterns for writing custom Python indicator scripts that compute values from OHLCV data and plot live on the chart."
---
# Custom Indicator Development
Custom indicators are Python scripts saved in the `indicator` category. They compute values from OHLCV data and are plotted live on the TradingView chart alongside built-in indicators.
See [`pandas-ta-reference`](pandas-ta-reference.md) for the full catalog of built-in indicators available via `pandas_ta`.
---
## Function Signature
A custom indicator must define a **top-level function** whose name is the lowercase, snake_case form of the `name` passed to `PythonWrite`. For example, `name="VW RSI"` → function `def vw_rsi(...)`.
The function receives the OHLCV columns listed in `input_series` as positional arguments and must return either:
- A `pd.Series` (single-output indicator), or
- A `pd.DataFrame` with column names matching `output_columns` in the metadata (multi-output)
```python
import pandas as pd
import pandas_ta as ta
# Single-output: volume-weighted RSI
def vw_rsi(close: pd.Series, volume: pd.Series, length: int = 14) -> pd.Series:
rsi = ta.rsi(close, length=length)
vol_weight = volume / volume.rolling(length).mean()
return (rsi * vol_weight).rolling(3).mean()
```
```python
import pandas as pd
import pandas_ta as ta
# Multi-output: custom Bollinger Bands
def vol_bands(close: pd.Series, length: int = 20, std: float = 2.0) -> pd.DataFrame:
bb = ta.bbands(close, length=length, std=std)
return pd.DataFrame({
"upper": bb.iloc[:, 2],
"mid": bb.iloc[:, 1],
"lower": bb.iloc[:, 0],
})
```
**Always use `pandas_ta` for standard indicator calculations.** Never write manual `rolling().mean()` or `ewm()` implementations — use `ta.sma()`, `ta.ema()`, `ta.rsi()`, etc.
---
## Required Metadata
When writing a custom indicator with `PythonWrite`, supply complete metadata so the web client can build the TradingView plotter automatically:
```python
PythonWrite(
category="indicator",
name="VW RSI",
description="RSI weighted by relative volume.",
details="""## Volume-Weighted RSI
Computes RSI on close prices, scales by relative volume, applies 3-bar smoothing.
**Formula:** (rsi * (volume / volume.rolling(length).mean())).rolling(3).mean()
**Inputs:** close, volume
**Output:** single Series — smoothed volume-weighted RSI (separate pane)
**Parameters:** length (int, default 14)""",
code="""...""",
metadata={
"parameters": {
"length": {"type": "int", "default": 14, "min": 2, "max": 200, "description": "RSI period"}
},
"input_series": ["close", "volume"],
"output_columns": [
{"name": "value", "display_name": "VW-RSI", "plot": {"style": 0}}
],
"pane": "separate" # "price" = overlay on candles; "separate" = sub-pane
}
)
```
### Plot styles
| Value | Renders as |
|---|---|
| `0` | Line (default) |
| `1` | Histogram bars |
| `4` | Area (filled under line) |
| `5` | Columns (vertical bars) |
| `9` | Step line |
### Filled areas (shaded bands)
To shade between two output series (e.g. upper/lower bands), add a `filled_areas` list. The two bounding series must appear at consecutive even/odd positions in `output_columns`:
```python
"filled_areas": [
{"id": "fill", "type": "plot_plot", "series1": "upper", "series2": "lower",
"color": "#2196F3", "opacity": 0.08}
]
```
---
## Workflow
1. **Check for existing indicators** before writing: `PythonList(category="indicator")`. If one already exists with the same sanitized name, update it with `PythonEdit` rather than creating a duplicate.
2. **Write** with `PythonWrite(category="indicator", ...)`. The system automatically runs the script against synthetic test data to catch compile/runtime errors — no separate validation call needed.
3. **Add to workspace** with `WorkspacePatch("indicators", ...)` using `pandas_ta_name: "custom_<sanitized_name>"`. Include `custom_metadata` in the patch value so the web client can render it.
4. **Use in strategies** via `ta.custom_<sanitized_name>(...)`. See [`strategy-development`](strategy-development.md) for details.
---
## Naming Conventions
The workspace `pandas_ta_name` is `"custom_"` + the sanitized indicator name. Sanitization: lowercase + spaces/hyphens → underscores. For example:
| `name` | function name | `pandas_ta_name` |
|---|---|---|
| `"VW RSI"` | `vw_rsi` | `custom_vw_rsi` |
| `"TrendFlex"` | `trendflex` | `custom_trendflex` |
| `"Vol-Bands"` | `vol_bands` | `custom_vol_bands` |
Two names that sanitize to the same value will conflict — check with `PythonList` first.
---
## Common Pitfalls
### Look-ahead bias
Never use future data in the computation. Indicator values for bar N may only depend on data available at bar N or earlier.
```python
# WRONG — uses future price
signal = close.shift(-1) > close
# CORRECT — only past data
signal = close > close.shift(1)
```
### Repainting
Indicator values for already-closed bars should not change as new bars arrive. Avoid calculations that recalculate over a sliding window that can retrospectively alter past values in non-obvious ways.
### NaN handling
Indicators need a warm-up period. The first `length - 1` values will be `NaN`. Strategies that consume custom indicators should guard with:
```python
if vw_rsi.isna().all() or len(df) < min_required:
return
```
### Overfitting
- Keep indicator logic simple and parameter-lean
- Validate on out-of-sample data, not the same window used to tune parameters
- Prefer indicators with a clear mechanical rationale over curve-fit formulas
---
## See Also
- [`pandas-ta-reference`](pandas-ta-reference.md) — Full catalog of built-in indicators and calling conventions
- [`api-reference`](api-reference.md) — DataAPI and ChartingAPI for research scripts
- [`strategy-development`](strategy-development.md) — Using custom indicators in strategies via `ta.custom_*`

View File

@@ -1,142 +0,0 @@
# Indicator Development Guide
Custom indicators in Dexorder are Python functions that process OHLCV data and return signals or values.
## Indicator Structure
```python
def my_indicator(df, **params):
"""
Calculate custom indicator
Args:
df: DataFrame with columns [open, high, low, close, volume]
**params: Indicator parameters
Returns:
Series or DataFrame with indicator values
"""
# Implementation
return result
```
## Common Patterns
### Simple Moving Average
```python
def sma(df, period=20):
return df['close'].rolling(window=period).mean()
```
### Exponential Moving Average
```python
def ema(df, period=20):
return df['close'].ewm(span=period, adjust=False).mean()
```
### RSI (Relative Strength Index)
```python
def rsi(df, period=14):
delta = df['close'].diff()
gain = delta.where(delta > 0, 0).rolling(window=period).mean()
loss = -delta.where(delta < 0, 0).rolling(window=period).mean()
rs = gain / loss
return 100 - (100 / (1 + rs))
```
### MACD
```python
def macd(df, fast=12, slow=26, signal=9):
ema_fast = df['close'].ewm(span=fast).mean()
ema_slow = df['close'].ewm(span=slow).mean()
macd_line = ema_fast - ema_slow
signal_line = macd_line.ewm(span=signal).mean()
histogram = macd_line - signal_line
return pd.DataFrame({
'macd': macd_line,
'signal': signal_line,
'histogram': histogram
})
```
## Best Practices
### Data Handling
- Always validate input DataFrame has required columns
- Handle NaN values appropriately
- Use `.copy()` to avoid modifying original data
- Consider edge cases (not enough data, etc.)
### Performance
- Vectorize operations when possible (avoid loops)
- Use pandas/numpy built-in functions
- Cache expensive calculations
- Test on large datasets
### Parameters
- Provide sensible defaults
- Document parameter ranges
- Validate parameter values
- Consider optimization bounds
### Testing
```python
def test_indicator():
# Create sample data
df = pd.DataFrame({
'close': [100, 102, 101, 103, 105]
})
# Test calculation
result = my_indicator(df, param=10)
# Validate output
assert not result.isna().all()
assert len(result) == len(df)
```
## Common Pitfalls
### Look-Ahead Bias
Never use future data:
```python
# WRONG - uses future data
df['signal'] = df['close'].shift(-1) > df['close']
# CORRECT - only past data
df['signal'] = df['close'] > df['close'].shift(1)
```
### Repainting
Indicator values should not change for closed bars:
```python
# Ensure calculations are based on closed candles
# Avoid using unstable data sources
```
### Overfitting
- Don't optimize on same data you test on
- Use separate train/validation/test sets
- Walk-forward analysis for robustness
- Simple is often better than complex
## Integration with Strategies
Indicators are used in strategy signals:
```python
def my_strategy(df):
# Calculate indicators
df['rsi'] = rsi(df, period=14)
df['sma_fast'] = sma(df, period=20)
df['sma_slow'] = sma(df, period=50)
# Generate signals
df['signal'] = 0
df.loc[(df['rsi'] < 30) & (df['sma_fast'] > df['sma_slow']), 'signal'] = 1
df.loc[(df['rsi'] > 70) & (df['sma_fast'] < df['sma_slow']), 'signal'] = -1
return df
```
Store indicators in your git repository under `indicators/` directory.

View File

@@ -1,5 +1,11 @@
---
description: "Full catalog of technical indicators available via pandas-ta, with parameters and usage for research scripts and custom indicators."
---
# pandas-ta Reference for Research Scripts # pandas-ta Reference for Research Scripts
This catalog applies to both research scripts and custom indicators. For usage in research scripts see [`usage-examples.md`](usage-examples.md). For writing custom indicator scripts (with metadata for the TradingView plotter) see [`indicators/indicator-development.md`](indicators/indicator-development.md).
The sandbox environment uses **pandas-ta** as the standard indicator library. Always use it for technical indicator calculations; do not write manual rolling/ewm implementations. The sandbox environment uses **pandas-ta** as the standard indicator library. Always use it for technical indicator calculations; do not write manual rolling/ewm implementations.
```python ```python

View File

@@ -1,71 +0,0 @@
# Agent System Architecture
The Dexorder AI platform uses a sophisticated agent harness that orchestrates between user interactions, LLM models, and user-specific tools.
## Core Components
### Gateway
Multi-channel gateway supporting:
- WebSocket connections for web/mobile
- Telegram integration
- Real-time event streaming
### Agent Harness
Stateless orchestrator that:
1. Fetches context from user's MCP server
2. Routes to appropriate LLM model based on license
3. Calls LLM with embedded context
4. Routes tool calls to user's MCP or platform tools
5. Saves conversation history back to MCP
### Memory Architecture
Three-tier storage system:
- **Redis**: Hot state for active sessions and checkpoints
- **Qdrant**: Vector search for RAG and semantic memory
- **Iceberg**: Cold storage for durable conversations and analytics
### User Context
Every interaction includes:
- User ID and license information
- Active channel (websocket, telegram, etc.)
- Channel capabilities (markdown, images, buttons)
- Conversation history
- Relevant memories from RAG
- Workspace state
## Skills vs Subagents
### Skills
Self-contained capabilities for specific tasks:
- Market analysis
- Strategy validation
- Indicator development
- Defined in markdown + TypeScript
- Use when task is well-defined and scoped
### Subagents
Specialized agents with dedicated memory:
- Code reviewer with review guidelines
- Risk analyzer with risk models
- Multi-file knowledge base
- Custom system prompts
- Use when domain expertise is needed
## Global vs User Memory
### Global Memory (user_id="0")
Platform-wide knowledge available to all users:
- Trading concepts and terminology
- Platform capabilities
- Indicator documentation
- Strategy patterns
- Best practices
### User Memory
Personal context specific to each user:
- Conversation history
- Preferences and trading style
- Custom indicators and strategies
- Workspace state
All RAG queries automatically search both global and user-specific memories.

View File

@@ -0,0 +1,4 @@
---
description: "Platform documentation index: workspace API reference, chart shape types, and MCP sandbox lifecycle."
tags: [platform, index]
---

View File

@@ -1,88 +1,21 @@
# Model Context Protocol (MCP) Integration ---
description: "User sandbox lifecycle, persistent script storage categories, and session management for indicator, strategy, and research scripts."
---
Dexorder uses the Model Context Protocol for user-specific tool execution and state management. # User Sandbox
## Container Architecture Each user has a dedicated sandbox environment that persists their data across sessions.
Each user has a dedicated Kubernetes pod running: ## Persistent Storage
- **Agent Container**: Python environment with conda packages
- **Lifecycle Sidecar**: Manages container lifecycle and communication
- **Persistent Storage**: User's git repository with indicators/strategies
## Authentication Modes User scripts (indicators, strategies, research) are stored in a git repository inside the user's sandbox. They survive session disconnects and reconnections.
Three MCP authentication modes: - Indicators are in the `indicator` category and can be listed with `PythonList(category="indicator")`
- Strategies are in the `strategy` category and can be listed with `PythonList(category="strategy")`
- Research scripts are in the `research` category and can be listed with `PythonList(category="research")`
### 1. Public Mode (Free Tier) ## Session Lifecycle
- No authentication required
- Container creates anonymous session
- Limited to read-only resources
- Session expires after timeout
### 2. Gateway Auth Mode (Standard) - Sandbox starts automatically when the user connects
- Gateway authenticates user - Cold start takes a few seconds if the sandbox was idle
- Passes verified user ID to container - All workspace state and scripts are preserved across reconnects
- Container trusts gateway's authentication
- Full access to user's tools and data
### 3. Direct Auth Mode (Enterprise)
- User authenticates directly with container
- Gateway forwards encrypted credentials
- Container validates credentials independently
- Highest security for sensitive operations
## MCP Resources
The container exposes standard resources:
### context://user-profile
User preferences and trading style
### context://conversation-summary
Recent conversation context and history
### context://workspace-state
Current chart, indicators, and analysis state
### context://system-prompt
User's custom agent instructions
### indicators://list
Available indicators with signatures
### strategies://list
User's trading strategies
## Tool Execution Flow
1. User sends message to gateway
2. Gateway queries user's MCP resources for context
3. LLM generates response with tool calls
4. Gateway routes tool calls:
- Platform tools → handled by gateway
- User tools → proxied to MCP container
5. Tool results returned to LLM
6. Final response sent to user
7. Conversation saved to MCP container
## Container Lifecycle
### Startup
1. Gateway receives user connection
2. Checks if container exists
3. Creates pod if needed (cold start ~5-10s)
4. Waits for container ready
5. Establishes MCP connection
### Active
- Container stays alive during active session
- Receives tool calls via MCP
- Maintains workspace state
- Saves files to persistent storage
### Shutdown
- Free users: timeout after 15 minutes idle
- Paid users: longer timeout based on license
- Graceful shutdown saves state
- Persistent storage retained
- Fast restart on next connection

View File

@@ -0,0 +1,131 @@
---
description: "Chart shape types (trend lines, Fibonacci, rectangles, channels, etc.), point requirements, override properties, and WorkspacePatch patterns for adding/modifying/deleting shapes on the TradingView chart."
---
# Chart Shapes
Shapes are persistent TradingView chart drawings stored in the `shapes` workspace store. Read them with `WorkspaceRead("shapes")` and create/modify/delete them with `ShapesMutate`. Do **not** use `WorkspacePatch` for shapes — it requires knowledge of the internal path structure and is error-prone.
Always read `chartState` first to get the current `symbol` and visible `start_time`/`end_time` for placing points correctly.
---
## Shape Object
```json
{
"id": "string — unique ID you assign (e.g. 'trendline-btc-1')",
"type": "string — see type table below",
"points": [{ "time": 1700000000, "price": 45000.0, "channel": "optional" }],
"color": "#2962FF",
"line_width": 2,
"line_style": "solid",
"properties": {},
"symbol": "BTC/USDT.BINANCE",
"created_at": 1700000000,
"modified_at": 1700000000
}
```
- `line_style`: `"solid"` | `"dashed"` | `"dotted"`
- `properties`: passed directly as TradingView overrides (see [Drawings Overrides](https://www.tradingview.com/charting-library-docs/latest/customization/overrides/Drawings-Overrides))
- `time` values must be Unix timestamps in **seconds**; they are snapped to the nearest candle boundary automatically
---
## Supported Shape Types
| `type` | Description | Points |
|---|---|---|
| `trend_line` | Trend line between two price/time points | 2 |
| `horizontal_line` | Horizontal price level across the chart | 1 |
| `vertical_line` | Vertical time marker | 1 |
| `rectangle` | Price/time rectangle (two corners) | 2 |
| `circle` | Circle centered at first point, edge at second | 2 |
| `arrow` | Arrow from point 1 to point 2 | 2 |
| `fib_retracement` | Fibonacci retracement levels between two points | 2 |
| `fib_trend_ext` | Trend-based Fibonacci extension (A→B→C) | 3 |
| `parallel_channel` | Parallel channel (two-line + channel width) | 3 |
| `pitchfork` | Andrews pitchfork (handle + two tines) | 3 |
| `gannbox_fan` | Gann fan from a pivot point | 2 |
| `path` | Free-form polyline through 2+ points | 2+ |
| `text` | Text label anchored at a price/time location | 1 |
| `head_and_shoulders` | Head and shoulders pattern overlay | 7 |
> For the full TradingView drawing catalog (including Elliott waves, patterns, annotations, etc.) see [Drawings List](https://www.tradingview.com/charting-library-docs/latest/ui_elements/drawings/Drawings-List/) and [CreateShapeOptions](https://www.tradingview.com/charting-library-docs/latest/api/interfaces/Charting_Library.CreateShapeOptions/#shape).
---
## ShapesMutate Patterns
Use `ShapesMutate` — not `WorkspacePatch` — to add, update, or remove shapes. Any combination of operations can be sent in a single call.
### Add a shape
```
ShapesMutate({
add: [{
id: "trendline-1",
type: "trend_line",
points: [
{ time: 1700000000, price: 42000 },
{ time: 1700172800, price: 45000 }
],
color: "#2962FF",
line_width: 2,
line_style: "solid",
symbol: "BTC/USDT.BINANCE"
}]
})
```
### Update a property
```
ShapesMutate({ update: [{ id: "trendline-1", color: "#FF5722" }] })
```
### Delete a shape
```
ShapesMutate({ remove: ["trendline-1"] })
```
### Combined (add + remove in one call)
```
ShapesMutate({
add: [{ id: "hline-support", type: "horizontal_line", points: [{ time: 0, price: 42000 }], symbol: "BTC/USDT.BINANCE" }],
remove: ["trendline-1"]
})
```
---
## Override Properties
These map to TradingView drawing override keys passed in the `properties` field:
| Property key | Type | Notes |
|---|---|---|
| `linecolor` | string (hex) | Same as top-level `color` — prefer `color` |
| `linewidth` | number | Same as top-level `line_width` — prefer `line_width` |
| `linestyle` | number | 0 = solid, 1 = dashed, 2 = dotted — prefer `line_style` |
| `fillBackground` | boolean | Fill enclosed areas (rectangles, circles, etc.) |
| `backgroundColor` | string (hex) | Fill color when `fillBackground` is true |
| `transparency` | number | Fill transparency 0100 |
| `extendLeft` | boolean | Extend line left (rays, horizontal lines) |
| `extendRight` | boolean | Extend line right |
| `showLabel` | boolean | Show price/time label on the shape |
For the complete per-type override reference, consult [Drawings Overrides](https://www.tradingview.com/charting-library-docs/latest/customization/overrides/Drawings-Overrides).
---
## Notes
- **ID collisions**: read the `shapes` store first to check existing IDs before adding
- **Symbol filter**: the web client only renders shapes where `shape.symbol` matches the current chart symbol — always set it
- **Horizontal lines** only need a `price` in their single point; `time` is ignored
- **Vertical lines** only need a `time` in their single point; `price` is ignored
- **Text shapes**: set `properties.text` to the label string

View File

@@ -0,0 +1,123 @@
---
description: "Workspace store schema: chartState (symbol/period/time range), indicators, shapes (chart drawings/annotations — see platform/shapes), and WorkspaceRead/WorkspacePatch usage."
---
# Workspace
The Workspace is the user's current UI context — what they are looking at, what is selected, and what persistent state belongs to their session. It is a collection of named **stores** that are kept in sync between the web client, the gateway, and the user's sandbox container.
Use `WorkspaceRead(store_name)` to read any store and `WorkspacePatch(store_name, patch)` to update it. Patches use JSON Patch (RFC 6902) format.
---
## Stores
### `chartState` — Current chart view (persistent)
Tracks what the user is currently looking at on the TradingView chart.
| Field | Type | Description |
|---|---|---|
| `symbol` | string | Active trading pair in `SYMBOL.EXCHANGE` format (e.g. `BTC/USDT.BINANCE`) |
| `period` | number | OHLC bar period in seconds (e.g. `900` = 15 min, `3600` = 1 h) |
| `start_time` | number \| null | Unix timestamp of left edge of visible range, or null for auto |
| `end_time` | number \| null | Unix timestamp of right edge of visible range, or null for auto |
| `selected_shapes` | string[] | IDs of currently selected drawing/annotation shapes |
When the user says "the current chart" or "what's selected", read `chartState` first.
---
### `indicators` — Active indicators on the chart (persistent)
A flat map of `indicator_id → IndicatorInstance`. Each entry represents one study currently plotted on the TradingView chart.
**`IndicatorInstance` fields:**
| Field | Type | Description |
|---|---|---|
| `id` | string | Unique ID for this instance |
| `pandas_ta_name` | string | Internal name used in strategy/indicator scripts (e.g. `RSI_14`, `custom_MyIndicator`) |
| `instance_name` | string | Human-readable label shown on chart |
| `parameters` | object | Key/value parameter map (e.g. `{ length: 14 }`) |
| `tv_study_id` | string? | TradingView study ID (assigned by TV after the study is added) |
| `tv_indicator_name` | string? | TradingView indicator name for built-in studies |
| `tv_inputs` | object? | TradingView input overrides keyed by TV input name |
| `visible` | boolean | Whether the study is visible on the chart |
| `pane` | string | `"price"` to overlay on price pane, `"separate"` for its own panel |
| `symbol` | string? | Override symbol if different from `chartState.symbol` |
| `created_at` | number? | Unix timestamp when added |
| `modified_at` | number? | Unix timestamp when last changed |
| `custom_metadata` | object? | Present only for `custom_*` indicators; drives TradingView custom study construction (see below) |
**`custom_metadata` sub-fields** (for custom indicators only):
| Field | Type | Description |
|---|---|---|
| `display_name` | string | Human-readable indicator title shown in TV |
| `parameters` | object | Parameter schema: `{ name: { type, default, description, min?, max? } }` |
| `input_series` | string[] | Input price series required (e.g. `["close"]`) |
| `output_columns` | array | Each entry: `{ name, display_name?, description?, plot? }` where `plot` has `{ style, color?, linewidth?, visible? }` |
| `pane` | `"price"` \| `"separate"` | Default pane placement |
| `filled_areas` | array? | Shaded regions between two plots or hlines |
| `bands` | array? | Horizontal reference lines (e.g. RSI 70/30) |
---
### `shapes` — Chart drawings and annotations (persistent)
```json
{ "shapes": { "<shape_id>": Shape } }
```
For the complete shapes reference — all supported types, point counts, override properties, and WorkspacePatch examples — see **`platform/shapes`** (`MemoryLookup({page: "platform/shapes"})`).
---
### `indicator_types` — Custom indicator registry (persistent)
```json
{ "types": { "<script_name>": CustomIndicatorMetadata } }
```
Maps custom indicator script names to their `CustomIndicatorMetadata` (same structure as `custom_metadata` above). Populated when a custom indicator is created or updated by the indicator agent. The web client uses this to register custom TradingView studies.
---
### `strategy_types` — Strategy registry (persistent)
```json
{ "types": { "<script_name>": StrategyMetadata } }
```
Maps strategy script names to their metadata. Used by the web client to know which strategies are available.
---
### `research_types` — Research script registry (persistent)
```json
{ "types": { "<script_name>": ResearchMetadata } }
```
Maps research script names to their metadata.
---
### `channelState` — Connected channels (transient, gateway-only)
Tracks which communication channels (WebSocket, Telegram, etc.) are connected to the current session. **Not synced to web clients.**
```json
{ "connected": { "<channel_id>": { type, connectedAt, capabilities } } }
```
---
## Sync Protocol
Stores are kept in sync using JSON Patch (RFC 6902) messages:
- **snapshot** — full state dump sent on connect or after missed patches
- **patch** — incremental change with a monotonic sequence number
Stores marked `persistent` are saved to the user's container at `/data/workspace/{store_name}.json` and survive session reconnects.

View File

@@ -1,188 +0,0 @@
# Strategy Development Guide
Trading strategies in Dexorder define entry/exit rules and position management logic.
## Strategy Structure
```python
class Strategy:
def __init__(self, **params):
"""Initialize strategy with parameters"""
self.params = params
def generate_signals(self, df):
"""
Generate trading signals
Args:
df: DataFrame with OHLCV + indicator columns
Returns:
DataFrame with 'signal' column:
1 = long entry
-1 = short entry
0 = no action
"""
pass
def calculate_position_size(self, capital, price, risk_pct):
"""Calculate position size based on risk"""
pass
def get_stop_loss(self, entry_price, direction):
"""Calculate stop loss level"""
pass
def get_take_profit(self, entry_price, direction):
"""Calculate take profit level"""
pass
```
## Example: Simple Moving Average Crossover
```python
class SMACrossoverStrategy:
def __init__(self, fast_period=20, slow_period=50, risk_pct=0.02):
self.fast_period = fast_period
self.slow_period = slow_period
self.risk_pct = risk_pct
def generate_signals(self, df):
# Calculate moving averages
df['sma_fast'] = df['close'].rolling(self.fast_period).mean()
df['sma_slow'] = df['close'].rolling(self.slow_period).mean()
# Generate signals
df['signal'] = 0
# Long when fast crosses above slow
df.loc[
(df['sma_fast'] > df['sma_slow']) &
(df['sma_fast'].shift(1) <= df['sma_slow'].shift(1)),
'signal'
] = 1
# Short when fast crosses below slow
df.loc[
(df['sma_fast'] < df['sma_slow']) &
(df['sma_fast'].shift(1) >= df['sma_slow'].shift(1)),
'signal'
] = -1
return df
def calculate_position_size(self, capital, price, atr):
# Risk-based position sizing
risk_amount = capital * self.risk_pct
stop_distance = 2 * atr
position_size = risk_amount / stop_distance
return position_size
def get_stop_loss(self, entry_price, direction, atr):
if direction == 1: # Long
return entry_price - (2 * atr)
else: # Short
return entry_price + (2 * atr)
def get_take_profit(self, entry_price, direction, atr):
if direction == 1: # Long
return entry_price + (4 * atr) # 2:1 risk/reward
else: # Short
return entry_price - (4 * atr)
```
## Strategy Components
### Signal Generation
Entry conditions based on:
- Indicator crossovers
- Price patterns
- Volume confirmation
- Multiple timeframe confluence
### Risk Management
Essential elements:
- **Position Sizing**: Based on account risk percentage
- **Stop Losses**: ATR-based or support/resistance
- **Take Profits**: Multiple targets or trailing stops
- **Max Positions**: Limit concurrent trades
### Filters
Reduce false signals:
- **Trend Filter**: Only trade with the trend
- **Volatility Filter**: Avoid low volatility periods
- **Time Filter**: Specific trading hours
- **Volume Filter**: Minimum volume requirements
### Exit Rules
Multiple exit types:
- **Stop Loss**: Protect capital
- **Take Profit**: Lock in gains
- **Trailing Stop**: Follow profitable moves
- **Time Exit**: Close at end of period
- **Signal Exit**: Opposite signal
## Backtesting Considerations
### Data Quality
- Use clean, validated data
- Handle missing data appropriately
- Account for survivorship bias
- Include realistic spreads and slippage
### Performance Metrics
Track key metrics:
- **Total Return**: Cumulative profit/loss
- **Sharpe Ratio**: Risk-adjusted returns
- **Max Drawdown**: Largest peak-to-trough decline
- **Win Rate**: Percentage of profitable trades
- **Profit Factor**: Gross profit / gross loss
- **Expectancy**: Average $ per trade
### Validation
Prevent overfitting:
- **Train/Test Split**: 70/30 or 60/40
- **Walk-Forward**: Rolling windows
- **Out-of-Sample**: Test on recent unseen data
- **Monte Carlo**: Randomize trade order
- **Paper Trading**: Live validation
## Common Strategy Types
### Trend Following
Follow sustained price movements:
- Moving average crossovers
- Breakout strategies
- Trend channels
- Works best in trending markets
### Mean Reversion
Profit from price returning to average:
- Bollinger Band reversals
- RSI extremes
- Statistical arbitrage
- Works best in ranging markets
### Momentum
Trade in direction of strong moves:
- Relative strength
- Price acceleration
- Volume surges
- Breakout confirmation
### Arbitrage
Exploit price discrepancies:
- Cross-exchange spreads
- Funding rate arbitrage
- Statistical pairs trading
- Requires low latency
## Integration with Platform
Store strategies in your git repository under `strategies/` directory.
Test using the backtesting tools provided by the platform.
Deploy live strategies through the execution engine with proper risk controls.
Monitor performance and adjust parameters as market conditions change.

View File

@@ -0,0 +1,266 @@
---
description: "PandasStrategy class API, order placement, backtesting, and paper trading patterns for automated crypto strategy development."
---
# Strategy Development Guide
Strategies on Dexorder are `PandasStrategy` subclasses that receive a live stream of OHLCV bars and call `self.buy()` / `self.sell()` / `self.flatten()` to place orders.
See [`api-reference`](api-reference.md) for the DataAPI and ChartingAPI used in research scripts. For indicator calculations, see [`pandas-ta-reference`](pandas-ta-reference.md).
---
## PandasStrategy API
```python
from dexorder.nautilus.pandas_strategy import PandasStrategy, PandasStrategyConfig
class MyStrategy(PandasStrategy):
def evaluate(self, dfs: dict[str, pd.DataFrame]) -> None:
"""Called after every new bar across all feeds.
Args:
dfs: dict mapping feed_key → pd.DataFrame
Columns: timestamp (ns), open, high, low, close, volume,
buy_vol, sell_vol, open_interest
Rows accumulate over time — last row = latest bar.
"""
df = dfs.get("BTC/USDT.BINANCE:300")
if df is None or len(df) < 20:
return # not enough data yet
close = df["close"]
# ... compute signals ...
if buy_signal:
self.buy(quantity=0.1)
elif sell_signal:
self.sell(quantity=0.1)
```
### Feed key format
`"{SYMBOL.EXCHANGE}:{period_seconds}"` — e.g. `"BTC/USDT.BINANCE:900"` for 15-minute bars.
Access all feeds via `self.config.feed_keys` (tuple of strings).
### Order methods
```python
self.buy(quantity: float, feed_key: str = None)
self.sell(quantity: float, feed_key: str = None)
self.flatten(feed_key: str = None) # close all open positions
```
If `feed_key` is omitted, the first feed in `feed_keys` is used. `quantity` is in base currency units (e.g. 0.1 BTC).
### Available data
Strategies may only use data in the `dfs` feeds: crypto OHLCV + buy/sell volume split + open interest. The following are **not available**:
- TradFi data (equities, forex, bonds, options, macro indicators)
- Alternative data (news, social sentiment, on-chain metrics, economic calendars)
---
## Using pandas_ta
Use `import pandas_ta as ta` for all indicator calculations. Never write manual `rolling()` or `ewm()` implementations.
```python
import pandas_ta as ta
rsi = ta.rsi(df["close"], length=14)
macd_df = ta.macd(df["close"], fast=12, slow=26, signal=9)
hist = macd_df.iloc[:, 2] # histogram column
ema = ta.ema(df["close"], length=20)
atr = ta.atr(df["high"], df["low"], df["close"], length=14)
```
See [`pandas-ta-reference`](pandas-ta-reference.md) for the full indicator catalog and multi-output column extraction patterns.
---
## Using Custom Indicators
Prefer referencing a custom indicator that already exists in the `indicator` category rather than duplicating the logic inline. Custom indicators appear on the user's chart, making the signal transparent.
```python
import pandas_ta as ta
def evaluate(self, dfs):
df = dfs.get("BTC/USDT.BINANCE:3600")
if df is None or len(df) < 20:
return
vw_rsi = ta.custom_vw_rsi(df["close"], df["volume"], length=14)
if vw_rsi is None or vw_rsi.isna().all():
return
if vw_rsi.iloc[-1] < 30:
self.buy(0.01)
elif vw_rsi.iloc[-1] > 70:
self.sell(0.01)
```
Custom indicator names follow the pattern `ta.custom_{sanitized_name}`. See [`indicator-development`](indicator-development.md) for naming rules and how to create custom indicators.
---
## Strategy Metadata
When writing a strategy with `PythonWrite(category="strategy", ...)`, always provide:
| Field | Required | Description |
|-------|----------|-------------|
| `description` | yes | One-sentence summary |
| `details` | yes | Full markdown: algorithm, entry/exit logic, parameters, data feeds, position sizing. Enough detail to reproduce the code from scratch. |
```python
PythonWrite(
category="strategy",
name="RSI Mean Reversion",
description="Buy oversold, sell overbought based on RSI(14) on BTC/USDT 5m bars.",
details="""## RSI Mean Reversion
...""",
code="""...""",
metadata={
"data_feeds": [
{"symbol": "BTC/USDT.BINANCE", "period_seconds": 300, "description": "BTC/USDT 5m"}
],
"parameters": {
"rsi_length": {"default": 14, "description": "RSI lookback period"},
"oversold": {"default": 30, "description": "Buy threshold"},
"overbought": {"default": 70, "description": "Sell threshold"},
"trade_qty": {"default": 0.01, "description": "Trade quantity in BTC"}
}
}
)
```
---
## Backtest Workflow
1. **Check existing indicators** first: `PythonList(category="indicator")` — reuse signals already on the chart.
2. **Write** the strategy: `PythonWrite(...)` — runs against synthetic data automatically.
3. **Run a backtest** targeting 100,000200,000 bars (max 5 years):
```
BacktestStrategy(
strategy_name="RSI Mean Reversion",
feeds=[{"symbol": "BTC/USDT.BINANCE", "period_seconds": 900}],
from_time="2023-01-01",
to_time="2024-12-31",
initial_capital=10000
)
```
4. **Interpret results**:
- `summary.total_return` — total fractional return (0.15 = +15%)
- `summary.sharpe_ratio` — annualized Sharpe (>1.0 good, >2.0 excellent)
- `summary.max_drawdown` — maximum peak-to-trough loss
- `summary.win_rate` — fraction of profitable trades
- `statistics.profit_factor` — gross profit / gross loss (>1.5 good)
5. **Iterate** with `PythonEdit`, re-run backtest.
6. **Activate** (paper first): `ActivateStrategy(..., paper=True)`
### Bar resolution and backtest window
Choose the resolution appropriate to the strategy's signal frequency, then set the date range to hit 100k200k bars:
| Resolution | ~100k bars | ~200k bars |
|---|---|---|
| 5m | 1 year | 2 years |
| 15m | 2.9 years | 5 years |
| 1h | cap at 5 yr (≈44k bars) | — |
| 4h | cap at 5 yr (≈11k bars) | — |
---
## Strategy Patterns
### Trend following
Follow sustained price movements using moving average crossovers, breakout of price channels, or trend-direction filters:
```python
ema_fast = ta.ema(df["close"], length=20)
ema_slow = ta.ema(df["close"], length=50)
bullish = ema_fast.iloc[-1] > ema_slow.iloc[-1]
crossover = ema_fast.iloc[-2] <= ema_slow.iloc[-2]
if bullish and crossover:
self.buy(qty)
```
### Mean reversion
Profit from price returning to an average after extremes:
```python
rsi = ta.rsi(df["close"], length=14)
if rsi.iloc[-1] < 30:
self.buy(qty)
elif rsi.iloc[-1] > 70:
self.sell(qty)
```
### Multi-timeframe confluence
Use a higher-timeframe trend filter with a lower-timeframe entry signal:
```python
df_4h = dfs.get("BTC/USDT.BINANCE:14400")
df_15m = dfs.get("BTC/USDT.BINANCE:900")
if df_4h is None or df_15m is None:
return
ema_4h = ta.ema(df_4h["close"], length=20)
bullish_trend = df_4h["close"].iloc[-1] > ema_4h.iloc[-1]
macd_df = ta.macd(df_15m["close"])
hist = macd_df.iloc[:, 2]
if bullish_trend and hist.iloc[-1] > 0 and hist.iloc[-2] <= 0:
self.buy(qty, feed_key="BTC/USDT.BINANCE:900")
```
---
## Important Rules
- **`evaluate()` must be fast, lightweight, and deterministic** — no model inference, file I/O, network calls, or randomness. It runs on every bar during backtests over potentially hundreds of thousands of bars.
- **No LLM calls inside strategies** — strategies must be fully reproducible.
- **Guard for insufficient data** — always check `len(df) >= min_required` before computing indicators with a lookback period.
- **Use `.get()` for feeds** — multi-feed strategies may have feeds missing during warm-up.
- **Size conservatively** — a typical trade quantity is `0.0010.01 * initial_capital / price`.
- **No `import` from `dexorder` inside `evaluate()`** — the strategy file is exec'd in a sandbox; PandasStrategy and pandas_ta are pre-loaded.
---
## Performance Metrics Reference
| Metric | Good | Excellent |
|---|---|---|
| Sharpe ratio | > 1.0 | > 2.0 |
| Profit factor | > 1.5 | > 2.0 |
| Max drawdown | < 20% | < 10% |
| Win rate | context-dependent | — |
A strategy with a lower win rate can still be profitable if winners are larger than losers (profit factor > 1). Focus on Sharpe and max drawdown as primary quality metrics.
### Avoiding overfitting
- Do not optimize parameters on the same data used for validation
- Use a held-out out-of-sample period to verify results
- Prefer fewer parameters — simpler strategies generalize better
- Walk-forward analysis: re-fit on a rolling window, evaluate on the next
---
## See Also
- [`pandas-ta-reference`](pandas-ta-reference.md) — Indicator catalog and usage examples
- [`indicator-development`](indicator-development.md) — Creating custom indicators
- [`api-reference`](api-reference.md) — DataAPI reference (for research scripts)
- [`usage-examples`](usage-examples.md) — Research script patterns

View File

@@ -0,0 +1,4 @@
---
description: "Trading knowledge index: signal combination theory, technical analysis, and a master catalog of 150+ strategies by asset class."
tags: [trading, index]
---

View File

@@ -0,0 +1,121 @@
---
description: "Institutional alpha combination: how to merge multiple weak signals into a single high-conviction output using the 11-step procedure and the Fundamental Law of Active Management."
tags: [signals, alpha, portfolio, kelly, statistics]
---
# Signal Combination — Alpha Stacking
## Core Law
Do not search for one perfect signal. Combine many weak, independent signals.
**Fundamental Law of Active Management:**
```
IR = IC × √N
```
- `IR` = Information Ratio of the combined system (risk-adjusted edge)
- `IC` = average Information Coefficient per signal (correlation of prediction to outcome)
- `N` = number of *genuinely independent* signals
Real institutional signals have IC = 0.050.15. A single signal at IC=0.10 is outperformed by 50 signals at IC=0.05 (IR = 0.05 × √50 = 0.354, over 3× better).
**Critical:** N counts *effective independent signals*, not raw signal count. Fifty correlated signals may yield only 1015 effective ones. The 11-step procedure below forces honest accounting.
---
## Five Signal Categories
| Category | What it measures | Why it persists |
|---|---|---|
| **Momentum / Price** | Direction/rate of price movement over lookback `d` | Underreaction causes short-term trend persistence |
| **Mean Reversion** | Deviation from cross-sectional fair value | Related instruments maintain consistent relative pricing |
| **Volatility** | Implied vs. realized volatility gap | Vol risk premium: sellers demand compensation |
| **Factor** | Value, momentum, carry, quality, low-vol premiums | Persistent behavioral/structural inefficiencies |
| **Microstructure** | Order book imbalance, bid-ask spread, VPIN | Informed order flow leads price movement |
> **Dexorder scope**: Only crypto OHLCV data is available. Factor signals (value, carry, quality) require TradFi data not available here. Momentum, mean reversion, volatility, and microstructure signals are all applicable.
---
## 11-Step Combination Engine
Given N signals with historical returns R(i,s) over M periods:
**Step 1.** Collect realized return series R(i,s) for each signal i, each period s.
**Step 2.** Remove drift — serially demean:
```
X(i,s) = R(i,s) mean(R(i,·))
```
**Step 3.** Compute variance per signal:
```
σ(i)² = (1/M) × Σ X(i,s)²
```
**Step 4.** Normalize to common scale:
```
Y(i,s) = X(i,s) / σ(i)
```
Makes signals with different magnitudes directly comparable.
**Step 5.** Drop the most recent observation from Y — use only out-of-sample history.
**Step 6.** Cross-sectionally demean at each time period:
```
Λ(i,s) = Y(i,s) avg_j(Y(j,s))
```
Removes any market-wide effect driving all signals simultaneously at that moment.
**Step 7.** Drop the final period from Λ to eliminate residual look-ahead.
**Step 8.** Estimate expected forward return per signal using d-day moving average, normalize:
```
E(i) = (1/d) × Σ R(i,s) over recent d periods
E_normalized(i) = E(i) / σ(i)
```
**Step 9. (Critical)** Regress E_normalized over Λ(i,s) without intercept, unit weights. Residuals `ε(i)` are each signal's *independent* forward-looking contribution — the component not explained by any other signal in the stack.
**Step 10.** Set weights:
```
w(i) = η × ε(i) / σ(i)
```
High independent edge + low noise → high weight. No subjective judgment.
**Step 11.** Normalize: scale η so `Σ|w(i)| = 1`. No unintended leverage.
**Combined output = Σ w(i) × signal_i_current_value**
---
## Empirical Kelly Sizing
```
f_empirical = f_kelly × (1 CV_edge)
f_kelly = (p × b q) / b
```
- `CV_edge` = coefficient of variation of edge estimates across 10,000 Monte Carlo path simulations of historical returns
- Higher uncertainty → smaller fraction. The formula automatically scales confidence to what is warranted.
---
## Key Failure Mode: Correlation Blindness
Believing you have 3 independent reasons for a trade when you have 1 reason expressed 3 times, sized as if for 3. This is the mechanism behind most systematic blowups where the trader was directionally correct but over-sized.
The cross-sectional demeaning (Step 6) and regression residualization (Step 9) structurally prevent this by exposing shared variance before weights are assigned.
---
## Dexorder Application Note
When combining multiple indicators into a single entry/exit signal:
1. Each indicator (momentum, RSI divergence, volume profile, spread, etc.) is a signal producing a score or directional estimate.
2. Run the 11-step engine over backtested signal histories to derive weights.
3. Combined score = weighted sum of current signal outputs.
4. Size the resulting position using empirical Kelly with CV_edge from simulation.
If computing probability estimates (e.g. probability of upward breakout), substitute probability estimates for return estimates at each step — the math is identical.

View File

@@ -0,0 +1,51 @@
---
description: "A portfolio strategy that optimally determines how much cash to hold to meet unforeseen liquidity demands while minimizing the drag from uninvested capital."
tags: [cash, liquidity, risk-management]
---
# Liquidity Management
**Section**: 17.3 | **Asset Class**: Cash | **Type**: Risk management / Portfolio construction
## Overview
From a portfolio management perspective, this strategy amounts to optimally defining the amount of cash to be held in the portfolio to meet liquidity demands generated by unforeseen events. Cash provides immediate liquidity, whereas other assets would have to be liquidated first, which can be associated with substantial transaction costs — especially if liquidation is abrupt. From a corporate perspective, holding cash can be a precautionary measure aimed at avoiding cash flow shortfalls that can yield, inter alia, loss of investment opportunities, financial distress, etc.
## Construction / Mechanics
The strategy involves three distinct roles for cash as an asset:
1. **Risk management tool** — Cash held as a buffer mitigates drawdowns and volatility by providing a shock absorber when other assets decline or become illiquid.
2. **Opportunity management tool** — A cash reserve allows the investor to take advantage of specific or unusual situations (e.g., distressed asset purchases, sudden dislocations) without needing to liquidate existing positions.
3. **Liquidity management tool** — In unexpected situations requiring liquid funds, cash is the only immediately available resource without liquidation costs.
Liquid cash equivalents that can be held in a portfolio include:
- U.S. Treasury bills
- Bank deposit certificates (CDs)
- Commercial paper
- Banker's acceptances
- Eurodollars
- Repurchase agreements (repos)
## Return Profile / Objective
The direct return on cash is low (near the risk-free rate). The strategy's value lies in its option-like properties: the ability to act quickly when opportunities arise and to avoid forced liquidation at disadvantageous prices. The optimal cash level trades off the opportunity cost of holding cash against the cost of unexpected forced liquidation.
## Key Parameters / Signals
- **Liquidity buffer target**: sized to expected maximum drawdown of cash needs over some horizon
- **Opportunity reserve**: discretionary allocation kept available for tactical deployment
- **Kelly-based sizing**: related to Kelly criterion strategies; the cash fraction can be derived from portfolio growth optimization
- **Liquidation cost model**: transaction costs and market impact of rapidly liquidating other assets inform the minimum required cash buffer
## Variations
- **Corporate treasury liquidity management**: focused on operating cash flow needs, payroll, and debt service
- **Hedge fund liquidity management**: sized to potential investor redemptions and margin call scenarios
- **Tactical cash allocation**: dynamically increasing cash weight as a defensive signal during high-volatility regimes
## Notes
The optimal cash holding is not the same as the rationale behind Kelly strategies. The appropriate cash level depends on the investor's specific liability structure, redemption profile, and the liquidity of the remaining portfolio. Holding too little cash forces costly liquidation; holding too much creates unacceptable opportunity cost drag. The strategy is foundational to virtually all portfolio management frameworks.

View File

@@ -0,0 +1,49 @@
---
description: "An illegal lending practice involving loans at excessively high interest rates without collateral, enforced through coercion; documented here for educational and awareness purposes only."
tags: [cash, illegal, lending, awareness]
---
# Loan Sharking
**Section**: 17.6 | **Asset Class**: Cash | **Type**: Illegal activity (documented for educational/awareness purposes only)
## Overview
Loan sharking consists of offering loans at excessively high — often usurious — interest rates. Unlike pawnbroking, loan sharking in many jurisdictions is illegal, and the loan is not necessarily secured by collateral. This activity is documented here solely for educational and awareness purposes. It is not a legitimate trading strategy and constitutes criminal conduct in most jurisdictions.
## Construction / Mechanics
**Loan structure:**
- Cash is lent to a borrower at interest rates far above legal usury limits — rates can be expressed in weekly or daily terms (e.g., "2 for 1" means repay double within a fixed period)
- No formal legal documentation is typically used; the arrangement is informal and unenforceable through normal legal channels
- The loan is typically unsecured — there is no collateral pledge
**Enforcement:**
- Because the loan cannot be enforced through the legal system, the lender (loan shark) may resort to extralegal means of enforcement
- This can include blackmail, threats, and physical violence to compel repayment
- The borrower has no legal recourse against abusive enforcement methods
**Relationship to legitimate lending:**
- Loan sharks operate in the same economic niche as payday lenders and pawnbrokers but without legal constraints on rates or enforcement methods
- They typically serve borrowers with no access to formal credit (e.g., due to criminal records, immigration status, or existing debt)
## Return Profile / Objective
The stated "return" is the extremely high interest charged. In practice, the loan shark often profits more from the coercive control over the borrower than from pure interest income — borrowers may be exploited for labor or other services. The high nominal returns are offset by significant legal, personal safety, and operational risks.
## Key Parameters / Signals
- **Interest rate**: typically far above legal usury limits; often quoted in short-period terms to obscure the true APR
- **Enforcement mechanism**: the primary differentiator from legal lending; ranges from social pressure to physical violence
- **Borrower desperation**: loan sharks target individuals with no alternative credit access
- **Rollover/compound traps**: unpaid interest may compound rapidly, trapping borrowers in escalating debt
## Variations
- **Organized crime lending**: loan sharking as a service offered by criminal organizations, often linked to other illegal activities
- **Predatory lending (grey area)**: legal but extremely high-rate lenders (payday loans, rent-to-own) operating at the edge of legality
- **Salary lending / advance-fee lending**: informal arrangements common in some developing markets
## Notes
Loan sharking is **illegal** in most jurisdictions. It is documented here exclusively for educational purposes — to illustrate the spectrum of cash-based financial activities, support awareness of predatory lending, and assist in compliance or regulatory analysis. Any actual participation in loan sharking carries severe criminal penalties including imprisonment. The key distinction from legal high-rate lending (e.g., payday loans) is the use of illegal coercion and the absence of legal licensing and rate-cap compliance.

View File

@@ -0,0 +1,42 @@
---
description: "A three-stage process (placement, layering, integration) by which illegal cash is transformed into legitimate-appearing assets; documented here for educational and awareness purposes only."
tags: [cash, illegal, awareness]
---
# Money Laundering — The Dark Side of Cash
**Section**: 17.2 | **Asset Class**: Cash | **Type**: Illegal activity (documented for educational/awareness purposes only)
## Overview
Money laundering is an activity wherein cash is used as a vehicle to transform illegal profits into legitimate-appearing assets. It is documented here solely for educational and awareness purposes — this is an illegal activity in virtually all jurisdictions and is not a legitimate trading strategy. Understanding its mechanics is relevant for compliance, AML (anti-money-laundering) system design, and regulatory awareness.
## Construction / Mechanics
There are three main steps in a money laundering process:
1. **Placement** — The first and riskiest step. Illegal funds are introduced into the legal economy via fraudulent means, e.g., by dividing funds into small amounts and depositing them into multiple bank accounts (structuring/smurfing) to avoid detection thresholds.
2. **Layering** — Moving the money around between different accounts and even countries, thereby creating complexity and separating the money from its source by several degrees. The goal is to obscure the audit trail.
3. **Integration** — Money launderers recover the funds via legitimate-looking sources, e.g., cash-intensive businesses such as bars, restaurants, car washes, hotels (in some countries), gambling establishments, and parking garages.
## Return Profile / Objective
The "return" is the successful conversion of illicit proceeds into spendable, untraceable wealth. The primary risk is detection and prosecution at the placement stage, which is why smurfing and structuring techniques are employed to stay below reporting thresholds.
## Key Parameters / Signals
- **Placement risk**: highest at initial deposit stage; mitigated by structuring below reporting thresholds
- **Layering complexity**: number of intermediary accounts, jurisdictions, and transactions used to obscure origin
- **Integration vehicles**: choice of business type affects detectability (high-cash-volume businesses preferred)
## Variations
- **Trade-based laundering**: over- or under-invoicing international trade transactions
- **Real estate laundering**: purchasing and reselling property through shell companies
- **Cryptocurrency layering**: use of mixers, privacy coins, and cross-chain swaps to layer funds
## Notes
This strategy is **illegal** in all major jurisdictions. It is documented here exclusively for educational purposes, AML awareness, and to support the design of detection and compliance systems. Financial institutions are required by law (e.g., BSA/AML regulations, FATF guidelines) to implement controls to detect and report suspicious activity consistent with these patterns.

View File

@@ -0,0 +1,51 @@
---
description: "A secured short-term lending strategy where a pawnbroker extends a cash loan against physical collateral, retaining the right to sell the collateral if the loan is not repaid."
tags: [cash, lending, collateral, alternative]
---
# Pawnbroking
**Section**: 17.5 | **Asset Class**: Cash | **Type**: Collateralized lending
## Overview
Pawnbroking is conceptually similar to repurchase agreements (REPOs) but operates in retail/consumer markets and has ancient historical roots. A pawnbroker extends a secured cash loan with a pre-agreed interest rate and period (which can sometimes be extended). The loan is secured with a collateral item of value; if the loan is not repaid with interest as agreed, the collateral is forfeited by the borrower and the pawnbroker can keep it or sell it.
## Construction / Mechanics
**Loan origination:**
- Borrower presents a physical item of value as collateral (jewelry, electronics, vehicles, rare books, musical instruments, etc.)
- Pawnbroker appraises the item and offers a loan amount at a significant discount to appraised value (e.g., 2560% of estimated resale value)
- Borrower receives cash; pawnbroker retains physical possession of the item
- A loan ticket is issued specifying the principal, interest rate, fees, and redemption deadline
**Redemption or forfeiture:**
- If borrower repays principal plus interest within the agreed period, the item is returned
- If borrower fails to repay, the pawnbroker takes full ownership of the collateral and may sell it to recover the loan amount plus a profit margin
**From an investment perspective:**
- The pawnbroker's strategy profits from: (a) interest income on repaid loans, and (b) resale margin on forfeited collateral
- The deep discount on collateral valuation provides a cushion against mispriced or illiquid items
## Return Profile / Objective
Returns come from two sources: interest income on performing loans (typically high, reflecting the high-risk, unbanked borrower profile) and trading profit on forfeited collateral items sold at or above the appraised value. The high interest rates compensate for the non-recourse nature of many pawn loans (the lender's only recourse is the collateral, not the borrower personally).
## Key Parameters / Signals
- **Loan-to-value (LTV) ratio**: loan amount as a fraction of collateral's estimated resale value; typically 2560%
- **Interest rate / fees**: high relative to bank rates; regulated in many jurisdictions with rate caps
- **Loan term**: typically 14 months; extensions often available
- **Collateral liquidity**: items with active resale markets (gold jewelry, electronics) command better LTV ratios
- **Forfeiture rate**: the fraction of loans that are not redeemed; drives the resale revenue component
## Variations
- **Online pawnbroking**: digital platforms for luxury goods, collectibles, and watches
- **Commodity pawnbroking**: pawnbrokers dealing specifically in precious metals and gems (overlap with commodity trading)
- **Title lending / auto pawn**: loans secured against vehicle titles; borrower retains use of the vehicle while the title is held
- **Jewelry/gold dealers**: effectively pawnbrokers who specialize in precious metals with spot-price-linked valuations
## Notes
Pawnbroking is legal and regulated in most jurisdictions, with interest rates and practices governed by consumer lending laws. The pawnbroker trades physical commodities such as silver and gold as a byproduct of forfeited collateral. The strategy is highly local and operationally intensive. It is conceptually the retail analogue of institutional repo markets — both involve a cash loan secured by an asset with a right to liquidate the asset upon default.

View File

@@ -0,0 +1,53 @@
---
description: "A repurchase agreement strategy that borrows or lends cash at a preset interest rate for 1 day to 6 months using securities as collateral, providing immediate liquidity."
tags: [cash, fixed-income, collateral, short-term]
---
# Repurchase Agreement (REPO)
**Section**: 17.4 | **Asset Class**: Cash | **Type**: Collateralized lending / Cash equivalent
## Overview
A repurchase agreement (REPO) is a cash-equivalent asset that provides immediate liquidity at a preset interest rate for a specific period of time in exchange for another asset used as collateral. A REPO strategy amounts to borrowing (or lending) cash with interest in exchange for securities, with the commitment of repurchasing them from (or reselling them to) the counterparty at the end of the term. This type of transaction typically spans from 1 day to 6 months.
## Construction / Mechanics
**From the borrower's perspective (classic repo):**
- The borrower sells securities to the lender at a spot price
- Simultaneously agrees to repurchase those securities at a future date at a higher price
- The difference in prices represents the repo interest (the "repo rate")
- The securities serve as collateral; the lender has recourse to them if the borrower defaults
**From the lender's perspective (reverse repo):**
- The lender buys securities and simultaneously agrees to resell them at a later date
- Effectively a collateralized cash loan earning the repo rate
- Counterparty credit risk is mitigated by holding the collateral
**Mechanics summary:**
- Term: overnight (O/N) to 6 months; "open" repos have no fixed term
- Collateral: typically government securities, though agency bonds and other high-grade paper are used
- Haircut: the collateral is valued at a discount to market price to provide a buffer against collateral price declines
## Return Profile / Objective
The strategy earns (or pays) the repo rate, which is typically close to but slightly below the risk-free rate (for general collateral) or can be significantly below (even negative) for "special" securities in high demand. The primary objective is efficient short-term cash management — deploying idle cash at near-risk-free rates while maintaining near-immediate liquidity.
## Key Parameters / Signals
- **Repo rate**: the annualized interest rate on the transaction; general collateral (GC) rate vs. specific/special rates
- **Haircut**: discount applied to collateral value (e.g., 2% for Treasuries, higher for lower-grade collateral)
- **Term**: overnight, term (fixed date), or open
- **Collateral type**: determines applicable haircut and rate; GC repos use any acceptable security vs. specific repos tied to a named security
- **Margin calls**: triggered if collateral value falls below the required threshold during the term
## Variations
- **Reverse repo**: the lender's side; used by central banks as a monetary policy tool and by money market funds as an investment
- **Tri-party repo**: a clearing bank acts as intermediary, handling collateral management for both parties
- **Securities lending**: conceptually similar; the security owner lends it out for a fee, receiving cash or other securities as collateral
- **GC pooling**: centralized clearing of general collateral repos to improve netting and efficiency
## Notes
REPOs are a foundational instrument in money markets and are used by banks, broker-dealers, money market funds, and central banks. The 2008 financial crisis revealed the fragility of repo markets when collateral quality deteriorated rapidly (the "run on repo"). Counterparty credit risk, collateral quality, and the potential for "fire-sale" dynamics during stress are the primary risk considerations. REPOs are conceptually similar to pawnbroking but operate in institutional markets at vastly larger scale.

View File

@@ -0,0 +1,56 @@
---
description: "Commodity futures strategy that uses CFTC Commitments of Traders (COT) hedging pressure data to identify long/short opportunities based on hedger and speculator positioning."
tags: [commodities, futures, cot, hedging-pressure, positioning]
---
# Trading Based on Hedging Pressure
**Section**: 9.2 | **Asset Class**: Commodities | **Type**: Positioning / Sentiment
## Overview
Hedgers and speculators have systematically different objectives in commodity futures markets. High hedger long positioning signals contango (excess hedging demand pushes futures prices up); high speculator long positioning signals backwardation. By reading the CFTC Commitments of Traders (COT) report, a trader can construct a zero-cost portfolio that exploits these positioning signals with a 6-month typical holding period.
## Construction / Mechanics
The "hedging pressure" (HP) for each group is defined as:
```
HP = (number of long contracts) / (total contracts: long + short)
```
HP lies between 0 and 1.
**Interpretation:**
- High hedgers' HP → indicative of contango
- Low hedgers' HP → indicative of backwardation
- High speculators' HP → indicative of backwardation
- Low speculators' HP → indicative of contango
**Portfolio construction:**
1. Rank all commodity futures by speculators' HP; divide the cross-section into upper and lower halves.
2. Within the upper half (higher speculator HP, i.e., backwardation signal):
- **Buy** futures that are in the **bottom quintile** by hedgers' HP (confirming low hedger demand, strong backwardation signal)
3. Within the lower half (lower speculator HP, i.e., contango signal):
- **Sell** futures that are in the **top quintile** by hedgers' HP (confirming high hedger demand, strong contango signal)
The portfolio is zero-cost and rebalanced with typical formation and holding periods of 6 months.
## Return Profile
Profits when commodity futures that show strong backwardation signals (low hedger HP, high speculator HP) outperform those with strong contango signals. The strategy earns a risk premium for providing liquidity to hedgers who are willing to pay above-fair-value forward prices.
## Key Parameters / Signals
| Parameter | Description |
|-----------|-------------|
| HP (hedgers) | Long / (long + short) for commercial hedgers from COT report |
| HP (speculators) | Long / (long + short) for non-commercial speculators from COT |
| Holding period | Typically 6 months |
| Data source | CFTC Commitments of Traders (weekly) |
## Variations
- Use the net position (long minus short) as the signal rather than the ratio HP.
- Combine COT positioning with the roll-yield signal (Section 9.1) for a multi-factor commodity model.
## Notes
- COT data is published weekly with a 3-day lag, so the signal has limited use for high-frequency trading.
- The classification of "hedger" vs. "speculator" in COT data is self-reported and can be noisy; large commodity index funds are classified differently across report types (legacy vs. disaggregated COT).
- The 6-month holding period smooths over reporting noise but requires patience through short-term adverse moves.
- Strategy performance can degrade when large commodity index investors distort the COT positioning signals.

View File

@@ -0,0 +1,46 @@
---
description: "Portfolio diversification strategy that adds commodity exposure to equity portfolios to exploit their historically low cross-asset correlation and improve risk-adjusted returns."
tags: [commodities, diversification, portfolio-construction, asset-allocation]
---
# Portfolio Diversification with Commodities
**Section**: 9.3 | **Asset Class**: Commodities | **Type**: Portfolio Construction / Asset Allocation
## Overview
Commodity markets typically exhibit low correlation with equity markets. Adding commodity exposure can improve the return-to-risk characteristics of equity-dominant portfolios. Two broad approaches exist: a passive buy-and-hold allocation, and an active tactical allocation that adjusts commodity exposure based on macroeconomic signals such as the Federal Reserve discount rate.
## Construction / Mechanics
### Passive Approach
1. Allocate a preset fraction of available capital to commodity futures (or commodity indices).
2. Hold the commodity position and rebalance periodically (e.g., monthly or annually) back to the target weight.
3. No active signal required; the diversification benefit arises purely from low cross-asset correlation.
### Active (Tactical) Approach
1. Monitor the Federal Reserve discount rate (or a proxy monetary policy indicator).
2. **Increase** commodity exposure when the discount rate decreases (accommodative policy), since commodity returns are empirically positively correlated with monetary easing.
3. **Decrease** commodity exposure when the discount rate increases (tightening policy).
4. The tactical adjustment exploits the empirical link between commodity returns and Fed monetary policy.
## Return Profile
The passive approach targets improved risk-adjusted returns through diversification without requiring any predictive signal. The active approach additionally aims to capture the positive correlation between commodity returns and accommodative monetary conditions, increasing commodity weights when they are most likely to outperform.
## Key Parameters / Signals
| Parameter | Description |
|-----------|-------------|
| Commodity allocation (passive) | Fixed % of portfolio (e.g., 520%) |
| Rebalancing frequency | Monthly or annual for passive; signal-triggered for active |
| Fed discount rate | Primary macro signal for active tactical allocation |
| Cross-asset correlation | Empirically low between commodities and equities; drives diversification benefit |
## Variations
- Use commodity indices (e.g., GSCI, BCOM) for passive exposure rather than individual futures contracts.
- Active allocation can use other macro signals: inflation expectations, industrial production growth, credit spreads.
- Risk-parity weighting (equalising volatility contribution of commodities and equities) rather than fixed notional allocation.
## Notes
- The low equity-commodity correlation is not constant; during crisis periods (e.g., 2008), correlations can spike, reducing diversification benefit at exactly the wrong time.
- The empirical link to Fed policy is regime-dependent; the relationship may be weaker during prolonged zero-rate environments.
- Commodity exposure via futures introduces roll costs (see Section 9.1); the net diversification benefit must be assessed after roll costs.
- Inflation-sensitive commodities (energy, metals) may provide additional value as inflation hedges alongside diversification benefits.

View File

@@ -0,0 +1,69 @@
---
description: "Commodity futures pricing strategy that fits a mean-reverting stochastic model to the term structure and trades futures identified as rich or cheap relative to model-implied fair value."
tags: [commodities, futures, stochastic-model, ornstein-uhlenbeck, pricing, term-structure]
---
# Trading with Pricing Models
**Section**: 9.6 | **Asset Class**: Commodities | **Type**: Relative Value / Model-Based
## Overview
Commodity futures term structures are non-trivial and can be modelled via stochastic processes. Fitting a parametric model (e.g., the Ornstein-Uhlenbeck mean-reverting process) to historical data allows the identification of futures that are rich (sell signal) or cheap (buy signal) relative to the model's predicted fair value. The approach acknowledges that structural mean reversion is a reasonable property for commodity prices.
## Construction / Mechanics
Let S(t) be the spot price and X(t) = ln(S(t)). Model X(t) as a mean-reverting Brownian motion (Ornstein-Uhlenbeck):
```
dX(t) = κ[a - X(t)] dt + σ dW(t) (459)
```
Parameters:
- κ: mean-reversion speed
- a: long-run mean of ln(S)
- σ: log-volatility
- W(t): Q-Brownian motion under risk-free measure Q
Under the standard pricing argument, the futures price F(t,T) is:
```
F(t,T) = E_t(S(T)) (460)
ln(F(t,T)) = E_t(X(T)) + (1/2) V_t(X(T)) (461)
```
This gives the closed-form futures price:
```
ln(F(t,T)) = exp(-κ(T-t)) X(t) + a[1 - exp(-κ(T-t))]
+ (σ²/4κ)[1 - exp(-2κ(T-t))] (462)
```
**Calibration and trading:**
1. Fit κ, a, σ to historical data (e.g., nonlinear least squares on observed futures prices).
2. Compute the model-implied futures price for each contract.
3. Compare market price to model price:
- Market price > model price: **sell signal** (futures is rich)
- Market price < model price: **buy signal** (futures is cheap)
Note: as κ 0, a with κa fixed, this model reduces to the Black-Scholes model.
## Return Profile
Profits when market prices revert toward the model-implied fair values. Returns are driven by mean-reversion in the spread between market and model prices. In-sample fit may be strong but out-of-sample predictive power is model-dependent.
## Key Parameters / Signals
| Parameter | Description |
|-----------|-------------|
| κ | Mean-reversion speed; higher κ faster reversion |
| a | Long-run mean of log-spot price |
| σ | Log-volatility of the spot price |
| F(t,T) model vs. market | Rich/cheap signal: sell if market > model, buy if market < model |
## Variations
- **Multifactor models**: add stochastic convenience yield or stochastic volatility for richer term structure fitting.
- **Black-box / ML models**: fit any model with desirable qualitative properties (e.g., mean reversion) using machine learning, without explicit stochastic dynamics; valid as long as out-of-sample predictive power is demonstrated.
- Combine with roll-yield (Section 9.1) as a complementary signal.
## Notes
- In-sample fit can be excellent even for models with poor predictive power; out-of-sample backtesting is essential (see Paschke and Prokopczuk, 2012).
- Model mis-specification risk: the true dynamics may not be OU; using a flexible model without theoretical grounding is equally valid if it works out-of-sample.
- Parameter instability: κ, a, σ estimated on historical data may shift during structural changes (supply shocks, geopolitical events).
- "Fancy does not equal better" complex models do not necessarily outperform simple ones out-of-sample.

View File

@@ -0,0 +1,51 @@
---
description: "Commodity futures roll-yield strategy that goes long backwardated and short contangoed futures based on the ratio of front-month to second-month prices."
tags: [commodities, futures, roll-yield, term-structure, carry]
---
# Roll Yields
**Section**: 9.1 | **Asset Class**: Commodities | **Type**: Carry / Term Structure
## Overview
When commodity futures are in backwardation (downward-sloping term structure), long futures positions generate positive roll yield because as contracts approach expiry they roll up toward the higher spot price. In contango (upward-sloping term structure), the roll yield is negative. A zero-cost long-short portfolio can be constructed by going long commodities in backwardation and short those in contango.
## Construction / Mechanics
Define the backwardation/contango ratio for each commodity:
```
φ = P₁ / P₂ (454)
```
where P₁ is the front-month futures price and P₂ is the second-month futures price.
- φ > 1: backwardation (front-month > second-month); long futures position earns positive roll yield
- φ < 1: contango (front-month < second-month); short futures position earns positive roll yield
**Portfolio construction:**
- Rank all N commodity futures by φ
- Buy futures with higher values of φ (stronger backwardation)
- Sell futures with lower values of φ (deeper contango)
- Dollar-neutral (zero-cost) implementation
Roll yield is realised when the near-expiry contract is sold (covered) and a longer-dated contract is purchased, or vice versa for short positions.
## Return Profile
Profits from the periodic rolling of positions: as a backwardated contract approaches expiry, its price converges upward to the spot, generating a positive roll return. In contango the opposite holds and short positions benefit. Roll yield is distinct from spot price returns.
## Key Parameters / Signals
| Parameter | Description |
|-----------|-------------|
| φ = P₁/P₂ | Backwardation ratio; φ > 1 → backwardation, φ < 1 contango |
| Ranking quantile | Top/bottom quantile cut-off for long/short selection |
| Roll frequency | Determined by contract expiry calendar |
## Variations
- Extend the ratio beyond the first two contracts to capture the broader term structure slope.
- Combine with hedging pressure (Section 9.2) or momentum signals for a multi-factor commodity strategy.
## Notes
- Roll yields can be substantial in commodities with high storage costs (energy) or seasonal supply/demand patterns (agricultural).
- The ratio φ is a snapshot measure; persistent backwardation or contango is more reliable than transient conditions.
- Transaction costs from rolling (bid-ask spreads on each roll) must be weighed against the expected roll yield.
- Convenience yield (the benefit of holding physical inventory) is the economic driver of backwardation in many commodity markets.

View File

@@ -0,0 +1,55 @@
---
description: "Commodity futures strategy that captures the negative skewness premium by buying low-skewness and selling high-skewness commodity futures, exploiting the empirical negative relationship between return skewness and expected returns."
tags: [commodities, futures, skewness, premium, cross-sectional]
---
# Skewness Premium
**Section**: 9.5 | **Asset Class**: Commodities | **Type**: Skewness / Risk Premium
## Overview
There is an empirically observed negative correlation between the skewness of historical returns and future expected returns across commodity futures. Commodities with highly negatively skewed returns have, on average, higher future expected returns, while those with positively skewed returns have lower expected returns. This mirrors the skewness premium observed in equity options markets and reflects investor preference for positive skewness ("lottery" demand).
## Construction / Mechanics
The skewness of returns for commodity i (i = 1,...,N) over T observations is:
```
S_i = (1 / (σ_i³ T)) Σ [R_is - R̄_i]³ (456)
```
where:
```
R̄_i = (1/T) Σ R_is (457)
σ_i² = (1/(T-1)) Σ [R_is - R̄_i]² (458)
```
and R_is are the historical return observations.
**Portfolio construction:**
- Rank all N commodity futures by S_i
- **Buy** futures in the **bottom quintile** by skewness (most negatively skewed, highest expected return)
- **Sell** futures in the **top quintile** by skewness (most positively skewed, lowest expected return)
- Zero-cost portfolio; rebalanced periodically
## Return Profile
Profits when the negative skewness-expected return relationship holds out-of-sample: low-skewness (left-tail-heavy) commodities outperform high-skewness (right-tail-heavy) ones. The premium compensates investors for bearing left-tail (crash) risk.
## Key Parameters / Signals
| Parameter | Description |
|-----------|-------------|
| S_i | Third standardised moment of historical returns |
| T | Estimation window length (number of return observations) |
| Quintile cut-offs | Bottom quintile (buy) vs. top quintile (sell) |
| Rebalancing | Periodic (monthly or quarterly) |
## Variations
- Use option-implied skewness (from commodity options) instead of realised skewness for a forward-looking signal.
- Combine with value (Section 9.4) or roll yield (Section 9.1) in a multi-factor commodity model.
## Notes
- Realised skewness is estimated with substantial noise, particularly for commodities with short or infrequently traded histories.
- The skewness premium can be concentrated in a small number of time periods; the strategy may have poor risk-adjusted returns in normal markets and large gains during commodity stress events.
- Tail risk is inherent in this strategy: buying low-skewness commodities means accepting left-tail exposure.
- Sufficient sample size T is needed for reliable skewness estimates; skewness estimation requires more data than mean or variance estimation.

View File

@@ -0,0 +1,52 @@
---
description: "Commodity value strategy that buys commodities whose current spot price is low relative to their spot price five years ago, and sells those with relatively high current prices."
tags: [commodities, value, mean-reversion, cross-sectional]
---
# Value
**Section**: 9.4 | **Asset Class**: Commodities | **Type**: Value / Mean-Reversion
## Overview
Analogous to the value strategy in equities (Section 3.3), the commodity value strategy is based on the premise that commodities with currently depressed prices relative to their historical levels are cheap and likely to revert upward, while those at elevated prices are expensive and likely to revert downward. The value ratio uses the 5-year-ago spot price as the benchmark for fair value.
## Construction / Mechanics
The value signal for each commodity is defined as:
```
v = P₅ / P₀ (455)
```
where:
- P₅ is the spot price 5 years ago (alternatively, the average spot price between 4.5 and 5.5 years ago)
- P₀ is the current spot price
A high v means the commodity is currently cheap relative to its 5-year-ago price (good value); a low v means the commodity is currently expensive.
**Portfolio construction:**
- Rank all N commodity futures by v
- **Buy** futures in the top tercile by v (cheapest relative to 5-year history)
- **Sell** futures in the bottom tercile by v (most expensive relative to 5-year history)
- Rebalance monthly
## Return Profile
Profits when commodity prices exhibit long-term mean reversion to their historical levels. The strategy is contrarian over a 5-year horizon, expecting that extreme deviations from historical prices will eventually correct.
## Key Parameters / Signals
| Parameter | Description |
|-----------|-------------|
| v = P₅/P₀ | Value ratio; high v → cheap (buy), low v → expensive (sell) |
| Look-back period | 5 years (or average between 4.5 and 5.5 years ago) |
| Portfolio terciles | Top tercile long, bottom tercile short |
| Rebalancing | Monthly |
## Variations
- Use different look-back horizons (e.g., 3 years or 7 years) to capture different mean-reversion cycles.
- Combine value with momentum (e.g., buy commodities with high v AND positive recent momentum) to avoid "value traps".
- Apply to commodity sub-sectors (energy, metals, agriculture) separately to account for different structural price cycles.
## Notes
- Commodity prices are subject to structural breaks (technological change, supply shocks) that can make historical prices poor benchmarks for fair value.
- The 5-year look-back is long enough to smooth business cycle effects but may include obsolete price regimes.
- Unlike equities, commodities have no earnings or book value; the purely price-based value measure has higher model risk.
- Roll costs from maintaining long futures positions in contangoed commodities can erode value-strategy returns.

View File

@@ -0,0 +1,44 @@
---
description: "Buy an undervalued convertible bond and short the underlying stock using a delta-based hedge ratio to capture the mispricing between the convertible's market price and its fair value."
tags: [convertibles, arbitrage]
---
# Convertible Arbitrage
**Section**: 12.1 | **Asset Class**: Convertibles (Hybrid: Fixed Income + Equity) | **Type**: Arbitrage
## Overview
A convertible bond is a hybrid security with an embedded option to convert the bond to a preset number of the issuer's shares (the conversion ratio) when the stock price reaches the conversion price. Empirically, convertibles at issuance tend to be undervalued relative to their fair value, creating arbitrage opportunities. The strategy buys the convertible bond and simultaneously shorts the underlying stock to hedge the equity exposure.
## Construction / Mechanics
The hedge ratio (number of shares to short) is:
```
h = Δ × C (492)
Δ = ∂V/∂S (493)
```
- `C` = conversion ratio (number of shares per bond)
- `V` = value of the conversion option (model-dependent)
- `S` = underlying stock price
- `Δ` = option delta (model-dependent)
The position is typically held for 612 months starting at the convertible's issuance date. The hedge ratio is updated daily as delta changes with the stock price.
## Return Profile
Profits when the market price of the convertible converges toward its theoretical fair value. The long convertible position captures the undervaluation premium. The short stock position hedges directional equity risk, leaving exposure primarily to the convergence of the mispricing.
## Key Parameters / Signals
- **Conversion ratio** `C`: fixed at issuance
- **Delta** `Δ = ∂V/∂S`: requires a model for the conversion option value `V`; changes daily with stock price `S`
- **Gamma hedging**: since delta itself changes with `S`, the option gamma can be used to refine dynamic hedging (see Section 7.4.1)
- **Entry timing**: position typically initiated at issuance when undervaluation is most pronounced
## Variations
- **Gamma hedging overlay**: use gamma to dynamically adjust the hedge ratio as the stock moves, capturing additional convexity profits
## Notes
- Hedge ratios are model-dependent; model risk is a key concern
- Nonparametric hedge estimation using historical data (constrained regression of MBS price P w.r.t. swap rate R) is an alternative to model-based delta
- Liquidity risk: convertible bonds are less liquid than the underlying stock
- Crowding risk: convertible arbitrage is a well-known strategy; forced unwinds by other funds can cause losses

View File

@@ -0,0 +1,46 @@
---
description: "Simultaneously buy and sell two convertible bonds from the same issuer, long the higher option-adjusted spread and short the lower, profiting when the spreads converge."
tags: [convertibles, arbitrage, fixed-income]
---
# Convertible Option-Adjusted Spread (OAS) Arbitrage
**Section**: 12.2 | **Asset Class**: Convertibles (Hybrid: Fixed Income + Equity) | **Type**: Relative Value / Arbitrage
## Overview
This strategy simultaneously buys and sells two different convertible bonds issued by the same company. The long position is in the bond with the higher option-adjusted spread (OAS) and the short position is in the bond with the lower OAS. The trade is profitable when the two spreads converge toward each other.
## Construction / Mechanics
The price of a convertible bond is decomposed as:
```
P_C = P_B + V (494)
```
- `P_C` = convertible bond price
- `P_B` = straight bond price (the bond without the embedded option), computed via standard discounting of future cash flows
- `V` = value of the conversion option (a call option on the issuer's stock)
**OAS Calculation Procedure:**
1. At the initial iteration, compute `V^(0)` using a call option pricing model with the zero-coupon government Treasury curve as the risk-free rate
2. Check if `V^(0)` matches the market-implied option value `P_C^mkt - P_B`
3. If not, iteratively parallel-shift the Treasury curve (e.g., using bisection) until the computed `V` equals `P_C^mkt - P_B`
4. The parallel shift obtained is the OAS
## Return Profile
Profits when the OAS of the long bond decreases (price rises) and/or the OAS of the short bond increases (price falls), i.e., when the two spreads converge. Returns are driven by relative mispricing between the two convertibles of the same issuer, not by the absolute level of spreads or interest rates.
## Key Parameters / Signals
- **OAS differential**: the spread between the two bonds' OAS values; wider differential implies larger potential profit but also higher risk if divergence continues
- **Same-issuer requirement**: both bonds must be from the same issuer to neutralize credit risk
- **Convergence horizon**: the expected time for OAS convergence to occur
## Variations
- **Multi-bond basket**: extend to a basket of convertibles from the same issuer, weighting by OAS rank
- **Cross-issuer OAS**: relax the same-issuer constraint and use credit hedges to neutralize issuer-level credit risk
## Notes
- The OAS computation requires an option pricing model for `V`; model risk affects both legs
- The iterative parallel-shift procedure assumes the Treasury curve shape is fixed; actual curve shape changes can affect the OAS estimate
- Liquidity mismatch between the two convertible bonds can create mark-to-market losses even when the fundamental trade thesis is correct
- This strategy is distinct from straight convertible arbitrage (12.1): there is no stock short; both legs are bonds from the same issuer

View File

@@ -0,0 +1,175 @@
---
description: "An artificial neural network (ANN) strategy that forecasts short-term BTC price movements using technical indicators (EMA, EMSD, RSI) as inputs and quantile-based classification as output."
tags: [crypto, machine-learning, ann, bitcoin, technical-analysis]
---
# Artificial Neural Network (ANN) Strategy
**Section**: 18.2 | **Asset Class**: Cryptocurrencies | **Type**: Machine learning / Price prediction
## Overview
This strategy uses an ANN to forecast short-term movements of BTC price based on input technical indicators. Unlike equities, cryptocurrencies have no evident "fundamentals" on which to build value-based strategies, so cryptocurrency trading strategies tend to rely on trend data mining via machine learning techniques. The ANN classifies the future normalized return into quantile buckets and generates buy/sell signals accordingly.
## Construction / Mechanics
### Price and Return Normalization
Let `P(t)` be the BTC price at time `t`, where `t = 1, 2, ...` is measured in some units (e.g., 15-minute intervals; `t = 1` is the most recent time).
**Return:**
```
R(t) = P(t)/P(t+1) - 1 (521)
```
**Serial mean return** over T₁ periods:
```
R_bar(t, T₁) = (1/T₁) * sum_{t'=t+1}^{t+T₁} R(t') (523)
```
**Serially demeaned return:**
```
R_tilde(t, T₁) = R(t) - R_bar(t, T₁) (522)
```
**Variance:**
```
[sigma(t, T₁)]² = (1/(T₁-1)) * sum_{t'=t+1}^{t+T₁} [R_tilde(t', T₁)]² (525)
```
**Normalized (serially demeaned) return:**
```
R_hat(t, T₁) = R_tilde(t, T₁) / sigma(t, T₁) (524)
```
For notational simplicity the T₁ parameter is omitted below and `R_hat(t)` denotes the normalized return. T₁ should be chosen long enough to provide a reasonable volatility estimate.
### Input Layer: Technical Indicators
**Exponential Moving Average (EMA):**
```
EMA(t, lambda, tau) = ((1-lambda)/(1-lambda^tau)) * sum_{t'=t+1}^{t+tau} lambda^{t'-t-1} * R_hat(t') (526)
```
**Exponential Moving Standard Deviation (EMSD):**
```
[EMSD(t, lambda, tau)]² = ((1-lambda)/(lambda - lambda^tau)) * sum_{t'=t+1}^{t+tau} lambda^{t'-t-1} * [R_hat(t') - EMA(t, lambda, tau)]² (527)
```
**Relative Strength Index (RSI):**
```
RSI(t, tau) = nu_+(t, tau) / [nu_+(t, tau) + nu_-(t, tau)] (528)
nu_±(t, tau) = sum_{t'=t+1}^{t+tau} max(±R_hat(t'), 0) (529)
```
Where: `tau` is the moving average length; `lambda` is the exponential smoothing parameter (to reduce parameters, one can set `lambda = (tau-1)/(tau+1)`).
Typically RSI > 0.7 is interpreted as overbought; RSI < 0.3 as oversold.
### Input Layer Construction
The input layer consists of:
- `R_hat(t)` the current normalized return
- `EMA(t, lambda_a, tau_a)` for `a = 1, ..., m`
- `EMSD(t, lambda_a, tau_a)` for `a = 1, ..., m`
- `RSI(t, tau_{a'})` for `a' = 1, ..., m'`
Example parameter choices (from the literature):
- `tau_a` corresponding to 30 min, 1 hr, 3 hrs, 6 hrs (so `m = 4`)
- `tau_{a'}` corresponding to 3 hrs, 6 hrs, 12 hrs (so `m' = 3`)
### Output Layer: Quantile Classification
The objective is to forecast which quantile the future normalized return `R_hat(t)` will belong to.
Let `K` be the number of quantiles. For training dataset `D_train`, compute the `(K-1)` quantile values `q_alpha`, `alpha = 1, ..., K-1`, of `R_hat(t)`, `t in D_train`.
Define supervisory K-vectors `S_alpha(t)`, `alpha = 1, ..., K`:
```
S_1(t) = 1, if R_hat(t) <= q_1
S_alpha(t) = 1, if q_{alpha-1} <= R_hat(t) < q_alpha, for 1 < alpha < K (530)
S_K(t) = 1, if q_{K-1} <= R_hat(t)
S_alpha(t) = 0, otherwise
```
The output layer produces a nonnegative K-vector `p_alpha(t)` of class probabilities:
```
sum_{alpha=1}^{K} p_alpha(t) = 1 (531)
```
### Network Architecture
The ANN has `L` layers labeled `l = 1, ..., L`:
- `l = 1`: input layer
- `l = L`: output layer
- Intermediate layers: hidden layers
At each layer `l`, there are `N^(l)` nodes with vectors `X_vec^(l)` having components `X_{i(l)}^(l)`:
**Forward propagation:**
```
X_{i(l)}^(l) = h_{i(l)}^(l)(Y_vec^(l)), l = 2, ..., L (532)
Y_{i(l)}^(l) = sum_{j(l-1)=1}^{N^(l-1)} A_{i(l)j(l-1)}^(l) * X_{j(l-1)}^(l-1) + B_{i(l)}^(l) (533)
```
Where: `A_{i(l)j(l-1)}^(l)` are the weights; `B_{i(l)}^(l)` are the biases (both determined via training).
**Activation functions:**
Hidden layers use ReLU:
```
h_{i(l)}^(l)(Y_vec^(l)) = max(Y_{i(l)}^(l), 0), l = 2, ..., L-1 (534)
```
Output layer uses softmax (ensuring probabilities sum to 1):
```
h_{i(L)}^(L)(Y_vec^(L)) = Y_{i(L)}^(L) * [sum_{j(L)=1}^{N(L)} Y_{j(L)}^(L)]^{-1} (535)
```
ReLU fires a neuron only if `Y_{i(l)}^(l) > 0`; softmax enforces condition (531).
### Training: Cross-Entropy Loss
The error function to minimize is the cross-entropy:
```
E = - sum_{t in D_train} sum_{alpha=1}^{K} S_alpha(t) * ln(p_alpha(t)) (536)
```
Minimized via stochastic gradient descent (SGD), which iterates until convergence.
### Trading Signal
```
Signal = Buy, iff max(p_alpha(t)) = p_K(t) (537)
Sell, iff max(p_alpha(t)) = p_1(t)
```
The trader buys BTC if the predicted class is `p_K(t)` (the top quantile) and sells if it is `p_1(t)` (the bottom quantile). This rule can be modified e.g., buy on top 2 quantiles and sell on bottom 2 quantiles.
## Return Profile / Objective
The strategy profits when the ANN correctly classifies the direction and magnitude of short-term BTC price movements. Returns are driven by the quality of the technical indicator signals and the ability of the trained network to generalize out-of-sample. Given BTC's high volatility, even modest directional accuracy can produce significant returns.
## Key Parameters / Signals
- `T₁`: lookback for return normalization (volatility estimation window)
- `tau_a`: EMA/EMSD lookback periods (e.g., 30 min, 1 hr, 3 hrs, 6 hrs)
- `tau_{a'}`: RSI lookback periods (e.g., 3 hrs, 6 hrs, 12 hrs)
- `lambda`: exponential smoothing factor; can be set to `(tau-1)/(tau+1)`
- `K`: number of quantile classes (e.g., K=2 for simple up/down)
- `N^(l)`: number of nodes at each hidden layer
- `L`: total number of layers
- `d_1`: number of most-recent time points excluded from training data to ensure all indicators are computed on sufficient data
## Variations
- **K=2 binary classification**: simple up/down forecast; buy/sell signal directly
- **K>2 multi-quantile**: more granular signal strength; trade only on extreme quantiles
- **Extended indicator set**: add MACD, Bollinger Bands, volume indicators to the input layer
- **LSTM/RNN variant**: replace feedforward ANN with recurrent architecture to better capture time-series dependencies
## Notes
The primary risk is overfitting: many free parameters (`tau_a`, `lambda_a`, `tau_{a'}`, `N^(l)`, `K`) must be chosen and the ever-present danger of overfitting various free parameters necessitates careful out-of-sample backtesting. The training dataset must exclude the most recent `d_1` time points to ensure all EMA, EMSD, and RSI values are computed using the required number of data points. This strategy is conceptually similar to the single-stock KNN trading strategy (Section 3.17) but uses ANN instead of k-nearest neighbors. No fundamental valuation of BTC is implied.

View File

@@ -0,0 +1,128 @@
---
description: "A naïve Bayes Bernoulli classifier applied to Twitter sentiment data to forecast BTC price direction, generating buy/sell signals from keyword-frequency feature vectors."
tags: [crypto, machine-learning, nlp, sentiment, bitcoin, naive-bayes]
---
# Sentiment Analysis — Naïve Bayes Bernoulli
**Section**: 18.3 | **Asset Class**: Cryptocurrencies | **Type**: Machine learning / NLP sentiment
## Overview
This strategy applies a social media sentiment analysis classification scheme to forecast the direction (or quantile) of BTC price movements based on Twitter data. It uses the naïve Bayes Bernoulli model to classify tweets into outcome classes and generate trading signals. The premise is that aggregate social media sentiment contains predictive information about short-term crypto price movements.
## Construction / Mechanics
### Data Collection and Preprocessing
1. Collect all tweets containing at least one keyword from a pertinent learning vocabulary `V` over some timeframe
2. Clean data: remove duplicate tweets from bots, remove stop-words (e.g., "the", "is", "in", "which"), perform stemming (reduce words to base forms, e.g., "investing" and "invested" → "invest")
3. Stemming can be performed using the Porter stemming algorithm
Let:
- `M = |V|` = number of keywords in the learning vocabulary
- `N` = number of tweets in the dataset
- `i = 1, ..., N` labels tweets
- `a = 1, ..., M` labels words `w_a` in `V`
### Feature Vector Construction (Bernoulli Model)
Assign a feature M-vector `X_i` to each tweet `i`:
**Bernoulli (binary presence/absence):**
```
X_{ia} = 0 if word w_a not present in tweet T_i
X_{ia} = 1 if word w_a is present in tweet T_i (Bernoulli)
```
Alternative (multinomial): `X_{ia} = n_{ia}`, the number of times `w_a` appears in `T_i`.
The Bernoulli case is the focus of this strategy.
### Classification Framework
Define `K` outcome classes `C_alpha`, `alpha = 1, ..., K`:
- Simplest case: `K = 2` (BTC goes up or down) — provides buy/sell signal
- Alternative: `K` quantiles of the normalized return `R_hat(t)` (as in the ANN strategy, Section 18.2)
**Goal:** given the `N` feature vectors `X_1, ..., X_N`, predict class `C_alpha`.
### Bayesian Foundation
By Bayes' theorem:
```
P(A|B) = P(B|A) * P(A) / P(B) (538)
```
The posterior probability of class `C_alpha` given features `X_1, ..., X_N`:
```
P(C_alpha | X_1, ..., X_N) = P(X_1, ..., X_N | C_alpha) * P(C_alpha) / P(X_1, ..., X_N) (539)
```
Note `P(X_1, ..., X_N)` is independent of `C_alpha` and acts only as a normalization constant.
### Naïve (Conditional Independence) Assumption
The naïve Bayes simplification assumes that for a given class `C_alpha`, all features `X_i` are conditionally independent:
```
P(X_i | C_alpha, X_1, ..., X_{i-1}, X_{i+1}, ..., X_N) = P(X_i | C_alpha) (540)
```
This gives:
```
P(C_alpha | X_1, ..., X_N) = gamma * P(C_alpha) * prod_{i=1}^{N} P(X_i | C_alpha) (541)
gamma = 1 / P(X_1, ..., X_N) (542)
```
### Bernoulli Likelihood
For the Bernoulli model, the conditional probability of feature vector `X_i` given class `C_alpha`:
```
P(X_i | C_alpha) = prod_{a=1}^{M} Q_{ia alpha} (543)
```
Where:
```
Q_{ia alpha} = P(w_a | C_alpha), if X_{ia} = 1 (544)
Q_{ia alpha} = 1 - P(w_a | C_alpha), if X_{ia} = 0 (545)
```
The conditional probabilities `P(w_a | C_alpha)` are estimated from word occurrence frequencies in the training data. Similarly, `P(C_alpha)` is estimated from the training data class frequencies.
### Prediction Rule
Set the forecasted class `C_pred` to the one with maximum posterior probability:
```
C_pred = argmax_{C_alpha in {1,...,K}} P(C_alpha) * prod_{i=1}^{N} prod_{a=1}^{M} [P(w_a | C_alpha)]^{X_{ia}} * [1 - P(w_a | C_alpha)]^{1 - X_{ia}} (546)
```
### Trading Signal
- For `K = 2`: `C_pred = 1` → Sell; `C_pred = 2` → Buy (consistent with ANN signal convention)
- For `K` quantiles: trade on extreme quantile predictions analogously to the ANN strategy
## Return Profile / Objective
Returns are driven by the predictive content of Twitter sentiment about short-term BTC price direction. The strategy profits when aggregate social media tone (bullish vs. bearish language in the vocabulary) correlates with subsequent price movements. High crypto volatility means that even modest accuracy gains over random can generate meaningful returns.
## Key Parameters / Signals
- **Vocabulary `V`**: the set of keywords relevant to BTC price forecasting; quality of `V` critically affects performance
- **`K`**: number of outcome classes (2 for binary up/down, or quantile-based)
- **`M = |V|`**: vocabulary size; larger vocabulary increases model complexity and overfitting risk
- **Training window**: timeframe for estimating `P(C_alpha)` and `P(w_a | C_alpha)`
- **Stemming algorithm**: Porter or similar; affects effective vocabulary size
- **Stop-word list**: removal of common non-informative words reduces noise
## Variations
- **Multinomial naïve Bayes**: uses word count `n_{ia}` instead of binary presence; can better capture emphasis
- **Support vector machine (SVM)**: alternative classifier on the same feature vectors
- **Logistic regression**: another popular alternative to naïve Bayes for text classification
- **Tree boosting**: gradient boosting applied to tweet feature vectors for BTC direction prediction
- **Multi-source sentiment**: extend beyond Twitter to Reddit (r/bitcoin), news feeds, Telegram channels
## Notes
The naïve (conditional independence) assumption is rarely exactly true but often works well in practice for text classification. The key challenge is vocabulary construction — the learning vocabulary `V` must be chosen to be informative about BTC price movements specifically, not just generally related to Bitcoin. Laplace smoothing (adding a pseudo-count) is typically applied to avoid zero probabilities for words not observed in a given class. Compared to the ANN strategy, naïve Bayes is more interpretable and computationally cheaper to train. Social media strategies are vulnerable to coordinated manipulation (pumping sentiment with bots) and regime changes in platform usage patterns.

View File

@@ -0,0 +1,50 @@
---
description: "Actively acquire distressed assets with the goal of obtaining management control, then drive reorganization through planning a restructuring, buying outstanding debt for equity conversion, or providing secured loans that convert to equity upon reorganization."
tags: [distressed-assets, credit, active-investing, private-equity]
---
# Active Distressed Investing
**Section**: 15.2 | **Asset Class**: Distressed Assets | **Type**: Active / Control-Oriented
## Overview
Unlike passive distressed debt buying (Section 15.1), active distressed investing aims to acquire sufficient ownership or control to influence the management and direction of the distressed company. When a company faces financial distress, it can file for Chapter 11 bankruptcy protection to reorganize under U.S. court supervision, or it can work directly with creditors outside of court. Active investors participate in or drive this reorganization process to generate returns.
## Construction / Mechanics
The active investor accumulates a significant position in the distressed company's debt or equity to obtain standing and leverage in the reorganization process. Larger debt holders tend to submit more competitive reorganization plans. Three primary sub-strategies are described below (as Variations).
## Return Profile
Returns are driven by the successful increase in the company's enterprise value through the reorganization process. Active investors can extract value through:
- Control of the reorganized entity
- Conversion of debt to equity at favorable terms
- Secured loan conversion to equity with control rights
Returns can be very large but require significant legal, operational, and financial expertise. The investment horizon is typically multi-year.
## Key Parameters / Signals
- **Debt claim size**: larger positions give more influence in Court and in out-of-court negotiations
- **Debt seniority**: senior secured creditors have more leverage; subordinated debt holders have less but more upside on equity conversion
- **Chapter 11 vs. out-of-court**: Chapter 11 provides a formal legal framework; out-of-court is faster but requires creditor consensus
- **Enterprise valuation**: ability to assess the reorganized company's value is critical to determine fair exchange ratios
## Variations
### 15.2.1 Planning a Reorganization
An investor submits a reorganization plan to Court with the objective of obtaining participation in the management of the company, increasing its value, and generating profits. Plans submitted by significant debt holders tend to be more competitive and are more likely to be approved by the Court and creditors.
### 15.2.2 Buying Outstanding Debt
The investor buys outstanding debt of the distressed firm at a discount with the view that, after reorganization, part of this debt will be converted into the firm's equity, thereby giving the investor a certain level of control of the reorganized company. The discount at which debt is purchased represents the potential upside if the company's equity value post-reorganization exceeds the implied value in the purchase price.
### 15.2.3 Loan-to-Own
The investor finances (via secured loans) a distressed firm that is not yet bankrupt, with the view that the firm either:
- (i) overcomes its distress, avoids bankruptcy, and increases its equity value (the secured loan is repaid at a premium), or
- (ii) files for Chapter 11 protection, upon reorganization the secured loan is converted into the firm's equity with control rights
This strategy is particularly attractive when the investor believes the firm's assets are worth more under new management.
## Notes
- Requires deep operational, legal, and financial restructuring expertise; typically executed by specialist hedge funds and private equity firms
- Intercreditor conflicts: multiple classes of creditors with competing interests can delay or complicate the reorganization
- Regulatory considerations: acquiring control of a company through debt conversion may trigger regulatory approvals (antitrust, industry-specific)
- Time horizon uncertainty: Chapter 11 cases can last from months to several years; secured loans may be tied up indefinitely if bankruptcy is protracted
- The "loan-to-own" strategy (15.2.3) has attracted regulatory scrutiny in some jurisdictions as potentially predatory toward the distressed borrower

View File

@@ -0,0 +1,44 @@
---
description: "Passively buy a diversified portfolio of deeply discounted distressed debt (yield spread >1,000 bps over Treasuries) and hold through reorganization, expecting high returns on the subset of positions that recover."
tags: [distressed-assets, credit, fixed-income, value]
---
# Buying and Holding Distressed Debt (Passive)
**Section**: 15.1 | **Asset Class**: Distressed Assets (Fixed Income / Credit) | **Type**: Value / Passive
## Overview
Distressed securities are those whose issuers are undergoing financial or operational distress, default, or bankruptcy. A common definition of distressed debt is when the yield spread between the issuer's bonds and Treasury bonds exceeds a preset threshold (e.g., 1,000 basis points). This passive strategy buys distressed debt at a steep discount and holds it, expecting (hoping) the company will repay its debt. The portfolio is diversified across industries, entities, and debt seniority levels.
## Construction / Mechanics
- **Definition**: distressed debt = yield spread over Treasuries > ~1,000 basis points
- **Diversification**: spread across industries, issuers, and debt seniority levels (senior secured, senior unsecured, subordinated)
- **Entry timing**: two common approaches:
1. At the end of the default month
2. At the end of the bankruptcy-filing month
— both aim to exploit overreaction in the distressed debt market at these key dates
- **Hold**: position is held passively through the reorganization/recovery process
Passive strategies may also use models (see Section 15.3) to pre-screen assets and predict which companies are likely to declare bankruptcy, selecting only those positioned for successful reorganization.
## Return Profile
Only a small fraction of held assets are expected to have positive returns, but those that do provide high rates of return (e.g., full par recovery from a deeply discounted purchase). Returns are highly skewed and non-normal. The driver of returns is successful company reorganization — either an out-of-court debt restructuring or a Chapter 11 bankruptcy reorganization.
## Key Parameters / Signals
- **Yield spread threshold**: typically >1,000 bps over comparable-maturity Treasuries as a distress indicator
- **Entry timing**: end of default month or end of bankruptcy-filing month captures the overreaction premium
- **Debt seniority**: senior secured debt has higher recovery rates; subordinated debt offers higher upside if the company fully recovers
- **Industry and issuer diversification**: essential due to high idiosyncratic default risk; a single large default can dominate portfolio returns
- **Bankruptcy prediction models**: logistic regression or similar models on financial ratios to pre-screen for likely successful reorganizations (see Section 15.3)
## Variations
- **Focus on defaults**: buy at the end of the default month, targeting market overreaction to default events
- **Focus on bankruptcy filings**: buy at the end of the bankruptcy-filing month, targeting overreaction to Chapter 11 filings
- **Seniority-focused**: concentrate in senior secured debt for higher recovery certainty (lower return, lower variance)
## Notes
- Illiquidity: distressed debt is highly illiquid; exit before resolution may require large price concessions
- Workout timeline: bankruptcy proceedings can take years; capital is tied up for an uncertain duration
- Legal complexity: debt holders in bankruptcy proceedings face complex intercreditor disputes, cram-down risks, and professional fees
- Expected value of total portfolio is positive but heavily dependent on the few positions that recover fully
- This is a passive strategy — the investor does not seek to influence the reorganization process (contrast with Section 15.2)

View File

@@ -0,0 +1,58 @@
---
description: "Exploit the distress risk puzzle by going long the safest (lowest bankruptcy probability) stocks and short the riskiest, constructing a zero-cost HMD (healthy-minus-distressed) portfolio rebalanced monthly."
tags: [distressed-assets, equities, factor-investing, long-short]
---
# Distress Risk Puzzle
**Section**: 15.3 | **Asset Class**: Distressed Assets (Equities) | **Type**: Factor / Long-Short
## Overview
Early studies suggested that companies more prone to bankruptcy offer higher returns as a risk premium. However, more recent and robust studies find the opposite: distressed (high bankruptcy probability) companies do not outperform healthier ones, and healthier companies actually offer higher returns. This is the "distress risk puzzle." The strategy exploits it by buying the safest companies and selling the riskiest (a healthy-minus-distressed, or HMD, zero-cost long-short portfolio).
## Construction / Mechanics
1. **Estimate bankruptcy probability** `P_i` for each stock `i = 1, ..., N` using, e.g., logistic regression on financial variables:
```
logit(P_i) = β_0 + β_1 × (leverage) + β_2 × (profitability) + ...
```
2. **Sort stocks** into deciles by `P_i`
3. **Construct zero-cost portfolio**:
- **Short** the top decile (highest `P_i` — most distressed)
- **Long** the bottom decile (lowest `P_i` — healthiest)
4. **Rebalance** monthly (annual rebalancing produces similar returns)
## Return Profile
The HMD portfolio profits when healthy stocks outperform distressed stocks, which empirical evidence suggests happens persistently. Returns are driven by the cross-sectional spread in returns between the safest and riskiest firms. The strategy has equity-like volatility and is exposed to periods of market stress.
## Key Parameters / Signals
- **Bankruptcy probability `P_i`**: core signal; modeled via logistic regression or other classification methods on financial variables (leverage, profitability, coverage ratios, market-to-book, etc.)
- **Decile cutoffs**: top and bottom decile are standard; tighter cutoffs increase signal strength but reduce breadth
- **Rebalancing frequency**: monthly is standard; annual rebalancing yields similar returns with lower turnover
## Variations
### 15.3.1 Distress Risk Puzzle — Risk Management
The standard HMD strategy has a high time-varying market beta that turns significantly negative following market downturns (associated with increased volatility). This can cause large losses when the market rebounds abruptly. To mitigate this, a volatility-scaled version is used:
```
HMD* = (σ_target / σ_hat) × HMD (519)
```
- `HMD` = standard HMD portfolio return (from Section 15.3)
- `σ_target` = target volatility level, typically 10%15% (set per trader preferences)
- `σ_hat` = estimated realized volatility over the prior year using daily data
**Interpretation**:
- If `σ_hat = σ_target`: 100% of the investment is allocated
- If `σ_hat > σ_target`: allocation is reduced below 100% (de-leverage in high-vol regimes)
- If `σ_hat < σ_target`: allocation exceeds 100% (leverage in low-vol regimes)
Alternatively, the allocation can be capped at 100% (`min(σ_target / σ_hat, 1)`) to avoid leverage entirely.
## Notes
- The distress risk puzzle is a well-documented anomaly but its persistence is debated; it may partly reflect data-mining or survivorship bias
- The HMD strategy has high time-varying beta to the equity market; risk management via volatility scaling (15.3.1) is essential for production use
- Bankruptcy probability models require regular recalibration; financial ratios used as inputs are sensitive to accounting changes
- Short-selling the most distressed stocks can be expensive (high borrow costs) and difficult (low float, high short interest)
- Regulatory restrictions on short selling may limit implementation in certain markets or during market stress periods
- Similar time-varying beta behavior is observed in other factor-based strategies (momentum, value, etc.), suggesting a common structural risk

View File

@@ -0,0 +1,46 @@
---
description: "Rotates into sector ETFs with the highest Jensen's alpha estimated from a Fama-French factor regression, replacing raw cumulative return momentum with factor-adjusted alpha as the ranking signal."
tags: [etfs, alpha, momentum, fama-french, rotation]
---
# Alpha Rotation
**Section**: 4.2 | **Asset Class**: ETFs | **Type**: Momentum / Factor-Adjusted Rotation
## Overview
Alpha rotation is structurally the same as the sector momentum rotation strategy (Section 4.1), but replaces cumulative ETF returns `R_i^cum` with ETF alphas `alpha_i`. These alphas are the Jensen's alpha regression coefficients from a serial regression of each ETF's returns on the Fama-French factors, representing the ETF's return unexplained by common risk factors.
## Construction / Signal
Run a serial regression of ETF returns `R_i(t)` on the 3 Fama-French factors (MKT, SMB, HML):
```
R_i(t) = alpha_i + beta_{1,i} MKT(t) + beta_{2,i} SMB(t) + beta_{3,i} HML(t) + epsilon_i(t) (364)
```
The regression coefficient `alpha_i` (Jensen's alpha) corresponds to the intercept and measures the ETF's risk-adjusted excess return relative to the Fama-French model. This alpha replaces `R_i^cum` as the ranking criterion.
ETFs are ranked by `alpha_i` (descending). Buy top-decile ETFs (highest alpha) and optionally short bottom-decile ETFs (lowest/most-negative alpha).
## Entry / Exit Rules
- **Entry**: At rebalance, estimate alpha for each ETF over the estimation period; rank and enter positions in top-decile (long) and optionally bottom-decile (short).
- **Exit**: Hold for the standard holding period; rebalance at next scheduled interval.
- **Estimation period**: Typically 1 year; returns `R_i(t)` are daily or weekly.
## Key Parameters
- **Factor model**: 3 Fama-French factors (MKT, SMB, HML); note alpha here is Jensen's alpha for ETF returns, not mutual fund alpha
- **Estimation period**: Typically 1 year
- **Return frequency for regression**: Daily or weekly `R_i(t)`
- **Holding period**: Same as sector momentum rotation (13 months)
- **Ranking criterion**: `alpha_i` (intercept of Fama-French regression)
## Variations
- **4-factor model**: Add Carhart momentum factor MOM(t) to regression for a 4-factor alpha
- **R-squared augmentation**: Combine alpha ranking with R-squared selectivity measure (see Section 4.3)
- **Long-only**: Buy only top-decile ETFs by alpha
## Notes
- Estimation period is typically 1 year with daily or weekly return data.
- Jensen's alpha here is defined for ETF returns (not mutual fund returns as in Jensen, 1968).
- Alpha rotation is analytically cleaner than raw momentum rotation as it removes systematic factor exposures from the ranking signal.
- The MA filter and dual-momentum variations from Section 4.1.1 and 4.1.2 can also be applied here.
- Can be combined with R-squared (Section 4.3) to further refine ETF selection.

View File

@@ -0,0 +1,49 @@
---
description: "Exploits the negative drift of leveraged ETF pairs by simultaneously shorting both a leveraged ETF and its inverse counterpart tracking the same index, capturing decay from daily rebalancing compounding."
tags: [etfs, leveraged-etf, letf, short-volatility, decay]
---
# Leveraged ETFs (LETFs)
**Section**: 4.5 | **Asset Class**: ETFs | **Type**: Short-Volatility / Structural Decay
## Overview
A leveraged (or inverse) ETF seeks to deliver a fixed multiple (2x, 3x) or the inverse (-1x, -2x, -3x) of the daily return of its underlying index. To maintain the target daily leverage, LETFs must rebalance every day — buying when the market is up and selling when it is down. This daily rebalancing creates a negative drift (volatility decay) in the long run, which can be exploited by shorting both a leveraged ETF and its corresponding leveraged inverse ETF on the same underlying index.
## Construction / Signal
A leveraged ETF with leverage factor L rebalances daily to maintain L × (daily index return). This requires:
- **On up days**: Buy more of the underlying index
- **On down days**: Sell the underlying index
The compounding of daily returns with daily rebalancing creates a path-dependent negative drift over time:
```
LETF cumulative return < L × (index cumulative return) [for L > 1 or L < -1]
```
**Strategy**: Short both a leveraged ETF (e.g., 2x) and its leveraged inverse ETF (-2x) on the same underlying index. Both positions decay in value over time due to daily rebalancing, generating profit from the combined negative drift.
Proceeds from both short positions can be invested in an uncorrelated asset (e.g., a Treasury ETF).
## Entry / Exit Rules
- **Entry**: Simultaneously short a leveraged ETF (e.g., 2x long) and its corresponding inverse leveraged ETF (e.g., 2x inverse) on the same underlying index.
- **Exit**: Positions are held as long as both ETFs continue to decay; may require periodic rebalancing of the short pair as relative prices change.
- **Capital deployment**: Invest the short proceeds into a Treasury ETF or other low-risk asset.
## Key Parameters
- **Leverage factor**: 2x or 3x (and their -2x or -3x inverses)
- **Underlying index**: Same index for both the leveraged and inverse leveraged ETF
- **Rebalancing of short pair**: Periodically rebalance the short positions to maintain equal dollar exposure
- **Volatility regime**: Decay is larger in high-volatility regimes
## Variations
- **3x pair**: Short a 3x leveraged ETF and its -3x inverse (higher decay, higher risk)
- **Single-leg short**: Short only the leveraged (not inverse) ETF when directional bias exists
- **Volatility regime filter**: Enter positions only in high-volatility environments where decay is expected to be larger
## Notes
- The negative drift from daily rebalancing is mathematically guaranteed over time for both the leveraged and inverse ETF, making this a structural (not purely alpha-dependent) source of return.
- **Key risk**: In the short term, if one leg of the short pair (e.g., the inverse ETF) has a large positive return (the market rallies strongly), the short position in the inverse ETF suffers a sizable loss. This short-term risk can be significant even though the long-term drift is negative.
- The strategy can have a significant downside in the short term if one short leg moves sharply against the position.
- Transaction costs (borrow costs for short selling LETFs, bid-ask spreads) must be carefully considered; LETF borrow rates can be elevated.
- Volatility decay is proportional to variance: approximately `L(L-1)/2 × sigma^2` per period for a leverage factor L.

View File

@@ -0,0 +1,57 @@
---
description: "Constructs a dollar-neutral ETF portfolio by selling ETFs with high Internal Bar Strength (IBS, close near daily high) and buying ETFs with low IBS (close near daily low), exploiting short-term mean-reversion."
tags: [etfs, mean-reversion, ibs, internal-bar-strength]
---
# Mean-Reversion (ETFs)
**Section**: 4.4 | **Asset Class**: ETFs | **Type**: Mean-Reversion
## Overview
This strategy applies mean-reversion to ETFs using the Internal Bar Strength (IBS) indicator, derived from the previous day's close, high, and low prices. ETFs with a close near their daily high (high IBS) are considered "rich" and likely to revert downward; ETFs with a close near their daily low (low IBS) are "cheap" and likely to revert upward. A dollar-neutral portfolio sells high-IBS ETFs and buys low-IBS ETFs.
## Construction / Signal
**Internal Bar Strength (IBS)**:
```
IBS = (P_C - P_L) / (P_H - P_L) (370)
```
Where:
- `P_C` = previous day's closing price
- `P_H` = previous day's high price
- `P_L` = previous day's low price
IBS ranges from 0 to 1:
- IBS close to 1: price closed near the daily high → ETF is "rich"
- IBS close to 0: price closed near the daily low → ETF is "cheap"
An equivalent symmetric measure: `Y = IBS - 1/2 = (P_C - P_*) / (P_H - P_L)` where `P_* = (P_H + P_L) / 2`; Y ranges from -1/2 to +1/2.
**Portfolio construction**:
- Sort ETFs cross-sectionally by IBS.
- Sell ETFs in the top decile (high IBS, "rich").
- Buy ETFs in the bottom decile (low IBS, "cheap").
- Dollar-neutral construction.
## Entry / Exit Rules
- **Entry**: Each day after the close, compute IBS for all ETFs, rank, and enter positions for the next day's open or close.
- **Exit**: Typically hold for 1 day (short-term mean-reversion); close at next day's close.
- **Rebalance**: Daily.
## Key Parameters
- **IBS computation**: Daily, using previous day's high, low, and close
- **Holding period**: Short-term (typically 1 day)
- **Portfolio construction**: Dollar-neutral long/short decile
- **Weights**: Uniform for all long and all short ETFs, or volatility-weighted
## Variations
- **Volatility-weighted positions**: Weight positions by historical ETF volatility rather than equal-weighting
- **Stock mean-reversion methods**: Mean-reversion strategies from Section 3 (cluster, weighted regression) can also be adapted to ETFs
- **IBS threshold**: Instead of top/bottom decile, use a fixed IBS threshold (e.g., IBS > 0.8 = short, IBS < 0.2 = long)
## Notes
- IBS is a simple, daily-bar indicator requiring only OHLC (open-high-low-close) data.
- Mean-reversion in ETFs can be stronger than in individual stocks because ETFs represent diversified baskets where idiosyncratic volatility is reduced, and market-maker arbitrage constrains large deviations from NAV.
- Holding period is very short (1 day); transaction costs can be significant for daily rebalancing.
- The strategy can be combined with other signals (e.g., sector momentum) for confirmation.
- All stock-based mean-reversion strategies (clusters, weighted regression) can be adapted for ETF universes.

View File

@@ -0,0 +1,70 @@
---
description: "Builds a long-only trend-following portfolio across multiple asset classes using ETFs, allocating weights proportional to cumulative momentum and optionally risk-adjusted by historical volatility, with an optional MA filter."
tags: [etfs, trend-following, multi-asset, momentum, long-only]
---
# Multi-Asset Trend Following
**Section**: 4.6 | **Asset Class**: ETFs | **Type**: Trend-Following / Multi-Asset
## Overview
ETFs allow efficient diversification across sectors, countries, asset classes, and factors in a relatively small number of instruments. This strategy constructs a long-only trend-following portfolio across multiple ETFs (and thus multiple asset classes) by allocating weights based on cumulative momentum, optionally filtered by a moving average, and weighted by historical volatility to manage risk.
## Construction / Signal
**Step 1 — Compute cumulative returns** over a T-month formation period (T = 612 months):
```
R_i^cum = P_i(t) / P_i(t+T) - 1
```
**Step 2 — Filter**: Keep only ETFs with positive `R_i^cum` (positive momentum required for long-only).
**Step 3 — Optional MA filter**: Additionally keep only ETFs whose last closing price P_i exceeds their moving average MA_i(T') (typically T' = 100200 days):
```
P_i > MA_i(T')
```
**Step 4 — Assign weights** to all surviving ETFs (not just top decile, since the universe is small):
Option A — proportional to cumulative return:
```
w_i = gamma_1 * R_i^cum (371)
```
Option B — momentum divided by volatility (Sharpe-like weighting):
```
w_i = gamma_2 * R_i^cum / sigma_i (372)
```
Option C — momentum divided by variance (Sharpe ratio optimization for diagonal covariance):
```
w_i = gamma_3 * R_i^cum / sigma_i^2 (373)
```
where `sigma_i` is historical ETF volatility and normalization coefficients `gamma_1`, `gamma_2`, `gamma_3` are computed to satisfy `sum_{i=1}^{N} w_i = 1` (N = number of ETFs with nonzero weights after filtering).
Option C (Eq. 373) optimizes the Sharpe ratio of the ETF portfolio assuming a diagonal covariance matrix `C_ij = diag(sigma_i^2)` (ignoring cross-ETF correlations).
## Entry / Exit Rules
- **Entry**: At each rebalance, apply momentum and MA filters, compute weights, and enter long positions in all surviving ETFs.
- **Exit**: Rebalance monthly (or per the formation period schedule); ETFs with negative cumulative momentum or below their MA are dropped (weight set to zero).
- **Position cap**: Bounds `w_i <= w_i^max` can be imposed to prevent overweighting of any single volatile ETF.
## Key Parameters
- **Formation period T**: 612 months
- **MA filter length T'**: 100200 days (optional; aligns with sector momentum rotation MA filter)
- **Weighting scheme**: Equal (Eq. 371), volatility-adjusted (Eq. 372), or variance-adjusted/Sharpe-optimal (Eq. 373)
- **Position cap**: Maximum weight per ETF (optional; mitigates concentration risk)
- **Holding period**: Monthly rebalancing typical
## Variations
- **No MA filter**: Use only positive cumulative return filter
- **With position caps**: Add `w_i <= w_i^max` to prevent overweighting high-momentum volatile ETFs
- **Sector rotation overlay**: Combine with sector momentum rotation (Section 4.1) by restricting the universe to top-ranked sectors
## Notes
- Eq. (371) is the simplest weighting; it overweights volatile ETFs since on average `R_i^cum ∝ sigma_i`.
- Eq. (372) mitigates volatility overweighting by dividing by sigma_i.
- Eq. (373) is the optimal Sharpe ratio solution under the assumption of uncorrelated (diagonal covariance) ETF returns.
- The key advantage of ETFs for multi-asset trend following: a small number of instruments (tens of ETFs) can provide exposure to many asset classes, sectors, geographies, and factors simultaneously.
- Long-only construction avoids shorting complexity; the MA filter prevents buying ETFs in absolute downtrends even if they have relative momentum.
- For some literature on multi-asset portfolios, dynamic asset allocation, and related topics: Bekkers, Doeswijk and Lam (2009), Black and Litterman (1992), Faber (2015, 2016), Mladina (2014).

View File

@@ -0,0 +1,63 @@
---
description: "Overweights ETFs with high selectivity (low R-squared against factor model) and high alpha, and underweights ETFs with low selectivity (high R-squared), using a two-dimensional sort on R-squared and alpha."
tags: [etfs, r-squared, alpha, selectivity, factor-model]
---
# R-Squared
**Section**: 4.3 | **Asset Class**: ETFs | **Type**: Factor-Based / Selectivity
## Overview
Empirical studies suggest that augmenting Jensen's alpha with an indicator based on R-squared from a factor model regression adds predictive value for future ETF returns. R-squared measures how much of an ETF's return variance is explained by common factors; low R-squared (high "selectivity") combined with high alpha predicts strong future performance. High R-squared (low selectivity) combined with low alpha predicts weak future performance.
## Construction / Signal
Run a serial regression of ETF returns `R_i(t)` on 4 factors (Fama-French 3 + Carhart momentum):
```
R_i(t) = alpha_i + beta_{1,i} MKT(t) + beta_{2,i} SMB(t) + beta_{3,i} HML(t) + beta_{4,i} MOM(t) + epsilon_i(t) (365)
```
Compute regression R-squared:
```
R^2 = 1 - SS_res / SS_tot (366)
SS_res = sum_{i=1}^{N} epsilon_i(t)^2 (367)
SS_tot = sum_{i=1}^{N} (R_i(t) - R_bar(t))^2 (368)
R_bar(t) = (1/N) * sum_{i=1}^{N} R_i(t) (369)
```
**Selectivity** = `1 - R^2` [Amihud and Goyenko, 2013]. High selectivity = low R-squared = returns less explained by common factors.
**Two-dimensional sort strategy**:
1. Sort ETFs into quintiles by R-squared (5 groups).
2. Within each R-squared quintile, sort ETFs into sub-quintiles by alpha (5 sub-groups).
3. This creates 25 groups of ETFs.
4. **Buy** ETFs in the group with lowest R-squared quintile and highest alpha sub-quintile.
5. **Sell** ETFs in the group with highest R-squared quintile and lowest alpha sub-quintile.
## Entry / Exit Rules
- **Entry**: At rebalance, run regression, compute R-squared and alpha for each ETF, perform 5×5 sort, enter long/short positions.
- **Exit**: Hold for estimation period or holding period; rebalance periodically.
- **Estimation period**: Same as alpha rotation (typically 1 year); longer estimation periods can be used, especially for monthly returns.
## Key Parameters
- **Factor model**: 4-factor (Fama-French 3 + Carhart MOM); 3-factor also usable
- **Estimation period**: Typically 1 year; can be longer for monthly return data
- **Sort dimensions**: R-squared quintiles × alpha sub-quintiles (5×5 = 25 groups)
- **Holding period**: Similar to alpha rotation strategy (13 months)
- **Selectivity definition**: `1 - R^2`
## Variations
- **3-factor model**: Use Fama-French 3 factors without momentum factor MOM
- **Different quintile splits**: Use deciles instead of quintiles for finer grouping
- **R-squared only**: Sort purely by R-squared without the alpha sub-sort
- **Estimation period alignment**: Use same estimation period as alpha rotation strategy (Section 4.2) for consistency
## Notes
- R-squared as a measure of active management: in Amihud and Goyenko (2013), R-squared is applied to mutual funds; Garyn-Tal (2014a, 2014b) applies it to actively managed ETFs.
- Low R-squared means the ETF has high "active share" — its returns are driven more by the manager's specific bets than by passive factor exposure.
- The estimation period and return frequency for R-squared can be the same as for alpha rotation (see Section 4.2 and fn. 77).
- Longer estimation periods are particularly appropriate if R_i(t) are monthly returns.
- Can be combined with the MA filter (Section 4.1.1) as an additional condition.

View File

@@ -0,0 +1,74 @@
---
description: "Overweights ETFs from outperforming sectors and underweights those from underperforming sectors based on T-month cumulative return momentum, with optional MA filter and dual-momentum variants."
tags: [etfs, momentum, sector-rotation]
---
# Sector Momentum Rotation
**Section**: 4.1 / 4.1.1 / 4.1.2 | **Asset Class**: ETFs | **Type**: Momentum / Sector Rotation
## Overview
Empirical evidence shows that the momentum effect exists not only for individual stocks but also for sectors and industries. The sector momentum rotation strategy overweights ETFs from outperforming sectors and underweights those from underperforming sectors, using ETFs concentrated in specific sectors/industries to implement sector/industry rotation without buying or selling large numbers of underlying stocks.
## Construction / Signal
Similarly to stock price-momentum (Section 3.1), use each sector ETF's cumulative return as the momentum measure. Let `P_i(t)` be the price of ETF labeled by i:
```
R_i^cum(t) = P_i(t) / P_i(t + T) - 1 (361)
```
Here `t + T` is T months in the past w.r.t. t. After time t, buy ETFs in the top decile by `R_i^cum(t)` and hold for a holding period (typically 13 months).
**Dollar-neutral construction**: Buy top-decile ETFs and short bottom-decile ETFs (ETFs can be shorted).
**Long-only construction**: Buy only top-decile ETFs, equal-weight or volatility-weight.
## Entry / Exit Rules
- **Entry**: At rebalance, rank all sector ETFs by cumulative return `R_i^cum`; buy top-decile, optionally short bottom-decile.
- **Exit**: Hold for 13 months; rebalance at the next scheduled interval.
- **Formation period T**: Typically 612 months.
## Key Parameters
- **Formation period T**: 612 months
- **Holding period**: 13 months
- **Portfolio construction**: Long-only (top decile) or dollar-neutral (top long, bottom short)
- **Weights**: Uniform or volatility-adjusted
## Variations
### 4.1.1 — Sector Momentum Rotation with MA Filter
A refinement that requires an ETF to pass a moving average filter before entering a position, preventing buys in sectors with downward price trends even if they rank high by relative momentum.
```
Rule = { Buy top-decile ETFs only if P > MA(T')
{ Short bottom-decile ETFs only if P < MA(T') (362)
```
- `P` = ETF's current price at transaction time
- `MA(T')` = moving average of ETF's daily prices over T' days (T' can differ from formation period T; typically T' = 100200 days)
This ensures the absolute price level (trend) also supports the trade direction.
### 4.1.2 — Dual-Momentum Sector Rotation
In long-only strategies, mitigates the risk of buying sector ETFs when the broad market is trending down. Augments relative (cross-sectional) momentum with absolute (time-series) momentum of a broad market index ETF:
```
Rule = { Buy top-decile ETFs if broad market P > MA(T')
{ Buy an uncorrelated ETF (e.g., gold, Treasury) if broad market P <= MA(T') (363)
```
- `P` = broad market index ETF's price at transaction time
- `MA(T')` = moving average of the broad market index ETF's price; typically T' = 100200 days
If the broad market is below its moving average (downtrend), capital is rotated into an ETF uncorrelated with the broad market (e.g., gold or Treasury ETF) instead of sector ETFs.
Reference: Antonacci (2014, 2017).
## Notes
- ETF-based sector rotation is simpler to implement than stock-level sector rotation: one ETF trade per sector instead of dozens of stock trades.
- The MA filter (4.1.1) reduces the chance of buying momentum in a sector that is in absolute decline.
- Dual-momentum (4.1.2) addresses the long-only strategy's vulnerability to broad market drawdowns.
- Typical formation period: 612 months; typical holding period: 13 months.
- Dollar-neutral construction removes broad market exposure but requires shorting ETFs (feasible in practice).

View File

@@ -0,0 +1,50 @@
---
description: "A barbell portfolio holds bonds at two extreme maturities (short and long) to achieve a target duration while gaining higher convexity than an equivalent bullet, providing better protection against parallel yield curve shifts."
tags: [fixed-income, duration, convexity, barbell, yield-curve]
---
# Barbells
**Section**: 5.3 | **Asset Class**: Fixed Income | **Type**: Duration / Convexity
## Overview
A barbell concentrates holdings at two maturities: a short maturity T_1 and a long maturity T_2. It is a combination of two bullet strategies. For a given modified duration (matching a bullet at intermediate maturity T_*), the barbell achieves higher convexity, providing better protection against parallel yield shifts at the cost of lower overall yield.
## Construction / Mechanics
For a simple barbell of w_1 dollars in zero-coupon bonds with maturity T_1 and w_2 dollars with maturity T_2 (continuous compounding, constant yield Y), with price-adjusted weights w̃_1 = w_1·exp(-T_1·Y) and w̃_2 = w_2·exp(-T_2·Y):
**Duration** (equals a bullet at T_*):
```
D = (w̃_1·T_1 + w̃_2·T_2) / (w̃_1 + w̃_2) (390)
T_* = D_* = D (391)
```
**Convexity** (exceeds the equivalent bullet):
```
C = (w̃_1·T_1² + w̃_2·T_2²) / (w̃_1 + w̃_2) (392)
C_* = T_*² (393)
```
The convexity advantage:
```
C - C_* = (w̃_1·w̃_2 / (w̃_1 + w̃_2)²) · (T_2 - T_1)² > 0 (394)
```
## Payoff / Return Profile
- Higher convexity than an equivalent bullet means the barbell outperforms when yields move significantly in either direction (parallel shifts).
- The long-maturity bonds benefit from high yields; the short-maturity bonds provide protection if rates rise (proceeds reinvested at higher rates).
- Flattening of the yield curve (short-term rates rise relative to long-term) has a positive impact; steepening has a negative impact.
## Key Parameters / Signals
- T_1 (short maturity), T_2 (long maturity): the two maturities defining the barbell
- w_1, w_2: dollar allocations to short and long maturities
- Target duration D: matched to the equivalent bullet at T_*
- Convexity advantage C - C_*: larger the spread T_2 - T_1, the greater the convexity benefit
## Variations
- Combine with duration matching to an intermediate bullet for controlled rate exposure.
## Notes
- Higher convexity comes at the expense of lower overall yield (yield curve typically slopes upward, so the mid-point bullet earns more carry).
- Duration scales approximately linearly with maturity; convexity scales quadratically — this is why the barbell's convexity exceeds the equivalent bullet.
- The barbell is more complex to manage than a bullet due to two distinct maturity exposures.

View File

@@ -0,0 +1,66 @@
---
description: "Bond immunization constructs a portfolio whose duration matches a future cash obligation's maturity, protecting the portfolio value against parallel yield curve shifts to meet a predetermined liability."
tags: [fixed-income, duration, immunization, liability-matching, convexity]
---
# Bond Immunization
**Section**: 5.5 | **Asset Class**: Fixed Income | **Type**: Duration / Liability Matching
## Overview
Bond immunization is used to ensure a portfolio can meet a predetermined future cash obligation F at time T_*. A portfolio is constructed so that its duration matches T_*, making its value insensitive to parallel shifts in the yield curve. It extends to matching convexity for additional protection with three bonds.
## Construction / Mechanics
**Total investment** P given a future obligation F at time T_*, constant yield Y, periodic compounding with period δ:
```
P = F / (1 + Yδ)^(T_*/δ) (396)
```
**Two-bond immunization** (matches duration only):
With two bonds of maturities T_1, T_2 and modified durations D_1, D_2, dollar allocations P_1, P_2:
```
P_1 + P_2 = P (397)
P_1·D_1 + P_2·D_2 = P·D (398)
```
where the target modified duration:
```
D = T_* / (1 + Yδ) (399)
```
**Three-bond immunization** (matches duration and convexity):
With three bonds, durations D_1, D_2, D_3 and convexities C_1, C_2, C_3:
```
P_1 + P_2 + P_3 = P (400)
P_1·D_1 + P_2·D_2 + P_3·D_3 = P·D (401)
P_1·C_1 + P_2·C_2 + P_3·C_3 = P·C (402)
```
where the target convexity:
```
C = T_*(T_* + δ) / (1 + Yδ)² (403)
```
## Payoff / Return Profile
- Immunized portfolio is protected against parallel yield curve shifts: the gain/loss from price changes offsets the loss/gain from reinvestment rate changes.
- Matching convexity (three-bond) provides additional protection against larger rate moves.
- The portfolio value converges to F at time T_* under parallel shifts.
## Key Parameters / Signals
- T_*: maturity of the future cash obligation (target duration)
- F: size of the future obligation
- Y: assumed constant yield (all bonds assumed same yield — a simplification)
- D, C: target modified duration and convexity
## Variations
- **Zero-coupon immunization**: purchase a single zero-coupon bond with maturity T_* — the simplest solution, but may not be available.
- **Two-bond**: matches duration only; sufficient for small parallel shifts.
- **Three-bond**: matches both duration and convexity; handles larger shifts.
- Extension to non-parallel yield curve changes requires additional sophistication.
## Notes
- The assumption that all bonds have the same yield is a simplification; in practice yields differ across maturities and issuers.
- The portfolio must be periodically rebalanced as the yield curve changes, incurring transaction costs.
- Immunization protects against parallel shifts only; slope and curvature changes can still cause losses.
- Non-parallel shifts, credit spread changes, and transaction costs all introduce complexity in practice.

View File

@@ -0,0 +1,36 @@
---
description: "A bullet portfolio concentrates all bond holdings at a single target maturity, used to express a directional view on interest rates at a specific point on the yield curve."
tags: [fixed-income, duration, bullet, yield-curve]
---
# Bullets
**Section**: 5.2 | **Asset Class**: Fixed Income | **Type**: Duration / Directional
## Overview
In a bullet portfolio all bonds share the same maturity date T, targeting a specific segment of the yield curve. The strategy expresses a view on the direction of interest rates at that maturity. Bonds are typically purchased over time to mitigate timing risk from rate fluctuations.
## Construction / Mechanics
- Select a target maturity T based on the trader's interest rate outlook.
- Purchase bonds of that maturity, potentially accumulating positions over time.
- Hold to maturity or until the rate view is realized.
Purchasing over time mitigates interest rate risk: if rates rise, later purchases capture higher yields; if rates fall, earlier purchases lock in higher yields.
## Payoff / Return Profile
- **Rates expected to fall** (bond prices rise): pick a longer maturity — longer bonds gain more in price from a given yield decline (higher duration).
- **Rates expected to rise** (bond prices fall): pick a shorter maturity — shorter bonds lose less.
- **Uncertain outlook**: diversify across maturities (barbell or ladder preferred).
## Key Parameters / Signals
- Target maturity T: the single maturity determining duration exposure
- Modified duration: scales with T; determines price sensitivity to rate changes
- Interest rate forecast: the primary signal driving maturity selection
## Variations
- Building the portfolio gradually over time to average in across different rate environments.
## Notes
- Concentrating at one maturity creates pure duration exposure with no convexity advantage.
- Compared to a barbell with the same duration, a bullet has lower convexity, meaning it is more exposed to parallel yield curve shifts.
- Suitable when the trader has a strong directional view on a specific maturity segment.

View File

@@ -0,0 +1,56 @@
---
description: "The carry factor strategy buys bonds with the highest carry — the return earned as a bond rolls down the yield curve — combining bond yield income with the roll-down return from the yield curve's slope."
tags: [fixed-income, factor, carry, roll-down, yield-curve]
---
# Carry Factor
**Section**: 5.11 | **Asset Class**: Fixed Income | **Type**: Factor / Carry
## Overview
Carry in fixed income is the return from holding a bond as it "rolls down" the yield curve toward maturity. If the term structure is upward-sloping and stable, a bond's yield declines as its maturity shortens, causing a price appreciation on top of the coupon income. The carry factor strategy buys bonds in the top decile by carry and sells those in the bottom decile.
## Construction / Mechanics
**Carry** over horizon Δt for a bond with current maturity T:
```
C(t, t+Δt, T) = [P(t+Δt, T) - P(t, T)] / P(t, T) (413)
```
Under the assumption that the yield curve shape is constant (R(t,T) = f(T-t) only), the yield at t+Δt is R(t+Δt, T) = R(t, T-Δt), giving:
```
C(t, t+Δt, T) = R(t,T)·Δt + C_roll(t, t+Δt, T) (414)
```
Two components:
1. **Yield income**: R(t,T)·Δt — the bond's current yield times the holding period
2. **Roll-down return**:
```
C_roll(t, t+Δt, T) ≈ -ModD(t,T) · [R(t, T-Δt) - R(t, T)] (415)
```
This is the price appreciation as the bond shortens in maturity by Δt along a static yield curve, estimated using modified duration.
**Portfolio construction**: rank all bonds by C(t, t+Δt, T); long top decile, short bottom decile (zero-cost version).
## Payoff / Return Profile
- Earns yield income plus roll-down return when the yield curve is upward-sloping and stable.
- Roll-down return is greatest in the steepest segments of the yield curve.
- Underperforms or loses when the yield curve flattens, inverts, or shifts upward unexpectedly.
## Key Parameters / Signals
- R(t,T): current yield (income component)
- ModD(t,T): modified duration (scales the roll-down component)
- R(t, T-Δt) - R(t, T): slope of the yield curve at maturity T (steeper = more roll-down)
- Δt: carry horizon (e.g., 1 month)
## Variations
- Long-only: buy top decile by carry (no short sales required).
- Cross-asset carry: extend the same framework to other fixed income markets (government bonds, credit, etc.).
## Notes
- The static yield curve assumption simplifies computation; actual carry will differ if the curve shifts.
- For financed portfolios, R(t,T) is replaced by R(t,T) - r_f (excess yield over the risk-free rate) in the income component, but this does not affect portfolio weights.
- High-carry bonds tend to have longer maturities in an upward-sloping curve environment, so the carry factor has implicit duration exposure.
- Carry and roll-down are sometimes separated as distinct signals; roll-down alone favors bonds in the steepest curve segments regardless of yield level.

View File

@@ -0,0 +1,55 @@
---
description: "CDS basis arbitrage exploits the mispricing between a bond's credit spread and its CDS spread — when the CDS basis is negative (bond spread too high), buy the bond and buy CDS protection to lock in a risk-free profit."
tags: [fixed-income, arbitrage, cds, credit-spread, basis]
---
# CDS Basis Arbitrage
**Section**: 5.14 | **Asset Class**: Fixed Income | **Type**: Arbitrage / Credit
## Overview
A credit default swap (CDS) provides insurance against default on a bond. In theory, the CDS spread should equal the bond yield spread over the risk-free rate, making the insured bond equivalent to a risk-free instrument. The CDS basis is the difference between these two spreads, and deviations from zero create arbitrage opportunities.
## Construction / Mechanics
**CDS basis**:
```
CDS basis = CDS spread - bond spread (417)
```
where bond spread = bond yield - risk-free rate.
**Arbitrage logic**:
- CDS spread should ≈ bond spread (both represent compensation for default risk)
- If CDS basis ≠ 0 (and |basis| exceeds transaction costs), an arbitrage opportunity exists
**Negative basis trade** (most common):
- CDS basis < 0: bond spread > CDS spread → bond is relatively cheap
- Trade: **buy the bond** (receive the high spread) + **buy CDS protection** (pay the lower CDS spread)
- Net P&L per dollar of insured debt: bond spread - CDS spread = -basis > 0
- Result: a nearly risk-free positive carry, since the CDS makes the bond effectively risk-free
**Positive basis trade** (less common in practice):
- CDS basis > 0: CDS spread > bond spread → CDS protection is expensive relative to bond
- Trade: sell the bond + sell CDS protection (write CDS)
- In practice, this often means unwinding an existing position (already owning both the bond and CDS)
## Payoff / Return Profile
- Earns the absolute value of the CDS basis as a near-riskless spread.
- Position closed when basis converges back to zero.
- The trade is essentially a carry trade: positive carry from the basis for as long as it persists.
## Key Parameters / Signals
- CDS basis = CDS spread - bond spread: the primary signal
- Transaction cost threshold: |basis| must exceed bid-ask spreads and financing costs
- Sign of basis: negative → buy bond + buy CDS; positive → sell bond + sell CDS
## Variations
- **Synthetic bond replication**: CDS + risk-free bond (e.g., Treasury repo) replicates a corporate bond; mispricing between the two creates the arbitrage.
## Notes
- CDS protection makes the bond synthetically risk-free, but counterparty risk on the CDS remains.
- Negative basis arbitrage requires financing the bond purchase (repo market); the repo rate affects net P&L.
- The CDS basis can persist or widen during stress periods (e.g., 2008 financial crisis) before eventually converging, creating significant mark-to-market losses in the interim.
- Liquidity risk: corporate bonds may be illiquid, making it difficult to close the position at fair value.
- In the positive basis case, selling a corporate bond short is operationally challenging.

View File

@@ -0,0 +1,44 @@
---
description: "A dollar-duration-neutral butterfly combines a long barbell (short T_1 and long T_3 maturities) with a short bullet (intermediate T_2) at zero net cost, immunizing against parallel yield curve shifts to profit from yield curve curvature changes."
tags: [fixed-income, butterfly, duration-neutral, yield-curve, curvature]
---
# Dollar-Duration-Neutral Butterfly
**Section**: 5.6 | **Asset Class**: Fixed Income | **Type**: Yield Curve / Curvature
## Overview
The dollar-duration-neutral butterfly is a zero-cost combination of a long barbell (long T_1 and T_3 maturity bonds) and a short bullet (short the T_2 intermediate maturity bond), where T_1 < T_2 < T_3. Both zero cost (dollar neutrality) and dollar-duration neutrality conditions are imposed, immunizing the portfolio against parallel yield curve shifts. The strategy profits from changes in yield curve curvature.
## Construction / Mechanics
Let P_1, P_2, P_3 be the dollar amounts invested in the three bonds, and D_1, D_2, D_3 their modified durations.
**Zero-cost** (dollar neutrality): the long barbell finances the short bullet position:
```
P_1 + P_3 = P_2 (404)
```
**Dollar-duration neutrality** (parallel shift immunity):
```
P_1·D_1 + P_3·D_3 = P_2·D_2 (405)
```
These two equations determine P_1 and P_3 given P_2.
## Payoff / Return Profile
- Profits when the yield curve becomes more curved (humped): the intermediate yield rises relative to the wings, or the wings fall relative to the body.
- Immune to small parallel shifts in the yield curve (both level and dollar-duration matched).
- Exposed to changes in the slope and curvature of the yield curve.
## Key Parameters / Signals
- T_1 (short wing), T_2 (body), T_3 (long wing): the three maturities; T_1 < T_2 < T_3
- D_1, D_2, D_3: modified durations of the three bonds
- P_2: the reference position size (determines P_1 and P_3 via the two constraints)
## Variations
- See also: fifty-fifty butterfly (5.7) and regression-weighted butterfly (5.8), which relax the zero-cost condition.
## Notes
- Dollar-duration neutrality (Eq. 405) protects against parallel shifts only; non-parallel changes in slope or curvature can still generate losses or gains.
- The zero-cost constraint (Eq. 404) means no initial capital is required, making it attractive as an overlay strategy.
- In practice, bid-ask spreads, financing costs, and liquidity differences across maturities affect profitability.

View File

@@ -0,0 +1,46 @@
---
description: "A fifty-fifty butterfly sets equal dollar durations on both wings of the barbell, making it approximately neutral to small yield curve steepening and flattening while remaining dollar-duration neutral, trading zero-cost for curve-neutrality."
tags: [fixed-income, butterfly, duration-neutral, yield-curve, curvature]
---
# Fifty-Fifty Butterfly
**Section**: 5.7 | **Asset Class**: Fixed Income | **Type**: Yield Curve / Curvature
## Overview
The fifty-fifty butterfly is a variation of the dollar-duration-neutral butterfly that equalizes the dollar durations of the two wings (short-maturity and long-maturity positions). This makes the strategy approximately neutral to small steepening and flattening of the yield curve (not just parallel shifts), at the cost of no longer being dollar-neutral (it is not zero-cost). It is also known as the "neutral curve butterfly."
## Construction / Mechanics
Using the same notation as the dollar-duration-neutral butterfly (Section 5.6), with modified durations D_1, D_2, D_3 and dollar positions P_1, P_2, P_3:
**Equal wing dollar durations**:
```
P_1·D_1 = P_3·D_3 = (1/2)·P_2·D_2 (406)
```
This implies dollar-duration neutrality is preserved:
```
P_1·D_1 + P_3·D_3 = P_2·D_2
```
But the zero-cost condition P_1 + P_3 = P_2 is generally not satisfied.
## Payoff / Return Profile
- Approximately neutral to small steepening and flattening of the yield curve: the spread change between the body (T_2) and the short wing (T_1) equals the spread change between the body and the long wing (T_3).
- Still immune to parallel shifts (dollar-duration neutral).
- Profits from curvature changes: if the body cheapens relative to both wings, the position gains.
## Key Parameters / Signals
- P_1·D_1 = P_3·D_3 = (1/2)·P_2·D_2: the defining equal-wing constraint
- T_1 < T_2 < T_3: the three maturities
- Net cost P_2 - P_1 - P_3: non-zero unlike the dollar-duration-neutral butterfly
## Variations
- Dollar-duration-neutral butterfly (Section 5.6): zero-cost but not curve-neutral.
- Regression-weighted butterfly (Section 5.8): uses empirical β to account for differential yield volatility across the curve.
## Notes
- The name "fifty-fifty" refers to the equal split of the body's dollar duration between the two wings.
- Curve-neutrality is approximate and holds only for small parallel steepening/flattening moves.
- The non-zero cost means the trader must finance the net position, which has carry implications.
- Short-term rates are empirically more volatile than long-term rates, which limits the curve-neutrality assumption; this motivates the regression-weighted butterfly.

View File

@@ -0,0 +1,100 @@
---
description: "Background concepts for fixed income instruments: zero-coupon bonds, coupon bonds, floating rate bonds, swaps, duration, and convexity — the foundational mechanics underlying all fixed income strategies."
tags: [fixed-income, background, duration, convexity, swaps]
---
# Fixed Income Generalities
**Section**: 5.1 | **Asset Class**: Fixed Income | **Type**: Background / Reference
## Overview
Fixed income instruments are promises to pay cash flows at future dates, priced today as the present value of those flows. The yield of a bond summarizes its return as a single annualized rate. Duration and convexity characterize how bond prices respond to interest rate changes, and are the primary risk-management tools for fixed income portfolios.
## Construction / Mechanics
### 5.1.1 Zero-Coupon Bonds
A zero-coupon (discount) bond with maturity T pays $1 at time T. Its price at time t is P(t,T), with P(T,T) = 1. The continuously compounded yield is:
```
R(t,T) = -ln(P(t,T)) / (T - t) (374)
```
### 5.1.2 Coupon Bonds
A coupon bond pays principal $1 at maturity T plus n coupon payments of amount kδ at times T_i = T_0 + iδ (i = 1,...,n), where δ is the payment period. Price at time t:
```
P_c(t,T) = P(t,T) + kδ Σ_{i=I(t)}^n P(t,T_i) (375)
```
where I(t) = min(i : t < T_i). At issuance (t = T_0), the par coupon rate is:
```
k = (1 - P(T_0,T)) / (δ Σ_{i=1}^n P(T_0,T_i)) (377)
```
### 5.1.3 Floating Rate Bonds
Coupon payments are based on LIBOR. The LIBOR rate at T_{i-1} for period [T_{i-1}, T_i] is:
```
L(T_{i-1}) = (1/δ) [1/P(T_{i-1},T_i) - 1] (378)
```
The coupon paid at T_i is X_i = L(T_{i-1})δ = 1/P(T_{i-1},T_i) - 1. The total value at T_0:
```
V_0 = 1 - [P(T_0,T_n) - P(T_0,T)] (380)
```
If T = T_n then V_0 = 1 (the bond prices at par).
### 5.1.4 Swaps
An interest rate swap exchanges fixed rate payments for floating (LIBOR) payments. A long swap = long fixed coupon bond + short floating rate bond. The fixed rate giving zero initial value:
```
k = (1 - P(T_0,T_n)) / (δ Σ_{i=1}^n P(T_0,T_i)) (383)
```
### 5.1.5 Duration and Convexity
**Macaulay duration** is the present-value-weighted average maturity of cash flows:
```
MacD(t,T) = (1/P_c(t,T)) [(T-t)P(t,T) + kδ Σ_{i=I(t)}^n (T_i-t)P(t,T_i)] (384)
```
**Modified duration** measures relative price sensitivity to parallel yield shifts:
```
ModD(t,T) = -∂ln(P_c(t,T)) / ∂R(t,T) (385)
```
For constant yield Y with periodic compounding: ModD = MacD / (1 + ).
Approximate price change: ΔP_c/P_c -ModD · ΔR
**Dollar duration** measures absolute price sensitivity:
```
DD(t,T) = -∂P_c(t,T)/∂R(t,T) = ModD(t,T) · P_c(t,T) (387)
```
**Convexity** captures nonlinear (second-order) effects:
```
C(t,T) = -(1/P_c(t,T)) · ∂²P_c(t,T)/∂R(t,T)² (388)
```
Full second-order approximation:
```
ΔP_c/P_c ≈ -ModD·ΔR + (1/2)·C·(ΔR)² (389)
```
## Key Parameters / Signals
- **Yield R(t,T)**: inverse of price; drives all valuation
- **Modified duration**: primary interest rate risk metric; scales approximately linearly with maturity
- **Dollar duration**: used for hedging and portfolio construction
- **Convexity**: scales approximately quadratically with maturity; higher convexity = better protection against parallel yield shifts at the cost of lower yield
## Notes
- Duration and convexity formulas assume parallel shifts in the yield curve; non-parallel shifts require more sophisticated treatment.
- Floating rate bonds priced at par (V_0 = 1) when T = T_n because the variable coupons replicate rolling T-bond investments.
- Periodic vs. continuous compounding: MacD and ModD coincide under continuous compounding; differ under periodic compounding by factor (1 + ).

View File

@@ -0,0 +1,43 @@
---
description: "A ladder portfolio holds bonds spread evenly across n equidistant maturities to diversify interest rate and reinvestment risk while maintaining an approximately constant duration through systematic roll-down."
tags: [fixed-income, duration, ladder, diversification, yield-curve]
---
# Ladders
**Section**: 5.4 | **Asset Class**: Fixed Income | **Type**: Duration-Targeting / Diversification
## Overview
A ladder holds bonds with (roughly) equal capital allocations across n different maturities T_i (i = 1,...,n), where maturities are equidistant: T_{i+1} = T_i + δ. The strategy maintains an approximately constant duration by selling shorter-maturity bonds as they near maturity and replacing them with new longer-maturity bonds. It diversifies both interest rate risk and reinvestment risk.
## Construction / Mechanics
- Allocate roughly equal capital to each rung T_i, i = 1,...,n (n is sizable, e.g., n = 10).
- Equidistant maturities: T_{i+1} = T_i + δ.
- Average (effective) maturity of the portfolio:
```
T = (1/n) Σ_{i=1}^n T_i (395)
```
- As the shortest rung approaches maturity, sell it and purchase a new bond at the longest maturity, maintaining the ladder structure.
- Also generates regular income from coupon payments across all rungs.
## Payoff / Return Profile
- Higher average maturity T → higher income (upward-sloping yield curve), but also higher interest rate risk.
- Rolling shorter bonds into longer bonds continuously captures roll-down return.
- Diversification across maturities smooths the impact of rate moves: if rates rise, maturing short bonds are reinvested at higher rates; if rates fall, longer bonds appreciate.
## Key Parameters / Signals
- n: number of rungs (more rungs = more diversification)
- δ: spacing between maturities
- T (average maturity): determines the income/risk trade-off
- Equal capital allocation per rung: ensures no concentration at any maturity
## Variations
- Unequal allocations tilting toward shorter or longer maturities (incorporating a partial bullet or barbell bias).
## Notes
- The ladder avoids the concentration risk of bullets and barbells, making it suitable for investors uncertain about the rate environment.
- The constant-duration property is approximate; exact duration changes as bonds age and are replaced.
- Reinvestment risk is diversified: proceeds from maturing bonds are spread across the yield curve over time rather than all reinvested at once.
- Transaction costs from regular rolling must be weighed against the diversification and roll-down benefits.

View File

@@ -0,0 +1,44 @@
---
description: "The low-risk factor strategy buys bonds with lower risk (shorter maturity and higher credit rating) within a credit tier, exploiting the empirical anomaly that lower-risk bonds outperform higher-risk bonds on a risk-adjusted basis."
tags: [fixed-income, factor, low-risk, credit, anomaly]
---
# Low-Risk Factor
**Section**: 5.9 | **Asset Class**: Fixed Income | **Type**: Factor / Anomaly
## Overview
Empirical evidence suggests that lower-risk bonds tend to outperform higher-risk bonds on a risk-adjusted basis (the "low-risk anomaly"), mirroring a similar effect in equities. "Riskiness" in fixed income is measured by credit rating and maturity. The strategy builds long portfolios of the lowest-risk bonds within a given credit tier.
## Construction / Mechanics
Portfolio construction uses two risk dimensions:
1. **Credit rating**: separates the investment universe into quality tiers.
- Investment Grade (IG): credit ratings AAA through A-.
- High Yield (HY): credit ratings BB+ through B-.
2. **Maturity (duration)**: within each credit tier, rank bonds by maturity and take the **bottom decile** (shortest maturities = lowest duration risk).
Example portfolios:
- IG low-risk: Investment Grade bonds (AAAA-), bottom decile by maturity.
- HY low-risk: High Yield bonds (BB+B-), bottom decile by maturity.
## Payoff / Return Profile
- Earns a risk-adjusted premium by being long the lowest-risk bonds in each tier.
- Outperforms the broad credit market on a Sharpe ratio basis due to the low-risk anomaly.
- Returns are driven by credit spread compression and coupon income, with lower sensitivity to interest rate moves (short maturity).
## Key Parameters / Signals
- Credit rating tier: AAAA- (IG) or BB+B- (HY)
- Maturity rank: bottom decile selects shortest-maturity bonds
- Risk-adjusted return (Sharpe ratio): primary evaluation metric
## Variations
- Can be combined with a short position in the top-risk decile (highest maturity within the tier) to create a long-short low-risk factor.
- Risk metrics beyond credit rating and maturity (e.g., option-adjusted spread, liquidity) can be incorporated.
## Notes
- The low-risk anomaly in bonds mirrors the similar effect documented in equities but is driven by different mechanisms (credit and duration rather than beta).
- Separating IG and HY tiers is important because the risk-return relationship differs significantly between investment grade and speculative grade.
- Liquidity may be lower for short-maturity high-yield bonds, increasing transaction costs.
- The strategy is typically implemented as a long-only portfolio; short positions in corporate bonds are operationally difficult.

View File

@@ -0,0 +1,53 @@
---
description: "A regression-weighted butterfly uses an empirically estimated β to account for the higher volatility of short-term rates relative to long-term rates, improving yield-curve-neutrality beyond the fifty-fifty butterfly."
tags: [fixed-income, butterfly, duration-neutral, yield-curve, curvature, regression]
---
# Regression-Weighted Butterfly
**Section**: 5.8 | **Asset Class**: Fixed Income | **Type**: Yield Curve / Curvature
## Overview
Empirically, short-term interest rates are significantly more volatile than long-term rates. The regression-weighted butterfly accounts for this by weighting the short wing's dollar duration by a factor β > 1, estimated from historical data via regression. This produces better curve-neutrality than the fifty-fifty butterfly in practice.
## Construction / Mechanics
Using positions P_1, P_2, P_3 with modified durations D_1, D_2, D_3 (T_1 < T_2 < T_3):
**Dollar-duration neutrality** (parallel shift immunity):
```
P_1·D_1 + P_3·D_3 = P_2·D_2 (407)
```
**Regression-weighted curve-neutrality**:
```
P_1·D_1 = β · P_3·D_3 (408)
```
where β > 1 is the regression coefficient from regressing the spread change between the body (T_2) and the short wing (T_1) on the spread change between the body and the long wing (T_3), using historical data.
## Payoff / Return Profile
- Immune to both parallel shifts (407) and, approximately, to yield curve slope changes in proportion β.
- Profits from yield curve curvature moves: gains when the body yields rise relative to the wings.
- More robust curve-neutrality than the fifty-fifty butterfly in practice due to the empirically calibrated β.
## Key Parameters / Signals
- β: regression coefficient (typically β > 1, calibrated from historical spread data)
- P_1, P_3: determined by solving (407) and (408) given P_2
- T_1, T_2, T_3: the three maturity points on the yield curve
## Variations
### 5.8.1 Maturity-Weighted Butterfly
Instead of estimating β from historical regressions, it is set analytically from the three bond maturities:
```
β = (T_2 - T_1) / (T_3 - T_2) (409)
```
This is proportional to the ratio of the short-wing maturity distance to the long-wing maturity distance from the body. It is a simpler, model-based alternative that does not require historical calibration.
## Notes
- β is empirically greater than 1 because short-term rates fluctuate more than long-term rates; the short wing therefore needs less dollar duration to hedge the same spread move.
- The regression β should be re-estimated periodically as the volatility relationship between short and long rates can change over time.
- The maturity-weighted variant (5.8.1) provides a model-based β that requires no estimation but may not capture the true empirical volatility asymmetry.
- All butterfly strategies share the exposure to transaction costs, financing costs, and bid-ask spreads that can erode theoretical curve-neutrality profits.

View File

@@ -0,0 +1,52 @@
---
description: "Rolling down the yield curve buys long- or medium-term bonds in the steepest segment of the yield curve and holds them while they appreciate as they shorten in maturity, then reinvests proceeds into new steepest-segment bonds."
tags: [fixed-income, roll-down, yield-curve, carry, duration]
---
# Rolling Down the Yield Curve
**Section**: 5.12 | **Asset Class**: Fixed Income | **Type**: Carry / Roll-Down
## Overview
The rolling down the yield curve strategy captures the roll-down component C_roll of bond returns by purchasing bonds in the steepest segments of the yield curve and holding them while their maturity shortens, causing price appreciation. Bonds are sold before maturity and the proceeds reinvested in new long/medium-term bonds from the same steep segment.
## Construction / Mechanics
The roll-down return over horizon Δt:
```
C_roll(t, t+Δt, T) ≈ -ModD(t,T) · [R(t, T-Δt) - R(t, T)] (415)
```
This is maximized when:
- **ModD(t,T)** is large (longer-maturity bonds have higher duration)
- **R(t, T-Δt) - R(t, T) < 0** (yield declines as maturity shortens upward-sloping curve)
- The magnitude |R(t, T-Δt) - R(t, T)| is large (steep segment of the curve)
**Strategy mechanics**:
1. Identify the steepest segment(s) of the yield curve.
2. Buy long- or medium-term bonds from those segments.
3. Hold while they "roll down" the curve (their maturity shrinks and yield declines).
4. Sell before maturity approaches (before they enter a flatter/shorter segment).
5. Reinvest proceeds into new long/medium-term bonds from the steep segment.
## Payoff / Return Profile
- Earns roll-down return C_roll in addition to yield income R(t,TΔt.
- Total carry C(t, t+Δt, T) = R(t,TΔt + C_roll(t, t+Δt, T).
- Profits maximized in steeply upward-sloping yield curves.
- Loses money when the yield curve flattens, inverts, or when long-end yields rise (parallel upward shift).
## Key Parameters / Signals
- Yield curve slope: identifies which segments offer the most roll-down return
- Modified duration: amplifies the roll-down return
- Holding horizon Δt: determines how far down the curve the bond rolls before sale
- Curve stability: strategy depends on curve shape remaining approximately stable
## Variations
- Pure roll-down: focus exclusively on C_roll, ignoring yield income (selects steepest curve segments regardless of absolute yield level).
- Combined carry + roll: as in the carry factor strategy (5.11), which uses total C as the signal.
## Notes
- The yield curve must be upward-sloping for roll-down to be positive; in a flat or inverted curve the roll-down may be zero or negative.
- Transaction costs from repeated roll-overs must be weighed against the roll-down income.
- The strategy has implicit duration risk: long/medium bonds lose value in a rising rate environment, which can more than offset the roll-down gain.
- Steepest curve segments often occur at the short to medium end (e.g., 2-10 year part of the Treasury curve) and can shift over time with monetary policy.

View File

@@ -0,0 +1,59 @@
---
description: "Swap spread arbitrage takes a long (short) position in an interest rate swap versus a short (long) position in a Treasury bond of the same maturity, profiting from the difference between the swap rate, Treasury yield, LIBOR, and the repo rate."
tags: [fixed-income, arbitrage, swap, libor, treasury, spread]
---
# Swap Spread Arbitrage
**Section**: 5.15 | **Asset Class**: Fixed Income | **Type**: Arbitrage / Rates
## Overview
The swap spread arbitrage is a dollar-neutral strategy that combines a long (or short) position in an interest rate swap with an offsetting short (or long) position in a Treasury bond of the same maturity. It profits from the spread between the swap fixed rate and the Treasury yield, net of financing costs (LIBOR vs. repo rate). The strategy is essentially a bet on the direction of LIBOR relative to the repo rate.
## Construction / Mechanics
**Instruments**:
- Interest rate swap: receive fixed rate r_swap, pay floating LIBOR L(t)
- Treasury bond: short the bond (financed at repo rate r(t))
**Per-dollar-invested P&L rate**:
```
C(t) = ±[C_1 - C_2(t)] (418)
C_1 = r_swap - Y_Treasury (419)
C_2(t) = L(t) - r(t) (420)
```
where:
- C_1: constant spread = swap fixed rate minus Treasury yield (the swap spread)
- C_2(t): floating spread = LIBOR minus repo rate
- Plus sign: long swap strategy (receive fixed, short Treasury)
- Minus sign: short swap strategy (pay fixed, long Treasury)
**Long swap strategy** (plus sign):
- Receive r_swap (fixed leg of swap) + short Treasury (financed at repo) → pay Y_Treasury + repo rate
- Profitable if C_2(t) = L(t) - r(t) < C_1
**Short swap strategy** (minus sign):
- Pay r_swap (fixed leg) + long Treasury (funded at repo) receive Y_Treasury
- Profitable if C_2(t) = L(t) - r(t) > C_1
## Payoff / Return Profile
- The long swap strategy profits if LIBOR falls (C_2 decreases below C_1).
- The short swap strategy profits if LIBOR rises (C_2 increases above C_1).
- This is fundamentally a **LIBOR bet**: the trade profits or loses based on the LIBOR-repo spread relative to the constant swap spread C_1.
## Key Parameters / Signals
- C_1 = r_swap - Y_Treasury: the swap spread (constant at trade inception)
- C_2(t) = L(t) - r(t): the LIBOR-repo spread (time-varying)
- Net P&L driver: ±(C_1 - C_2(t)); direction depends on long vs. short swap position
## Variations
- Adjust maturity of the swap and Treasury bond to target different parts of the yield curve.
- Pair with CDS basis trades for multi-leg credit/rates arbitrage.
## Notes
- The strategy is dollar-neutral (the swap and Treasury position offset each other in notional terms).
- LIBOR risk is the dominant risk: unexpected changes in LIBOR (e.g., central bank policy shifts, bank credit stress) directly affect P&L.
- The repo rate r(t) can vary and introduces additional uncertainty in C_2(t).
- With the transition away from LIBOR to SOFR and other risk-free rates, the mechanics of this strategy are evolving.
- Counterparty risk on the swap and margin requirements must be accounted for in practice.

View File

@@ -0,0 +1,66 @@
---
description: "The value factor strategy for bonds selects bonds with the highest actual credit spread relative to a theoretically predicted spread from a cross-sectional regression, going long undervalued bonds in the top decile."
tags: [fixed-income, factor, value, credit-spread, regression]
---
# Value Factor
**Section**: 5.10 | **Asset Class**: Fixed Income | **Type**: Factor / Value
## Overview
"Value" in fixed income is defined by comparing a bond's observed credit spread to a theoretically predicted (fair value) credit spread. Bonds trading with a spread significantly above their predicted fair value are cheap (high value); those below are expensive. The strategy buys the top-decile bonds by value score.
## Construction / Mechanics
**Step 1: Estimate fair value spreads** via a cross-sectional linear regression across N bonds (i = 1,...,N):
```
S_i = Σ_{r=1}^K β_r · I_{ir} + γ · T_i + ε_i (410)
```
where:
- S_i: observed credit spread of bond i (bond yield minus risk-free rate)
- I_{ir}: dummy variable = 1 if bond i has credit rating r, 0 otherwise (K ≤ 21 ratings)
- T_i: maturity of bond i
- β_r, γ: regression coefficients (note: no separate intercept since Σ_r I_{ir} = 1 for each bond)
- ε_i: regression residual
The constraint:
```
Σ_{r=1}^K I_{ir} = 1 for all i (412)
```
(each bond has exactly one credit rating, so the intercept is absorbed into the rating dummies)
**Step 2: Compute fair value spread**:
```
S_i* = S_i - ε_i (411)
```
(the fitted value from the regression)
**Step 3: Compute value score** — either:
- V_i = ln(S_i / S_i*), or
- V_i = ε_i / S_i* = S_i / S_i* - 1
**Step 4: Select portfolio** — long bonds in the top decile by V_i (most undervalued).
## Payoff / Return Profile
- Profits when cheap bonds (high V_i) revert toward fair value, compressing their spreads.
- Returns driven by credit spread compression and coupon income.
- The strategy assumes mean-reversion in credit spreads around their rating- and maturity-implied fair value.
## Key Parameters / Signals
- S_i: observed credit spread (bond yield minus risk-free rate)
- S_i*: fair value credit spread from cross-sectional regression
- V_i = ln(S_i/S_i*) or V_i = S_i/S_i* - 1: value score
- Top decile by V_i: the portfolio selection criterion
## Variations
- Long-short: long top decile (cheap bonds), short bottom decile (expensive bonds).
- Separate regressions for Investment Grade and High Yield universes.
- Additional cross-sectional controls (e.g., industry, liquidity) can be added as regressors.
## Notes
- "Value" in fixed income is harder to define than in equities because bonds have finite lifetimes and their spreads are heavily influenced by credit ratings and maturity.
- The cross-sectional regression should be run on bonds within a comparable universe (e.g., only IG or only HY) to ensure meaningful comparisons.
- Credit spread data may be noisy; outliers from bonds near distress can distort the regression.
- Shorting corporate bonds is operationally challenging; the strategy is often implemented long-only.

View File

@@ -0,0 +1,56 @@
---
description: "Yield curve spread strategies (flatteners and steepeners) trade the difference in yields between two maturities of the same issuer, going short the spread when rates are expected to rise and long when rates are expected to fall."
tags: [fixed-income, yield-curve, spread, flattener, steepener, duration]
---
# Yield Curve Spread (Flatteners & Steepeners)
**Section**: 5.13 | **Asset Class**: Fixed Income | **Type**: Yield Curve / Spread
## Overview
Yield curve spread strategies trade the yield spread between two bonds of the same issuer at different maturities. If interest rates are expected to rise, the yield curve is expected to flatten (short end rises more than long end); if rates are expected to fall, the curve steepens. The strategy goes short the spread (flattener) or long the spread (steepener) accordingly.
## Construction / Mechanics
**Yield curve spread**: the difference in yields between a longer-maturity bond (back leg) and a shorter-maturity bond (front leg) of the same issuer:
```
Spread = Y(back leg, long maturity) - Y(front leg, short maturity)
```
**Trading rule**:
```
Rule = { Flattener: Short spread if interest rates expected to rise
{ Steepener: Buy spread if interest rates expected to fall (416)
```
**Position construction**:
- **Short the spread (flattener)**: sell shorter-maturity bonds (front leg) + buy longer-maturity bonds (back leg).
- **Buy the spread (steepener)**: buy shorter-maturity bonds (front leg) + sell longer-maturity bonds (back leg).
**Dollar-duration matching**: to immunize against small parallel shifts, match the dollar durations of the front and back legs:
```
P_front · D_front = P_back · D_back
```
Without duration matching, a parallel shift in the yield curve can generate significant losses.
## Payoff / Return Profile
- **Flattener profits** when the curve flattens: short-end yields rise more than long-end yields (or long-end falls more than short-end).
- **Steepener profits** when the curve steepens: long-end yields rise more than short-end (or short-end falls more than long-end).
- Dollar-duration-neutral construction limits losses from parallel yield curve moves.
## Key Parameters / Signals
- Yield curve slope: R(long maturity) - R(short maturity) — the key signal
- Front leg maturity T_1, back leg maturity T_2: define the segment being traded
- Modified durations D_1, D_2: used for dollar-duration matching
- Interest rate outlook: the primary driver of direction (flattener vs. steepener)
## Variations
- **Curve trades across issuers**: trading the slope difference between two issuers (adds credit spread risk).
- **Butterfly trades**: extend the two-leg spread to a three-leg position to trade curvature rather than slope (see Sections 5.65.8).
## Notes
- Parallel yield curve shifts can cause losses if dollar durations are not matched; duration-matching is essential for a pure slope bet.
- Even with duration matching, large parallel moves (exceeding the immunization approximation) can generate losses due to convexity differences between legs.
- The strategy is exposed to idiosyncratic supply-and-demand effects at specific maturities (e.g., Treasury auction effects, central bank purchases).
- Financing costs (repo rates for the short leg) affect the net P&L of the strategy.

View File

@@ -0,0 +1,48 @@
---
description: "Futures calendar spread strategy that takes simultaneous long/short positions in near-month and deferred-month contracts to bet on supply/demand fundamentals while reducing overall market volatility exposure."
tags: [futures, calendar-spread, term-structure, spread-trading]
---
# Calendar Spread
**Section**: 10.2 | **Asset Class**: Futures | **Type**: Spread Trading / Relative Value
## Overview
A calendar spread (also called a time spread or intra-commodity spread) involves simultaneously buying and selling futures contracts on the same underlying commodity or asset but with different delivery months. By taking offsetting positions, the trader reduces exposure to outright price moves and focuses on the relative pricing of near versus deferred contracts, which reflects supply-and-demand fundamentals and storage costs.
## Construction / Mechanics
**Bull spread**: Buy a near-month futures contract, sell a deferred-month futures contract.
- P&L = price change of near-month - price change of deferred-month
- Benefits when near-month appreciates relative to deferred (supply tightening, demand surge)
**Bear spread**: Sell a near-month futures contract, buy a deferred-month futures contract.
- P&L = price change of deferred-month - price change of near-month
- Benefits when deferred-month appreciates relative to near-month (supply glut, weak demand)
**Economic rationale**: For commodity futures, near-month contracts react more strongly to current supply and demand imbalances than deferred contracts. Therefore:
- Expect low supply + high demand → use a **bull spread**
- Expect high supply + low demand → use a **bear spread**
## Return Profile
Profits from changes in the spread between near and deferred contract prices. The outright directional market risk is substantially reduced (though not fully eliminated) relative to an outright futures position. The strategy is driven by term structure dynamics, convenience yield changes, storage cost changes, and short-term supply/demand imbalances.
## Key Parameters / Signals
| Parameter | Description |
|-----------|-------------|
| Near-month contract | The shorter-dated futures leg |
| Deferred-month contract | The longer-dated futures leg |
| Spread = near - deferred | Positive → backwardation; negative → contango |
| Bull signal | Expected low supply and high demand (buy spread) |
| Bear signal | Expected high supply and low demand (sell spread) |
## Variations
- **Skip-month spread**: skip one contract month between the two legs to amplify the spread move.
- **Butterfly spread**: three legs (buy near, sell middle, buy far) to isolate curvature of the term structure.
- **Crack spread** (energy): spread between crude oil and refined product futures (captures refining margin rather than a pure calendar spread).
- **Inter-commodity spread**: similar mechanics but between related but different commodities (e.g., corn vs. wheat).
## Notes
- While market exposure is reduced relative to outright futures, calendar spreads are not market-neutral; correlation between legs can break down during stress events.
- Margin requirements for calendar spreads are typically lower than for outright futures because exchanges recognise the reduced directional risk.
- Liquidity in deferred contracts is typically lower than in near-month contracts; wide bid-ask spreads on the deferred leg can erode profits.
- For financial futures (equity index, interest rate), the spread is primarily driven by carry (financing cost and dividend/coupon income) rather than physical supply and demand.

View File

@@ -0,0 +1,79 @@
---
description: "Futures mean-reversion strategy that buys recent underperformers and sells recent outperformers relative to an equally-weighted futures market index, with an extension using volume and open interest filters."
tags: [futures, mean-reversion, contrarian, market-index, dollar-neutral]
---
# Contrarian Trading (Mean-Reversion)
**Section**: 10.3 | **Asset Class**: Futures | **Type**: Mean-Reversion / Contrarian
## Overview
Analogous to the equity mean-reversion strategy (Section 3.9), this futures strategy bets that recent losers will rebound and recent winners will give back gains. Returns of individual futures are measured relative to an equally-weighted market index, and capital is allocated inversely to the deviation from that index. The result is a dollar-neutral, automatically constructed contrarian portfolio rebalanced weekly.
## Construction / Mechanics
Within a universe of N futures labeled i = 1,...,N, define the "market index" return as the equally-weighted average:
```
R_m = (1/N) Σ R_i (469)
```
where R_i are individual futures returns, typically measured over the last one week.
The capital allocation weights are:
```
w_i = -γ [R_i - R_m] (470)
```
where γ > 0 is fixed via the dollar-neutral normalization condition:
```
Σ |w_i| = 1 (471)
```
- Futures below the market index (R_i < R_m): positive weight (long)
- Futures above the market index (R_i > R_m): negative weight (short)
- The portfolio is automatically dollar-neutral (Σ w_i = 0)
- The strategy buys losers and sells winners relative to the market index
**Volatility adjustment**: To mitigate overinvestment in volatile futures, suppress w_i by 1/σ_i or 1/σ_i², where σ_i are the historical volatilities.
## Return Profile
Profits when futures returns mean-revert toward the market index over a one-week horizon. Returns are driven by short-term overreaction and subsequent correction. The strategy is market-neutral at the index level.
## Key Parameters / Signals
| Parameter | Description |
|-----------|-------------|
| R_i | Individual futures return over the last week |
| R_m | Equally-weighted market index return (Eq. 469) |
| w_i = -γ[R_i - R_m] | Allocation weight; negative for winners, positive for losers |
| γ | Scaling parameter fixed by Eq. (471) |
| σ_i | Historical volatility; used to suppress w_i optionally |
| Rebalancing | Weekly |
## Variations
### 10.3.1 Contrarian Trading — Market Activity
Volume and open interest filters can improve the basic mean-reversion signal. Define:
```
v_i = ln(V_i / V_i') (472)
u_i = ln(U_i / U_i') (473)
```
where V_i is total volume for futures i over the last week, V_i' is total volume over the prior week, and U_i, U_i' are the analogous open interest quantities.
**Construction:**
1. Take the upper half of futures by volume factor v_i (higher recent volume relative to prior week).
2. Within that subset, take the lower half by open interest factor u_i.
3. Apply the contrarian weights from Eq. (470) to this filtered subset.
**Rationale:**
- Larger volume changes indicate greater overreaction (a stronger snap-back is expected).
- A decrease in open interest (low u_i) signals hedger withdrawal and suggests a deeper market for the mean-reversion to work.
## Notes
- The simple weighting scheme (Eq. 470) can overinvest in highly volatile futures; volatility scaling (1/σ_i or 1/σ_i²) is recommended in practice.
- Weekly rebalancing incurs transaction costs; the net alpha must exceed round-trip costs across all positions.
- Contrarian strategies can suffer sustained losses during trending regimes; combining with a trend-following overlay (Section 10.4) may reduce drawdowns.
- The market-index return R_m links this strategy to the broader futures universe; changing the universe composition changes the benchmark and alters all weights.

Some files were not shown because too many files have changed in this diff Show More