Files
ai/gateway/knowledge/trading/signal-combination.md
Tim Olson 47471b7700 Expand model tag support: add GLM-5.1, simplify Anthropic IDs, scan tags anywhere in message
- Flink update_bars debouncing
- update_bars subscription idempotency bugfix
- Price decimal correction bugfix of previous commit
- Add GLM-5.1 model tag alongside renamed GLM-5
- Use short Anthropic model IDs (sonnet/haiku/opus) instead of full version strings
- Allow @tags anywhere in message content, not just at start
- Return hasOtherContent flag instead of trimmed rest string
- Only trigger greeting stream when tag has no other content
- Update workspace knowledge base references to platform/workspace and platform/shapes
- Hierarchical knowledge base catalog
- 151 Trading Strategies knowledge base articles
- Shapes knowledge base article
- MutateShapes tool instead of workspace patch
2026-04-28 15:05:15 -04:00

122 lines
4.8 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
description: "Institutional alpha combination: how to merge multiple weak signals into a single high-conviction output using the 11-step procedure and the Fundamental Law of Active Management."
tags: [signals, alpha, portfolio, kelly, statistics]
---
# Signal Combination — Alpha Stacking
## Core Law
Do not search for one perfect signal. Combine many weak, independent signals.
**Fundamental Law of Active Management:**
```
IR = IC × √N
```
- `IR` = Information Ratio of the combined system (risk-adjusted edge)
- `IC` = average Information Coefficient per signal (correlation of prediction to outcome)
- `N` = number of *genuinely independent* signals
Real institutional signals have IC = 0.050.15. A single signal at IC=0.10 is outperformed by 50 signals at IC=0.05 (IR = 0.05 × √50 = 0.354, over 3× better).
**Critical:** N counts *effective independent signals*, not raw signal count. Fifty correlated signals may yield only 1015 effective ones. The 11-step procedure below forces honest accounting.
---
## Five Signal Categories
| Category | What it measures | Why it persists |
|---|---|---|
| **Momentum / Price** | Direction/rate of price movement over lookback `d` | Underreaction causes short-term trend persistence |
| **Mean Reversion** | Deviation from cross-sectional fair value | Related instruments maintain consistent relative pricing |
| **Volatility** | Implied vs. realized volatility gap | Vol risk premium: sellers demand compensation |
| **Factor** | Value, momentum, carry, quality, low-vol premiums | Persistent behavioral/structural inefficiencies |
| **Microstructure** | Order book imbalance, bid-ask spread, VPIN | Informed order flow leads price movement |
> **Dexorder scope**: Only crypto OHLCV data is available. Factor signals (value, carry, quality) require TradFi data not available here. Momentum, mean reversion, volatility, and microstructure signals are all applicable.
---
## 11-Step Combination Engine
Given N signals with historical returns R(i,s) over M periods:
**Step 1.** Collect realized return series R(i,s) for each signal i, each period s.
**Step 2.** Remove drift — serially demean:
```
X(i,s) = R(i,s) mean(R(i,·))
```
**Step 3.** Compute variance per signal:
```
σ(i)² = (1/M) × Σ X(i,s)²
```
**Step 4.** Normalize to common scale:
```
Y(i,s) = X(i,s) / σ(i)
```
Makes signals with different magnitudes directly comparable.
**Step 5.** Drop the most recent observation from Y — use only out-of-sample history.
**Step 6.** Cross-sectionally demean at each time period:
```
Λ(i,s) = Y(i,s) avg_j(Y(j,s))
```
Removes any market-wide effect driving all signals simultaneously at that moment.
**Step 7.** Drop the final period from Λ to eliminate residual look-ahead.
**Step 8.** Estimate expected forward return per signal using d-day moving average, normalize:
```
E(i) = (1/d) × Σ R(i,s) over recent d periods
E_normalized(i) = E(i) / σ(i)
```
**Step 9. (Critical)** Regress E_normalized over Λ(i,s) without intercept, unit weights. Residuals `ε(i)` are each signal's *independent* forward-looking contribution — the component not explained by any other signal in the stack.
**Step 10.** Set weights:
```
w(i) = η × ε(i) / σ(i)
```
High independent edge + low noise → high weight. No subjective judgment.
**Step 11.** Normalize: scale η so `Σ|w(i)| = 1`. No unintended leverage.
**Combined output = Σ w(i) × signal_i_current_value**
---
## Empirical Kelly Sizing
```
f_empirical = f_kelly × (1 CV_edge)
f_kelly = (p × b q) / b
```
- `CV_edge` = coefficient of variation of edge estimates across 10,000 Monte Carlo path simulations of historical returns
- Higher uncertainty → smaller fraction. The formula automatically scales confidence to what is warranted.
---
## Key Failure Mode: Correlation Blindness
Believing you have 3 independent reasons for a trade when you have 1 reason expressed 3 times, sized as if for 3. This is the mechanism behind most systematic blowups where the trader was directionally correct but over-sized.
The cross-sectional demeaning (Step 6) and regression residualization (Step 9) structurally prevent this by exposing shared variance before weights are assigned.
---
## Dexorder Application Note
When combining multiple indicators into a single entry/exit signal:
1. Each indicator (momentum, RSI divergence, volume profile, spread, etc.) is a signal producing a score or directional estimate.
2. Run the 11-step engine over backtested signal histories to derive weights.
3. Combined score = weighted sum of current signal outputs.
4. Size the resulting position using empirical Kelly with CV_edge from simulation.
If computing probability estimates (e.g. probability of upward breakout), substitute probability estimates for return estimates at each step — the math is identical.