Expand model tag support: add GLM-5.1, simplify Anthropic IDs, scan tags anywhere in message

- Flink update_bars debouncing - update_bars subscription idempotency bugfix - Price decimal correction bugfix of previous commit - Add GLM-5.1 model tag alongside renamed GLM-5 - Use short Anthropic model IDs (sonnet/haiku/opus) instead of full version strings - Allow @tags anywhere in message content, not just at start - Return hasOtherContent flag instead of trimmed rest string - Only trigger greeting stream when tag has no other content - Update workspace knowledge base references to platform/workspace and platform/shapes - Hierarchical knowledge base catalog - 151 Trading Strategies knowledge base articles - Shapes knowledge base article - MutateShapes tool instead of workspace patch
2026-04-28 15:05:15 -04:00
parent d41fcd0499
commit 47471b7700
184 changed files with 9044 additions and 170 deletions
--- a/gateway/knowledge/trading/signal-combination.md
+++ b/gateway/knowledge/trading/signal-combination.md
@@ -0,0 +1,121 @@
+---
+description: "Institutional alpha combination: how to merge multiple weak signals into a single high-conviction output using the 11-step procedure and the Fundamental Law of Active Management."
+tags: [signals, alpha, portfolio, kelly, statistics]
+---
+
+# Signal Combination — Alpha Stacking
+
+## Core Law
+
+Do not search for one perfect signal. Combine many weak, independent signals.
+
+**Fundamental Law of Active Management:**
+```
+IR = IC × √N
+```
+- `IR` = Information Ratio of the combined system (risk-adjusted edge)
+- `IC` = average Information Coefficient per signal (correlation of prediction to outcome)
+- `N` = number of *genuinely independent* signals
+
+Real institutional signals have IC = 0.05–0.15. A single signal at IC=0.10 is outperformed by 50 signals at IC=0.05 (IR = 0.05 × √50 = 0.354, over 3× better).
+
+**Critical:** N counts *effective independent signals*, not raw signal count. Fifty correlated signals may yield only 10–15 effective ones. The 11-step procedure below forces honest accounting.
+
+---
+
+## Five Signal Categories
+
+| Category | What it measures | Why it persists |
+|---|---|---|
+| **Momentum / Price** | Direction/rate of price movement over lookback `d` | Underreaction causes short-term trend persistence |
+| **Mean Reversion** | Deviation from cross-sectional fair value | Related instruments maintain consistent relative pricing |
+| **Volatility** | Implied vs. realized volatility gap | Vol risk premium: sellers demand compensation |
+| **Factor** | Value, momentum, carry, quality, low-vol premiums | Persistent behavioral/structural inefficiencies |
+| **Microstructure** | Order book imbalance, bid-ask spread, VPIN | Informed order flow leads price movement |
+
+> **Dexorder scope**: Only crypto OHLCV data is available. Factor signals (value, carry, quality) require TradFi data not available here. Momentum, mean reversion, volatility, and microstructure signals are all applicable.
+
+---
+
+## 11-Step Combination Engine
+
+Given N signals with historical returns R(i,s) over M periods:
+
+**Step 1.** Collect realized return series R(i,s) for each signal i, each period s.
+
+**Step 2.** Remove drift — serially demean:
+```
+X(i,s) = R(i,s) − mean(R(i,·))
+```
+
+**Step 3.** Compute variance per signal:
+```
+σ(i)² = (1/M) × Σ X(i,s)²
+```
+
+**Step 4.** Normalize to common scale:
+```
+Y(i,s) = X(i,s) / σ(i)
+```
+Makes signals with different magnitudes directly comparable.
+
+**Step 5.** Drop the most recent observation from Y — use only out-of-sample history.
+
+**Step 6.** Cross-sectionally demean at each time period:
+```
+Λ(i,s) = Y(i,s) − avg_j(Y(j,s))
+```
+Removes any market-wide effect driving all signals simultaneously at that moment.
+
+**Step 7.** Drop the final period from Λ to eliminate residual look-ahead.
+
+**Step 8.** Estimate expected forward return per signal using d-day moving average, normalize:
+```
+E(i) = (1/d) × Σ R(i,s) over recent d periods
+E_normalized(i) = E(i) / σ(i)
+```
+
+**Step 9. (Critical)** Regress E_normalized over Λ(i,s) without intercept, unit weights. Residuals `ε(i)` are each signal's *independent* forward-looking contribution — the component not explained by any other signal in the stack.
+
+**Step 10.** Set weights:
+```
+w(i) = η × ε(i) / σ(i)
+```
+High independent edge + low noise → high weight. No subjective judgment.
+
+**Step 11.** Normalize: scale η so `Σ|w(i)| = 1`. No unintended leverage.
+
+**Combined output = Σ w(i) × signal_i_current_value**
+
+---
+
+## Empirical Kelly Sizing
+
+```
+f_empirical = f_kelly × (1 − CV_edge)
+
+f_kelly = (p × b − q) / b
+```
+- `CV_edge` = coefficient of variation of edge estimates across 10,000 Monte Carlo path simulations of historical returns
+- Higher uncertainty → smaller fraction. The formula automatically scales confidence to what is warranted.
+
+---
+
+## Key Failure Mode: Correlation Blindness
+
+Believing you have 3 independent reasons for a trade when you have 1 reason expressed 3 times, sized as if for 3. This is the mechanism behind most systematic blowups where the trader was directionally correct but over-sized.
+
+The cross-sectional demeaning (Step 6) and regression residualization (Step 9) structurally prevent this by exposing shared variance before weights are assigned.
+
+---
+
+## Dexorder Application Note
+
+When combining multiple indicators into a single entry/exit signal:
+
+1. Each indicator (momentum, RSI divergence, volume profile, spread, etc.) is a signal producing a score or directional estimate.
+2. Run the 11-step engine over backtested signal histories to derive weights.
+3. Combined score = weighted sum of current signal outputs.
+4. Size the resulting position using empirical Kelly with CV_edge from simulation.
+
+If computing probability estimates (e.g. probability of upward breakout), substitute probability estimates for return estimates at each step — the math is identical.