Files
ai/gateway/knowledge/trading/strategies/stocks/mean-reversion-cluster.md
Tim Olson 47471b7700 Expand model tag support: add GLM-5.1, simplify Anthropic IDs, scan tags anywhere in message
- Flink update_bars debouncing
- update_bars subscription idempotency bugfix
- Price decimal correction bugfix of previous commit
- Add GLM-5.1 model tag alongside renamed GLM-5
- Use short Anthropic model IDs (sonnet/haiku/opus) instead of full version strings
- Allow @tags anywhere in message content, not just at start
- Return hasOtherContent flag instead of trimmed rest string
- Only trigger greeting stream when tag has no other content
- Update workspace knowledge base references to platform/workspace and platform/shapes
- Hierarchical knowledge base catalog
- 151 Trading Strategies knowledge base articles
- Shapes knowledge base article
- MutateShapes tool instead of workspace patch
2026-04-28 15:05:15 -04:00

4.2 KiB
Raw Permalink Blame History

description, tags
description tags
Generalizes pairs trading to N>2 historically correlated stocks within a cluster (e.g., an industry), buying underperformers and shorting overperformers relative to the cluster mean.
stocks
mean-reversion
cluster
statistical-arbitrage

Mean-Reversion — Single Cluster (and Multiple Clusters)

Section: 3.9 / 3.9.1 | Asset Class: Stocks | Type: Mean-Reversion / Statistical Arbitrage

Overview

This strategy generalizes pairs trading to N > 2 stocks that are historically highly correlated — for example, stocks belonging to the same industry or sector. Each stock's return is demeaned relative to the cluster mean, and positions are taken proportional to negative demeaned returns: buy stocks that underperformed the cluster and short stocks that outperformed it.

Construction / Signal

Let R_i = ln(P_i(t_2) / P_i(t_1)) be the log return for stock i in the cluster of N stocks.

Cluster mean return and demeaned returns:

R_bar = (1/N) * sum_{i=1}^{N} R_i                         (293)
R_tilde_i = R_i - R_bar                                   (294)

Short stocks with positive R_tilde_i (outperformers), buy stocks with negative R_tilde_i (underperformers).

Dollar-neutrality constraints:

sum_{i=1}^{N} P_i |Q_i| = I                               (295)
sum_{i=1}^{N} P_i Q_i = 0                                 (296)

A simple prescription for dollar positions D_i = P_i Q_i proportional to demeaned returns:

D_i = -gamma * R_tilde_i                                  (297)

where gamma > 0 (short outperformers, buy underperformers). Eq. (296) is automatically satisfied; Eq. (295) fixes gamma:

gamma = I / sum_{i=1}^{N} |R_tilde_i|                     (298)

Entry / Exit Rules

  • Entry: Compute demeaned returns over a short measurement window; enter positions D_i = -gamma * R_tilde_i.
  • Exit: Close when demeaned returns converge back toward zero, or at a predefined time horizon.

Key Parameters

  • Cluster definition: Industry group, sector, or any set of historically correlated stocks
  • Measurement window: Short-term (days to weeks) for mean-reversion
  • Position sizing: Dollar-neutral via gamma normalization (Eq. 298)
  • Weights: Uniform modulus, or non-uniform (e.g., suppressed by volatility)

Variations

3.9.1 — Mean-Reversion: Multiple Clusters

Generalize to K > 1 clusters, where stocks within each cluster are historically highly correlated. Treat clusters independently and combine via linear regression (unified approach).

Let Lambda_{iA} be the N×K binary loadings matrix: Lambda_{iA} = 1 if stock i belongs to cluster A, else 0. Cluster sizes: N_A = sum_{i=1}^{N} Lambda_{iA} > 0, N = sum_{A=1}^{K} N_A.

Run a linear regression of stock returns R_i on Lambda_{iA} (no intercept, unit weights):

R_i = sum_{A=1}^{K} Lambda_{iA} f_A + epsilon_i           (303)

Regression coefficients (cluster mean returns):

f = Q^{-1} Lambda^T R,   Q = Lambda^T Lambda               (304, 305)
Q_{AB} = N_A delta_{AB}                                   (307)
R_bar_A = (1/N_A) sum_{j in J_A} R_j                      (308)

Demeaned return (residual) for stock i:

epsilon_i = R_i - R_bar_{G(i)} = R_tilde_i                (309)

where G(i) is the cluster to which stock i belongs. These residuals are cluster-neutral:

sum_{i=1}^{N} R_tilde_i Lambda_{iA} = 0,  A = 1,...,K    (310)

Also automatically: sum_{i=1}^{N} R_tilde_i = 0 (dollar-neutral).

Investments can be allocated uniformly across the K independent cluster strategies.

Notes

  • The single-cluster strategy (3.9) is the natural generalization of pairs trading to N stocks.
  • The multiple-cluster (3.9.1) formulation uses linear regression to compute all cluster means simultaneously in a unified framework.
  • The binary loadings matrix Lambda_{iA} ensures each stock belongs to exactly one cluster (no overlap assumed).
  • The intercept is automatically included via the constraint that each stock belongs to one cluster (sum_A Lambda_{iA} = 1).
  • Mean-reversion strategies work best when stocks are truly co-integrated or highly correlated; sector/industry groupings provide natural clusters.