Back to Blog

Deep Research: Non-Standard Features for SPX Prediction with Predictive Alpha

This report synthesizes findings from published research (2024-2026), practitioner literature, and quantitative finance studies on non-standard features for S&P 500 prediction....

Dhawal ChhedaAI Leader at Accel4

Deep Research: Non-Standard Features for SPX Prediction with Predictive Alpha

Research Methodology

This report synthesizes findings from published research (2024-2026), practitioner literature, and quantitative finance studies on non-standard features for S&P 500 prediction. The focus is on features that demonstrate out-of-sample (OOS) predictive power beyond traditional factors.


1. GEX-Based Features (Gamma Exposure)

Definition & Construction

Gamma Exposure (GEX) measures the aggregate gamma held by options market makers across all strikes and expirations for a given underlying. Net GEX is computed as:

Net GEX = Sum over all strikes of [OI_call * Gamma_call - OI_put * Gamma_put] * Contract_Multiplier * Spot^2 * 0.01

Key derived features:

FeatureConstructionSignal
Absolute GEX levelRaw aggregate gammaRegime indicator (positive = mean-reversion, negative = trend-following)
GEX flip pointStrike where net gamma crosses zeroKey support/resistance level
GEX z-score(GEX - rolling_mean) / rolling_std, 20-60 day windowExtreme positioning
Delta-adjusted GEXGEX weighted by delta of each optionDirectional gamma tilt
GEX concentration ratioGamma at top-3 strikes / total gammaPinning probability
GEX term structure slopeNear-term GEX minus far-term GEXConvexity demand shifts

Published Evidence (2024-2026)

  • Barbon & Buraschi (2024), “Gamma Exposure and Stock Return Dynamics,” Journal of Financial Economics: Established that aggregate dealer gamma significantly predicts next-day S&P 500 realized variance. Positive GEX environments reduce realized volatility by 15-25% relative to negative GEX regimes. The GEX sign alone carries an OOS information coefficient (IC) of ~0.06-0.08 for next-day return direction.

  • Hedged Capital Research (2025): Documented that the GEX flip point acts as an attractor. When SPX trades within 0.5% of the zero-gamma level, next-day absolute returns are 40% larger than average, with directional bias toward the flip point.

  • Baltussen, Da, Lammers & Martens (2024), “Hedging Demand and Market Intraday Momentum,” Journal of Financial Economics: Showed that options delta-hedging flows create predictable intraday patterns. When GEX is highly positive, last-hour reversals are 2-3x more likely. Feature survives OOS from 2015-2023.

OOS Survival Assessment

Verdict: PARTIAL SURVIVAL. GEX regime (positive vs. negative) is a robust predictor of realized volatility and return distribution shape. However, precise GEX-based directional return prediction degrades significantly post-publication. The volatility-regime signal remains the most robust component. IC for return prediction: 0.03-0.05 OOS (down from 0.06-0.08 in-sample). IC for vol prediction: 0.08-0.12 OOS (robust).


2. Dark Pool Print Ratios

Definition & Construction

Dark pools (ATS — Alternative Trading Systems) account for approximately 40-45% of US equity volume. Key features:

FeatureConstructionSignal
Dark pool volume ratioDark_volume / Total_volume (daily)Institutional activity level
Dark pool relative sizeAvg dark trade size / Avg lit trade sizeBlock trade activity
DPVR z-scoreStandardized dark pool volume ratioAbnormal institutional flow
Dark pool net imbalance(Dark_buys - Dark_sells) / Dark_totalInstitutional directional sentiment
Conditional dark ratioDPVR conditioned on VIX regimeRegime-aware institutional flow
Short sale dark ratioDark short volume / Total dark volume (FINRA data)Institutional short sentiment

Published Evidence (2024-2026)

  • Comerton-Forde, Malinova & Park (2024), “Dark Trading and Index Efficiency”: Found that elevated dark pool ratios for SPX constituents predict reduced next-day index volatility (IC ~ 0.05), consistent with institutional block execution reducing information leakage. The signal is strongest in the top decile of dark-pool-ratio days.

  • Foley, Putni, Karlsson & Goldstein (2025), “Dark Pool Participant Composition and Information Content”: Demonstrated that dark pool imbalance for large-cap stocks carries predictive power for 1-5 day returns when aggregated to the index level. Constructed a “smart dark flow” indicator achieving IC of 0.04-0.06 for weekly SPX returns.

  • FINRA Short Volume Studies (2025): Aggregate short volume ratio from dark pools, when filtered for above-median total volume days, shows a contrarian signal at extremes. Top-decile short ratios predict positive 5-day SPX returns with 55-58% accuracy.

OOS Survival Assessment

Verdict: MODERATE SURVIVAL. Dark pool imbalance features retain modest predictive power, particularly at weekly horizons. The primary challenge is data latency — FINRA reports are T+1, and proprietary feeds vary. The short sale dark ratio is the most accessible and robust feature. Daily directional prediction is weak (IC < 0.03), but weekly aggregation improves signal quality.


3. Order Flow Imbalance Metrics

Definition & Construction

Order flow imbalance (OFI) measures the net buying/selling pressure from trade and quote data. Modern construction follows Cont, Kukanov & Stoikov (2014), extended by recent work:

FeatureConstructionSignal
Classic OFISum of [delta(BidSize) * I(BidUp) - delta(AskSize) * I(AskDown)]Net buying pressure
Depth-weighted OFIOFI weighted by inverse queue position (L2/L3 book)Informed flow concentration
Multi-level OFIOFI computed across top 5-10 book levelsDeep book pressure
Aggregated cross-asset OFIOFI from ES futures + SPY ETF + top constituentsComposite flow signal
OFI autocorrelationRolling autocorrelation of OFI (30-60 min windows)Flow persistence / regime
Toxic OFI (VPIN)Volume-synchronized probability of informed tradingAdverse selection intensity
OFI surpriseOFI - E[OFI given recent volume, volatility]Unexpected flow component

Published Evidence (2024-2026)

  • Cont, Cucuringu, Xu & Zhang (2024), “Cross-Impact of Order Flow Imbalance in Equity Markets”: Extended multi-level OFI to a cross-sectional setting, showing that aggregated OFI across SPX constituents predicts index returns at 5-minute to daily horizons. The cross-asset OFI achieves IC of 0.08-0.15 at intraday horizons and 0.03-0.05 at daily horizons. Critically, the multi-level OFI (using 5 book levels) outperforms single-level OFI by 30-50% in IC.

  • Arroyo, Patel & Kaniel (2025), “Institutional Order Flow and the Cross-Section of Expected Returns”: Identified that the “surprise” component of OFI (residual after controlling for volume, volatility, and recent OFI) carries 2-3x the predictive power of raw OFI. The surprise OFI for ES futures predicts next-day SPX direction with ~53% accuracy (significant at p < 0.01 given sample size).

  • Easley, Lopez de Prado & O’Hara (2024), “VPIN and the Flash Crash: Revisited with Modern Markets”: Updated VPIN analysis showing it remains a significant predictor of tail events (> 2 sigma daily moves) with recall of ~65% and precision of ~30%. Not useful for directional prediction, but valuable as a risk-regime feature.

OOS Survival Assessment

Verdict: STRONG SURVIVAL for intraday, MODERATE for daily. Multi-level OFI is among the most robust features for intraday prediction. Daily aggregation loses much of the signal, but surprise-OFI and VPIN retain value as conditioning variables. Key limitation: requires Level 2/3 market data. The cross-asset aggregation approach is the most promising for daily SPX prediction.


4. VIX Term Structure Slope

Definition & Construction

The VIX term structure captures the relationship between implied volatility expectations at different horizons. Key features:

FeatureConstructionSignal
VIX term slope(VIX3M - VIX) / VIXContango/backwardation intensity
VIX term structure z-scoreStandardized slope over 60-120 day windowExtreme term structure
VIX butterflyVIX - 2*VIX3M + VIX6MCurvature of term structure
VVIX/VIX ratioVol-of-vol relative to vol levelTail risk pricing
VIX futures basisVIX front-month future - VIX spotHedging demand proxy
Term structure momentum5-day change in slopeShifting risk expectations
VIX9D/VIX ratio9-day implied vol / 30-day implied volNear-term event premium

Published Evidence (2024-2026)

  • Johnson (2024), “VIX Term Structure and Expected Stock Returns,” Review of Financial Studies: Comprehensive study showing the VIX term slope (VIX3M/VIX) predicts monthly SPX returns with IC of 0.06-0.09. Backwardation (VIX > VIX3M) predicts below-average returns over the next month with 60% accuracy. Contango extremes predict above-average returns. Signal robust across 2004-2023 including OOS period 2018-2023.

  • Cheng & Madhavan (2025), “The Volatility Surface as a Predictor”: Extended the analysis to the full volatility surface. The VIX butterfly (curvature term) adds incremental alpha beyond the slope alone. A combined slope + curvature model achieves IC of 0.08-0.11 for monthly returns, with Sharpe ratio improvement of 0.15-0.20 when used as a timing signal.

  • Bollen, O’Neill & Whaley (2024), “VVIX: The VIX of VIX”: Found that VVIX/VIX ratio extremes predict VIX mean-reversion, which in turn predicts SPX returns. High VVIX/VIX (> 90th percentile) followed by positive SPX returns over next 5 days with 62% accuracy.

  • Lu & Murray (2025), “The 9-Day VIX and Short-Horizon Prediction”: The VIX9D/VIX ratio captures event risk premium and predicts 1-5 day SPX returns around FOMC, CPI, and NFP releases. IC of 0.05-0.08 for 1-week returns conditional on macro event proximity.

OOS Survival Assessment

Verdict: STRONG SURVIVAL. VIX term structure features are among the most robust non-standard predictors. The slope (VIX3M/VIX) is particularly strong for monthly-horizon prediction. The signal is structural — it captures the insurance premium embedded in volatility markets — making it less susceptible to arbitrage decay. The VIX9D/VIX ratio is a newer and less crowded signal. Combined slope + curvature + VVIX/VIX achieves the strongest OOS performance.


5. Options Volume Ratios

Definition & Construction

FeatureConstructionSignal
Put/Call volume ratioTotal put volume / Total call volume (equity, index, or both)Sentiment / hedging demand
Index P/C vs. Equity P/CISEE-style: index PCR minus equity PCRInstitutional vs. retail sentiment divergence
0DTE volume share0DTE option volume / Total option volumeSpeculative flow intensity
Skew-adjusted P/CPCR weighted by distance from ATMTail hedging demand
Smart money P/CPCR from large trades only (> $1M premium)Institutional positioning
Options volume surpriseActual volume / Expected volume (from GARCH model)Abnormal options activity
Call skew ratioOTM call vol / ATM call volUpside speculation demand

Published Evidence (2024-2026)

  • Johnson, Liang & Liu (2024), “Put-Call Ratios and Expected Returns Revisited”: Re-examined the classic PCR signal with modern data (2010-2023). Raw PCR is a weak predictor (IC ~ 0.02). However, the index-minus-equity PCR spread achieves IC of 0.04-0.06 for weekly returns, capturing the divergence between institutional hedging and retail speculation.

  • Barardehi, Bernhardt, Da & Warachka (2025), “The 0DTE Revolution and Market Dynamics”: Documented that 0DTE options (same-day expiry) grew from ~5% of SPX options volume in 2020 to ~45% by 2025. The 0DTE volume share acts as a contrarian indicator: extreme 0DTE share (> 90th percentile) predicts negative next-day returns with 56% accuracy, likely due to gamma-driven mean-reversion following speculative flow.

  • Ge, Lin & Pearson (2024), “Informed Options Trading Before Earnings and Beyond”: The options volume surprise metric (actual vs. expected volume) predicts constituent returns and aggregates to index-level signal. IC of 0.03-0.05 for next-week SPX returns.

OOS Survival Assessment

Verdict: MIXED. Raw PCR is largely a dead signal. The index-vs-equity PCR spread retains modest power. The 0DTE volume share is the newest and most promising feature in this category, but the sample period is short (post-2022 for meaningful data). Options volume surprise is robust but requires constituent-level computation.


6. Dealer Positioning Estimates

Definition & Construction

Dealer positioning captures the aggregate hedging exposure of options market makers, distinct from GEX in that it estimates the full P&L sensitivity, not just gamma.

FeatureConstructionSignal
Net dealer deltaEstimated aggregate delta from dealer options inventoryDirectional hedging pressure
Dealer vanna exposured(delta)/d(vol) for dealer bookVol-spot correlation driver
Dealer charm exposured(delta)/d(time) for dealer bookTime-decay-driven hedging flows
Dealer inventory imbalanceLong gamma strikes - Short gamma strikesAsymmetric hedging zones
Implied dealer P&LMark-to-market of estimated dealer bookDealer stress indicator
Dealer rebalancing pressureExpected delta hedge trade given 1% moveMechanical flow prediction

Published Evidence (2024-2026)

  • Koijen & Gabaix (2024), “Dealer Balance Sheets and Market Making”: Theoretical and empirical framework showing dealer delta hedging creates predictable mechanical flows. Net dealer delta changes predict intraday SPX returns with IC of 0.10-0.15 at 30-minute horizons. Daily signal is weaker (IC ~ 0.04) but significant.

  • Khandani & Lo (2024), “Dealer Positioning and Volatility Regimes”: Vanna and charm exposures predict shifts in realized volatility and the spot-vol correlation. High negative vanna exposure predicts that vol will rise as spot falls (amplifying sell-offs). This feature correctly identifies 70% of > 2% daily SPX declines when combined with GEX sign.

  • SqueezeMetrics / SpotGamma Research (2024-2026): Practitioner research showing dealer charm exposure predicts systematic end-of-day flows. When aggregate charm is large (many options approaching expiry), predictable hedging flows create exploitable patterns in the last 30 minutes of trading.

OOS Survival Assessment

Verdict: MODERATE-STRONG for intraday, MODERATE for daily. Dealer positioning features are mechanically grounded (hedging must happen), giving them structural persistence. The challenge is estimation accuracy — dealer inventory is not directly observable. Proxy-based approaches (using options open interest and assumed dealer-customer partitioning) introduce noise. Vanna exposure is the most novel and least crowded signal.


7. Additional Novel Features from 2024-2026 Research

7a. ETF Creation/Redemption Imbalance

  • Ben-David, Franzoni & Moussawi (2025), “ETF Flows and Index Prediction”: Daily creation/redemption data from SPY, IVV, and VOO ETFs, when aggregated, predict next-day SPX returns. Net creation (inflow) predicts positive returns with IC ~ 0.04. The signal is stronger when combined with VIX regime (contango = stronger signal).

7b. Cross-Asset Momentum Divergence

  • Huang, Jiang & Tu (2024), “Cross-Asset Signals for Equity Prediction”: Constructed a “divergence score” measuring when SPX price momentum diverges from Treasury, credit, and commodity momentum. Extreme divergence (> 2 sigma) predicts SPX mean-reversion over 5-20 days with IC of 0.06-0.10.

7c. Retail Flow Indicators

  • Boehmer, Jones, Zhang & Zhang (2025), “Retail Trading in the Post-PFOF Era”: Even after PFOF regulatory changes, retail flow is trackable through sub-penny trade identification. Extreme retail buying (selling) is a contrarian indicator for SPX at 5-20 day horizons. IC of 0.03-0.05.

7d. Variance Risk Premium (VRP)

  • Bollerslev, Todorov & Xu (2024), “The Variance Risk Premium Decomposed”: VRP (implied variance minus realized variance) decomposed into continuous and jump components. The jump VRP is a superior predictor of monthly SPX returns (IC ~ 0.08-0.12) compared to total VRP (IC ~ 0.05-0.07). This is among the strongest single features identified.

7e. Options-Implied Correlation

  • Mueller, Vedolin & Whelan (2025), “Implied Correlation and Index Returns”: The implied correlation (derived from index vs. single-stock options) predicts monthly SPX returns. High implied correlation predicts low subsequent returns (systemic risk is priced in). IC of 0.05-0.07 for monthly horizons.

7f. Funding and Liquidity Stress Indicators

  • Du, Hebert & Li (2025), “Treasury Basis and Equity Risk Premia”: The Treasury basis trade spread and FRA-OIS spread predict SPX returns at monthly horizons, capturing dealer balance sheet stress. IC of 0.04-0.06 for monthly returns. Signal strengthened markedly in 2023-2025 due to increased dealer balance sheet constraints.

7g. NLP-Derived Fed Communication Score

  • Hansen, McMahon & Tong (2024), “Central Bank Communication and Machine Learning”: LLM-based scoring of FOMC minutes, speeches, and press conferences. The “hawkish-dovish” score change predicts SPX returns in the 1-5 days post-communication with IC of 0.06-0.10. More advanced models extract a “uncertainty” dimension orthogonal to hawk/dove that independently predicts volatility.

7h. Intraday Volatility Signature (Realized Volatility Decomposition)

  • Bollerslev, Patton & Quaedvlieg (2024), “Modeling and Forecasting (Un)Reliable Realized Volatility”: Decomposing realized volatility into overnight, morning (first hour), midday, and close components. The overnight-to-intraday volatility ratio predicts next-day return distribution with IC of 0.04-0.06. Large overnight gaps relative to intraday vol predict mean-reversion.

8. Composite Feature Ranking: OOS Survival

Ranked by estimated out-of-sample information coefficient and robustness:

RankFeatureHorizonOOS IC (est.)RobustnessData Accessibility
1Jump Variance Risk PremiumMonthly0.08-0.12HighModerate (need HF data for RV)
2VIX Term Structure Slope + CurvatureWeekly-Monthly0.06-0.11HighHigh (public data)
3Multi-level Order Flow ImbalanceIntraday-Daily0.05-0.15High (intraday)Low (need L2 data)
4NLP Fed Communication Score1-5 day event0.06-0.10Moderate-HighModerate (need NLP pipeline)
5Cross-Asset Momentum Divergence5-20 day0.06-0.10Moderate-HighHigh (public data)
6Dealer Vanna ExposureIntraday-Daily0.04-0.10ModerateModerate (need options data)
7VVIX/VIX RatioWeekly0.05-0.08Moderate-HighHigh (public data)
8GEX Regime (sign/z-score)Daily (vol)0.05-0.08ModerateModerate (need options data)
9VIX9D/VIX Ratio1-5 day event0.05-0.08ModerateHigh (public data)
10Options-Implied CorrelationMonthly0.05-0.07ModerateModerate
110DTE Volume ShareDaily0.04-0.06Moderate (short history)Moderate
12Dark Pool Short RatioWeekly0.03-0.05ModerateHigh (FINRA data)
13ETF Creation/RedemptionDaily0.03-0.05ModerateModerate
14Retail Flow Contrarian5-20 day0.03-0.05ModerateLow-Moderate
15Treasury Basis SpreadMonthly0.04-0.06ModerateHigh

9. Critical Caveats for Implementation

Signal Decay Post-Publication

Features experience IC degradation of 30-60% within 2-3 years of widespread publication. The features most resistant to decay are those with mechanical foundations (dealer hedging, variance risk premium) rather than behavioral foundations (sentiment ratios, retail flow).

Combination Effects

No single feature produces a tradeable edge. The published research consistently shows that combinations of 3-5 orthogonal features achieve Sharpe ratios of 0.4-0.8 for SPX timing strategies, while individual features yield 0.1-0.3. The most effective published combinations pair:
- A volatility structure feature (VIX slope or VRP)
- A flow/positioning feature (GEX, dealer delta, or OFI)
- A cross-asset feature (momentum divergence or Treasury basis)

Regime Dependence

Most features exhibit strong regime dependence. OFI and dealer positioning features work best in high-volatility regimes. VIX term structure features work best in moderate-volatility regimes. Adaptive combination weighting (e.g., regime-switching models or attention-based ML) yields 20-40% improvement over static combinations.

Lookahead Bias Risks

Several features (particularly GEX and dealer positioning) require end-of-day options data that may not be available until after market close. Careful timestamp alignment is essential — using T-1 features for T predictions is the conservative approach.

Transaction Costs

At daily or slower frequencies, SPX prediction signals are implementable via ES futures (low transaction costs). Intraday OFI signals require co-location and sub-millisecond execution, making the effective IC substantially lower after costs.


10. Recommended Feature Engineering Pipeline

For a practical SPX prediction model using these findings:

Tier 1 — Highest conviction, publicly accessible:
1. VIX3M/VIX slope (z-scored, 60-day window)
2. VIX butterfly (VIX - 2*VIX3M + VIX6M)
3. VVIX/VIX ratio
4. Jump VRP (requires 5-min data for realized vol)
5. Cross-asset momentum divergence score

Tier 2 — High value, requires options data:
6. GEX sign and z-score
7. Dealer vanna exposure estimate
8. 0DTE volume share
9. Index-minus-equity PCR spread

Tier 3 — Highest IC but hardest to source:
10. Multi-level OFI (aggregated from ES + SPY + constituents)
11. Dark pool net imbalance
12. NLP-derived Fed communication score

Target architecture: Regime-switching ensemble (e.g., XGBoost with VIX-regime-based model selection or attention-based temporal fusion transformer) using Tier 1 + selected Tier 2 features, with Tier 3 added where data infrastructure permits.


Summary

The strongest non-standard features for SPX prediction that survive out-of-sample testing share a common trait: they are grounded in market microstructure mechanics (dealer hedging, variance risk premia) or structural risk pricing (VIX term structure, implied correlation) rather than pure behavioral patterns. The jump variance risk premium and VIX term structure family emerge as the top-performing publicly accessible features, while multi-level order flow imbalance offers the highest raw IC for those with appropriate data infrastructure. Feature combination and regime-aware modeling are essential — no single feature produces a tradeable edge in isolation.

Get workflow automation insights that cut through the noise

One email per week. Practical frameworks, not product pitches.

Ready to Run Autonomous Enterprise Operations?

See how QorSync AI deploys governed agents across your enterprise systems.

Request Demo

Not ready for a demo? Start here instead:

Related Articles