Validation & Methods

Canonical, public methodology for Market State Detector (MSD) evaluation and statistics.

Last Updated: November 28, 2025

Performance Definitions

Hit rate = hits ÷ (hits + false positives)

Lead time = business days from alert timestamp to first occurrence of the event threshold

Alerts/year = total state activations ÷ years in validation period

BD (business days) = NYSE/Nasdaq trading days

0 BD (zero business days) = alert timestamp on the same trading day as the first hit, published pre‑market before the regular NYSE/Nasdaq cash-session open.

Validation period shown = 2012–2024 (extended research 1990–2024 under NDA)

FP (false positive) = an alert episode that does not meet any state’s pre‑published hit criteria within its scoring horizon (BD) after episode collapse and cooldown. FP is evaluated at the episode level (not per trigger day).

Event Criteria (what counts as a “hit”)

Systemic Stress

Hit if any of: VIX ≥ 35; SPX 5‑day ≤ −3.0%; SPX 10‑day ≤ −6.0%; SPX 20‑day ≤ −10.0%.

Volatility Spike

Hit if any of: VIX ≥ 30; SPX 5‑day ≤ −3.0%; SPX 10‑day ≤ −5.0%.

Stress (Advisory)

Hit if any of: VIX ≥ 30; SPX 5‑day ≤ −3.0%; SPX 10‑day ≤ −5.0%.

Turning

Hit if any of: SPX 10‑day ≤ −2.0%; SPX drawdown ≤ −4.0%; VIX ≥ 25 for 2 consecutive trading days.

Calm

Active when all other alarms are inactive (informational regime classification).

Event Horizons

Volatility Spike: 5 BD
Stress (Advisory): 7 BD
Systemic Stress: 10 BD
Turning: 20 BD

Methodology (Public Summary)

Compliance

For institutional research only. Historical, backtested results (2012–2024 unless noted). Not investment advice. See our Disclaimers.

What we backtest

MSD evaluates five production market-state alarms: systemic stress, volatility spike, turning, Stress (Advisory), and a meta calm state.
Performance summaries are maintained in a versioned internal validation archive (single source of truth). A full validation packet with current figures and provenance is available on request via the validation packet form.

Data and period

Historical period: multi‑year history (e.g., 2012–2024) as reflected in the validation files.
Inputs: the canonical PDI time series and public market data (SPX and VIX).
Time base: business‑day (BD) calendar; timestamps validated and stored in ISO 8601.

What we do / What we don’t do

We do

Classify market states with fixed, rules‑based definitions
Evaluate alerts against independent market events (SPX/VIX)
Use business‑day horizons and deterministic episode logic
Publish definitions and validation methods publicly

We don’t

Provide price targets or trade recommendations
Tune rules on evaluation windows (no in‑sample optimization)
Score alerts with look‑ahead or calendar‑day shortcuts
Claim endorsement by data sources or agencies

How an alarm is evaluated (at a glance)

Fixed rules: Each alarm is specified by an immutable definition (no in‑sample tuning in the backtest).
Daily evaluation: Conditions are checked per business day against historical inputs.
Episodes: Consecutive alert days are collapsed into alert “episodes” using business‑day cooldowns and windows.
Market events: Independent market events (e.g., VIX and SPX patterns) are identified and deduplicated into event episodes.
Scoring: An alert episode is a “hit” if a qualifying event occurs within a forward‑looking scoring horizon (in business days). Otherwise it is a miss; episodes with no event are false positives.
Metrics: Precision, recall, and F1 are computed from episodes; figures reported on the website come from these validation files.

Design principles that reduce overfitting risk

No in‑sample optimization: Backtests run the published rule logic as‑is.
Forward‑only scoring: Hits require future events within a predefined BD horizon; no look‑ahead.
Deterministic time handling: All windows/horizons/cooldowns are business‑day based.
Separation of concerns: Historical performance validation files (manifests) are presentation SoT and are not used by evaluation logic.
Snapshot parity: Internal regression checks ensure outputs remain stable across versions.

Concrete safeguards

Independent event labeling and deduplication with fixed event cooldown (10 BD).
Episode collapsing to prevent double‑counting during prolonged spells.
Forward‑only scoring horizons (no look‑ahead leakage).
Business‑day determinism with ISO‑8601 timestamps.
Rule immutability per run (no parameter tuning inside the backtest).
Presentation SoT isolation and regression snapshot parity checks.
Reproducible artifacts: episode‑level CSVs (alerts, events, hits) for independent verification.

Data sources

We use public market data (SPX, VIX) and public environmental datasets (e.g., NOAA, NASA, USGS). References to third‑party data providers are for sourcing only and do not imply endorsement.

Validation protocols (walk‑forward and robustness)

Held‑out era evaluation on excluded historical eras; no tuning on the evaluation period.
Walk‑forward validation: expanding-window approach with 2-year training, 1-year test, 90-day roll. Era splits (SC24/SC25) validated separately.
Leave‑one‑era‑out to assess robustness by omitting eras and testing excluded windows.
Era stratification to surface instability and confirm consistency across market conditions.
Cross‑domain sanity checks where applicable.
Optional inactive‑day baselines using identical episode windows are available for context in the internal validation materials.

For complete overfitting controls and offline reproduction procedures, we provide an auditor‑friendly validation packet on request via the validation packet form.

Reproducibility

Public figures are sourced from versioned internal validation records; website statistics are derived directly from this single source of truth.
Validation Packet — an auditor‑friendly package to reproduce metrics without environment access. Request via the form and we will deliver the packet by email; certain sensitive artifacts may require an NDA.

Compliance framing

All analytics are informational research or risk‑classification only and must not be construed as investment advice, solicitation, or performance guarantees.

FAQ

Are the results real‑time or backtested?

Figures shown are historical/backtested for 2012–2024 unless otherwise stated. Live operation began in 2025.

Can I see episode‑level data?

We provide episode‑level CSV exports and versioned validation summaries on request via the validation packet form; certain sensitive artifacts may require an NDA.

Is this investment advice?

No. This is an informational classification tool. See Disclaimers for details.

Or email sales@mindforge.tech

Informational research only - not investment advice. Past performance does not guarantee future results.