Methodology

How ForecastMind collects data, computes calibration metrics, and matches markets across venues. This page is intended for researchers and journalists who want to understand and cite our data.

1. Data Collection

ForecastMind pulls live market data from multiple venues every 5 minutes and stores snapshots in a DuckDB database. The following venues are currently tracked:

Venue	API	Auth	Polling
Polymarket	Gamma API + CLOB	None	5 min
Kalshi	Elections API	None	On divergence check
Manifold	v0 REST API	None	On divergence check
PredictIt	Market data API	None	On divergence check
Metaculus	api2	Token (free)	On consensus check

Each snapshot stores: market_id, yes_price, no_price, volume_24h, liquidity, spread, best_bid, best_ask, and a Unix timestamp. Data is retained indefinitely.

Market resolutions are recorded when Polymarket marks a market as resolved via their Gamma API. The final yes_price snapshot before the resolution timestamp is used as the market's last price for calibration.

2. Calibration Metrics

Calibration measures how well a prediction market's probabilities correspond to actual outcome frequencies. A perfectly calibrated market would resolve YES exactly 70% of the time for markets priced at 70%.

Brier Score

BS = (1/N) × Σ (predicted_probability − actual_outcome)²

Where actual_outcome = 1.0 for YES, 0.0 for NO. Range: 0 (perfect) to 1 (perfectly wrong). Random guessing on binary outcomes yields BS = 0.25.

Mean Absolute Error (MAE)

MAE = (1/N) × Σ |predicted_probability − actual_outcome|

Simpler than Brier Score; less sensitive to extreme errors. Used for cross-venue comparisons where only one venue's price is available.

Reliability diagrams (also called calibration curves) group markets into 10 probability buckets (0–10%, 10–20%, etc.) and plot the actual YES resolution rate against the bucket midpoint. A perfectly calibrated market would produce a diagonal line from (0, 0) to (1, 1).

Category filters apply to all metrics: Brier scores, MAE, and calibration curves can all be filtered by the Polymarket category tag (politics, crypto, sports, etc.).

Cross-venue comparisons require that the same real-world event was tracked on both venues AND subsequently resolved. Only markets matched via divergence detection (see Venue Matching) are included. The last recorded price on each venue before the resolution timestamp is used.

3. Venue Matching (Entity Resolution)

Matching the same event across venues is the hardest part of cross-venue analysis. ForecastMind uses a two-stage hybrid approach:

Jaccard similarity on tokenized question text. Stop words, common prediction market phrasing ("will", "by", "in"), and magnitude tokens (bare numbers, bps, percentages) are stripped. Similarity is computed as:|A ∩ B| / |A ∪ B|
Embedding similarity (tiebreaker for borderline cases). For pairs scoring 0.15–0.55 on Jaccard, a cosine similarity check using all-MiniLM-L6-v2 (384-dimensional sentence embeddings) is applied. Pairs must score ≥ 0.76 cosine to be accepted by the embedding path.

Use case	Min score	Notes
Divergence alerts	0.50	Raised 2026-03-12 after false-match audit
Cross-venue signals	0.55	Per-market panel
Consensus (Metaculus)	0.57	Established standard
Canonical entity layer	0.57	Persistent cross-venue links

Matched pairs are stored as canonical entities with stable UUIDs in the canonical_events table. Each entity accumulates venue aliases over time as new matches are confirmed.

4. Coverage & Limitations

Resolution coverage: ForecastMind records resolutions as they are reported by Polymarket's Gamma API. Markets that resolve without a price snapshot in the database (e.g. markets not seen during polling intervals) will have no calibration entry. Coverage is typically >95% for markets active within 7 days of resolution.

Cross-venue coverage: Limited to events that triggered a divergence alert (i.e. where the price gap exceeded the threshold at any point). Events that never diverged are not in the cross-venue comparison set. This introduces selection bias — divergence events may be systematically harder to predict.

Kalshi access: Only api.elections.kalshi.com is accessible from ForecastMind's servers. This limits Kalshi coverage to long-dated political markets (US/UK elections, leadership races). Kalshi sports and economics markets are not tracked.

Start date: Continuous data collection began in early 2026. Markets that resolved before the collection start date are not in the calibration database.

5. Citation Format

For academic papers, reports, or journalism citing ForecastMind data:

APA

ForecastMind. (2026). ForecastMind prediction market calibration database. Retrieved [date], from https://forecastmind.org/calibration

For a specific market

ForecastMind. (2026). [Market question]. Retrieved [date], from https://forecastmind.org/markets/[slug]

For a monthly report

ForecastMind. (2026). State of prediction market accuracy — [Month Year]. Retrieved [date], from https://forecastmind.org/reports/[YYYY]/[MM]

If you use ForecastMind data in published research, we'd appreciate a brief note at research@forecastmind.org. We track citations to improve data quality for the research community.

Calibration data →Polymarket vs Kalshi →API documentation →