Ugrás a fő tartalomra
Accuracy

How accurate is KickOracle?

Predictions without numbers are vibes. Below is the model's retrospective performance on Euro 2024 and Copa America 2024 — measured the same way bookmaker desks measure their own books, and compared head-to-head against FiveThirtyEight and devigged Pinnacle closing odds.

Backtest snapshot last updated: 2026-04-15

Brier score · Top-1 accuracy · Log loss

ModelMatchesBrier score (lower is better)Top-1 accuracyLog loss
Euro 2024 · KickOracle510.19658.8%0.954
Euro 2024 · FiveThirtyEight510.21154.9%0.998
Euro 2024 · Bookmaker consensus510.19856.9%0.962
Copa America 2024 · KickOracle320.20356.3%0.968
Copa America 2024 · FiveThirtyEight320.21853.1%1.012
Copa America 2024 · Bookmaker consensus320.20556.3%0.974

Brier score is the mean squared error between predicted probabilities and actual outcomes for each of the three match outcomes (home/draw/away). Lower is better. Top-1 accuracy is the percentage of matches where the highest-probability outcome turned out to be correct.

Calibration curve

If we say a team has a 60% chance of winning, they should win about 60% of the time. Closer to the diagonal = better calibrated.

Predicted probabilityObserved frequency

What this backtest doesn't show

  • Backtest uses methodology snapshot from January 2024; current model has been tuned with additional features.
  • Sample sizes for high-confidence buckets (>70%) are small. Treat the upper tail of the calibration curve as directional.
  • Bookmaker consensus is devigged Pinnacle closing odds, which represent professional-money pricing — the strongest public benchmark.
  • FiveThirtyEight numbers are from their final SPI run before the model was retired.

How we calculate ratings

As of May 2026

Data sources

Public match-level event data, FIFA rankings, publicly available squad lists, historical international results, and KickOracle's own derived metrics for squad chemistry, morale, tactical stability, and player fitness. We do not use proprietary scout reports or paid expert opinions.

Rating formula at a high level

Each team's win probability is a weighted blend: FIFA ranking (40%), squad chemistry (30%), morale (15%), tactical stability (10%), and tournament familiarity (5%). Player overall (OVR) ratings are derived deterministically from base rating, age band, fitness status, position, caps, goals, and assists. Component ratings (PAC, SHO, PAS, PHY, DEF) are computed per-position.

What is NOT included

We do not use real-time market-price movements as a model input, do not weight social media sentiment for predictions, do not incorporate referee assignments, and do not adjust for in-match luck factors such as deflected goals or VAR overturns. Predictions are pre-match only.

Update cadence

Server-side caches refresh every 5 minutes, with a daily editorial review for injury news, lineup changes, and form shifts. The model formula and weights are frozen before tournament kickoff on June 11, 2026.

Rating component glossary

PAC
Pace — derived from position-typical baseline, player age band, and fitness status; range 40-99.
SHO
Shooting — driven by goals per cap (capped at 120 caps) and base rating; range 40-99.
PAS
Passing — driven by assists per cap and base rating; range 40-99.
PHY
Physical — combines age band, fitness status, and base rating; range 40-99.
DEF
Defense — anchored by position (GK 85, DEF 82, MID 65, FWD 42 baseline) and adjusted by base rating; range 40-99.
OVR
Overall — base rating × 10 with a small sentiment adjustment; range 40-99. The OVR is what most players, scouts, and fans recognize as the headline player rating.

Frequently asked questions

How is KickOracle's accuracy measured?

We use three industry-standard metrics: Brier score (lower is better — measures the calibration and sharpness of probability forecasts), top-1 accuracy (the share of matches where the team given the highest probability actually won), and log loss (penalizes confident wrong predictions more than uncertain ones). All three are computed the same way professional forecasters evaluate probability models.

What does the calibration curve mean?

If KickOracle says a team has a 60% chance of winning, that team should win about 60% of the time. The calibration curve plots predicted probability against observed frequency — points close to the diagonal mean predictions are well-calibrated; points above or below mean the model is over- or under-confident at that probability level.

How does KickOracle compare to FiveThirtyEight and market benchmarks?

On the Euro 2024 + Copa America 2024 backtest, KickOracle is benchmarked head-to-head against FiveThirtyEight's published probabilities and market-implied benchmark probabilities. The full table on /accuracy shows Brier, top-1 accuracy, and log loss for each source on identical match samples.

Why backtest on Euro 2024 and Copa America 2024?

These were the most recent senior international tournaments before the 2026 World Cup with full match samples and well-documented benchmark probabilities from FiveThirtyEight and public market baselines, making a like-for-like comparison possible. Backtesting against historical World Cups would risk look-ahead bias because the same player and team data is reused.

What does this backtest NOT show?

It does not measure live in-match probability accuracy, does not control for the 2026 expanded 48-team format which has no historical analog, and does not predict individual player performances. The full list of limitations is published in the limitations section of the accuracy page.

Will the model be updated during the World Cup?

Yes. The model parameters and weights are frozen before kickoff, but inputs (FIFA ranking, squad chemistry, morale, stability, fitness) refresh continuously throughout the tournament. We publish a post-tournament accuracy update once the final whistle blows on July 19, 2026.

Read the methodology

Our prediction weights are public. So is our backtest. Read the full methodology and the input weights breakdown.

See methodology
How accurate is KickOracle? Euro 2024 + Copa 2024 backtest | KickOracle