Build a World Cup Model: Monte Carlo Fair Value for the Outright

France trades at 22¢ to lift the trophy. That number is not handed down from a panel of pundits — somebody built a model that spat out a probability, the market argued with it, and 22% is where the argument settled. If you want to know whether 22¢ is cheap, expensive, or fair, you need your own number to argue back. The fastest, most honest way to get one is a world cup Monte Carlo model: feed it team strengths, simulate the bracket ten thousand times, and read the championship share straight off the results.

With the 2026 tournament kicking off June 11 in Mexico City — 48 teams, 104 matches, the July 19 final at MetLife — every outright contract on Kalshi and Polymarket is a bet that someone's simulation is wrong. This is how to build the simulation that tells you when to take the other side.

Why simulate at all instead of just reading odds

You could skip the work and devig the consensus. That is a fine prior, and most days the market is right. But a tournament outright is a path-dependent object: a team's title probability is not just "how good are they," it is "how good are they, given this group, this likely Round-of-16 opponent, and this half of the bracket."

A single power rating cannot express that. A simulation can. Run the whole bracket thousands of times and the share of runs a team wins is its model probability — draw difficulty, upset variance, and depth of field all baked in automatically.

Step 1: turn team strength into a rating number

Every model starts with a number per team. You have three credible sources, and the sharp move is to blend them rather than marry one.

Elo and SPI-style ratings

Elo is the workhorse. Each team carries a rating around 1500–2100; the gap between two ratings maps to a win probability through a logistic curve. It updates after every match — win against a stronger side, gain points; lose to a minnow, bleed them. World Football Elo and FiveThirtyEight-lineage SPI ratings are public starting points.

Market-implied strength

Reverse-engineer strength from the no-vig outright board. If the devigged market has France at ~18% and Mexico at ~2%, that is a strength ordering the whole market agreed on. Anchoring to it keeps you from drifting into pure bias.

Your own xG-based ratings

If you have the appetite, build ratings from underlying numbers — expected goals for and against, adjusted for opponent — using a source like FBref. This is where genuine edges hide, because xG-based strength can disagree with reputation and recent results.

Step 2: convert a rating gap into a match probability

This is the engine. For two teams with ratings A and B, the win probability for A in a single match is the logistic function:

P(A beats B) = 1 / (1 + 10^((B − A) / 400))

That /400 scaling is the Elo convention: a 400-point gap means the favorite wins about 91% of the time, a 200-point gap about 76%, and equal ratings give a clean 50/50. Plug France (≈2060) against Croatia (≈1900): a 160-point edge puts France around 72% to advance from that tie.

Knockout football has draws, so a real simulator either resolves ties with a penalty coin-flip or models goals directly (a Poisson draw whose mean scales with the rating gap) and only flips a coin when the score is level. The embedded simulator below does exactly that — it draws goals each match and breaks deadlocks probabilistically, which is why the title shares it produces are realistic rather than chalky.

Here is the actual machine. It carries a rating for each contender, plays the full knockout bracket match by match using the logistic engine above, records who lifts the trophy, and repeats ten thousand times. Open the Ratings panel to nudge any team up or down — that is you injecting your own read — then hit Run and watch the champion probabilities converge.

Monte Carlo Simulator

Run 10,000 Tournaments

0 / 10,000 sims

Probability of Winning

ARG
0.0%
FRA
0.0%
BRA
0.0%
ESP
0.0%
ENG
0.0%
POR
0.0%
NED
0.0%
GER
0.0%
BEL
0.0%
ITA
0.0%
CRO
0.0%
URU
0.0%

Two things to notice as it runs. First, the win percentages stabilize after a few thousand simulations — early noise smooths out, which is the whole point of Monte Carlo: you are estimating a probability by brute-force counting. Second, even a clear favorite rarely clears ~20%. That is not a bug. A 48-team field with single-elimination variance is designed to spread probability, and any model claiming a team is 35% to win the whole thing should make you suspicious, not confident.

≈18%

Top favorite ceiling

even the strongest team rarely models above ~20% to win a 48-team bracket

How many simulations is enough

The standard error on a probability estimate scales with 1/√N. At 10,000 runs, a team that truly wins 18% of the time will show up around 18% give or take roughly a point — tight enough to trade favorites. For a 2% longshot you want more runs (50k+) to pin the tail down, because rare events need more samples to estimate cleanly.

Step 4: compare your model to the market and find the edge

Now the payoff. Take your simulated probabilities and lay them next to live Kalshi and Polymarket prices. Remember the contract mechanic: a price in cents equals the market's implied probability for a $1-resolving contract, so a 22¢ France contract = a 22% market read. Your edge on any outcome is simply model probability minus price.

Below is an illustrative board built from the early-June outright market against a sample model run. The fair column is your simulator's output; the venue columns are live-ish prices.

Market board

Prices across venues

Outcome	Kalshi	Polymarket	Pinnacle (no-vig)	Fair	Edge
France	22¢	21¢	20¢	19%	-1.0
Spain	18¢	19¢	17¢	17%	0.0
England	14¢	13¢	14¢	12%	-1.0
Brazil	12¢	13¢	13¢	14%	+2.0
Argentina	10¢	11¢	11¢	13%	+3.0
Morocco	4¢	4¢	3¢	6%	+3.0

Best priceEdge = fair % − best price ¢ · positive = value

Illustrative snapshots vs a sample model run — verify live prices before trading.

Read it like a desk analyst. France is priced above model on every venue — the market is paying up for reputation and a soft-looking group; that is a fade or a pass, not a buy. Argentina and Brazil model higher than they trade — the South American bloc is where your simulation disagrees with the board, so that is where you shop. Morocco at 4¢ against a 6% model is the kind of cheap convex longshot a tournament sim is built to surface.

The same comparison as bars makes the disagreements pop:

Model vs market — where the edges are

France21% · you 19%

Spain18% · you 17%

Brazil13% · you 14%

Argentina11% · you 13%

Morocco4% · you 6%

Market Your model

Turning an edge into a trade

A gap on a board is a signal, not a position. Before you fire, run the EV on the specific contract at the specific price. Take Argentina: your model says 13%, the market wants 11¢. Plug it in.

Expected value

Is this contract +EV?

Your fair probability%Your true read on Argentina to lift the trophyMarket price¢What you pay per $1 contractStake$Dollars you put at risk

✓ Positive expected value

The market prices Argentina to lift the trophy at 11% but you have it at 13% — a 2.0-point edge.

Market implied

11.0%

Your edge

+2.0 pts

EV / contract

+$0.020

Expected ROI

+18.2%

Contracts

909

Max payout

$909

EV on stake

+$18.18

Break-even prob

11.0%

EV is only as good as your probability. Garbage-in, garbage-out — devig the market and pressure-test your model.

A 2-point edge at a 11¢ price is real but thin, and thin edges demand discipline: tight sizing, and confidence that your rating for Argentina is genuinely better than the consensus rather than just different. The single most important habit after placing the trade is logging the price you got versus where the market closes — that is the only metric that proves your model is actually beating the market over time, and it deserves its own playbook in closing line value for the World Cup.

How to actually trade your model output

A few rules that separate a model that makes money from one that just generates numbers.

Trade the disagreements, not the agreements. If your sim and the no-vig market both say France 19%, there is no trade. Edge lives only where you diverge — and you should be able to say why in one sentence.
Size by edge, not by conviction. A 6-point model edge on a longshot warrants more than a 1-point edge on a favorite. Convert edge to a Kelly fraction and trade a fraction of full Kelly to survive model error.
Re-run after every result. Group-stage matches are information. Update ratings, re-simulate, and re-price your outrights as the bracket fills in. A static model goes stale by the second matchday.
Respect liquidity. A beautiful edge on a longshot you cannot fill at the modeled price is a paper edge. Check depth before you celebrate.
Treat your prior with humility. When your model wildly disagrees with a deep, liquid market, the base rate is that the market is right. Demand a specific, defensible reason for every big divergence.

The market's 22¢ on France is just somebody else's simulation. Build your own, run it ten thousand times, and you stop guessing whether a price is fair — you measure it.

“A price is a forecast in disguise. The only way to know if it's wrong is to forecast it yourself.”

Frequently asked

What is a Monte Carlo model for World Cup betting?

It is a simulation that assigns each team a strength rating, plays out the entire tournament bracket thousands of times using those ratings to decide each match, and counts how often each team wins. The share of simulations a team wins is its fair championship probability, which you compare to market prices to find edge.

How many simulations do I need to run?

Around 10,000 runs gives stable estimates for favorites — the standard error scales with one over the square root of the number of runs, so error shrinks slowly. For low-probability longshots, push toward 50,000-plus so the rare-event tail is estimated cleanly.

Where do team strength ratings come from?

Three main sources: public Elo or SPI ratings, strength reverse-engineered from no-vig market odds, and your own ratings built from expected-goals data. Blending all three is more robust than relying on any single source, and you should sanity-check that your top teams roughly match the market's top teams.

How do I convert a rating gap into a match win probability?

Use the logistic formula P = 1 / (1 + 10^((B − A)/400)), the standard Elo curve. A 400-point gap implies about 91% for the favorite, 200 points about 76%, and equal ratings a 50/50. Knockout draws are resolved with a probabilistic coin-flip or by modeling goals directly.

How do I find an edge against Kalshi or Polymarket?

A contract price in cents equals the market's implied probability for a $1 contract. Subtract the price from your model's probability to get your edge. Buy where your model is meaningfully higher than the price, fade or pass where it is lower, and always confirm the edge survives an EV check at the actual fill price.

Should I trust my model over the market when they disagree?

Usually only a little. Deep, liquid markets are right more often than any home-built model, so treat large disagreements with suspicion and demand a specific reason — an unpriced injury, a soft path, a tactical mismatch. The smart play is to trade the divergences you can justify and size them at a fraction of full Kelly.

Why simulate at all instead of just reading odds