Wordle Solver

An information-theoretic Wordle solver using Shannon entropy — built in Berlin, deployed in Montreal.

R
E
A
L
S
G
A
I
N
S
M
A
T
H
S
·
·
·
·
·
Information Theory · NLP · Game AI

Wordle Solver
via Shannon Entropy

An optimal solver for French & English Wordle variants, written in Python — with an interactive webapp to test it in-browser.


Try the solver — zero install

Play French Wordle or English Wordle. The solver tells you the highest-entropy guess at every step. Stop struggling with five-letter words.

Wordle Solver — Zero Install

I built my first Wordle solver while I was an exchange student in Berlin, with plenty of free time and just some basic Python skills. This was the pre-LLM era — so everything was done the hard (and fun) way.

At the time, I was studying mechanical engineering, and I continued in that direction for a while. But later on, I followed my interest in computation and language, and joined a master’s in computer science (NLP) at Polytechnique Montréal.

Now, things look a bit different. The web app you see below was built in under an hour using LLMs. Same idea — just a much faster way to bring it to life.

Try it before reading the math. Play a round, let the solver suggest its next guess, and watch how each turn strips away possibilities. The theory will make a lot more sense once you have felt it work.


How it works

Each guess is chosen to maximize expected information gain — formally, the Shannon entropy of the feedback distribution over remaining candidate words.

\[ H(X) = - \sum_{i} P(x_i)\,\log_2 P(x_i) \] // expected uncertainty across all remaining words
\[ g^* = \arg\max_{g} \mathbb{E}[I(g)] = \arg\max_{g} \sum_{r} P(r \mid g)\,\log_2\!\left(\frac{|A|}{|A_r|}\right) \] // pick the guess that partitions the answer space most evenly
01 — feedback

Each letter gets a ternary signal: green (right place), yellow (wrong place), gray (absent). Up to \(3^5 = 243\) possible patterns per guess.

02 — partition

Words are grouped by which feedback pattern they would produce. The solver picks the guess that makes those groups as equally sized as possible — maximizing information.

03 — heuristics

For 7–9 letter words, brute force is too slow. Positional letter frequency prunes the candidate set while losing less than 2% of information gain.