After playing some Wordle over Christmas with my family, I decided I would ruin the game for everyone by trying to find the optimum solution using code.
Don't worry, you can read this without finding out the words, they're hidden by spoiler blocks below...
Let's start by looking for the best starting word.
First, we need to know what words we can use to guess and what the possible answers are. It turns out that the programmer Matt Wardle left the two word lists in the code for the program itself, and they are different. This means you can guess words that will never be the answer, which is great programming. A person is allowed to guess an obscure word that's a real word, but an obscure word will never be the answer.
Then we need to have a good think about what we mean by 'best starting word'. Options:
1) The word that on average leaves the fewest potential answers left.
2) The word that in the worst case scenario leaves the fewest potential answers.
3) The funniest word. This is FARTS.
Let's compare a couple of words to get a better idea. For every possible answer, I calculated how many potential answers would be left after guessing FUZZY and ASIDE .
You can see that in the worst case scenario, ASIDE will still eliminate all words with A, S, I, D and E in, leaving 223 words, and on average leaves far fewer words than FUZZY.
This tallies with expectations, ASIDE has letters that are far more common letters than FUZZY does, so it on average leaves the fewest potential answers, and even in the worst case scenario (where the answer doesn't have an A, S, I, D or E) still eliminates 2092 words.
This is a 3 times loop: For all possible guesses, work out what information you have for all possible answers - then see out of all possible answers what that information eliminates. I coded it inefficiently in python and used a Google Colab notebook. It took approximately a day to run.
1) The word that on average leaves the fewest potential answers is
3) The funniest word is still FARTS.
It's worth pointing out that this isn't entirely intuitive. We're used to thinking of T as one of the most common letters, and even though that remains true in this dataset, it doesn't differentiate between words as much as I does. Other letters that we think of as being common like D are not as common in five letter words.
Neither of these words are optimising the right thing. Ideally we would:
1) Have on average the least possible turns.
2) Guarantee that we will only use a certain number of turns.
This is true. But to run an exhaustive approach on this would require a four way loop. And for that you'd need silly computing power and a better programmer than me. BUT, I have been working on a strategy that guarantees a solution in 6 goes. Using that I've worked out some excellent tips for turns 2 and 3.