The Wordle strategy of starting with "Siren, Octal, Dumpy" that I saw mentioned on a youtube channel I follow just nerd sniped me hard. The TL;DR is that I now think "Crape, Doily, Shunt" are three better opening guesses, especially for Dordle. Read on to find out why.
Since you have to find two words using one set of guesses in Dordle this mean that making use of the first few guesses to make progress of finding most letters in both words is even more important than in Wordle. So after hearing about starting with guesses of "Siren, Octal, Dumpy" I tried it out, and it very much increased my Dordle win rate. But that strategy was based on the frequency of the letters in English as a whole, and that got me thinking, wouldn't checking how common letters were the list of allowed answers be much more relevant? That's not the same list as the list of allowed five letter words to *guess*, so it's not just the same thing as limiting the dictionary to five letter words. This is because it would be frustrating to play if the answer was too often words they've never heard of.
So if I assume that I always want to use my first three guesses to find as many letters as possible, (if the first two guesses hasn't found all five letters), then optimizing the first word to have the most common letters isn't needed. I still optimize primarily on finding as many letters as possible, so that means finding three words using the 15 most common letters. But then I want to optimize those possible guesses on finding three words that will give an as high average for number of green squares found with those words. So I set to writing a quick and dirty JavaScript.
The first thing I did was count the number of times each letter was in the 2315 words allowed to be the randomly picked answers. I counted both for that letter in any position, and for that letter in each of the five positions. I found that if you're going to use just two words to guess, then "Siren, Octal" is not the correct way to go for this dictionary, even if we ignore finding green letters. The 10 most common letters in allowed answers are ACEILNORST. The word "Orate" contains all the 5 most common letters, but that leaves CILNS as letters 6-10 and that doesn't anagram to a word. Hang on, "orate" is the five most common letters, that doesn't have an S in it? Aren't plural forms of nouns allowed, so wouldn't S shoot up to near the top? Well, no. The game allows plural forms for guesses, but not for the answers! So for example, the word "Books" can never be the answer in Dordle. In fact, only 36 out of 2315 allowed answers end in an S! (This made me think my code was buggy, but the count tool in my text editor confirmed this.)
But since I was looking for three starting words to guess, what's more relevant for that search is that the fifteen most common letters in possible answers are: ACDEHILNOPRSTUY. Wordsmith's Internet Anagram Server gave me a list of 3223 triples of five-letter words that use all of these letters. https://new.wordsmith.org/anagram/ Yes, it's possible that Wordsmith uses a slightly different dictionary than the allowed guesses, but I decided to not worry about it.
I then ran all those triples through my script and counted how many green squares each triple would result in when run against every possible answer. The winning triple was "Crape, Doily, Shunt".
I then checked for the number of green letters given by each of those words, and got back that they should be guessed in the order "Crape, Doily, Shunt", the same order I randomly got them in first. But then I realized that if I know I'll use three words to search for letters *unless* the first two words got all five letters of the answer, then I want to optimize for doing this instead, so I searched for those words giving yellow or green squares. And a bit annoyingly the correct order for them was still "Crape, Doily, Shunt".
It's also worth noting that while Crape, Doily, and Shunt are all legal guesses, only Shunt can ever be the correct answer.
Amusingly I just noticed that Crape, Doily, Shunt is an opening that beats Absudle, the adversarial Wordle variant in just 5 guesses, even though it's in no way optimized to be good against it.
Anyone have any feedback? Would you try to have optimized for something else instead or as well? Or do you think I missed something obvious?