I think you misunderstand. I'm not saying base your distribution on how often these words appear, but use that as a dictionary to run your tile improvement method against.
So instead of "The new tile rarities are based on an analysis of the in-game dictionary, weighted based on source word length with a bias curving towards 7 letter words."
It would be "The new tile rarities are based on an analysis of the most used words dictionary, weighted based on source word length with a bias curving towards 7 letter words."
edit: if you do this, delete any one or two letter words in that dictionary, there is a lot of slang and bullshit there that should be removed before running an analysis.
Also, just checking you 100% have the rights to use the dictionary you're using, because hint found a unique word that isn't in oxford or scrabble dictionaries (I googled it because I'd never heard it before). ooecia - cool word, tonnes of vowels, but not in just about any dictionary used for games. (Some dictionaries include unique words to their dictionary to test if people use it without proper authorization)