Decoding
Decoding secret messages can be really hard! I had a friend who sent me a secret message, and you would be surprised at how many different results fit the available letters. Some were funny, some were downright frightening and had my heart racing, but in the end I think I found the correct answer.
When decoding secret messages, the manual approach is incredibly time-consuming. I found it much easier to write a small program to help. As a python programmer, it ended up being pretty easy to slap something together.
I decided to approach the problem by treating all of the available letters as a multiset. A multiset is much like a set but has the advantage of being able to store and match against multiple copies of any individual letter. The drawback of this approach is that it is very brute force - there is certainly a more efficient option if I had wanted to implement some sort of tree or fuzzy matching; but I decided that simple and effective was the best route to go.
The first problem is you need a wordlist to match against. I found one: google-10000-english-usa. It ended up being generally effective.
Solving turned out to be a pretty iterative process. I was running in a Jupyter Notebook, so that made it easy to not have to reload my wordlist each time, and then just update my suspected message words as I ran. I output the found matches every 1000 iterations even on incomplete results as they often contained decent words for me to use.
# Open my wordlist
with open('google-10000-english-usa.txt', 'r') as fh:
wordlist = fh.readlines()
wordlist = [ w.rstrip().upper() for w in wordlist ]
#%%
from multiset import Multiset
import random
# The letters in your secret scrambled message
bag = list('LETTERS')
# Add words you think are in the message here
words = [
]
wordlist_depth = 5000
result_limit = 5
dupe_limit = 5
for i, w in enumerate(words):
for l in w:
bag.remove(l)
print("All Letters:", ''.join(bag))
print("Available Letters:", ''.join(sorted(set(bag))))
# We want to limit the 2 letter words or they can dominate our results
safewords = ['BE', 'AM', 'AS', 'OR',
'BY', 'TO', 'OF', 'IS',
'IT', 'AT', 'WE', 'MY',
'ME', 'DO', 'NO', 'SO',
'GO']
iteration = 0
dupe = 0
best = None
results = []
while True:
iteration += 1
if iteration % 1000 == 0:
print(f"Iteration {iteration} Best {best}")
best = None
letters = Multiset(bag)
my_words = [ w for w in wordlist[:wordlist_depth] if len(w) > 2 or w in safewords]
# Random shuffle the wordlist
random.shuffle(my_words)
found = []
for w in my_words:
word = Multiset(w)
if word.issubset(letters):
found.append(w)
letters = letters.difference(word)
if best is None or len(letters) < len(best):
best = found
result = ' '.join(sorted(found))
if len(letters) == 0:
if result in results:
dupe += 1
else:
print(result)
results.append(result)
found = []
dupe = 0
if len(results) >= result_limit:
print("Result Limit Reached")
break
if dupe > dupe_limit:
print("Dupe Limit Hit")
break
That’s it! Message unscrambling 101. It only took me six years of school to make it this easy. Good thing I had the vision to see that through. The father once told me that if someone relies on someone other than themselves for their accomplishments, they aren’t really theirs. I think he was right.