- How (and why) to program, part 1
- How (and why) to program, part 2
It’s National Computer Science Education Week! That must mean it’s time for part 2 of my How (And Why) To Program series. Today I will discuss a tricky but powerful concept in computer science: recursion.
Briefly, recursion means accomplishing a task by performing it in terms of smaller versions of the same task. For example, each morning I execute my “drive to work” routine, which is really my “drive from point A to point B” routine, where point A is home and point B is work. To do that, I first do “drive from point A to point B” where point A is home and point B is the Golden Gate Bridge (which is about halfway to work for me), followed by “drive from point A to point B” where point A is now the Golden Gate Bridge and point B is work. Each of those steps, of course, can be decomposed into smaller “drive from point A to point B” tasks.
One classic example for illustrating recursion in computer code is the Fibonacci sequence — the mathematical sequence in which each number is the sum of the two before it. You might already see the weakness in that definition: what can the first and second numbers in the sequence be, if they don’t have two numbers before them that can be added together? This is a key feature of recursive functions: at some point they reduce the problem into parts so small that they reach the “base case,” where the recursive rule breaks down. It happens that the base case of the Fibonacci sequence says the first two numbers are 0 and 1. From there, the recursive rule takes over to give the numbers that follow: 1, 2, 3, 5, 8, 13, 21, 34, and so on.
Let’s look at a “function,” which is the computer programming equivalent of a recipe: you give it some inputs, and it gives you an output, the result of processing the inputs in specific ways. Our function is called fibonacci and it takes one input, or “argument”: a number, which we’ll call n. The result of fibonacci(n) will be the nth number in the Fibonacci sequence, where the first two numbers — fibonacci(0) and fibonacci(1) (recall that in programming, lists and sequences of things are almost always numbered beginning at zero) — are 0 and 1.
As before, code samples are presented in the Python programming language, though the same concepts we’re discussing apply to most other programming languages too.
def fibonacci(n): if n == 0 or n == 1: return n else: return fibonacci(n-1) + fibonacci(n-2)
We start with “def fibonacci(n),” which simply means “define a function named fibonacci taking one argument called n.” The body of the function follows. First it checks for the base case: does the caller (whoever is invoking this function) want one of the first two Fibonacci numbers? If so, the function simply “returns” (or hands back to the caller) the value of n, since by coincidence the value of fibonacci(n) is n when n is 0 or 1.
If it’s not the base case, the function returns a different value: the sum of invoking fibonacci first on n-1 and then on n-2. Those recursive calls give the two prior Fibonacci numbers. For instance, if we invoke fibonacci(9), then n is 9 and fibonacci(n-1) is fibonacci(8), which is 21; and fibonacci(n-2) is fibonacci(7), which is 13. Adding those together gives 34, which is the correct result for fibonacci(9).
Enough about the Fibonacci sequence. It’s a contrived example and, though it explains recursion pretty well, it doesn’t demonstrate the real-world applicability of the technique. (It also happens that, for reasons I won’t go into here, recursion is a terribly inefficient way to compute Fibonacci numbers compared to other possibilities like iteration.)
A few days ago, my Mensa Puzzle-a-Day Calendar presented this riddle:
The letters of a certain three-letter word can be added in sequential order (though not necessarily with all three letters together in the same place) to each of the letter strings below to form common, uncapitalized English words. You don’t need to rearrange any of the letters below. Simply add the three needed letters in sequential order. What is the three-letter word, and what are the nine new words formed?
1. alp 2. wl 3. marit 4. ealus 5. urneman 6. cke 7. disintedl 8. traectr 9. epard
(To illustrate the puzzle: the letters of what three-letter word can be inserted in both “hoyde” and “ckear” to produce common English words? The answer is “new,” to produce “honeydew” and “neckwear.”)
Staring at the puzzle for a while, I was unable to solve it. So I sat down and wrote a program to solve it for me. How’s that for real-world applicability?
Once again I relied on the file /usr/share/dict/words (or sometimes /usr/dict/words, or /usr/lib/dict/words) that is a standard feature of some operating systems; it’s simply a list of many common English words (and many uncommon ones, plus some frankly questionable ones), one per line. Reading that file, I produced two sets of words: one set of all the words, and one set of all three-letter words. Here’s how that looks:
three_letter_words = set() all_words = set() wordlist = open('/usr/share/dict/words') for word in wordlist: word = word[:-1] all_words.add(word) if len(word) == 3: three_letter_words.add(word) wordlist.close()
(Very similar code is explained in detail in part 1 of this series.)
With those two word sets in hand, and the nine letter-strings from the puzzle, this was my strategy: try all possible ways of inserting the letters of all the three-letter words in each of the letter-strings. For any three-letter word, if none of its combinations with a given letter-string produces a valid word, remove the three-letter word from further consideration. In other words, beginning with all possible three-letter words, we whittle them away as they become disqualified. In the end, the only three-letter words left should be ones that combine, one way or another, with all of the nine letter-strings to produce valid words.
So, for example, the three-letter words “see” and “era” both can be added to the letter-string “alp” to produce valid words (“asleep” and “earlap”). But the three-letter word “new” can’t be, so after running through all the three-letter words on the letter-string “alp,” “see” and “era” will still be in the set three_letter_words, but “new” won’t be.
Here’s how that strategy looks:
for string in ("alp", "wl", "marit", "ealus", "urneman", "cke", "disintedl", "traectr", "epard"):
This starts a loop that will run nine times, once for each letter-string, giving each letter-string the name “string” on its turn through the body of the loop.
three_letter_words_to_discard = list()
This creates an empty list called three_letter_words_to_discard. It’s empty now but as we progress we will fill it with words to remove from the three_letter_words set.
(If you’re wondering why I sometimes use lists for collections of things, and sometimes use sets, gold star! The answer is that they are two different kinds of data structure, each one good at some things and bad at others. A set is very fast at telling you whether a certain item is in it or not; a list is slow at that. On the other hand, a list keeps things in the same order in which you added them; a set doesn’t do that at all.)
for three_letter_word in three_letter_words:
This starts a nested loop. It’ll run through the complete list of three_letter_words each of the nine times that the outer loop runs.
combinations = combine(three_letter_word, string)
Here we presume there’s a function called combine that takes the current three-letter word and the current letter string, and produces the complete list of ways that the letters of three_letter_word can be interspersed with the letters of string. For example, combine(“abc”, “def”) should produce the list [“abcdef”, “abdcef”, “abdecf”, “abdefc”, “adbcef”, “adbecf”, “adbefc”, “adebcf”, “adebfc”, “adefbc”, “dabcef”, “dabecf”, “dabefc”, “daebcf”, “daebfc”, “daefbc”, “deabcf”, “deabfc”, “deafbc”, “defabc”]. That’s where recursion is going to come into play. We’ll get to writing the combine function in a moment.
good_combinations = list() for combination in combinations: if combination in all_words: good_combinations.append(combination)
With the list of combinations in hand, we now look through them to see which of them are valid words, if any. We set good_combinations to be a new empty list where we’ll accumulate the valid words we find. We loop through the combinations, testing each one to see if it’s a member of the set all_words. If one is, we add it to the list good_combinations.
if good_combinations: print three_letter_word, "+", string, "=", good_combinations else: three_letter_words_to_discard.append(three_letter_word)
After the “for combination in combinations” loop, we check to see whether good_combinations has anything in it. (“If good_combinations” is true if the list has something in it, and false otherwise.) If it does, we print out the current three-letter word, the current letter-string, and the list of valid words they make. If it doesn’t, then three_letter_word goes into our list of three-letter words to discard.
for word in three_letter_words_to_discard: three_letter_words.remove(word)
After the “for three_letter_word in three_letter_words” loop, this small loop does the discarding of disqualified three-letter words.
Why not simply discard those words from three_letter_words in the preceding loop, as we run across them? Why save them up to remove them later? The answer is that when you’re looping through the contents of a data structure, it’s a bad idea to add to or remove from the data structure. The loop can get confused and lose its place in the structure. It may end up running twice with the same list member, or skip a member entirely. It’s safe to make changes to the membership of the data structure only after the loop finishes.
Finally, after the outermost loop has finished, it’s time to see which three-letter words remain in our set:
print three_letter_words
And that’s all! All except the tricky part: the combine function. Here is how it starts:
def combine(string1, string2):
It takes two strings. We’ll give them generic names, string1 and string2, so as not to assume that either one is a three-letter word. As you’ll see, often neither one is.
Now, how to approach writing a recursive function? It’s usually a safe bet to start with the base case, the conditions under which combine isn’t recursive. The recursive step will involve passing shorter and shorter strings to combine, so the base case is when one or both of the strings is empty. Obviously if either string is empty, the result should be the other string — or more precisely, the list containing the other string as its one member (since we’ve already stipulated that the result of combine is a list of strings). In other words, combine(“”, “def”) should produce the list [“def”] — which after all is the result of interspersing the letters of “” among the letters of “def” — and combine(“abc”, “”) should produce [“abc”].
So here’s the body of combine so far. It’s just the base case:
if len(string1) == 0: return [string2] elif len(string2) == 0: return [string1]
(Recall that “elif” is Python’s abbreviation for “else if.”)
Now for the case where string1 and string2 are both non-empty; the recursive case. The key to writing the recursive step of a function like this is figuring out (a) how to make the problem the same but smaller, and then (b) what to do with the result of computing the smaller solution.
One way to make the problem smaller is to lop off the first letter of string1. So if combine were originally invoked with the strings “abc” and “def,” the recursive call would invoke it with “bc” and “def.” Presuming combine works correctly — which is the counterintuitive assumption you must always make about the recursive step in a function like this — we’ll get back the list [“bcdef”, “bdcef”, “bdecf”, “bdefc”, “dbcef”, “dbecf”, “dbefc”, “debcf”, “debfc”, “defbc”]. None of those belongs in the result list of combine(“abc”, “def”); but if we now restore to the beginning of each of those strings the same letter we lopped off, we get [“abcdef”, “abdcef”, “abdecf”, “abdefc”, “adbcef”, “adbecf”, “adbefc”, “adebcf”, “adebfc”, “adefbc”]. This is halfway to the complete answer: it’s all the strings in the result list that begin with the first letter of string1. We only need to add all the strings in the result list that begin with the first letter of string2, and we’re done. We do this by treating string2 the same way we just treated string1: we lop off its first letter in another recursive call to combine, then paste it back on to each string in the result. Continuing the example, this means calling combine(“abc”, “ef”), which produces [“abcef”, “abecf”, “abefc”, “aebcf”, “aebfc”, “aefbc”, “eabcf”, “eabfc”, “eafbc”, “efabc”]. Sticking the “d” back onto the beginning of each of those strings gives [“dabcef”, “dabecf”, “dabefc”, “daebcf”, “daebfc”, “daefbc”, “deabcf”, “deabfc”, “deafbc”, “defabc”], and adding this list to the list from the first recursive call gives the complete solution.
In Python, the first letter of string is denoted string[0]. The rest of string, without its first letter, is denoted string[1:]. So here’s the complete version of combine, with the (double) recursive step added in.
def combine(string1, string2): if len(string1) == 0: return [string2] elif len(string2) == 0: return [string1] else: recursive_result1 = combine(string1[1:], string2) recursive_result2 = combine(string1, string2[1:]) result = [] for string in recursive_result1: result.append(string1[0] + string) for string in recursive_result2: result.append(string2[0] + string) return result
This is the crazy magic of recursion: at each step, you simply assume the next-smaller step is going to work and give you the result you need. All you have to get right is the base case and the way to process the recursive result, and — well, look:
hol + alp = ['alphol'] has + alp = ['alphas'] sae + alp = ['salpae'] her + alp = ['halper'] see + alp = ['asleep'] eta + alp = ['aletap'] era + alp = ['earlap'] soe + alp = ['aslope'] yin + alp = ['alypin'] pus + alp = ['palpus'] een + alp = ['alpeen'] kas + alp = ['kalpas'] ecu + alp = ['alecup'] ist + alp = ['alpist'] doh + alp = ['adolph'] pal + alp = ['palpal'] cul + alp = ['calpul'] ped + alp = ['palped'] Moe + alp = ['Malope'] clo + alp = ['callop', 'callop'] gos + alp = ['galops'] tid + alp = ['talpid'] yum + alp = ['alypum'] pon + alp = ['palpon'] hin + alp = ['alphin'] joy + alp = ['jalopy'] hol + wl = ['wholl', 'wholl'] sae + wl = ['swale'] soe + wl = ['sowel', 'sowle'] joy + wl = ['jowly'] joy + marit = ['majority'] joy + ealus = ['jealousy'] joy + urneman = ['journeyman'] joy + cke = ['jockey'] joy + disintedl = ['disjointedly'] joy + traectr = ['trajectory'] joy + epard = ['jeopardy'] set(['joy'])
I got it in ~30 seconds by inspection ๐ #7. disintedl was the key.