More-or-less aimless surfing brought me to an old blog post about math in movies, which mentions Pi and Contact. This mostly coincidental juxtaposition reminded me of the conclusion of the novel behind the latter.
When I first read the book, a zillion years ago, I thought the ending was pretty clever. Then I realized that it was incredibly dumb. It later occurred to me that it might actually be intentionally dumb, and therefore incredibly clever, because Carl Sagan really ought to have known better, but… I don’t know. I think he either genuinely goofed or assumed (probably correctly, in most cases) that his readers wouldn’t notice.
A short aside: if you haven’t read the novel or seen the film, you haven’t really missed anything. The novel is basically a not-bad-but-not-brilliant ripoff of Stanisław Lem’s Głos Pana and Solaris. The film is… well, a film. It’s OK, I guess, and stars several excellent actors, and is reasonably but not entirely true to the novel.
If you haven’t read the novel but intend to, I should warn you that the rest of this post is a HONKIN’ HUGE SPOILER.
Here goes: at the end of the film, the perceived failure of the project is more or less covered up and Ellie returns to her former job as head of the SETI program. In the novel, however, she is disgraced and (from my recollection—remember, it’s been years since I read it) ends up as a glorified tour guide. She somehow manages to wrangle sufficient computer time to search for something that was hinted at earlier in the novel: a proof of a Universal Creator, embedded somewhere in the digits of π. And guess what… The computer discovers that if you print the digits of π with a specific number of digits per line, after billions and billions of digits you come across a pattern of zeroes that forms a circle on the page. There is a God. QED.
Here’s the problem: π is transcendental. If you search long enough, no matter what you’re looking for, you’ll eventually find it.
Monkeys and typewriters, Ellie. Monkeys and typewriters.
But the point of monkeys and typewriters is that they will not, in fact, write Shakespeare.
Suppose the circle on the page is rather simple: it is formed of 10 lines, each containing 10 digits (ones or zeros). So it is the simplest 10×10 rasterisation of a circle. How long a printout would you need to see that specific sequence of digits? About 10^100 digits. That’s a lot more than will fit in “billions and billions” of pages: you couldn’t fit those pages in the known universe. And my reading is that it was a much bigger, and therefore more unlikely, image than that — say, 100 lines, each with 100 digits: it would occur once in 10^10000 tries.
Observing such a pattern after mere billions and billions of pages would prove… what, though? That there is a God? Pi is what it is, whether there is a God or not… It would just suggest a monumental inadequacy in our understanding of mathematics. Or maybe it would prove that God was manipulating the computer in mysterious ways.
On average, yes. But Ellie is not looking for a circle in a random representation of Pi; she is looking for a circle in a specific representation (base 11, so many digits per line etc.) which she was told to use by someone else who already knew about the circle. It’s like looking for hidden messages in the Bible; someone tells you that if you pick every nth letter from every mth chapter and pass it through a mathematical formula they’ve devised and squint a little bit, you end up with a prophecy that says Sarah Palin is the Second Coming of Christ. See also “Nothing up my sleeve” numbers.
Did you know that my birth date occurs in Pi at almost regular intervals of ~1,000,000 digits?
Do you think it means something?
Well, finding the circle in base 11 makes it slightly less likely than finding it in base 10. Of course the probability is increased by the fact that the base, size of circle, choice of digits in the circle, even the choice of geometric figure (circle vs square vs triangle…) are all unspecified. But even if you give a maximum possible latitude to all of those, you can’t compete with a number like 10^100 or 10^100000. Your birthdate is a trivial pattern in comparison — of course any 6 digit string will occur once in a million times, but it gets exponentially less probable the longer it gets.
Of course if you print out pi you will see patterns — perhaps faces (the human mind is good at that), perhaps imperfect circles, and so on. But finding a perfect, flawless (within limits of rasterisation), geometrical figure, in a normal page-sized printout that contains hundreds or thousands of digits, is really vanishingly improbable.
It’s like people who think they understand physics saying there is a finite probability of a mouse escaping a cage because of Heisenberg’s uncertainty principle. Mathematically the probability of that is finite, but for practical purposes it’s zero. If you find that the mouse escaped, it certainly squeezed through the bars.
You seem to be suffering from the misconception that if the probability of a particular outcome is 1/p, then it will not occur until the pth experiment, when in fact, it is just as likely to occur on the very first attempt.
The probability that any given digit of Pi in base 10 is a 1 is 1/10, yet the second digit of Pi is 1—and so is the fourth!
The probability that the first five digits of Pi in base b are 1.2345 is 1/b^5, yet I am sure that I could find a value of b for which they are. If I gave you a specific value for b, and you looked at Pi in base b and saw that it started with 1.2345, would you exclaim “that’s amazing!”, or would you question why I chose that particular value of b?
By the way, your —10^100 or 10^10000— number is way too high, since all you need is for the digits that make up the circle to be 0. The correct number is closer to 10^20 for a 10×10 square in base 10, and 10^200 for a 100×100 square in base 10. Of course, if you’re looking for, say, a circle of 1s in a field of 0s, that’s something else entirely.
You seem to be suffering from the misconception that if the probability of a particular outcome is 1/p, then it will not occur until the pth experiment, when in fact, it is just as likely to occur on the very first attempt.
I’m saying no such thing. I’m saying it will occur on average once in p times. If you look into pi, one digit in 10 will be 1, one bigram in 100 will be 11, and so on, on average.
The probability that the first five digits of Pi in base b are 1.2345 is 1/b^5, yet I am sure that I could find a value of b for which they are.
I can guarantee you can’t. In any base greater than 3, the first digit must be three. In any base less than or equal to 3, there will be two digits before the decimal point, and 3, 4, 5 will not occur anywhere.
The correct number is closer to 10^20 for a 10×10 square in base 10, and 10^200 for a 100×100 square in base 10.
OK, I misunderstood the picture. One in 10^200 is still zero, no matter how many pre-factors you put in to account for the choice of base, number of digits per line, choice of digit in pattern, etc. One in 10^20 may perhaps be borderline, but my impression is the book had a much larger pattern than that.
You can’t fight exponential falls.