The Secret Life of Numbers: 50 Easy Pieces on How Mathematicians Work and Think (2006)

Chapter: 39 Stacking Words Like Oranges and Tomatoes (Computer Science)

Previous Chapter: 38 Nature’s Fundamental Formula (Botany)
Suggested Citation: "39 Stacking Words Like Oranges and Tomatoes (Computer Science)." George G. Szpiro. 2006. The Secret Life of Numbers: 50 Easy Pieces on How Mathematicians Work and Think. Washington, DC: Joseph Henry Press. doi: 10.17226/11543.

39
Stacking Words Like Oranges and Tomatoes (Computer Science)

In August 2002 the International Congress of Mathematicians was held in Bejing. This mammoth convention, which takes place every four years and to which thousands of mathematicians come from all corners of the world, is a good opportunity for bestowing prizes on the worthiest among the many worthy. Since 1982 a prize called the Nevanlinna Prize (named after the Finnish mathematician Rolf Nevanlinna) has been awarded for outstanding work in the field of theoretical computer science. In 2002 it was bestowed on Madhu Sudan, a professor at the Massachusetts Institute of Technology. The laudation mentioned, among other things, Sudan’s fundamental contributions to the area of error-correcting codes.

Computer users are probably familiar with error-correcting software. Most word-processing programs contain this feature in the form of a spell checker. If, for example, you type “hte,” a wavy red line under the senseless letter combination will indicate that such a word does not exist in the English language. Commonly this is referred to as error detection.

Some word-processing programs go further than that. In the English language, for instance, only one legitimate word exists containing only the three letters t, h, and e. Hence there is no doubt about which word the typist intended to type. Accordingly, an advanced word-processing program will automatically replace the incorrect sequence of characters with “the.” In this case an error correction took place.

It is much simpler to detect errors than to correct them. When transmitting text or copying a file, it suffices to include a check sum to detect errors. For a string of numbers, for example, this quality check could be in

Suggested Citation: "39 Stacking Words Like Oranges and Tomatoes (Computer Science)." George G. Szpiro. 2006. The Secret Life of Numbers: 50 Easy Pieces on How Mathematicians Work and Think. Washington, DC: Joseph Henry Press. doi: 10.17226/11543.

the form of the sum of the string’s digits. If the “check sum” at the points of transmission and reception do not correspond, one must conclude that at least one error has crept in, and the number sequence needs to be retransmitted.

This repetition of a transmission is time consuming. It certainly would be more efficient to correct the errors at the point of reception automatically. So, instead of having to retype the word “the,” a good program automatically replaces the erroneous character sequence “hte” with the correct one.

How and when is this possible? Surprisingly this problem is closely connected to a totally different problem that Johannes Kepler, an astronomer and mathematician who lived in the first half of the 17th century, investigated at great length. The question he asked was, which is the most efficient way to arrange oranges or tomatoes on a greengrocer’s cart—that is, with the minimum amount of gaps in between them?

How can the problem of stacking tomatoes have anything to do with error correction? Imagine a three-dimensional space with the letters of the alphabet arranged at equal distances, say 1 centimeter, from each other along the three axes. Three-letter words can be visualized as points in this space, the x axis representing the first letter, the y axis the second, and the z axis the third letter of the word. Thus each word is represented by a point within a cube measuring 26 × 26 × 26 centimeters. Points that do not correspond to any word in the English language remain empty. Hence, if a nonsensical character sequence is transmitted, this would correspond to an empty space that an error detection program would automatically flag. An error correction program does more: It automatically seeks out the nearest “legitimate” word.

The more distant from each other the legal spots are, the less doubt exists regarding possible errors and the easier it is to replace a faulty sequence of characters with the correct one. Hence, to avoid ambiguities, it is important to ensure that no other proper word lies within a certain distance around every legitimate word. On the

Suggested Citation: "39 Stacking Words Like Oranges and Tomatoes (Computer Science)." George G. Szpiro. 2006. The Secret Life of Numbers: 50 Easy Pieces on How Mathematicians Work and Think. Washington, DC: Joseph Henry Press. doi: 10.17226/11543.

other hand, one wishes to cram as many words as possible into the cube. Hence, the problem that presents itself is as follows: How can as many points as possible be stored in the cube without the points lying closer to each other than a minimal distance? Thus we have arrived at the question that Kepler asked himself: How can oranges or tomatoes be stacked as densely as possible without squashing each other?

For centuries greengrocers were well aware of the fact that their fruits and vegetables are most efficiently packed in piles and layers that are arranged in a honeycomb or hexagonal pattern. It was only in 1998, however, that this conjecture was actually proved rigorously by the American mathematician Thomas Hales. In theory, 17,576 three-letter words can be stored along the lattice points of a cube holding 26 letters (= 263). The answer to Kepler’s question shows, however, that the densest stacking of words, with 1 centimeter between any two of them, would allow about 25,000 words to be stored in the same space. If, on the other hand, one wanted to establish a broader security band, say 2 centimeters, between any two proper words, then the most efficient stacking method allows for only about 3,000 words.

To deal with words consisting of more than three letters, one simply turns to spaces of more than three dimensions. However, the most efficient method of packing has so far not been determined for all higher-dimensional spaces.

Suggested Citation: "39 Stacking Words Like Oranges and Tomatoes (Computer Science)." George G. Szpiro. 2006. The Secret Life of Numbers: 50 Easy Pieces on How Mathematicians Work and Think. Washington, DC: Joseph Henry Press. doi: 10.17226/11543.
Page 158
Suggested Citation: "39 Stacking Words Like Oranges and Tomatoes (Computer Science)." George G. Szpiro. 2006. The Secret Life of Numbers: 50 Easy Pieces on How Mathematicians Work and Think. Washington, DC: Joseph Henry Press. doi: 10.17226/11543.
Page 159
Suggested Citation: "39 Stacking Words Like Oranges and Tomatoes (Computer Science)." George G. Szpiro. 2006. The Secret Life of Numbers: 50 Easy Pieces on How Mathematicians Work and Think. Washington, DC: Joseph Henry Press. doi: 10.17226/11543.
Page 160
Next Chapter: 39 Stacking Words Like Oranges and Tomatoes (Computer Science)
Subscribe to Email from the National Academies
Keep up with all of the activities, publications, and events by subscribing to free updates by email.