Hans Christian von Baeyer
Information
The new Language of Science
Weidenfeld&Nicholson 2003



Electric Information

The date and time of the birth of electric information are certain: Friday, 24 May 1844 at nine forty-five in the morning. The place, however, is less so, for the happy event occurred not at a point, but on a line connecting two points - a pair of wires from the Supreme Court Chamber in Washington, DC, to an office in Baltimore, forty-one miles away. In the middle of the courtroom, amid a tangle of wires, Samuel Finley Breese Morse sat hunched over a mysterious brass box, surrounded by a curious crowd of congressmen elbowing each other to get a better look over his shoulder. He was understandably anxious, because a year earlier Congress had awarded him the substantial sum of $30,000 to mount a historic experiment, and if it falled, his bold project would be doomed. Urgently he fiddled with the complicated screws and levers needed to generate a series of electrical dots and dashes. In the style of the time the message he composed, which had been selected earlier by the young daughter of a friend, was the Biblical exclamation: 'What hath God wrought!' A few minutes elapsed while a colleague in Baltimore decoded the signal, and then, to the astonishment of the crowd and the relief of the inventor, the same message, faithfully copied and re-encoded, arrived back in Washington.

Three days later the New York Daily Tribune, presumably informed by mounted messenger, triumphantly described the event, and declared that the miracle of the annihilation of space had been accomplished. The support of the US Congress, and the publicity resulting from the spectacular demonstration, tonched off the furious growth of telegraphy. Private enterprise rushed in to capitalize on what the public purse had subsidized. Of course, Morse had not been the only inventor to try his hand at applying new discoveries in electricity and magnetism to the problem of communication.

A number of rivals in France, Germany and England had been hard on his heels, and in some respects even ahead of him, but eventually his system, which he defended vigorously in countless patent suits and priority battles on both sides of the the Atlantic, had triumphed over theirs. 'It would not be long', Morse prophesied, 'ere the whole surface of this country would be channelled for those nerves which are to diffuse, with the speed of thought, a knowledge of all that is occurring throughout the land, making, in fact, one neighbourhood of the whole country.' (A hundred years later Marshall McLuhan, the prophet of the information age, would echo the obvious metaphor: 'Today, after more than a century of electric technology, we have extended our central nervous system itself in a global embrace, abolishing both space and time as far as our planet is concerned.') In fact, Morse underestimated the range of his invention. Within twenty years telegraph cables crossed not only the continent, but the Atlantic Ocean as well to create the very first world-wide web. Information was its staple, the Morse code its language.

At the time of the demonstration in Washington, Morse, a Massachusetts Yankee born near Boston and educated at Yale, was fifty-three years old. The telegraph, though years in gestation, was by no means his first accomplishment. In fact, Morse's career had started out in a very different direction. Until just seven years earlier he had been a respected portrait painter, with over three hundred works to his name, of which several now hang in the principal museums of America; but in 1837, at the peak of his powers, he had suddenly stopped painting. For the rest of his life he devoted himself to the brilliant idea that had first come to him during the enforced leisure of an ocean voyage from Europe back to America. The reasons for his abrupt change of heart are revealing. Its proximate cause was bitter disappointment at being passed over for a commission to paint a mural for the rotunda of the Capitol in Washington; but this setback, by itself, would not have been enough to deter a single-minded artist. Morse was, in fact, driven by other urges besides devotion to art. For one thing, he hungered for fame and fortune. For another, he was consumed by an abiding zeal to preserve the nation's culture from the levelling force of the prevailing populism. At heart he was a teacher and a reformer, and he was prepared to use any available means in the pursuit of his ideological goals.

The two paintings to which he had devoted the greatest effort were The House of Representatives (1822) and The Gallery of the Louvre (1831-33), two vast, wallsized canvases. The subject of the first is the machinery of government and those who operated it; over a hundred personalities, painted from life, are shown assembled for a lamp-lit evening session of the House. In the second, Morse himself takes centre-stage, tutoring a young girl as she sketches, against a backdrop of the greatest European art then exhibited in the Louvre, including da Vinci's Mona Lisa and Rembrandt's Head of an Old Man. The motivation for both works is obvious: they were not meant to hang in homes or museums, nor to delight the elite; rather, their purpose was to instruct the public. Supported by detailed explanatory notes for the viewer, they carry educational messages - the first about democracy in action, the second about high culture.

Unfortunately for Morse, both paintings were roundly ignored by audiences in New York and Boston. Whatever their artistic merit might be, in their own time they were failures. Chagrined and bitter at being rejected first by the public and then by Congress, Morse renounced art and, after a short and unsuccessful campaign for public office (on an anti-abolitionist, anti-Catholic and rabidly xenophobic platform) turned to telegraphy. In a way, he thereby remained true to his original purpose. For, in the hurly-burly of mid-nineteenth-century America, its development suggested a new and thoroughly modern way of conveying information, of communicating and instructing - something that painting and sculpture had been doing since prehistoric times.

Before one could think of reaching the public, however, or of harnessing the lightning speed of electricity and the power of magnetism, one had to start with a system, and the simplest possible proved to be a wire and a linear sequence of dots and dashes - zeroes and ones. Thus, one of the first problems Morse tackled on his way to designing a telegraphic system was the question of coding: how do you translate verbal information into electrical signals, which in turn make visible or audible marks for the recipient? His initial solution to this fundamental problem of information technology turned out, in retrospect, to be far more efficient than his ultimate code, but too clumsy in actual practice. Since the telegraph grew out of the science of electricity and magnetism, it was natural to turn first to numbers, the alphabet of science. Numbers are easy to record as clicks of a counter or scratches of a pen. These, in turn, could be generated by electrical currents activating electromagnets. (One of the principal differences between Morse's design and those of his European competitors lay in its effective use of electromagnets, which had been perfected by the American physicist Joseph Henry.) Accordingly, Morse's first code was numerical.

In 1837 he assembled the numbers he assigned to specific words and phrases into a special dictionary. Telegrams would consist of nothing but lists of numbers, which the recipient would decode by reference to the dictionary - a system that is clearly capable of an impressive level of data compression. If, for example, the sentence: 'The eastbound train will arrive three hours late,' is assigned the numeral 3, then only three clicks (or two bits, since 3 is represented by the binary code 11) are required to transmit that eight-word, fifty‑symbol message. It would be a hundred years before the theoretical efficiency of Morse's original system would be recognized. In his time, however, the human effort involved in t looking up numbers in a big book and transcribing them manually proved to be an insurmountable bottleneck.

Undeterred, Morse pushed on to invent the alphabetic code that bears his name. While infinitely more flexible than the numerical code, its messages are much longer. This trade-off led inevitably to the international style called 'telegraphese', in which a message might read: 'Eastbound delay 3 h stop.' But even this compact sentence, transmitted letter by letter, is relatively long and costly, and needed to be kept as brief as possible. Time, after all, is money. So Morse confronted the problem that is at the root of modern information theory: what is the most efficient way to choose symbols, composed of dots, dashes and spaces, for the letters of the alphabet? The ingenious way in which he answered this question, and the way it was tackled a century later, illustrate in microcosm the difference between engineering and science, between practice and theory. Then, it was a triumph of Yankee resource­fulness; today it bears witness to the power of mathematics.

The principle is obvious: the most efficient code assigns short symbols to the common letters, and long symbols to the rare ones. But what is common, and what is rare? What is the order of the frequency with which letters appear in English? Cryptographers need this information, as Edgar Allen Poe demonstrated in his popular story 'The Gold Bug' a year before Morse's experiment in Washington. One way to gather such statistics is to select a text, and simply count the number of times each letter appears, but while this method works well for the three or four most frequent letters, it becomes successively less reliable for the uncommon ones, such as Q X and Z, unless the reference text is extremely long. Morse's pragmatic solution, which he hit upon five years before the appearance of Poe's story, was quicker: he walked into a newspaper office and counted the number of letters in each compartment of the printer's type box. Presumably decades of experience had reduced its contents to an efficient compromise between supply and demand. Since he found more Es than any other letter, E is represented by a single dot, followed by T which merits a dash. X, Y and Z, on the other hand, whose compartments in the type box were relatively empty, drew four symbols each.

The theory that would vindicate Morse's rongh-and-ready method a century later was devised by the American mathematician Claude Elwood Shannon. Shannon, who was born in Michigan, earned his PhD at MIT and worked at the AT&T Bell Telephone Laboratories in New Jersey for fifteen years before returning to MIT to teach. He died in February 2001 at the age of eighty-four, laden with honours and revered as the legendary founding father of the cyber age.

In personality, Shannon was the polar opposite of Samuel Morse. He was self-effacing, carefree, retiring and cerebral where Morse was self-important, dour, combative and practical. Both men over­flowed with energy and determination, but because Shannon tasted success early in life he never acquired the bitterness Morse was able to shed only at an advanced age when he had finally captured the fame that had eluded him for so long. Their approaches to science differed in the way painting (Morse's art) differs from music (Shannon's): while painting tends to be holistic and synthetic, and is said to be right-brained, music is more ana­lytical, focused on details, and supposedly left-brained. A painting presents itself to the eye in one go, whereas a composition is heard in a linear fashion, one note at a time. Thus Morse and Shannon embody complementary approaches to a technical problem.

The quality that distinguished Shannon was playfulness. Like many scientists he was fascinated by games, puzzles and tricks, but unlike most people, once he got a hold of one, he would persist until he had mastered it and discovered its mathematical essence. To borrow a phrase from James Clerk Maxwell, who shared the same passion, Shannon never stopped until he understood 'the "go" of it'. A reporter who visited him at home after retirement found him surrounded by marvels like the mind-reading machine mentioned in the previous chapter, innumerable normal and strange musical instruments, a collection of chess-playing computers, a petrol-powered pogo stick, a two-seated unicycle, a hundred-bladed penknife, and a computer called THROBAC that calculated in Roman numerals (which was similar in principle to the electrical Marchant desk calculator that introduced me to the art of computing half a century ago). Shannon's most enduring hobby was juggling, which he practised as a yonng man at Bell Labs while gleefully riding his single-seater unicycle through the quiet, night-time halls. He built an illusion called the 'No-Drop Juggling Diorama', and wrote a learned treatise on the 'Scientific Aspects of Juggling', complete with poetic epigraphs, historical references reaching back to 2040 BC, schematic diagrams, math­ematical theorems, and the design of a diagnostic instrument called a 'jugglometer'. Unlike Morse, who was driven by the search for recognition, Shannon maintained: 'I've always pursued my interests without much regard to financial value or value to the world. I've spent lots of time on totally useless things.'

In science, though, as biologist Francois Jacob remarked, seemingly insignificant puzzles can lead to deep insights. This certainly proved to be the case with Shannon's investigation of the efficiency of communications channels. Its success was largely due to the care with which he defined and delimited the problem. Figure 1 of his monograph sets the stage: five boxes in a row are connected by arrows from left to right, and labelled, in succession, 'Information Source', 'Transmitter', 'Channel', 'Receiver' and 'Destination'. (There is also a sixth box off to the side, ominously marked 'Noise' and connected to the Channel, but I'll come back to that later.) This simple picture inspired my own questions in chapter 1: 'What mediates between the atom and the brain? What agency originates in the atom, or, for that matter, anywhere in the material world, and ends up shaping our understanding of it?' My answer, 'information', also figures in Shannon's diagram.

The opening of the second paragraph of his seminal 1948 paper 'A Mathematical Theory of Communication' - a work variously likened to the Magna Carta, Newton's laws of motion and the explosion of a bomb - is crucial, and worth recalling:

The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point. Frequently the messages have meanin$ that is they refer to or are correlated according to some system with certain physical or conceptual entities. These semantic aspects of communication are irrelevant to the engineering problem.

By blithely ignoring the meaning of information Shannon succeeded in constructing a complete mathematical theory.

Circumscribed though it is, the theory nevertheless describes a large gamut of phenomena. The information can be written, typed, spoken, sung, played on an instrument, painted, photographed or televised; the channel can be a pair of wires, a light beam, a band of radio frequencies, or any other device for relaying messages; the source and the destination can be people or machines. In order to measure in a consistent way the amount of information flowing through the channel, Shannon, like Morse, had to begin with the problem of coding, and in picking zeroes and ones, those atoms of information, he ended up with a far simpler system. For each choice between the two he chose the word 'bit', and, in order to generalize the concept, he immediately adjusted the original definition so he could use the bit as the unit of information under all possible circumstances.

The key to Shannon's adapted definition of information is a property of the logarithm (to the base 2) that we discovered at the end of chapter l0: The log of the number of messages that can be sent is equal to the length ofthe stnng. This simple rule prompted Shannon to define the information content of any set of messages as the log ofthe number of possible messages, and to call the unit of information the 'bit'. This definition agrees with the original bit-counting definition, because that's what it was designed to reproduce, but it works even in more complicated circumstances. Imagine, for example, substituting dice for coins. You can communicate with a friend by using the numbers on the faces of a die. There are exactly six different messages that can be conveyed with each die. How much information does each throw convey? According to Shannon's definition, the answer is log 6, which comes out at about 2.585 bits. (The result makes sense, because 6 is halfway between 4 and 8, and its log correspondingly falls between 2 and 3. The specific number can be verified by punching a calculator: 2258s = 6.000.) One throw of a die, therefore, is more informative than the toss of two coins, but less than a toss of three - an intuitively reasonable result. In this fashion the definition of the bit is broadened to accommodate fractional values.

The resemblance of Shannon's definition of information (the log of the number of possible messages) to Boltzmann's formula for entropy (the log of the number of ways of rearranging an atomic system) is not accidental. As Boltzmann himself suggested long before Shannon was born, entropy measures the missing information about the system - the information one could possibly have, but doesn't. Much ink has been spilled over the significance, or lack of significance, of the connection between information and entropy, but in the end Boltzmann's intuition was reliable: information and entropy are different ways of expressing the same idea.

Shannon's simple information measure sports one last wrinkle. Recall that a bit is the amount of information in the throw of an honest coin, or in the choice between two equally probable outcomes. For the case that the coin is weighted, or that the two choices of a question in the game of Twenty Questions are not equally probable, Shannon devised a formula for information content that involves not only the logarithm, but probabilities as well. Compare, for example, two players trying to guess a town in the US. The first one successively divides the country into two equal portions. With each answer, whether it is yes or no, she narrows down the possible region by half. As we saw, with twenty questions she can divide the country into more than a million patches - with absolute certainty. The second player decides to follow the logic of geography instead. So he begins by determining the state in which the town is located. The answers to his questions are not equally probable, but will be no in forty-nine cases out of fifty. Assuming for simplicity that each state has fifty counties, and that the towns are uniformly spread over the country, the player must use up 49 x 49 = 2401 questions just to be able to identify the county with certainty. Shannon's improved formula takes the unequal probabilities of the answers into account, and predicts that the second player requlres many more questions than the first in order to elicit the same information. Accordingly, it assigns a smaller information content to each of the second player's questions.

Today, the final, modified recipe for measuring the amount of information, called 'Shannon information', serves as the cornerstone of the vigorous science of information theory. As an industrial tool, this theory's chief aim is to help design machinery for transmitting large volumes of information, cheaply, accurately and quickly over the various channels that have been invented since Morse's primitive wires. It shows, for example, that 'block­coding' (assigning a short word or number to longer phrases) is far more efficient than 'letter-coding' (assigning a different symbol to each letter). Ordinary English requires on average about 28 bits per word. But suppose you insist on restricting yourself to much shorter strings of no more than 14 bits instead. If each string were assigned to a specific word, the system could handle 16,384 (which equals 2'4) different words or phrases - plenty for most cir­cumstances. It appears, in short, that Morse's first idea, to transmit messages by means of a numerical code linked to a dictionary of words, was perfectly sound.

More practically, especially since the Morse code is still in use today, one might ask whether its dots and dashes are efficiently assigned. In order to answer this question, it is necessary to take into account the frequencies of letters in English, i.e. the probabilities of 'E' and 'T' occurring, versus 'X' and 'Z'. Shannon's formula does that, in the same way that it accounts for the different probabilities of throwing heads and tails with a biased penny. Before its invention, the problem could not have been solved with mathematical rigour, because a measure of information was missing. Using Shannon's definition, together with the theorems he proved, it has been shown that reshuffling of dots and dashes could improve the Morse code by no more than 15 per cent. Morse's rough-and-ready scheme turned out better than it had a right to be.

Shannon's theory has put Morse's intuitive inventions on a scientific basis. Its real power, however, comes out of the sixth box in his scheme - the one marked 'Noise'. Noise introduces errors, uncertainty, losses and inefficiency into any system. In dealing with robust problems of civil engineering, or the design of big machines, noise is often neglected for the sake of simpllfying the problem. In information theory, on the other hand, the noise is often at least as strong as the signal itself, or even stronger, and cannot be ignored. In fact, an information theory that leaves out the issue of noise turns out to have no content.





HOME      BOE     SAL     TEXTE