John von Neumann – The Man from the Future

henrikwarne

Henrik Warne

Posted on July 7, 2024

John von Neumann – The Man from the Future

Before I read The Man from the Future by Ananyo Bhattacharya, I only knew about John von Neumann in two contexts: that computers use the von Neumann architecture, and that he appeared in a story about a mathematical problem I remember from many years ago. After reading it, I understand what a genius he was, and how much of science in the 20th century he influenced. He deserves to be better known than I think he is, and this is a great book to learn about him.

The von Neumann architecture means instructions and data are both stored in the same kind of memory, and instructions are fetched from memory and executed in order. This is taken for granted now, but this way of organizing computers was not a given when computers were invented.

The story of the mathematical problem I remember is this:

Two trains are 60 kilometers apart, traveling towards each other on the same track. Each train is moving at a constant speed of 30 kilometers per hour. A fly starts at the front of one train and flies back and forth between the two trains at a constant speed of 60 kilometers per hour. The problem is to determine the total distance the fly travels before the trains collide and the fly is crushed between them.

One way to solve it is by summing the infinite series representing the fly’s back-and-forth trips. However, there is an easier way to solve it. Since each train is moving towards the other at 30 kilometers per hour, the combined speed at which the distance between them is closing is 60 kilometers per hour. Given that they are 60 kilometers apart, they will collide in 1 hour. Because the fly is flying at 60 kilometers per hour, and the trains will collide in 1 hour, the fly will travel a total distance of 60 kilometers.

When this problem was posed to von Neumann, it didn’t take long for him to come up with the correct answer. The person who posed the problem was impressed by von Neumann’s quick response: “Ah, you came up with the short-cut for calculating the answer, instead of summing the series”. And von Neumann responded: “No, I summed the series in my head”.

I read this story a long time ago. It is probably apocryphal, but I like the math problem, and I’ve always remembered the story.

Early Years

János Neumann was born in Budapest in 1903 into a Jewish family. In 1913, his father Max was awarded a hereditary title from the Austrian emperor Franz Joseph I. This is the origin of the von in the name. When János moved to the United States, he anglicized his name, and became John von Neumann.

His extraordinary mind for mathematics was apparent very early on, and he was only 17 when he published his first mathematical paper (on zeros of Chebyshev polynomials). In 1919, there was a communist coup in Hungary. The communist reign lasted only 133 days, but von Neumann would remember that chaotic period, and he remained a life-long anti-communist.

Mathematical Upheaval

At the beginning of the 20th century, there was a foundational crisis in mathematics. The roots of it came from discovering a flaw in Euclid’s Elements, the standard textbook on geometry for centuries. In it, there were five axioms believed to be self-evident. By building on those axioms in logical steps, more advanced results (like Pythagoras’ theorem) could be proved. This axiomatic method was the cornerstone of mathematics.

In the 1830s, Euclid’s fifth postulate, the parallel postulate, was shown not to be true. It states that if two lines are drawn that intersect a third line so that the sum of the interior angles (a and b in the picture) is less than 180 degrees (two straight angles), then the lines must intersect at some point. On the other hand, if a and b do sum to 180 degrees, then they never meet (and are thus parallel).

The parallel postulate is true on flat surfaces, i.e. in Euclidean geometry, but it is not true in hyperbolic geometry, where the surface can curve, like a saddle. In the 1850s, Bernhard Riemann introduced spaces with any number of dimensions, hyperspace.

By the end of the 19th century, many other theorems and proofs from Euclid’s geometry were being questioned. David Hilbert set out to rebuild geometry theory from scratch in a much more systematic and rigorous way. In 1899 he published his book The Foundations of Geometry. At that time, some scientists were of the opinion that some questions could not be answered. Hilbert was of the opposite opinion – “we can know and we will know”. Having successfully tackled geometry, he wanted to do the same for all of mathematics: make sure it was on a solid base of irrefutable axioms and theorems.

However, this project ran into problems almost immediately. In 1901, the British philosopher Bertrand Russell struggled with a paradox in set theory. Some sets are straightforward, for example the set of possible cheesecakes. This set does not contain the set itself, because that set is not a literal cheesecake. Let’s call these kinds of sets normal. But when you consider the complement – the set of everything that isn’t cheesecake, then that set is a member of itself. Let’s call sets that are members of themselves abnormal. So far, so good.

Now let’s form the set of all normal sets, and call it R. If R is normal, it should be contained in R, and thus be abnormal (because it would contain itself). On the other hand, if R is abnormal (i.e. it is a set that contains itself), it would not be contained in the set of all normal sets (itself), and therefore be normal. This is Russell’s paradox.

This has the same structure as The Liar’s Paradox: “This sentence is false”. If the sentence is true, it is false, but if it is false, it is true. In both cases, the problem stems from the self-referential part of the paradox.

Russell’s paradox threatened Hilbert’s project of get mathematics on more rigorous grounds. “If mathematical thinking is defective, where are we to find truth and certitude?” he asked. Von Neumann’s solution to the problem came out in a paper in 1925. In it, he lists, on one page, all the axioms needed to build up set theory. To avoid Russell’s paradox, he introduces sets and classes. A class is defined as a collection of sets that share a property. There is no “set of all sets that are not members of themselves”, but there is a “class of all sets that are not members of themselves”. This class is not a member of itself, because it is not a set (it’s a class).

This development was to Hilbert’s liking. In 1928, he challenged mathematicians to prove that mathematics is complete , consistent and decidable. Complete meant that all mathematical theorems can be proved from a finite set of axioms. In other words, given some fixed set of axioms, is there a proof for every true statement? By consistent , Hilbert meant that the axioms would not lead to any contradictions. That is, can only the true statements be proved? And by decidable , he meant that there should be a step-by-step procedure (an algorithm) that, in finite time, can be used to show if a particular mathematical statement is true or false. This last property became known as the Entscheidungsproblem (decision problem in German, since German was the language of science in those days). Within a decade, the answers were in: mathematics was neither complete nor consistent nor decidable!

Quantum Mechanics

At the same time, physics was going through its own crisis. In 1900, the German physicist Max Planck proposed that energy might be absorbed or emitted in discreet quantities, quanta. In 1905, Einstein theorized that light might be composed of a stream of particles, the first hint that quantum entities had both wave-like and particle-like properties. The Danish physicist Niels Bohr came up with a model of the atom, where electrons could only occupy special orbits, where the difference between the orbits equaled the difference in energy.

To try to describe how atoms behaved, Werner Heisenberg came up with “matrix mechanics” in 1925. He wanted a theory that would explain experimental results. In experiments, scientists would “excite” atoms, by for example vaporizing a sliver of material in a flame, or by phasing a current through a gas. As a result, light was produces. There would be characteristic spectral lines (with given frequencies and intensities) for each element. Heisenberg proposed that the difference in the initial and final energy levels of the electrons accounted for the frequencies in the atomic emission lines. The possible transitions between levels could be represented in matrix form (with an infinitely large matrix). Since there could be different paths from the initial to the final levels (for example via intermediary levels), to get the probabilities of all possible transitions, he multiplied the individual transitions with their respective probabilities.

At about the same time, Erwin Schrödinger came up with an entirely different way of describing atoms, as an infinite sum of superpositions of a wave function. Just like Heisenberg’s model, this worked well to describe how atoms behaved experimentally. But how could two wildly different models both describe reality so well? Could they be shown to be the same? This is exactly what von Neumann did, with some help from Hilbert. I won’t pretend to understand all the mathematics, but it relates to operators, eigenfunctions, Hilbert spaces, and square integrable functions. In the end, von Neumann was able to show that the coefficients of the expanded wave function were the elements that appear in the state matrix, in other words: they were fundamentally the same theory!

Bombs

In 1930, von Neumann moved to the United States, after receiving an offer to work at Princeton University (and later at the Institute for Advanced Study). The same year, the Nazis became the second largest party in Germany. When they came to power in 1933, Jews were forced out from all parts of society. University maths and physics departments were particularly hard hit, with some 15 – 18 percent dismissed. Twenty of the ousted researchers were former or future Nobel laureates. Many of the researchers (including virtually all of the founders of quantum mechanics) moved to the United States. Almost instantly, the balance shifted from Germany to the US in terms of quality of scientific output. Before the Nazis, almost all important physics and mathematics results were published in German. After the war, the United States was the dominant force.

Von Neumann soon moved from Princeton to the Institute of Advanced Study. One of his achievements there was to prove the ergodic hypothesis. The hypothesis essentially bridges the gap between the microscopic behavior of individual particles in a system and the macroscopic properties observed in thermodynamics. In 1935, von Neumann became a father, when his daughter Marina was born. In 1936, Turing was visiting Princeton, and von Neumann read his paper “On Computable Numbers”. During the 1930s, von Neumann predicted there would be a war in Europe. In September 1941, before the United States entered the war, von Neumann wrote to his congressman: “The present war against Hitlerism is not a foreign war, since the principles for which it is being fought are common to all civilized mankind, and since even a compromise with Hitler would mean the greatest peril to the future of the United States.”

To calculate artillery projectile trajectories, many variables need to be considered. For example, long range projectiles fly through progressively thinner air as they gain altitude, so experience less resistance to its motion. This, and a lot a of other factors, meant that hundreds of multiplications were needed to calculate a single trajectory. Calculating the shock waves of bombs also required advanced mathematics. Von Neumann had been involved in this research for a while when he became an official consultant for the Army in 1937. He then quickly became more and more involved, as he showed what he could do. One of his contributions was to show that the damage from a bomb is far greater if it is detonated in the air above the target, rather than on the ground. This is due to the destruction caused from the airburst. The general principle was know before, but he showed that the effect was much larger than previously thought. He also improved the accuracy of the calculations for the optimal altitude of a bomb’s detonation.

In 1943, Robert Oppenheimer wrote to von Neuman, asking for help with the atom bomb project. At that time, there were two alternative ways of triggering the nuclear explosion: a gun-type design (a “bullet” of fissile material is fired into a target piece to start a nuclear chain reaction), and an implosion design (high explosives around a core is detonated to compress the core to trigger the nuclear reaction). For the implosion device to work, it was important that the core was compressed evenly from all sides. When von Neumann’s arrived at Los Alamos, the gun-type was the primary option. His main contribution was to come up with a design of wedge-shaped charges around the plutonium core, that would ensure the compression happened fast enough for the implosion device. This also meant that less plutonium would be needed for an equivalent yield, compared to the gun-type design, and the focus switched to the implosion design.

Computers

The calculations needed when developing the atom bomb were getting out of hand. Von Neuman knew that automatic calculating machines were being developed, and he became the chief advocate for using them at Los Alamos. He became aware of the ENIAC (Electronic Numerical Integrator and Computer) project after a chance meeting while waiting for a train in 1944. ENIAC was then more than a year away from being completed, but von Neumann immediately saw the usefulness of it. He was by then the second most famous scientist in the United States (after Einstein), and he had considerable influence in government and military circles. His first contribution was to ensure continued funding for the project.

Not only did he have the necessary connections and influence, and saw the need for a computer to perform calculations. He was also perhaps the one person with the best understanding of the mathematical and logical underpinnings of the modern computer, since he was very familiar with both Gödel’s and Turing’s work.

Gödel

In 1931, Gödel showed that if arithmetic is consistent, then there are true statements in arithmetic that cannot be proved – i.e. arithmetic is incomplete. The proof uses a variation of the liar’s paradox. Consider the statement “This statement is not provable“. Let’s call this statement Statement A. Suppose Statement A could be proved. Then it would be false (since it asserts that it cannot be proved). That would mean a false statement could be proved (which would be inconsistent). On the other hand, if Statement A can not be proved, then the statement is true (because it says that it is not provable). So then we have a true statement that cannot be proved. This means either arithmetic is inconsistent (which would make it useless, since false statements could be proved), or it is incomplete (i.e. it has true statements that can not be proved).

So far, the above is just logic statements, not arithmetic. But Gödel cleverly expressed the above idea in arithmetic, using what is now called Gödel numbers. He came up with a system of numbering all axioms and theorems in Principia Mathematica. Furthermore, in his system, a certain operation would always correspond to the same arithmetic operation. For example, assume that the Gödel number of the statement “All swans are white” is 122. The negation of the statement is “Not all swans are white”, and might have the Gödel number of double that, i.e. 244. This property (the Gödel number doubles) would hold for all negations, not just for this specific statement. Furthermore, each Gödel number can be unambiguously decoded back to its original expression.

All logical operations in this system had corresponding arithmetical operations. A proof is a series of logical statements linked together. With Gödel’s system, he was able to turn the proofs into their equivalent arithmetic operations. So any proof can be checked by simple math. With this in place, he produced an arithmetic statement that mirrored the phrase “This statement is not provable“, and showed the result above using arithmetic. In other words, the language of mathematics could be used to make meta-statements about mathematics.

By making arithmetics talk about arithmetics, Gödel dissolved the distinction between syntax and data. He also showed that numbers can represent logical operations, just like instructions in modern day computers. And the memory address of instructions are reminiscent of Gödel numbers.

Turing

Gödel showed in 1931 that Hilbert’s first two questions had negative answers. Five years later, Turing demonstrated that mathematics is not decidable. To do this, he invented an imaginary machine, which we now recognize as a computer, long before computers were invented.

The Turing machine, as it is now called, is very simple. It consists of an infinite tape, with squares that can each contain a symbol, or be empty. The machine has a read/write head, that can read a symbol from the tape, erase it, and write a new symbol. The head can also move the tape one square to the left or the right. The head also has an m-configuration, which are the instructions on what to do, and its current state. For example, if reading a 0, erase it and write a 1, then move left. Today, we recognize the m-configuration to be its program.

Using this very simple machine, Turning built a set of instruction tables (i.e. subroutines, although he didn’t call them that) to search for and replace a symbol, to erase all symbols of particular kind, etc. Using these, he shows how to build a “universal computing machine”, itself a Turing machine, which is capable of simulating any other Turing machine. Given the other Turing machine’s m-configuration (encoded as symbols on a tape), and that Turing machine’s input tape, the universal computing machine will output the same result as that other Turing machine.

Armed with this universal Turing machine, Turing shows that the Entscheidungsproblem is not possible to solve. To do so, he comes up with the halting problem. Assume that there is a Turing machine that can reliably answer whether another Turing machine will eventually halt on a given input. Call this H. Now create a new Turing machine, H’, that has H inside it. H’ will use H on its input, and if H answers “it will halt”, H’ goes into an infinite loop. On the other hand, if H answers “it will not halt”, H’ halts. Now run H’ with itself as the input. This lead to the logical impossibility of H’ both halting and not halting. Thus H’ can not exist, which mean H can not exist. In other words, it is not possible to have a general procedure for deciding if a mathematical statement is true or false.

The von Neumann Architecture

ENIAC became operational at the end of 1945. Instead of being used for artillery firing tables, it was used for solving partial differential equations for the hydrogen bomb project at Los Alamos. ENIAC’s program was fixed, but it could be reconfigured using patch cables (like in old telephone exchanges). But the scientists behind it were already working on a successor. In June 1945, von Neumann wrote the First Draft of a Report on the EDVAC. Whereas ENIAC was not easily reprogrammable, the new architecture described in the report was. This was the first instance of the modern design, where program and data are both stored in the same way in memory. This means it can easily be reprogrammed.

Von Neumann was not the only one thinking about organizing a computer this way, but his clear description crystalized the thinking. His familiarity with both Gödel’s and Turing’s work was most likely helpful. The report was widely distributed, and helped keep this type of design in the public domain (open source), as opposed to being patented. This undoubtedly helped speed up the development and adoption of the design.

Other Firsts

Stanislaw Ulam, a friend and colleague of von Neumann, invented the Monte Carlo method of calculating probabilities using repeated simulations. While convalescing in hospital, he began to play solitaire to relieve the boredom. He started to try to calculate the probability to play out a hand to completion, but the calculation quickly became intractable. Then he realized he could get a good estimate simply by keeping track of the number of successful plays in 100 tries. And so Monte Carlo simulations were born. Von Neumann was quick to apply this new technique to simulate chain reactions for bomb calculations at Los Alamos.

To run Monte Carlo simulations, you need a source of randomness. Von Neumann came up with a way of producing a sequence of random numbers he called “the method of middle-squares”. His method consisted of squaring an 8- or 10-digit binary number, then using the middle digits to become the next random number. This number in turn would be squared, and its middle digits became the next random number. Of course, this is not truly random, but it is good enough. Von Neumann later remarked “Anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin”.

Von Neumann and his colleague Herman Goldstine invented flowcharts to describe how the simulation programs worked. Von Neumann’s wife Klári worked as a programmer on the project. In a program in April 1948, Klári used a subroutine to generate random numbers by von Neumann’s “method of middle squares”. The invention of the subroutine is generally credited to computer scientist David Wheeler, but Klári’s code made use of one at least a year earlier. In 1955, von Neumann noted that computing capacity had nearly doubled every year since 1945, and he was of the opinion that this trend would continue. Noteworthy, since it predates Moore’s law by ten years.

Game Theory

Von Neumann also helped found the field of Game Theory. What does game theory mean? The game of chess is not a game in the game theory sense. As von Neumann explained, chess is a form of computation. In every situation, there is an optimal move. We may not be able to find what the optimal move is, but it exists. On the other hand, in a game-theoretic game, you must consider that the other player may bluff, and you have to consider how other players will react to your actions. It started in 1928, when von Neumann published the paper On the Theory of Parlour Games. In it he proves the minimax theorem. There he considers a two-person game, where one person’s gain is the other person’s loss. For this he coined the term _ zero-sum _ game. The strategy is to minimize a player’s maximum loss. He proves that in every two-player zero-sum game, there is a strategy that guarantees the best outcome.

In 1944, von Neumann and Oskar Morgenstern published the book Theory of Games and Economic Behavior, which became a bestseller, despite weighing in at a hefty 1200 pages. At the time of its writing, economists didn’t have a way to account for a persons preferences. Von Neumann came up with a system of a happiness/utility scale from 0 – 100, that could be used to account for them. The influence of utility theory and the notion of rational calculating individual introduced here spread wide and far.

In the book von Neumann also analyzes the game of poker. He simplifies the analysis by assigning a score to each hand between 1 and 100. He also simplifies bids to just be either high or low. With these simplifications he is able to show the importance of bluffing. The purpose of bluffing is not so much to try to win with a bad hand, as it is to encourage the opponent to bet high with middle-range hands when you have a good hand.

The book found applications in a lot of areas, for example on how to handle monopolies, and eventually helped in many fields, such as evolutionary biology. Another application became cold war planning, for example nuclear strategies, especially at the RAND (Research ANd Development) corporation in Santa Monica, California. RAND was started as a think tank for the US Army Air Forces. The subjects researched at RAND were completely alligned with von Neumann’s three obsessions at the time: computing, game theory and the bomb. One example of the research done at RAND is the duel, for example two aircraft about to shoot at each other. Each one wants to hold off fire until it is close enough to have a good chance of hitting the opponent. At the same time, it still wants to fire first. This is a good example of a two-player zero-sum game. Another problem worked on at RAND was that of nuclear war. I was surprised to learn that right after World War II, there were many advocates of a pre-emptive nuclear strike against the Soviet Union.

Self-Replication

As if he wasn’t busy enough with all his various projects, von Neumann also took an interest in comparing biological machines to synthetic ones. This resulted in the book Theory of Self-Reproducing Automata. He was thinking about whether a machine could be constructed that could create a copy of itself. To those who doubted it is possible, we replied that organisms in nature reproduce themselves all the time.

Von Neumann used the universal Turing machine as his starting point. It can imitate any other Turing machine if it is given its instructions and input. What would it take for it to instead copy itself? Von Neumann’s answer is that three things are needed. First, it needs a set of instructions on how to build a new machine. Second, the machine needs a construction unit that can build the new machine by following the instructions. Finally, the machine needs a unit that can copy the instructions and insert them into the new machine. It is interesting to note that five years before the discovery of the structure of DNA in 1953, and before scientists had a good understanding of cell replication in detail, von Neumann identified the key steps of what it needed for an entity to replicate itself.

After a while, von Neumann wondered if three dimensions were necessary, or if self-replication was possible in two dimensions. After getting inspiration from Stanislaw Ulam, he developed what would be known as his cellular model of automata. His self-reproducing automata lives on an endless two-dimensional grid, consisting of squares, or cells. Each cell can be in one of twenty-nine different states. Further, each cell can only communicate wih its four contiguous neighbors. There are many rules on how cells can communicate, and how this is done. Most cells start in a dormant state, but can be brought to “life” by neighboring cells, and can later also be “killed”.

Using this setup, he then builds units that perform the three tasks he identified as necessary for self-reproduction. He constructs a tape consisting of dormant cells (representing 0) and live cells (representing 1). Adding a control unit that can read from and write to tape cells, he is able to reproduce a universal Turing machine in two dimensions. He then designs a constructing arm that can snake out to any cell, set it to the right state, then withdraw. The design took a lot longer than he expected, and other pressing government tasks meant that he was never able to finish his design.

After von Neumann’s death in 1957, Arthur Burks, who had worked with him on the ENIAC, completed the automaton by going through von Neumann’s notes. The complete machine fits in an 80×400-cell box, but has an enormous tail, 150,000 squares long, containing the instructions needed to clone itself. Start the clock, and step by step the behemoth starts work, reading and executing each tape instruction to create a copy of itself some distance away. The construction arm grows until it reaches a predefined point in the matrix, then it starts depositing row upon row of cells to create its offspring. The first attempts to actually run his design on a computer was in 1994, but then it took impossibly long. On a modern laptop though, it takes minutes to run. The most famous example of a cellular automaton was developed in 1970 – Conway’s Game of Life.

Another Story

A few weeks after I finished reading the book, I came across this quote. The book has many great stories about von Neumann, but not this one. But I think it fits to include this one as well:

“There was a seminar for advanced students in Zürich that I was teaching and von Neumann was in the class. I came to a certain theorem, and I said it is not proved and it may be difficult. Von Neumann didn’t say anything but after five minutes he raised his hand. When I called on him, he went to the blackboard and proceeded to write down the proof. After that I was afraid of von Neumann.”

George Pólya

Conclusion

Von Neumann is such a fascinating character. He was involved in so many important scientific projects and break-throughs that it truly boggles my mind. I particularly liked to read about the birth of the modern computer, and his influence on that. I also really enjoyed learning about the background and theoretical underpinnings from Gödel’s and Turing’s work.

The book is very well written, and full of interesting anecdotes. In addition to all the math, science and technology, there is also a good bit of history of the first half of the 20th century. Highly recommended if you are interested in the history of computing and science in general!

💖 💪 🙅 🚩
henrikwarne
Henrik Warne

Posted on July 7, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related