Is digital storage the secret of life?
by John Walker
Memory Is Tough
It’s possible one needs to have been involved in computing for as long as I have (I wrote my first program for an electronic digital computer in 1967, but I built a pressboard Geniac computer before my age broke into double digits, and I did outrageous things with my Minivac 601 relay computer in the early 1960s) in order to fully appreciate how difficult a problem computer memory was over most of the history of computing. In these days of 512 Mb DRAM modules, 300 Gb hard drives, and 2 Gb flash memory cards for digital cameras, it’s easy to forget that until the 1980s, the cost of random access memory dwarfed that of any other computer component, and programmers were consequently required to expend enormous effort and cleverness squeezing programs into extremely limited memory.
| UNIVAC core memory plane from the Fourmilab Museum.
Click image to enlarge.
The reason for this is simple. While a CPU can be simplified at the expense of speed, every bit of random access memory requires physically fabricating a discrete object to hold the bit. (I exclude here historical footnotes such as Williams tubes and EBAM [electron beam addressable memory] as they were even more expensive and/or limited in capacity). When each bit was a ferrite core, through which some bleary-eyed human had to string three tiny wires, the reason for the high cost was obvious. (In the mainframe core memory era, I worked for a year on a project which ended up consuming about ten man-years to write a new operating system for a UNIVAC mainframe machine solely to avoid the need to buy another half-megabyte memory module. This made a kind of bizarre economic [if not strategic] sense, since all of the salaries of the implementors added up to far less than the memory would have cost.)
Even a monolithic solid-state dynamic RAM chip requires that you photolithographically fabricate a capacitor cell for every bit, and that means that for a 256 megabit DRAM you need to make a quarter of a billion memory cells, and if one doesn’t work, the chip is junk. It took human beings a great deal of intellectual effort and time to figure out how to do such an extraordinary thing, and we should never forget how recently it became possible.
When the first reasonably cheap medium-scale integrated circuits (for example, four NAND gates or two flip flops on a chip) appeared around 1970, it was obvious to me that it was now possible to build a simple CPU powerful enough to do useful things which would be inexpensive enough for an individual to afford. But memory was the sticking point—the only viable technology at the time was core memory, and enough core to do anything useful at all would cost more than a luxury car. It wasn’t until almost 1980 that the advent of the 16 Kb dynamic RAM made it possible to envision affordable bulk memory. Memory is tough.
Memory vs. Storage
At this point, I’m going to change terminology. Back in the mainframe era, IBM never used the word “memory” for computer data banks. Instead, they always used “storage”. The rationale for this is that it’s misleading to refer to what is nothing more than a bunch of boxes which store and retrieve bytes (computer storage) with the associative, pattern matching, highly parallel function of human memory. Calling the computer’s data bank “memory” attributes to it anthropomorphic powers it doesn’t have. I was always on IBM’s side in this debate, since every time I tried to explain computers to non-techies, it was difficult to get across that the much vaunted “computer memory bank” of science fiction was nothing more than a glorified collection of pigeon holes for bytes.
For the rest of this document, I’ll use the terms “digital storage” or just “storage” to refer to media (DRAM, hard drives, magnetic tape, optical discs, etc.) which store and retrieve digital data, and “memory” only when referring to associative pattern recall in the brains of animals.
What Is Life?
Life is one of those things, like pornography, which is difficult to precisely define, yet everybody knows it when they see it. Almost every handy-dandy definition of life seems to exclude some odd species that everybody agrees is alive, or includes something that obviously isn’t. In The Anthropic Cosmological Principle, John Barrow and Frank Tipler argue (pages 521–522) that under a commonly used definition of life, automobiles must be regarded as living creatures.
Since reading Frank Vertosick’s The Genius Within, I have become quite fond of his definition of life as systems which solve problems, and the attribution of intelligence to any organism (or species, or system) which adapts itself to novel conditions in its environment. This is something which does seem to be a universal property of life, but is absent in non-living matter.
Problem-solving, however, almost inherently requires the ability to, in some sense, learn from experience and record solutions to problems which worked in order to benefit from them in the future. This, in turn, would seem to require some kind of storage, and given the need for such storage to be robust in the face of real-world perturbations, digital storage in preference to analogue storage.
What Is Digital Storage?
Let me make clear the distinction between analogue and digital storage. “Digital” has nothing to do with electronics or computers—it simply means that information is encoded as a collection of discrete values which are robust in the presence of error. “Analogue” storage is an effectively continuous-valued quantity with no threshold value. Let’s consider a cell in a computer dynamic RAM chip. This might be a capacitor which is able to store, say, a million electrons. We can define zero as any number of electrons less than half a million, and one as any number greater than or equal to half a million. Assuming that we more or less discharge the cell when writing a zero and charge it up all the way for a one, when we read it back, we have a margin of error of half a million electrons before we’ll confuse a one with a zero or vice versa. Given that we can’t ever fully charge or discharge a cell, that each cell differs slightly in its capacity, that thermal noise and leakage are always affecting the number of electrons in the cell, and that we read out the number of electrons with sloppy analogue circuitry with its own tolerance and noise problems, only the large threshold between one and zero makes all of the myriad cells in the RAM chip work reliably.
As an example of analogue storage, suppose we decided to be really clever and store twenty bits in each cell—a twenty to one gain in storage density—by charging up the cell with zero to a million (well, actually 1048576=220 to be precise) electrons, then reading back the precise value. Obviously, this wouldn’t work, not only due to the presence of noise but because it’s impossible to precisely count electrons. You might try to increase the storage density by, say, using four different charge levels to store two bits, but experience has shown you’re always better off using the smallest possible cell which permits robust distinction of a single bit; in any case the four level scheme, if it worked, would still be digital storage.
Life Requires Digital Storage
Now let’s start to weave together the definition of life as problem solver and the requirement for digital storage. I’ll begin with the following observation, which I think is beyond dispute.
Observation 1: Known Life Requires Digital Storage.
All known forms of life embody one or more forms of digital storage.
This is proved simply by noting that all known forms of life contain a genome encoded in DNA (or RNA for RNA viruses, if you consider them living organisms, which I don’t). DNA (or RNA) is, if you’ll excuse the expression, every bit as much a digital storage medium as a RAM chip, CD-ROM, or hard drive. In fact, in my unfinished oeuvre The Rube Goldberg Variations, I show in detail how arbitrary digital data may be stored in DNA and invent “music plants”, thereby enlisting biology in the “file sharing” wars. I will assume that “digital storage”, however implemented, is “reliable” or “robust”—in other words, that random sources of error don’t corrupt the information such that it becomes useless.
Now let’s slip the surly bounds of science and soar into the exhilarating blue sky of speculation. I now conjecture the following:
Conjecture 1: Life Requires Digital Storage.
All forms of life embody one or more forms of digital storage.
Here I’ve simply deleted the “known” to assert that any conceivable life form, in whatever environment, will require some form of reliable digital storage in order to perform the problem-solving which we’re using as the definition of life. This will eternally remain a conjecture, of course, since there’s always the possibility some space probe will happen upon a form of life that gets along without digital storage. But, based on present-day theories of the origin of life (with the possible exception of Freeman Dyson’s dual origin theory in Origins of Life, and to the extent metabolism alone can be considered life) digital storage is a requirement to avoid the “error catastrophe” which would destroy the replication of even the most primitive organisms. The raw error rate in DNA replication, about one mistake per hundred million (108) bases (before DNA proofreading and error correction, which reduces the final error rate two or three additional orders of magnitude), is just about as high as it can be without triggering runaway accretion of errors which would destroy the genome and render daughter cells unviable. It is difficult to envision any form of analogue storage attaining such a low error rate.
Life Is Difficult
Now we can begin to understand why it’s so difficult for life to initially appear, become complex, and begin to perform higher level computation. It comes back to the early days of computing: “Storage is tough”. The only way to implement reliable digital storage is to build a mechanism which stores each bit of information in a highly robust or redundant manner, and permits replication with a very low error rate.
It took biological evolution almost four billion years to evolve creatures smart enough to invent their own means of reliable digital information storage. Biology itself, in those billions of years, seems to have discovered only three such mechanisms:
- The genome.
- The central nervous system and brain.
- The vertebrate adaptive immune system.
Note that these developed in the order I’ve listed them, and highly-successful organisms exist today at each level of development. Neural netters might ponder the fact that there are a multitude of species with neural networks and ganglia which lack the chemical computer of their own immune system.
Interestingly, each of these mechanisms seems to work in the same way: massively parallel hill-climbing and selection by fitness. They are, in effect, implementations of precisely the same scheme in different hardware, adapted to the life-cycle of the species which possess them. Bacteria and other rapidly reproducing species use changes in their genome to adapt, either by mutation, exchange of genetic material, or recombination. Their data bank is the genome, and that suffices, and suffices very well. Paraphrasing Stephen Jay Gould, “Forget ‘the age of dinosaurs’, ‘the age of mammals’, and all that. It’s the age of bacteria, and it always has been.”
As organisms get larger and more complicated, their generation time increases and population size is smaller, so there’s a need to adapt faster than pure genomic evolution permits. A neural net allows individual organisms to learn behaviours (memory) by making connections and adjusting synapse weights. This permits evolution to adjust behaviour by changes in pre-wiring of neural nets but also adaptation—learning (but not phenotypic transmission) by individuals within a species. Although neurons are analogue chemical systems, long-term memory is stored in a robust, redundant fashion in synapse weights and can be considered to be digital storage. In fact, it is possible to simulate storage and retrieval in a neural network even on the humblest of personal computers.
Finally, as complication increases further, organisms need to be able to defend themselves against challenges from pathogens at a rate much quicker than evolution alone would permit. The pre-programmed immune system of the insects (adjustable only through evolution) gave way to the adaptive immune system in vertebrates, which permits individuals to counter novel attacking organisms they encounter in a single, much longer, lifespan.
All of these require reliable, robust storage, and in each case biological evolution has implemented this in digital form: DNA base sequences, neuron connections and synapse weights, and the B and T cells of the immune system. But remember, “Storage is tough”—even the first of these molecular digital storage systems is of such fantastic complexity that even after decades of intensive research, no convincing explanation as to how life could have originated from lifeless matter is at hand. The very difficulty of the central digital storage mechanisms of life is a strong argument that they are necessary for its existence, since if there were a simpler analogue mechanism that did the job, the odds are that evolution would have stumbled upon it first and that’s what we’d see today.
Classes of Computation
In A New Kind of Science, Stephen Wolfram identifies four classes of results which seem to characterise a wide variety of computations: physical and performed on digital computers. Each is defined in terms of the output which results from continuous iteration of a given algorithm on a specified input, with the output of each iteration serving as the input to the next.
|Class 1||Quickly settles down to a constant, invariant result.|
|Class 2||Cycles forever among a finite set of results.|
|Class 3||Produces seemingly random (albeit deterministically generated) results.|
|Class 4||Produces ever-changing, structured results, in which a change in a given location can have effects arbitrarily distant in the future or far from their origin.|
(These are my own definitions of Wolfram’s classes. In any case, the distinctions aren’t precise, particularly between classes 3 and 4.)
Wolfram argues that the vast majority of computations of type 4, found in nature as well as computing systems, are universal, or able to emulate any other computation, given a properly crafted input. Let’s see how this fits with the observations from life above.
Conjecture 2: Class 4 Computations Require Digital Storage.
Any Wolfram class 4 computation requires digital storage.
Evidence for this conjecture is that every well-defined example (as opposed to assertion of equivalence) of a Class 4 computation does, in fact, use digital storage to transmit the state of a given step in the computation to the next, and requires that transmission to be sufficiently accurate so as not to disrupt the evolution of the computation.
Conjecture 3: Most Class 4 Computations Are Universal.
“…almost all processes which are not obviously simple can be viewed as computations of equivalent sophistication.” (Wolfram, p. 716)
This, which Wolfram asserts as the “Principle of Computational Equivalence”, claims that most of these “complex” computations (Wolfram deems Classes 3 and 4 complex—I opt for the more conservative definition of only Class 4 as complex.) are universal computations—able, given a suitably prepared input, to emulate any other computation by another universal computing device.
Characteristics of Classes
Let’s step back for a moment and look at the computational requirements for Wolfram’s four classes of computation.
- Class 1
- This requires only pure computation. To get a fixed output, we need only say, “Set every bit to 1” or “Set every even numbered bit to zero and odd numbered bit to 1”. No storage of previous states is required.
- Class 2
- To obtain oscillatory behaviour, feedback is required from earlier states, but storage requirements are limited to the number of states in the repetitive cycle. One could, in fact, replace the evolution rule with a fixed look-up table with one entry for each state in the cycle and simply fill in the state from the current cycle number modulo the number of cycles between repeats.
- Class 3
- Apparently random behaviour may require storage, but it needn’t require much. Pseudorandom generators which produce results which pass the most demanding tests of randomness require only very limited amounts of storage: Wolfram’s one-dimensional cellular automata “Rule 30” (Wolfram, p. 315), for example.
- Class 4
- Class 4 behaviour requires that a local change can have effects arbitrarily distant in space and time. Since Class 4 behaviour cannot be compressed in space or time, in the general case the storage required is as large as the entire instantaneous state of the computation, and this storage must be highly reliable, since even a single error may propagate and disrupt the entire computation.
And remember, “Storage is tough”.
Now if my persuasive prose has kept you nodding in agreement all this way, let me warn you that I’m about to pull the rug out from under the Wolfram Principle of Computational Equivalence. If you’re committed to doing a new kind of science, please hear out my argument that the kind of science we do needs not only to be new but also plausible.
Let’s start with the conflict between Conjectures 2 and 3: I assert in Conjecture 2 that any Class 4 computation requires digital storage, while in Conjecture 3 Wolfram argues that most Class 4 computations are universal. We can now state the following:
Inference 1: All universal computations require digital storage.
This follows trivially from Conjectures 2 and 3.
But now we’re confronted with the following
Observation 2: Most natural computations do not involve digital storage.
The fluttering of leaves in the wind, the ripples of water in a stream, the boiling of clouds over a mountain range, the intricate N-body ballet of stars orbiting the centre of mass of galaxy, all have no storage apart from their instantaneous analogue state space—they have no way to store and retrieve information in a robust and reliable digital form. Hence, from Inference 1, these are not universal computations. They may be chaotic and unpredictable, but they can’t be used in any manner to emulate any other computation, and therefore cannot be said to be equivalent to other computations or universal.
Ant bites tungsten.
Why Do Computers So Intrigue Us?
Ever since I first heard of computers, I was fascinated by them. As a little kid, I spent endless hours wiring up circuits on my Geniac and fiddling with its hopelessly unreliable screw head and brass shunt switch wheels. I believe, as much as we try to draw a distinction between the inanimate and the living while trying at the same time to eschew vitalism, we perceive there’s something about computers which is, in some sense, alive. This is certainly what attracted me to computers. You can’t tell an adding machine, “go do these calculations for the next ten hours and tell me the answer”. You can’t even tell your dog, “Go dig a two metre hole here so I can bury this body and bark when you’re done.” But you can tell your computer (assuming you know the incantations required to ask it), “evolve the solar system back and forward for a quarter million years and show me all the transits of planets visible from planets further from the Sun”. And the computer, this captive intelligent spirit, will go and do it, store the results for your perusal, and let you know when they’re ready. Computers are systems which solve problems, and our instinct, evolved over billions of years, tells us that’s what distinguishes living beings from rocks.
What is it about computers which so enthralls us? Well here’s my answer: because they have memory, and hence they’re capable of Universal Class 4 computation, which means they can do anything and the results they produce can always surprise us. With the exception of other humans and a very few kinds of animals, they’re the only creatures we’ve encountered which can do this, and it’s only natural we’re fascinated by them. The fact that we created them, and can, in principle, make them do anything we’re smart enough to describe, only heightens their appeal to us. We are the first species on this planet, perhaps in the universe (but maybe not), to intelligently design progeny which can perform Class 4 computations and produce unexpected, unpredictable results which stimulate our sense of wonder—it’s no surprise we’re fond of them, as our own Creator doubtless is of us.
Summary and Conclusions
When considering computation, in nature as well as on digital computers, there’s a tendency to focus on the computing element—the CPU chip in present-day computers, for example. But the magic in computing is really in the storage (or, as it’s now usually called, memory). It’s the ability to reliably store and retrieve large quantities of information which permits computers to perform computations whose results are unpredictable by any process other than going through all the steps of the computation, and thereby simulate, with more or less fidelity, processes in nature which have this same property of unpredictability.
The most difficult problem in the evolution of electronic digital computers was development of highly reliable inexpensive bulk digital storage. Today’s dynamic RAM chips and hard drives would have seemed the purest fantasy or products of an alien technology to most computing practitioners of the 1960s. This is because storage is tough—a RAM chip may contain more than a quarter billion separate components, all of which have to work perfectly hundreds of millions of time a second over a period of years. The fact that one can buy such components in unlimited quantities for less than the cost of a haircut is the greatest triumph of mass production in human history.
Prior to the development of digital computers in the 20th century, the only systems on Earth which incorporated bulk, reliable digital storage were living organisms. DNA, neural networks and brains, and the adaptive immune system all have the ability to robustly store large quantities of information and retrieve it when needed. But storage is tough—each of these biological systems is enormously more complicated than any existing computer, and it took biology billions of years to evolve its second and third kinds of digital storage. The intertwined complexity of DNA and protein synthesis in even the simplest living cells is such that how it came to be remains one of the central mysteries of biological science, a conundrum so profound that one of the two discoverers of the structure of DNA, Nobel Prize winner Francis Crick, believes the first living cells were placed on Earth by intelligent aliens from elsewhere in the Galaxy. (But then how did the aliens get started?)
Digital storage permits living beings to solve problems; this is the fundamental difference between life and inanimate matter. Species evolve by changes in the genome, more complex animals learn by adjusting connections between neurons, and vertebrates have the ability to detect invading organisms and learn defences against them. Computers, endowed with their own digital storage, are the first objects ever built by humans which have the kind of problem solving ability biological organisms do, and this is why we find computers so fascinating. Their digital storage permits them, in principle, to compute anything which is computable (they are universal), and they can produce results which are unpredictable and unexpected, just as living creatures do. We sense that computers are, if not completely alive, not entirely dead either, and it’s their digital storage which ultimately creates this perception. Without digital storage, you can’t have life. With digital storage, you don’t exactly have a rock any more.
Many systems in nature—waves breaking on a beach, a leaf fluttering to the ground, a stream flowing turbulently around stones—are unpredictable. They are sensitively dependent on initial conditions, and evolve in a chaotic fashion where infinitesimal changes can be amplified into large-scale perturbations. To the extent the initial conditions can be known, computers (or humans, far more tediously) can simulate these natural systems and produce the same kinds of behaviour. The natural systems can be said to be performing computations of great complexity, and it’s the ability of the computer to perform equivalently complex computations which permits it to emulate them. But it is the digital storage in the computer which ultimately enables it to carry out such computations.
Physical systems which lack digital storage cannot, however, regardless of the complexity of their behaviour, perform the kind of arbitrary computations systems with storage can. Only a system with storage can be computationally universal. The only objects in the natural world which possess digital storage are living organisms, Digital storage is, therefore, both the secret of life and the reason computers are perceived, to a limited extent, to be alive.
- Barrow, John D., and Frank J. Tipler. The Anthropic Cosmological Principle. Oxford: Oxford University Press, 1988. ISBN 0-19-282147-4.
- Crick, Francis. Life Itself. New York: Simon and Schuster, 1981. ISBN 0-671-25563-0.
- Dyson, Freeman J. Origins of Life. 2nd. ed. Cambridge: Cambridge University Press, 1999. ISBN 0-521-62668-4.
- Vertosick, Frank T., Jr. The Genius Within. New York: Harcourt, 2002. ISBN 0-15-100551-6.
- Watson, James D. et al. Molecular Biology of the Gene, 5th ed. Menlo Park, California: Benjamin/Cummings, 2003. ISBN 0-8053-4635-X.
- Wolfram, Stephen. A New Kind of Science. Champaign, IL: Wolfram Media, 2002. ISBN 1-57955-008-8.
- The full text of this book may now be read online.