220 75 4MB
English Pages 158 [159] Year 2023
Rafael Ball
Viruses in all Dimensions How an Information Code Controls Viruses, Software and Microorganisms
Viruses in all Dimensions
Rafael Ball
Viruses in all Dimensions How an Information Code Controls Viruses, Software and Microorganisms
Rafael Ball ETH Library, ETH Zurich Zürich, Switzerland
ISBN 978-3-658-38825-6 ISBN 978-3-658-38826-3 (eBook) https://doi.org/10.1007/978-3-658-38826-3 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Fachmedien Wiesbaden GmbH, part of Springer Nature 2023 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Fachmedien Wiesbaden GmbH, part of Springer Nature. The registered company address is: Abraham-Lincoln-Str. 46, 65189 Wiesbaden, Germany
Contents
1 Introduction�������������������������������������������������������������������� 1 2 Viruses, Microorganisms and Molecular Genetics ���� 9 2.1 Origin and Development of Microorganisms �������� 10 2.2 Basic Concepts of Molecular Genetics ������������������ 15 2.2.1 Transcription and Translation �������������������� 17 2.2.2 Replication�������������������������������������������������� 19 2.2.3 The Genetic Code: DNA and RNA������������ 20 2.3 What Is Life? Definitions and Emergence�������������� 28 2.3.1 Definitions of Life�������������������������������������� 28 2.3.2 The Emergence of Life ������������������������������ 34 2.4 Are Viruses Living Things? Viruses and the Early Genetics of the RNA World�������������������������� 41 2.4.1 The Genetics of Viruses������������������������������ 45 2.4.2 The Relation of Viruses to the Kingdom of Living Beings and Their Classification�� � 49 2.4.3 Prions���������������������������������������������������������� 56 3 Algorithms, Software and Artificial Intelligence�������� 59 3.1 Software and Hardware������������������������������������������ 60 3.2 Algorithms and Computer Programs���������������������� 61 3.3 Evolutionary Algorithms���������������������������������������� 67 3.4 Artificial Intelligence���������������������������������������������� 69 v
vi Contents
4 C omputer Viruses, Computer Worms, and the Self-Replication of Programs�������������������������� 73 4.1 “Brain” and the Computer Viruses�������������������������� 74 4.2 Computer Worms and Trojan Horses���������������������� 78 4.3 On the Analogy Between Biologically Active Viruses and Computer Viruses�������������������� 80 5 Information, Genetics, and the Evolution of Life ������ 87 5.1 What Is Information? Data – Information – Knowledge�������������������������������������������������������������� 88 5.2 The Coding of Information in Biology and Technology ������������������������������������������������������ 92 5.3 The Genetic Code and Information Theory������������ 97 5.3.1 Epigenetics��������������������������������������������������102 5.3.2 Prions����������������������������������������������������������108 5.4 Information Coding in Technology������������������������111 5.5 The Meaning of Information with Viruses and Algorithms��������������������������������������������������������114 6 The Great Continuum: The Convergence of Life and Technology��������������������������������������������������119 6.1 The Independence of the Systems��������������������������120 6.2 The Self-Replication of Artificial Intelligence��������125 6.3 The Robots Are Coming: Technical Solutions for Autonomous Systems����������������������������������������128 7 The Coevolution of Life and Technology: Summary and Outlook��������������������������������������������������133 References ������������������������������������������������������������������������������141 Index����������������������������������������������������������������������������������������151
List of Figures
Fig. 2.1 Comparison of bacterium and virus (sizes are not to scale). (© C. Munz/AbiBlick.com)��������������11 Fig. 2.2 Bacterium, schematic structure. (© C. Munz/AbiBlick.de)��������������������������������������13 Fig. 2.3 Principle of transcription: DNA is translated into RNA. (Source: NHGRI/Wikipedia – public domain)������������������������������������������������������17 Fig. 2.4 The assembly of proteins from genetic information (translation). (Source: LadyofHats/ Wikipedia – in the public domain)������������������������18 Fig. 2.5 The genetic code (assignment of amino acids to coding triplets). (Source: Mouagip/ Wikipedia – public domain)����������������������������������23 Fig. 2.6 Paramecium (also flagellate) as a simple unicellular organism with cell membrane and compartmentalization. (Source: Udaix/ Shutterstock (own translation))������������������������������30 Fig. 2.7 Stirred-up, energy-rich primordial soup as the basis for the emergence of the first self-replicating molecules. (Source: Tim Bertelink (2016): “Artist’s impression of the Hadean Eon”, https://commons. vii
viii
Fig. 2.8
Fig. 2.9 Fig. 2.10 Fig. 2.11 Fig. 3.1 Fig. 3.2 Fig. 4.1
Fig. 5.1
Fig. 5.2
List of Figures
wikimedia.org/wiki/File:Hadean.png CC BY-SA 4.0, https://creativecommons.org/licenses/ by-sa/4.0/)�������������������������������������������������������������35 Hypercycle according to Manfred Eigen: RNA gives rise to proteins, which in turn cause the production of RNA. (Source: Own illustration according to Manfred Eigen (simplified))���������������38 The SARS-CoV-2 (electron micrograph). (Source: Alissa Eckert, MS; Dan Higgins, MAM – public domain)����������������������������������������41 Schematic structure of a virus, here influenza. (Source: Designua/Shutterstock (own translation))��������������������������������������������������44 Prions in “normal” and in “pathogenic” form. (Source: Designua/Shutterstock)������������������58 Muhammad ibn Musa al-Chwarizmi. The concept of the algorithm goes back to him. (Source: Likeness on a former Soviet stamp)����������63 Alan Turing (1912–1954), solving the Hilbert decision problem. (© ARCHIVIO GBB/Alamy)����������������������������������������������������������65 Program data of the first PC virus “Brain”. The developer even puts his address in the code. You can contact him to have the virus removed. (© M. Zabel/Lectorate Freiburg)���������������������������75 Signs – data – information – knowledge. Knowledge pyramid according to Raffael Herrmann. (Reprinted with kind permission. Source: https://derwirtschaftsinformatiker.de)��������������������90 How epigenetics works. (Reprinted with kind permission from Visionaries of Health. Source: https://visionaere-gesundheit.de/)������������105
1 Introduction
Abstract Viruses are on everyone’s lips at the beginning of the twenty-first century. Right now, the Corona pandemic is on our minds, but we must not forget other virus-induced diseases such as avian flu or classical influenza. As a result of the Corona pandemic, we have almost become virus specialists and increasingly understand how viruses spread or how vaccines can work. The basis is always a genetic program based on information that is passed on from virus to virus. The way in which information is encoded is the same as the principle that is used throughout living organisms – from bacteria to humans. At the same time, computer viruses are increasingly entering our daily lives. Due to the networking of computer systems worldwide, malicious programs can be transmitted and spread quickly. Here we are interested in the basics of technical information coding and the comparison to viruses and living beings. Ultimately,
© The Author(s), under exclusive license to Springer Fachmedien Wiesbaden GmbH, part of Springer Nature 2023 R. Ball, Viruses in all Dimensions, https://doi.org/10.1007/978-3-658-38826-3_1
1
2
R. Ball
this also leads to the question of a possible autonomy of malware systems and the unease of being ruled by autonomous robots in the future. Viruses mutate: With the transgression of discursive boundaries, the concept of the virus also changes its shape unexpectedly and often barely perceptibly. (Mayer & Weingart, 2004)
The Corona pandemic has brought back viruses into conversation. After the discussion and, to some extent, the public interest in AIDS caused by the HI virus (Human Immunodeficiency Virus) have massively decreased (although still more than two million people die from the HI virus per year) and the diseases caused by other corona viruses such as SARS and MERS were not pandemic, but were localized and controllable, the new corona virus and the lung and systemic disease it causes, COVID-19, will bring renewed interest in the topic of viruses, their mode of action, spread and control, to the forefront in 2020. Although 80 million people die per year worldwide from various infectious diseases and 8.2 million people die of cancer (Mölling, 2015, p. 37); nevertheless, the attention around the Corona pandemic is far greater, because this pandemic, which started from China at the turn of the year 2019/2020, not only costs many lives and can cause severe secondary diseases, but also causes worldwide restrictions on freedom of movement, in some cases significant, due to the so-called lockdown, which has massively restricted public and economic life and widely interferes with people’s fundamental rights. In the direct fight against the virus, the focus is now less on its epidemiological containment and more on knowledge of its genome. Those who know the genetic code of the virus can not only search specifically for therapeutic
1 Introduction
3
options for the diseases and develop vaccines, but also gain a better understanding of the mode of action of the specific virus as a whole. Here, we are currently observing a surge in studies and scientific articles about the corona virus and the COVID-19 disease, which are quickly uploaded as preprints on the countless servers and platforms of universities and research institutions, usually without the necessary quality control. In principle, it is right for intensive research to be carried out about corona, but the publications are then no longer an asset if their content and the underlying data cannot be relied upon. Already there are voices from the scientific community complaining about an inscrutable jungle of published half-truths and the loss of overview.1 Quite obviously, the publication practice in the Corona crisis confirms the old principle that “more” does not always mean “better”. Thus, psychiatrist and university professor Klaus Lieb sums up the publication volume in the Corona crisis: “Indeed, the crisis is our chance to question the huge floods of data and data publications” (Bartenschlager, 2020, p. 2). Chinese scientists decoded the complete genome of the SARS-CoV-2 virus after its appearance and made it available online worldwide as early as January 11, 2020 (Irmer & Müller-Jung, 2020, p. 11). It turned out that these viruses behave like most others and are minimalists: They keep their own activities to a minimum and leave the production of necessary proteins to the host cell by instructing it via short RNA sequences. Viruses are thus minimalists in terms of their own activities of encoding information and obtaining the necessary proteins. Therefore, not only their Here it becomes evident that patient reflection, classification, penetration, refutation and correction of hypotheses are the essence of science. The value of a publication is lost when it delivers only quick results in the “quick-and-dirty mode”. Then science turns into its opposite: Instead of delivering findings, half- truths or even fake news are produced. 1
4
R. Ball
position in the kingdom of living things is debated, but also their status as living things in general. Questions about how viruses live and work, how they spread and multiply have been discussed daily in the world’s media since the outbreak of the Corona pandemic. Due to the intensive and daily confrontation with the topic of “viruses”, technical terms such as “reproduction rate”, “mutation” or “replication” have long been on everyone’s lips. Particular attention is being paid to the question of how the pandemic can be brought under control and how the virus can be successfully combated. Even in the spring of 2021, the particular modes of action and details of the infection and its effects on processes in the human body are still far from being comprehensively clarified and understood in detail. At the same time, virologists point out that the virus, like all microorganisms, is constantly changing, making it even more difficult to combat via the development of drugs or vaccines. For example, in October 2020, researchers identified a variant of SARS- CoV-2 that has spread throughout Europe since the summer of 2020, starting in Spain, alongside the first form of the virus (Hodcroft et al., 2020). A look at the smallpox virus can illustrate the extent to which these mutations can occur. In the twentieth century, smallpox still killed half a billion people. Untreated, this infection has a very high lethality. Vaccination has been so successful in limiting it that smallpox is now considered eradicated. The smallpox virus has existed for thousands of years. It could initially only be traced back about 360 years. Now it has recently been detected in 1400-year-old skeletons from the Viking Age, showing an enormous genetic difference between the DNA of these ancient variants and the DNA of more recent variants (Schmitt, 2020, p. 8). This exemplifies how quickly viruses can change through mutations compared to the time periods of natural evolution.
1 Introduction
5
Of course, for all their relatedness, not all viruses are the same; there are many differences: “HIV, on the other hand, is insanely variable, which is not true to the same extent for SARS-CoV-1 and 2, although they are also RNA viruses” (Bartenschlager, 2020). But what are the causes of permanent changes in microorganisms such as or even viruses? What are the mechanisms behind them? How do their genetics actually work? We know that viruses are very early and original systems in terms of evolutionary history, which reproduce with the help of host cells and constantly change their genetic information – and thus also their infectivity. The mechanisms of genetic change, which are adaptations to changing environmental conditions or pure coincidences, are as old as life itself and have been the basis of all phylogenetic developments of living organisms for almost four billion years. The basis for this is the coding of genetic information in molecules of RNA or DNA (ribonucleic acid or deoxyribonucleic acid) and the identical transmission to subsequent generations. This principle is so universal that one ponders whether it could also be mastered by inanimate technical systems, and whether mastering the identical replication of information and passing it on to new individuals is already a constituent feature of life. Who does not think of computer viruses here, which also replicate automatically and identically and about whose status we want to talk in detail in this book? We encounter the description of viruses and their mechanisms almost exclusively in biology books or in works of molecular medicine and infectiology. It is usually clear that although the subject of viruses is located and treated in the field of microbiology, viruses are not considered “real” living organisms. Sometimes they are referred to as “part-time living beings” (Schreiber, 2019, p. 247), sometimes as the “transition from the inanimate world to the living”
6
R. Ball
(Mölling, 2015, p. 21). And indeed, viruses lack essential properties that are considered prerequisites for the designation “life”, such as a cell membrane or their own metabolism. Here we attempt clarification and classification – always with a view to the basic question of the coding of information in animate and inanimate systems. For viruses have something in common with the whole living world: the mechanisms of storage and reproduction of genetic information. Based on this fascinating hermaphroditic position of viruses as original systems at the boundary between dead organic macromolecules in prebiotic times and the emergence of the first cells and thus of “real” living beings, the question arises whether there could also be other, self-replicating but technical, man-made systems that could not only have similar processes and procedures of their reproduction and permanent change, so-called mutations, as the infectious viruses, but also similar effects. In the technical field, this quickly brings up the subject of “computer viruses”, whose developers as well as discoverers have borrowed not only the name from microorganisms, but also the dimensions of the processes. Computer viruses also lead to the disruption or destruction of a host, the computer system; they reproduce and spread automatically, and, unlike viruses, their information is not stored as RNA or DNA, but as “program code”. Ultimately, they can only be effectively stopped by interrupting chains of infection. In the case of a computer virus, the computer is taken off the network and all external connections are cut off; in the case of a “real” virus, social contacts between the potential carriers, i.e., people, are cut off in the lockdown. Not all self-replicating computer programs are computer viruses; there is a whole range of programs and algorithms that not only copy and spread themselves automatically, but – like living creatures in evolution – also change and thus achieve a certain “autonomy”: This applies to
1 Introduction
7
computer worms and other types of software. Whether such computer programs have or can have a similar function and mode of action as biologically effective viruses is one of the questions addressed in this book. In doing so, it is of importance to examine and compare the basis of the encoding of information in living organisms (microorganisms), in viruses (“part-time living organisms”), in the inanimate organic or inorganic world, and finally in the technical world. Answering this question will eventually help to clarify whether and to what extent there are parallels between the self-replicating systems of still inanimate nature (such as organic molecules), the replication of viruses, the (early) true biological replication systems and their ancestors, and the self-replicating technical computer programs. The question then follows whether and to what extent such self-replicating technical systems can become as dangerous as infectious viruses in triggering pandemics, such as the Corona pandemic in 2020, because “viruses are software – and we are their hardware” (Bartenschlager, 2020).
2 Viruses, Microorganisms and Molecular Genetics
Abstract The origin of living organisms is to a large extent a matter of coding and passing on information to future generations. This basic principle of genetics makes it possible to pass on information about connections and established processes to new generations in an identical manner and at the same time to be able to react to environmental changes through mutation and selection. The underlying mechanisms of molecular genetics have existed for almost four billion years and have always followed the same principle. They range from the first primitive forms of life to the emergence of mammals. In this context, the mechanisms of transcription (duplication of genetic information) and translation (reading of genetic information and assembly of physiological and structural products of the cell) serve as the basic genetic processes of life. At the same time, however, all viruses, which are not living organisms, precisely use this coding pattern, its structures, and processes. The emergence of life is therefore very closely connected with © The Author(s), under exclusive license to Springer Fachmedien Wiesbaden GmbH, part of Springer Nature 2023 R. Ball, Viruses in all Dimensions, https://doi.org/10.1007/978-3-658-38826-3_2
9
10
R. Ball
the development of these information-coding processes, if not even identical with them. At least, these information- based processes belong to the central criteria for the definition of life. In addition to the structures of the basic molecular genetic law, this chapter discusses questions about the origin of life and the evolution of viruses and provides a classification in the realm of the living.
2.1 Origin and Development of Microorganisms We should not look at the cosmos with the eyes of the rationalist. Lavish wealth has always been the essence of nature. (Wernher von Braun)
Microorganisms are organisms that are microscopic and invisible to the naked eye. This means they must be smaller than around 0.1 mm (or 100 μm). Structures of this size are – depending on the individual – just visible to the naked eye. Thus, microorganisms remained undetected for thousands of years and were only noticed with the development of functional microscopes in the sixteenth century. They are predominantly unicellular organisms, but there are also some microorganisms that are multicellular (yet so tiny), such as yeast fungi or algae. Microorganisms include bacteria, fungi, algae, and the so-called protozoa, which include such well-known organisms as the paramecia or the malaria pathogen plasmodium. Figure 2.1 shows the schematic structure and comparison between a bacterium and a virus. Viruses are also often understood as microorganisms, but they do not belong to living organisms in the true sense and therefore cannot be classified as classical microorganisms: “[…] viruses cannot even be called living organisms with full justification” (Sourjik, 2011).
11
2 Viruses, Microorganisms and Molecular Genetics
Flagellum Virus shell
Substance storage
Capsid
Pili
Cell inclusion Ribosomes Nucleoid (Chromosome) RNA
SurfaceProteins
Plasmid Cell membrane
Capsule Cell wall
Fig. 2.1 Comparison of bacterium and virus (sizes are not to scale). (© C. Munz/AbiBlick.com)
Already due to their small size, it is difficult to classify viruses in the realm of microorganisms: while bacteria with a size of 0.6 to 1.0 μm (in a few species up to 700 μm) can just be seen in a light microscope (no longer with the naked eye), viruses can only be seen with the help of an electron microscope. They are between 10 and 300 nm in size1 and are thus on average only one hundredth to one thousandth of the size of an average bacterium. SARS-CoV-2, for example, belongs to the beta-coronavirus group and is 120 nm in size. Viruses are the smallest microorganisms; they can exist only as intracellular parasites of macro- and microorganisms. In a strict sense, they cannot even be called living organisms, because they have neither their own metabolism nor their own biosynthetic machinery and are completely dependent on host cells for their reproduction. (Sourjik, 2011)
Giga viruses up to 2000 nm.
1
12
R. Ball
The first microorganisms were discovered by Louis Pasteur and Robert Koch in the second half of the nineteenth century. The two researchers are considered the founders of microbiology, a research discipline that deals with all microorganisms and, to some extent, viruses. Even with the first usable microscopes in the late sixteenth and early seventeenth centuries, most microorganisms could not yet be seen (Cremer, 2011). The quantity and diversity of microorganisms are huge compared to the animal and plant kingdoms. They are much older and have existed for billions of years, being the starting point for the emergence of life on earth. Thus, they have by far the largest share in the evolutionary history of living organisms and play the central role not only for the questions about the origin of life, but also for the understanding of the mechanisms of reproduction, multiplication, and coding of genetic information. Representatives of microorganisms are, as mentioned, fungi, algae, bacteria, unicellular eukaryotes (living organisms with a cell nucleus), but also the archaea, formerly called “primordial bacteria”, which are now regarded as a separate, very original group of living organisms. Thus, the kingdom of microorganisms is also divided into three domains: archaea, bacteria (so-called cell nucleus-less prokaryotes) and the eukaryotes (organisms with a cell nucleus, for example algae, fungi, protozoa). Figure 2.2 shows the schematic structure of a bacterium. It is now assumed that archaea have existed on earth for about 3.6 billion years (Rauchfuß, 2005, p. 330). They represent the most primitive organisms in the family tree of life. Archaea are anaerobic organisms: They do not require oxygen and use CO2 as the only carbon source for their metabolism. They obtain the energy they need for all life processes from inorganic reactions with hydrogen and sulfur. This type of energy production is called
2 Viruses, Microorganisms and Molecular Genetics
Nucleoid (Chromosome)
Ribosomes
Substance storage
13
Flagellum
Plasmid
Capsule Cell wall Cell membrane Pili
Cell inclusion
Fig. 2.2 Bacterium, schematic structure. (© C. Munz/AbiBlick.de)
chemolithoautotrophic. Archaea are thermophilic organisms. They tolerate very high temperatures and live at great ocean depths in hot springs. Some species even grow at 110 degrees Celsius. At the same time, they can withstand a wide variety of chemical boundary conditions and, depending on the species, are specialized to survive and grow in strongly basic or acidic environments. Because they have lived on Earth for so long, it is now believed that they even survived the time of the great planetary meteorite impacts and the exchange of material between the planets of our solar system. It is therefore not inconceivable that Archaea live today in subsurface hot areas of Mars, for example (Stetter, 2011). Only recently, another group of microorganisms was found, which made it necessary to recast previous ideas: giga viruses. In 2003, giant viruses 1000 times larger than those previously known were discovered. They were appropriately named giga viruses, perhaps marking the transition from an inanimate system, the virus, to the animate cell. These organisms are therefore of particular importance for
14
R. Ball
research into the origins of life, its conditions and mechanisms of action, and the classification of viruses (Deutsche Welle, 2014). With giga viruses …the boundary between virus and cell becomes blurred. The transition is a continuum […]. These near bacteria form the transition from viruses to bacteria, from dead to alive. (Mölling, 2015, p. 21)
With this statement we have arrived at the topic of our book: Namely, in clarifying the question of how (genetic) information is encoded and whether there are similar or even comparable reproduction mechanisms for the encoded information in inorganic, dead systems. And furthermore, in discussing the big question of whether these mechanisms are also transferable to technical, man-made computer programs and other IT systems. This because there are already programs which use similar mechanisms of action as original living beings – and which could therefore be seen in a series with the mechanisms for coding and replicating information in the development of life from dead matter. Thus, I extend the continuum postulated by Karin Mölling from inorganic dead matter via early forms of self-replication, for example in viruses, to early forms of living beings (bacteria) by the topic of self-replication of computer programs and algorithms. Important and fascinating at the same time is the realization that already the very first forms of life and already their precursors in the form of dead matter and of viruses, whose position and evolution are not yet finally clarified, have almost identical molecular biological mechanisms of how hereditary information is encoded and passed on. Against this background and with this classification in our question, we will now take a closer look at the basic concepts of molecular genetics and its mechanisms in the next chapter.
2 Viruses, Microorganisms and Molecular Genetics
15
2.2 Basic Concepts of Molecular Genetics From twenty amino acid letters nature created a language “in the pure state”, a language which – by minor rearrangements of the nucleotide syllables – expresses phages, viruses, bacteria, tyrannosaurs, termites, hummingbirds, forests and nations – if only it has enough time at its disposal. […] Truly, it is worthwhile to learn a language that produces philosophers, while ours produces only philosophies. (Lem, 2013, p. 106)
We learned in the previous chapter that the important questions in classifying the phenomena of animate and inanimate systems concern primarily the replication, reproduction, and duplication of stored (hereditary) information and its translation into proteins. Here there is a whole series of significant findings that help to provide clarity. In this chapter, we will learn about the molecular basis of heredity and genetics and will hear about the two basic genetic laws, the universal validity of which is quite amazing. For it is important to understand which mechanisms are at work here and how living systems (and to some extent already their inanimate or partially animate predecessors) encode, archive, and pass on the basic building blocks of their (hereditary) information as the so-called “genetic code”. This is crucial for whether an organism, its successors, or an entire species or population can successfully propagate over time, and thus evolution, and at least not become extinct. Or whether a whole species perishes due to lacking or faulty information transfer on the individual level or as a whole population.
16
R. Ball
At the heart of the transmission of hereditary informati on, in the growth and synthesis of various substances from this data, are the central processes of replication, transcription and translation3 2
To what extent can one meaningfully speak of “genetic information” in the context of genetics, and to what extent is this merely a misleading metaphor taken from information and communication technology? (Hildt & Kovacs 2009, p. 3)
For a generally understandable presentation of the matter and for the later transfer to the issues of the coding of information in prebiotic, biological and technical systems, these analogies seem not only harmless, but also helpful for understanding, if one remains aware of the respective different dimensions of the “information terms” used. Proteins consist essentially of amino acids. There are about 20 amino acids from which all proteins of living beings are composed. The sequence of genetic words (triplets) in deoxyribonucleic acid (DNA) or ribonucleic acid (RNA) defines specific amino acids. There is a striking analogy here with human written language: the “words” of DNA or RNA, we speak here of “code words,” thus determine, in the construction of proteins in the ribosomes – those places in the cell where proteins are synthesized – which amino acids are added in sequence, and thus which specific protein is synthesized. There are several of these triplets encoding the same amino acid, as we saw above. Therefore, one can It would be more correct to speak of “hereditary data” here, since information, according to its definition, is action-relevant, while data in itself is still contextand action-less. However, due to practicability, I prefer to use “hereditary information”. 3 Of course, it can be debated whether it is appropriate and corresponds to the actual genetic dimension if the “informational” terminology of molecular genetics is in large parts taken from the diction of another discipline. 2
2 Viruses, Microorganisms and Molecular Genetics
17
clearly infer a chain of amino acids from the DNA sequence (or the RNA sequence), but not vice versa. The fact that several triplets stand for one amino acid is called redundancy of the genetic code. The information value of the combination of the different triplets is much higher than the number of amino acids and proteins that can be designated by it. This reveals once again an all-encompassing principle that is valid everywhere in the kingdom of living things: the conversion of DNA into proteins is also called the “central dogma of molecular biology”.
2.2.1 Transcription and Translation In the transcription (Latin for copy/rewrite) the hereditary information, which is stored as DNA in the genes, is copied into other storage and readout forms in an identical- complementary manner. Figure 2.3 shows the principle of this process. The various forms of RNA required for protein
Antisense beach
RNA polymerase
RNA Transcript
Scythe beach
Fig. 2.3 Principle of transcription: DNA is translated into RNA. (Source: NHGRI/Wikipedia – public domain)
18
R. Ball
synthesis are synthesized, such as messenger RNA (mRNA), which transports the genetic information of the DNA to the site of protein synthesis (ribosomes), or transfer RNA (tRNA), which carries out the assembly of proteins along the coded information of the mRNA. In the synthesis of mRNA, the only thing that matters is the most accurate and correct conversion of the coded information of DNA into the transport form of an RNA without loss of information (what exactly DNA and RNA are, we will see later). This is important because the mRNA can now transport the copied and stored information (which in all organisms except the cell nucleus-less ones, the prokaryotes, is in the cell nucleus) to another location in the cell, to the so-called ribosome, where it is passed on to the second central process, the translation. This is where, as one might put it, the reading out of the stored and encoded data and the translation into products of metabolism begins. Figure 2.4 illustrates Newly synthesized Protein Amino acids tRNA
large subunit A-Site P-position
mRNA
Small subunit
Fig. 2.4 The assembly of proteins from genetic information (translation). (Source: LadyofHats/Wikipedia – in the public domain)
2 Viruses, Microorganisms and Molecular Genetics
19
protein synthesis. In this process, RNA is read out and translated into a sequence of specific amino acids that make up proteins. This readout and translation is carried out with the aid of tRNA, a special molecule that matches information units of the mRNA exactly on one side and contains the matching amino acid on the other. This completes the sequence from the copy of the stored hereditary information in the DNA via the mRNA to the readout of the then action-relevant information and the construction of concrete proteins in the ribosome by the tRNA. One must realize the special importance of these processes: Genetic information, which is encoded by a special sequence of nucleic bases, is copied several times identically and thereby complementarily, so that in the end exactly the protein encoded in the DNA can be created. This process is valid in all living organisms, from bacteria and other singlecelled organisms to highly developed mammals, as well as in all plants and fungi. However, the principle also applies to viruses and is possibly even the basis for the emergence of life from inorganic systems that at least master replication. The question about the age of the genetic code was answered by M. Eigen by the statistical evaluation of comparative RNA sequences […] According to Eigen (1987) the genetic code must have originated about 3.6 billion years ago. (Rauchfuß, 2005, p. 258)
2.2.2 Replication When organisms grow and cells divide, it is of utmost importance that an exact copy of the genetic information from the initial cell is produced. Again, structural, and functional groups on the molecule of DNA (the explanation of the details is given below) guarantee the creation of an exact copy.
20
R. Ball
This process is also universal and is found in viruses, bacteria, and all other living things, including highly evolved ones. Replication is also the first basic requirement for the emergence of life, as it enables the identical transmission of the “blueprint” of systems beyond the single individual. This is a basic prerequisite because it would be almost impossible in terms of resource economy and information theory if individuals had to “invent” themselves repeatedly. Passing on successful “blueprints” as unchanged as possible to new generations, and thus guaranteeing all basic structures and functions based on which further development, differentiation and evolution can then take place, is the most powerful legacy that life can give us. Also, “the complex processes within cells and in the emergence of life suggest a functional carrier within the cell, namely a blueprint in which information is stored” (Küppers, 1986, p. 39).
2.2.3 The Genetic Code: DNA and RNA The hereditary information of living organisms is localized in genes. Genes are functional units formed by a certain number and sequence of DNA pieces, from which the gene products, i.e., certain proteins, are then read and synthesized (by way of transcription and translation). In this process, the DNA is not present as a long thread or as a pure double helix, but is wound up on a protein scaffold, the chromatin. The structural unit of genetic information is the chromosome, the functional one the respective gene, which contains information (data) for a specific protein, and which results from the sequence of DNA sequences. Already here the structural and functional dimension of genetics becomes visible, as we will learn and discuss below in connection with the molecular structure of DNA and RNA:
2 Viruses, Microorganisms and Molecular Genetics
21
Genes or DNA are regarded as material carriers or storage media of a “knowledge” that is of crucial importance for the life processes in an organism. By using this knowledge to produce proteins during protein biosynthesis, among other processes, genes make an essential informational contribution to the development and physiology of a living being. (Schmidt, 2009)
The hereditary information of all living organisms is stored in a special chemical macromolecule, the DNA or – for example in most viruses –RNA. DNA and RNA are so- called macromolecules that consist of four different nucleotides. The chemical building blocks of a nucleotide is ribose sugar (the eponymous deoxyribose for DNA, ribose for RNA), a phosphate residue (oxygen atoms around a central phosphorus atom), and one of the four organic nucleic bases typical of DNA: adenine, guanine, cytosine and thymine. In the case of RNA, these are adenine, guanine, cytosine, and uracil. DNA is present as a long chain of nucleotides, spirally coiled into a so-called double helix. Due to certain chemical properties, only certain nucleotides are ever opposite each other as base pairs, which form a special hydrogen bond and are thus held together. Adenine and thymine form a pair, as do guanine and cytosine. RNA is not present as a double helix, but as a single strand. But here, too, the specific hydrogen bonds allow only selected base pairings: Adenine and thymine, and cytosine and uracil. In DNA and RNA, the respective sugars – deoxyribose and ribose – as well as the phosphorus residues form the strengthening backbone. They are the same for all nucleotides. The functional differences result exclusively from the four organic bases: The entire (hereditary) information of an organism is encoded in their sequence.
22
R. Ball
DNA and RNA not only form the material framework of the genes, but they also represent their informational basis in the coding and sequence of the nucleotides: The DNA contains the genetic information for the structure of the proteins and thus also – due to the specific effects of the proteins – for the structure, the life phenomena of the organism and for the genetically determined parts of its behavioral control. In these respects, the genetic information shapes the phenotype. (DNA, 1994, p. 348)
It is important to note that DNA and RNA have no physiological function in organisms but serve exclusively to encode (hereditary) information. “The two nucleic acids act exclusively as information carriers, whereas proteins (for example, as enzymes) are responsible only for functions within the cell” (Rauchfuß, 2005, p. 199). The functional, information-bearing differentiation of the DNA takes place via the respective sequence and combination of the base pairs. Three nucleotides always determine an information unit. These triplet codons are thus the letters of the genetic alphabet. In contrast to the digital storage developed by humans on computers with the two states zero and one, nature has been using an information storage system for more than 3.6 billion years that is based on more than two variables: namely, the four DNA nucleotides adenine, guanine, cytosine and thymine and their respective sequence in the triplet. In RNA, adenine, guanine, cytosine, and uracil are found. For example, the nucleotide sequence guanine-adenine-cytosine encodes the amino acid asparagine, and the triplet cytosine-uracil-guanine encodes the amino acid leucine. As shown in Figure 2.5, it is only ever possible to infer the amino acid unambiguously from the codon, but not from the amino acid to a single codon, since the third position in the triplet is occupied multiple
23
Mea
d
2 Viruses, Microorganisms and Molecular Genetics
Fig. 2.5 The genetic code (assignment of amino acids to coding triplets). (Source: Mouagip/Wikipedia – public domain)
times. This is called genetic redundancy and allows, within certain limits, error correction in the synthesis of proteins. At the same time, the coding of amino acids by a triplet of four nucleotides results in a possible combination of 43 = 64, far more than the 20 amino acids. The fact that there is only this single genetic alphabet is called the basic law of molecular biology. “The coding of information in the genetic molecular systems of information storage can be called a molecular language” (Küppers, 1986, p. 174). It could also be called the first basic genetic law, and we will immediately see that there is a second basic genetic law.
24
R. Ball
The genetic code is not only variable and flexible, but also still extremely stable. It is assumed that the genetic code has evolved and optimized itself during evolution in and with the development of living beings. Model calculations indicate that the genetic code cannot be a product of chance but has been optimized by selection processes […] Computer simulations show the insensitivity to errors of the contemporaneous genetic code, because it resisted errors (in the model calculations) better than a million other codons. (Rauchfuß, 2005, p. 258)
However, there are also opinions that assume that the genetic code has not undergone any changes: “Biologists currently believe that the genetic code is frozen or no longer subject to evolutionary change” (Gatlin, 1972, p. 117). The biologist Lila Gatlin sees it differently: I am skeptical of this view […]. The genetic code is a small subroutine of a master program which directs the machinery of life. We have no idea what the language of this master program is like, but we can be sure that it has always evolved, is now evolving and will continue to evolve in the future. (Gatlin, 1972, p. 118)
The problem behind this different assessment is obvious: Since the origin of living beings, the genetic code itself has remained unchanged. Thus, the basis of replication of life seems to be stable over billions of years, while living beings are permanently evolving due to that very flexibility of mutation and selection which is anchored in the “basic genetic law”. If we understand Gatlin correctly, the basic genetic law seems to be only a subunit, a secondary process of life, but not the “principle” itself, the master program of life. Possibly this is precisely the principle of self-replication.
2 Viruses, Microorganisms and Molecular Genetics
25
When cells replicate, that is, whenever cells divide and growth or multiplication occurs, the genetic information must be copied exactly so that the same genetic information is present in the new cell. We have already learned about this process above as replication. It involves unraveling the DNA double helix and copying the two strands. This process is based on the fixed pairing of the nucleotides, and only in this way can identical, new strands of DNA be synthesized, and information passed on one-to-one to new cells and subsequent generations. Equally relevant is the reading and thus the transcription of information. Just as copying a book does not yet mean that the copyist reads, understands, and implements the content, replication does not yet result in any real use of information encoded in the triplets. Only when the book is read, the letters are recognized and the sentences formed from them are understood, data has become action-relevant information. This is exactly what happens in the processes of transcription and translation, the second important molecular genetic process. In translation, information of the nucleotides laid down in the codons is converted into concrete cell building blocks, the proteins. “Translation means the transfer of genetic information in a sequence of amino acids and thus the formation of proteins” (Haeseler & Liebers, 2003, p. 13). This is because information for the production (synthesis) of functional substances lies in the genes, which are stored in the triplets, each consisting of three nucleotides. The proteins produced can perform a wide variety of tasks: They can be catalysts for biochemical processes and sequences, they can be components for building tissue, or they can themselves be starting materials from which various products are manufactured in the cell.
26
R. Ball
Nucleotides thus have two dimensions, one structural and one functional. In the construction of DNA and RNA and in the formation of a double helix, the structural dimension is important. There, the specific and invariable base pairings adenine-cytosine, and guanine-thymine are structurally constituent, thus determining the structure of the molecule. During the copying process of the DNA, for example during cell division or reproduction, the structural dimension is also important for the arrangement of the base pairs so that the genetic information can be copied correctly. Similarly, in the copying process of DNA into a mRNA, a transport form of hereditary information, the structural dimension is relevant, since here, too, it is a question of a correct copy of information onto a “counter strand”, but not yet of the application of the same. The functional dimension of the special system of base pairs, which is identical in all living beings, only comes into play when information encoded in the triplets is read and translated, in the process of translation. The combination of three nucleotides determines an amino acid. This sequence is relevant in terms of information function because a protein obtains its specific properties from a particular sequence of the amino acids from which it is built. This process of translation, i.e., the translation of genetic information into material structures (encoded in the sequence of base pairs in the triplet), is responsible for the production of proteins, the building blocks for metabolism and the construction of structures and processes. Transcription and translation are closely related and are based on the dimensions of function and structure of DNA and RNA. Transcription is the first step in protein biosynthesis. In this process, the genetic information of the DNA is transferred to a special form of RNA, the mRNA. In eukaryotes (organisms with a cell nucleus), this transfer takes
2 Viruses, Microorganisms and Molecular Genetics
27
place in the cell nucleus; in lower forms of protozoa (organisms without a cell nucleus, the so-called prokaryotes, for example bacteria), it takes place in the free cell plasma. This copying is only possible due to the functionally and structurally unique assignments of the base pairs. DNA is thus much more than a sugar molecule for the preferential storage of information and data; it is “a document of evolutionary history” (Zuckerkandl and Pauling, cited in Haeseler & Liebers, 2003, p. 4). In replication as well as in transcription and translation, structural conditions and the functional dimension are congenially linked. This is precisely what makes these basic biological processes so fascinating; indeed, they can be described as the biological dogma par excellence, which applies to all living things, but also to viruses and probably to the various precursors of living things. This is where the question arises as to whether the existence of the second genetic law as a molecular-biological dogma is already a constitutive feature of life (we will discuss this question in detail in Sect. 2.3 “What Is Life? Definitions and Emergence”), since we also observe systems that are inanimate and nevertheless have the ability to replicate molecules identically via structural and functional groups. At the same time, the extremely important question arises as to which form of storage of (hereditary) information could have been the older and thus the more original, DNA or RNA. There are many arguments in favor of RNA as the first storage medium in evolution. It is simpler than DNA and thus easier and faster to synthesize than the complicated double helix of DNA. It can therefore be assumed that DNA – which today functions as an information carrier in almost all living organisms – only developed later as a central information carrier. Viruses, for example, store their genetic information in the form of “simple”
28
R. Ball
RNA. RNA, however, is a rather unstable molecule and does not last very long. It decays after a few minutes, whereas DNA (in the laboratory) can be stored for much longer. It is thought to have a lifespan of around 100,000 years under good preservative conditions. “RNA is a very unstable molecule that is degraded by water or oxygen” (Haeseler & Liebers 2003, p. 115). Because it is so unstable, RNA is not suitable for permanent information storage. This could be one reason why viruses quickly have the genetic information of their RNA converted into DNA in host cells and anchor it there. In any case, the question of the primary storage medium for the blueprints of organisms and their predecessor systems is a central informational issue in the origin of life.
2.3 What Is Life? Definitions and Emergence According to Darwin’s Origin of Species, it is not the most intellectual of the species that survives; it is not the strongest that survives; but the species that survives is the one that is able best to adapt and adjust to the changing environment in which it finds itself. (Megginson, 1963)
2.3.1 Definitions of Life Whoever deals with the subject of the coding of information and the early forms of the living and its predecessors, and whoever wants to know which mechanisms of action viruses use for the storage of their hereditary information, cannot avoid the question of what life means. We must and may not only ask this question in our context, but we will also try to answer it. Admittedly, not in the most general,
2 Viruses, Microorganisms and Molecular Genetics
29
generic philosophical sense (that is not what this book is intended for), but against the background of our hypothesis that viruses and technical computer viruses are in many respects cut from the same cloth. Therefore, when we ask what life is, we will have to adopt a perspective that looks at viruses and computer viruses from the point of view of “information”. At first it seems still strange to treat the topic “life” and its definition straight against the background of the similarity of viruses and computer viruses. However, we will see that this is quite justified, since viruses show characteristics of dead matter, while computer viruses show characteristics of “living matter”, such as the ability to reproduce. Obviously, viruses, like computer viruses, are at home in both worlds. In the technical literature, there is no consensus as to whether viruses are living organisms or represent another form of organized life, whether they mark the transition between dead matter and “real” life or represent a special form of evolution of microorganisms, and perhaps have even chosen a special path and are thus in parts more highly developed (and younger) than “normal” microorganisms. In general, we define life as a set of properties that must all (or at least predominantly) be present for us to call an organism or structure “alive”. Sven P. Thoms, for example, speaks of the “eight pillars of life: compartmentalization, energy metabolism, catalyst, regulation, growth, program, reproduction, adaptation” (Thoms, 2005, p. 7). The paramecium in Fig. 2.6 is an example of a simple single-celled organism. Compartmentalization means the division of a structure or cell into different functional areas, as we know it from unicellular microorganisms, or into structurally- spatially segregated functional areas (cell organelles), as can
30
R. Ball Paramecium (Paramecium) anterior Membrane
Trichocysts
Eyelashes/Cilia anterior contractile vacuole
Cytoplasm Small nucleus (micronucleus)
Food vacuoles
Mouth funnel
Large nucleus (macronucleus)
Cell Mouth
posterior contractile vacuole
Buccal Cave Formation of a food vacuole
Tax cilia
Zellafter
posterior
Fig. 2.6 Paramecium (also flagellate) as a simple unicellular organism with cell membrane and compartmentalization. (Source: Udaix/Shutterstock (own translation))
be seen in the cells of more highly developed organisms, the eukaryotes. Here, the various functions of the cell take place in different cell areas (compartments), which are structurally and functionally separated from each other by membranes. The cell nucleus, for example, contains the
2 Viruses, Microorganisms and Molecular Genetics
31
hereditary information (stored in DNA). Division and duplication of the genetic information take place in it. Another example is the mitochondria, highly specialized cell organelles with their own genetic material, which are separated from the rest of the cell by a double membrane and are responsible for energy metabolism. In eukaryotes, compartmentalization ensures that enzymes and building materials remain in the cell, while waste products can be transported out of the cell. Membranes provide the boundary to the outside, i.e., the boundary between inside and outside. In unicellular organisms without their own cell organelles and without a cell nucleus, the so-called prokaryotes, which include bacteria and archaea, the various (physiological and genetic) activities take place in the free cytoplasm without a demarcated compartmentalization. Schreiber also defines the minimum requirement of life as a compartment, such as a cell or cell-like delimited structure, within which an information store exists, and metabolism can take place for the exchange of energy and matter (2019). The presence of an energy metabolism is also a characteristic of living systems. This refers to the fact that a cell (or a precellular structure) can independently meet its energy needs and metabolic processes are present and occur that extract energy from organic or inorganic substances. This is used to build molecules and structural proteins or is converted into kinetic energy – for example, in the flagellate, into the movement of the flagella for active locomotion. The presence of catalysts is also a feature of life, according to Thoms (2005). These are special molecules that enable or accelerate (chemical-biological) reactions. In more highly developed organisms, there is a whole range of special catalysts that initiate, support, or accelerate reactive processes. They are also referred to as enzymes. They usually cause very specific reactions and have established themselves over billions of years in biological evolution,
32
R. Ball
alongside the basic genetic law, in a very clear standard configuration across almost all living things and forms as the valid basic equipment of living things. When we speak of regulation as a characteristic of living things, we mean the possibilities to control processes, buildup and degradation procedures and sequences. This can be done by chemical substances, by structural changes (for example barriers) or – in higher organisms – by neuronal activities, which themselves are chemically conditioned and become chemically active. We will be hard pressed to define a structure that develops or grows completely unregulated and randomly (such as stalactites or stalagmites, the calcium stones in caves) as living. The criterion “growth” alone, on the other hand, is not sufficient for defining life. Other criteria for life must be used. If we stay with the above example, even a stalactite or stalagmite shows growth without being a living thing. At the same time, no biological organisms exist which – regardless of whether they have come into being by sexual or asexual reproduction – would manage without growth. A central topic is the criterion mentioned by Thomas of the “program”. It comes quite close to our question because it means the “blueprint” as the basis of regulation and reproduction. For all processes that take place in an organism, there is a schedule. Otherwise, the processes would be random and unregulated, which would be biologically (and in the understanding of life) nonsensical. This program is the genetic blueprint of all living things, which contains information according to which living things organize themselves and all processes take place. More still: The genetic information is the basis for the fact that living beings can multiply over the time and thereby on the one hand always the same organism develops (identical passing on of the hereditary information, thus the “program”), and on
2 Viruses, Microorganisms and Molecular Genetics
33
the other hand by small or large coincidental changes (mutations) an evolution of the organisms and the living is possible. Dead matter and substances have no plan according to which they develop or change: An ice wall grows and melts depending on temperature and humidity, rocks change by chemical and physical influences, but not according to the rules of a plan or program. The Reproduction, again, is a hallmark of life that we must deal with at length when considering viruses and computer viruses. In fact, all biological systems reproduce. But technical systems, such as certain computer programs, can also reproduce and multiply. This brings us to a central question about the nature of viruses. Viruses reproduce, but they cannot do so on their own, but only with the help of, and usually with the destruction of, their specific biological host cells. In general, the topic of reproduction is strongly connected with the term replication, a central concept in this book, since the (self-) replication of molecules is the starting point for biological evolution and the emergence of living things from inorganic matter. Chemist Gerald Joyce defines life as “a self-sustaining chemical system that has the capacity for Darwinian evolution” (Schreiber, 2019, p. 7), thus summarizing the characteristic of reproduction with the aspect of adaptation. “Adaptation” is an evolutionary biological term that is closely related to the two terms reproduction and program. A system can adapt only if its program is not unchanging and rigid but can respond to external influences. Adaptations can refer to the period of the individual life span (especially in higher life forms) or – far more important – to the development of further generations of the organism and thus to the change of a whole species. Thus, organisms (and their entire species) can adapt to environmental changes and remain viable.
34
R. Ball
According to Dyson, two conditions are necessary for the definition of life: replication and metabolism. According to his theory, life began twice: one system succeeded in metabolism without replication, the other system succeeded in replication without metabolism. Only the fusion of the two systems and thus of both properties could lead to the successful evolution of living things as we know them (after Rauchfuß, 2005). Here, too, we recognize implicit indications of a possible classification of viruses in biological evolution: they do not possess their own metabolism and have their replication carried out via a host cell. Manfred Eigen also combines aspects of the eight pillars for his definition of life: Life is a system that sustains itself by consuming external energy or food substances through an internal process of component production. It is coupled to the medium via adaptive exchange processes. They outlast the life history of the system. (Rauchfuß, 2005, p. 17)
Thus, for Eigen, self-reproduction, mutation (adaptation), and metabolism are among the characteristics of living things.
2.3.2 The Emergence of Life The emergence of life from a primordial soup of organic and inorganic material under the conditions of a primordial atmosphere cannot be traced here in all details. There is relevant literature on this subject (Meißner, 2010; Röhrlich, 2012; Schaper, 2004; Thoms, 2005). Figure 2.7 shows a schematic diagram of how the energetic primordial soup might have turned out. For us, the consideration of the origin of life is always done from the point of view of
2 Viruses, Microorganisms and Molecular Genetics
35
Fig. 2.7 Stirred-up, energy-rich primordial soup as the basis for the emergence of the first self-replicating molecules. (Source: Tim Bertelink (2016): “Artist’s impression of the Hadean Eon”, https:// commons.wikimedia.org/wiki/File:Hadean.png CC BY-SA 4.0, https://creativecommons.org/licenses/by-sa/4.0/)
reproduction mechanisms and modes of action of information storage, transmission, and conversion. Bernd-Olaf Küppers distinguishes three phases in the origin and development of life (Küppers, 1986, pp. 56–57): • The first phase is a phase of chemical evolution. In a primordial soup the first macromolecules are formed, which can reproduce themselves in the course of time and thus form the basis for self-replicating systems. Thus, although life has not yet emerged, the basis for living systems has been created. • In phase two, the emergence of life occurs. It is characterized by functional couplings of nucleic acids and proteins, optimized self-replicating systems, and living cells. “This is because the instruction for the construction of a living system requires genetic information, which in turn is tied to the defined sequence of macromolecules” (Küppers, 1986, p. 56). Once bioenergetic mechanisms
36
R. Ball
have been established by autonomous systems, the thermodynamic foundations for the beginning of the archiving of information have been laid. • Then, in phase three, biological evolution begins with the further development and optimization of all living things, from primitive single-celled organisms to highly developed organisms. For the questions of the coding of information, phase one is particularly exciting, because here one of the most fundamental steps for the long-term possible and sustainable development of life takes place. Macromolecules that self- replicate already give us clues to later biological replication mechanisms. However, these replications still take place “haphazardly”: such macromolecules do not yet pass on information about their blueprint, about their composition or their functions to subsequent “generations” of macromolecules. This first requires “heritable” and “inherited” information, which contains the blueprint for the formation of further macromolecules. We have already noted in the previous chapter that the respective complete “reinvention” of each biological individual without a blueprint is completely impossible in terms of resources and information, especially when organisms have reached a certain level of complexity. The search for the origin of life will have to be based primarily on the fundamental question of how self-replicating systems, whether inanimate or living, give themselves a blueprint to optimize their own identical reproduction and produce identical “successor generations.” “In the period prior to this, a transition from pure physico- chemical to information-driven organic molecule formation must have occurred.” (Schreiber, 2019, p. 6). There are various models of what this primordial world or primordial soup might have looked like. One hypothesis
2 Viruses, Microorganisms and Molecular Genetics
37
is the so-called “iron-sulfur world”, another is the so-called “RNA world”. In the iron-sulfur world, anaerobic minerals dominate, which serve as the basis for energy production. For the RNA world, it is assumed that RNA pieces that reproduced themselves were the beginning of life and all later organisms. Both theories have something going for them. However, they cannot explain everything conclusively and conclusively. Probably, a combination of the two models as well as other theories of the prebiotic period comes closest to reality (Rauchfuß, 2005, pp. 99–100). In fact, proteins were initially the focus of research into the origin of life, and it was assumed that they were the first building blocks of life. This is how Shapiro postulates it in his conception of the origin of life (Shapiro, 2009). According to his theory, life began as a metabolic network of reactions. A wide variety of monomers, i.e., low- molecular, unbranched molecules, were involved in them, while the mechanisms of replication – and thus the informational basis – are supposed to have formed only at a later stage of evolution (Rauchfuß, 2005, p. 105). Later, it became clear that it is not the proteins that are the primary functional carriers, but the information carriers DNA and RNA. In their theory of the hypercycle, Manfred Eigen and Ruthild Winkler assume that there are feedback cycles in which RNA gives rise to proteins, which in turn cause the formation of RNA (Eigen et al., 1991). Figure 2.8 illustrates this principle. In this cycle, not only do new and more RNA and protein molecules keep being formed, but the mechanism of feedback causes small errors in reproduction to give rise to ever new forms of RNA and of proteins. These are first signs of a not yet biological but chemical evolution. Accordingly, this is the basis for the emergence and further development of life.
38
R. Ball
Fig. 2.8 Hypercycle according to Manfred Eigen: RNA gives rise to proteins, which in turn cause the production of RNA. (Source: Own illustration according to Manfred Eigen (simplified))
It is particularly interesting here that the category “information” and its material realization come first: RNA induces the formation of proteins, which then produce RNA again as an information carrier. At this early stage of the prebiotic, the formation of proteins initially had only the effect of information reproduction of RNA. Real, functional proteins, as required in the complex physiological processes of a cell, were not yet. “Information for its own sake” – this is how one could simplify it to the point – is the beginning of life.
2 Viruses, Microorganisms and Molecular Genetics
39
The hypercycle is based […] on the assumption that the emergence of life is equivalent to the emergence of information-bearing polymers such as DNA or RNA, that is, that information has priority over function. (Thoms, 2005, p. 50)
This primacy of information is still valid today in all living things that evolution has produced during almost four billion years. A look at the system of viruses also reveals there the concentration on the pure replication of information structures. Viruses have practically no functional groups and consist essentially of information. This lets them appear on the one hand in a very early connection with the emergence of chemical-biological evolution, on the other hand one could assume an emergence of the system “virus” due to the maximum dependence of viral activities on their respective biological hosts only after biological differentiation and development of the living beings had taken place. Thus, in a way, viruses are at the interface, they are either dead or alive or both. However, I do not see a point there, a singularity, but a continuum, a gradual flowing transition from individual biomolecules to the cell. (Mölling, 2015, p. 26)
One of our basic concerns in this book is to clarify whether the existence of pure replication processes for identical copying of blueprint information is already a central criterion for life. As we have seen, there are apparently already macromolecules that can self-replicate, such as pieces of RNA, whose “read-out and read-off” then allows proteins to be synthesized, which in turn produce pieces of RNA (hypercycle). If there are then changes in the information content of the RNA due to random errors, we already see elements
40
R. Ball
of evolution with self-replication on the one hand and mutations on the other, and thus many important criteria for early precursors of life. If we think this system further, we quickly arrive at self-replicating computer programs that can also “reprogram” themselves by chance. The basic question, what we want to admit as criteria for life and where we want to draw a (definitional or factual) border, is decided at the two extreme poles: the early phase of the emergence of life on the one hand and its technical “overcoming” by computer programs on the other hand. Viruses, at any rate, we will always have to locate in these border areas. On an even higher level of aggregation in the discussion about the origin of life, the structure of evolution and the classification of viruses (which we can only touch upon here), there is the question of meaning. Science, however, can neither create sense nor explain sense. This applies also and especially to the subject of life. Max Weber explains it like this: Who today still believes that findings of astronomy or biology or physics or chemistry can teach us something about the meaning of the world, or even something about the way in which one could trace such a meaning – if it exists? (Weber, 1919, p. 518)
The question of meaning leads to a particular difficulty especially in the case of lower organisms or even prebiotic systems or viruses. Darwin answered the question of meaning by defining reproduction as an act to spread one’s genes as widely as possible as the “meaning of life”. Here we still must distinguish between the focus on life itself, on one’s own (biological) species, on a special population or even on the individual. Here, too, we encounter major classification problems, especially in the case of lower organisms, but above all in the case of prebiotic systems and viruses.
2 Viruses, Microorganisms and Molecular Genetics
41
2.4 Are Viruses Living Things? Viruses and the Early Genetics of the RNA World Must in the nature view Always respect one like everything; Nothing is inside, nothing is outside: For what is inside, that is outside. So seize without delay Sacred public mystery. (Epirrhema, Goethe)
Those who speak of viruses in 2021 think first and foremost of the Corona pandemic, which was triggered by SARS- CoV-2. In Fig. 2.9 we see an electron micrograph of the
Fig. 2.9 The SARS-CoV-2 (electron micrograph). (Source: Alissa Eckert, MS; Dan Higgins, MAM – public domain)
42
R. Ball
virus. SARS-CoV-2 is the talk of the town, and both professional and unprofessional virologists focus their work almost exclusively on its study. This is due not only to the occasionally fatal course taken by the disease caused by SARS-CoV-2 (COVID-19), but also to the large amounts of additional funding being invested in SARS-CoV-2 research. This easily gives the public the impression that viruses are always a threat to humanity, while a virus-free world is the most desirable thing. In fact, many serious, sometimes fatal, diseases are transmitted by viruses. Everyone knows the flu viruses (influenza), dengue fever caused by viruses, or the Ebola virus and the severe illness it causes. More than half a billion people are still believed to have died from smallpox virus alone in the twentieth century (Schmitt, 2020). However, viruses are also fascinating systems. They are probably as old as the oldest microorganisms and have evolved in parallel and along with living things ever since. There is a plethora of viruses (1033); their number is one hundred times greater than the number of existing bacteria (1031) and greater than the number of stars in the entire universe, which is estimated at 1025. So, viruses are omnipresent in terms of quantity alone. However, there are much fewer “species”4 There are only about 3000 of them, while there are about 1.8 million potential host species among recent organisms. Only a vanishingly small proportion of viruses are infectious in the sense of causing disease in humans. The vast majority are neutral in relation to humans and thus have no effect in either a positive or negative sense. And the good news is that a large proportion of viruses specialize in bacteria (so-called bacteriophages) and thus help to keep the The species term is of course wrong here because it applies only to living beings and therefore means here different virus forms. 4
2 Viruses, Microorganisms and Molecular Genetics
43
mass spread of bacteria in check. The ecological system of supply, demand and equilibrium also and especially applies in the field of microorganisms. Without realizing it, humanity benefits more from viruses than it is threatened by them: “Viruses and humans have entered into a predominantly peaceful coexistence” (Mölling, 2015, p. 13). Viruses are thus old acquaintances and companions in the evolution of life since its emergence some 3.6 billion years ago. Viruses are tiny; their classification in the size relationships of microorganisms is explained in detail in Sect. 2.1. Viruses are multiform. They exist with shells, without shells, with membrane and without membrane, in rod form or as icosahedron (a figure of 20 equilateral triangles). We cannot present the systematics of viruses in detail in this book, but only provide a brief overview: Their classification is based on different criteria. Viruses can be classified according to their respective hosts, for example so-called bacteriophages – viruses that only infect bacteria – , animal viruses, insect viruses, plant viruses or viruses that only (or also) infect humans. However, they can also be classified according to where they replicate in the respective host cell, for example in the cytoplasm of a cell or in the cell nucleus. Viruses have a much simpler structure than true microorganisms, such as bacteria. Figure 2.10 shows the schematic structure of a virus. A virus essentially consist of its genetic material, which is present as a simple RNA or DNA thread and is surrounded by a protein envelope, the capsid. Some viruses also have a double lipid layer as an envelope, as does the coronavirus. Viruses do not have their own proteins, cytoplasm, or metabolism. They are not even able to duplicate their own genetic material. To do so, they need a host, or more precisely a host cell, into which the virus introduces its (usually short) genetic information, the viral
44
R. Ball Structure of the influenza virus
Neuraminidase Hemagglutinin Protein shell (capsid) Nucleotide (RNA) Viral envelope (lipid layer) ion channel
Fig. 2.10 Schematic structure of a virus, here influenza. (Source: Designua/Shutterstock (own translation))
RNA. This RNA is then incorporated into the host cell’s genome and is read and translated by the mechanisms of transcription and translation. In the process, firstly, new, identical viral RNA is produced, and secondly, the few proteins that the virus needs to build its envelope are formed. While still in the host cell, the individual parts (the viral RNA and the protein and/or lipid envelope) assemble to form the new virus, which then leaves the host cell by strangulation of a vesicle (exocytosis) or bursts the host cell, thus releasing the new viruses (lysis). From a medical point of view, viruses thus belong to the obligate intracellular parasites. Viruses are specialized. They can only infect certain organisms and even only certain cells or cell types of an organism. The reason for this is their specific receptors, the sites where the virus attaches to cells to inject its genetic
2 Viruses, Microorganisms and Molecular Genetics
45
information into the cell. During infection, viruses attach to the host cell membrane (adsorption), penetrate the host cell (penetration), and release their nucleic acids (RNA or DNA) into the host cell, where they are transported to those sites where genetic information is read, such as the nucleus in eukaryotes or free DNA in bacteria. After replication of the viral genetic information, transcription and translation give rise to the corresponding viral proteins. When enough viral protein has been synthesized, the proteins and the viral genetic information (RNA or DNA) assemble in such a way that a new virus is formed. This process is also known as “self-assembly”: it happens practically by itself, without the intervention of auxiliary structures or mechanisms. This process is not yet fully understood. With the release and often the destruction of the cell during lysis, the reproduction process of a virus is finished. Thousands of new viruses are created from the injection of one viral RNA.
2.4.1 The Genetics of Viruses In this section, we will take a closer look at the genetics of viruses. This is of particular importance because the genetics of viruses not only shows their position in relation to the realm of living beings, but also allows us to aptly discuss whether the basic genetic rules and principles apply only to living beings or can also be effective in the realm of inanimate particles, such as viruses. To speak of genetic principles and processes in non-living systems is perhaps at first counter intuitive. We usually locate genetics and its underlying molecular mechanisms and principles clearly in the realm of living things. But this is clearly not the case with viruses, as we have already learned about the mechanisms of infection and replication of
46
R. Ball
viruses in their respective host cells. Not only are the same terms used, such as nucleic acids, RNA, DNA, but the same mechanisms of replication also exist. In fact, viruses have very similar, and in some cases even identical, structures and procedures for encoding genetic information as living organisms. Here, too, the terminological peculiarity should be pointed out, since the term “hereditary information” indicates that information passes from one generation to the next. In our understanding, however, this is in principle limited to living beings – we do not speak of the “hereditary information” of a computer or a software program either. What should concern us more than the linguistic analogies, however, are the exciting questions of how close viruses are to living organisms, how they encode and pass on their information, and how they evolve and optimize themselves in the sense of evolution. We can let this problem culminate in the question of the extent to which computer programs can also pass on information about themselves, replicate, reproduce and optimize themselves. For this we need to know how information coding looks like in the technical-constructive world (Chap. 3). These issues are linked to the question of whether the known mechanisms of information storage and transmission, as they have existed in living beings for 3.6 billion years, are a criterion for “life” or not. Viruses have a similar genetic makeup as living organisms. The genetic information is stored in nucleic acids, which occur in the form of RNA or DNA and exist as a single strand or as a double strand. However, they are not replicated in the virus itself, nor are the (few) proteins encoded in them synthesized in the virus. Rather, the virus releases its genetic information into the host cell, where it is incorporated into the genome of the host cell, replicated,
2 Viruses, Microorganisms and Molecular Genetics
47
and read. Only in the host cell and with the help of the host’s own synthesis mechanisms are the viral proteins, especially the viral envelope, built up and then assembled into a new virus in the self-assembly process while still in the cytoplasm of the host cell. The genetic “text” of a medium-sized virus is not particularly extensive, however, because it does not need to contain information on enzymes and other proteins required, for example, for the assembly of cell structures, or “instructions” for the processes of replication and translation – the host cell does this for the virus. The text – we use this term here for the genetic code – consists of around 30,000–40,000 “letters”, the triplets; the genome of SARS-CoV-2 comprises around 30,000 nucleotides with information for 29 proteins, some of whose function is still unknown. The human genome has also been decoded. In the journal Nature in 2001, many consider it the most important paper of our century: “Initial sequencing and analysis of the human genome” (Lander et al., 2001). To this end, the Frankfurter Allgemeine Zeitung printed several pages with the letters of the four nucleotides adenine (A), thymine (T), guanine (G), and cytosine (C) and proclaimed the genomic age in succession to the atomic age (FAZ, 2001). However, the results of the decoding of the human genome turned out to be consistently sobering – in terms of the expectation of size and quantity. We also find viral genes in the human genome. It has about 3.2 billion nucleotides. Most of this, however, is non- coding material; only 3% of the approximately 3 billion base pairs contain information that is read. The vast majority, i.e., 97%, are multiple repeats or segments whose meaning is unknown or meaningless. Obviously, we carry around in every cell of our body a huge amount of genetic material that has “accumulated” during evolution but is no longer
48
R. Ball
needed and has no function. The fact that there are also viral genes in it already shows that the viral genome has also developed a special relationship to living beings during billions of years of evolutionary history: […] our genes are not unique to humans, but our genome consists of a potpourri of genes from bacteria, archaea, viruses, fungi, and nonsense, garbage – also called junk. (Mölling, 2015, p. 158)
About half of the human genome consists of viral DNA, since gene transfers (gene migrations) are part of evolution. We can already consider together at this point that this sentence must be given a special interpretation against the background of information theory and will lead to many questions. We have now learned that viruses encode their blueprint in the form of nucleic acids, but do not have their own means of translation and transcription; for this they use the structures of the host cells. However, they have mastered the “language” of the genetics of living organisms, which they reprogram as parasites to carry out their own reproduction and from this to create new viruses. Terminologically, we are on thin ice here, because parasites are living beings that (exploit) other living beings for their own benefit. We do not normally refer to inanimate matter as parasites. The reading of the genetic information of viruses is completely outsourced. Viruses not only have translation performed by their host cells, but even the most primal genetic process of replicating their own genetic information for transmission to subsequent generations. The question of the meaning of the complete outsourcing of the genetic processes (in industry one would speak of an “extremely low vertical range of manufacture”) cannot be asked in this context.
2 Viruses, Microorganisms and Molecular Genetics
49
2.4.2 The Relation of Viruses to the Kingdom of Living Beings and Their Classification Viruses are particles which are generally not considered to be living beings. They could be called part-time living organisms. They infect cells and can reproduce only with their help. It is completely unclear from when they appeared in the world of living beings and how they evolved in parallel with cells. Are they former bacteria whose ability to reproduce was lost and which therefore needed the support of other cells? Or are they RNA and DNA strands that emerged from their host and became self-sufficient? (Schreiber, 2019, p. 247)
These central questions characterize the dilemma in attributing viruses. At the same time, however, they also show the fascination that emanates from the system of viruses because they master or apply mechanisms that are principles of life. This is because the genetic code is the same in all living things and in viruses and is based on the four nucleic acids adenine, cytosine, guanine, and thymine. Viruses thus stand at the transition from the first biomolecules to the origin of life. They help to understand how life might have arisen, and at the same time are an integral part of this vast and fascinating process. Viruses could thus be an intermediate stage between the inorganic world and life. They need a host to replicate their own RNA, in the simplest case. Viruses don’t build it themselves, however, but have a host cell produce it for them. “Viruses can be thought of as successful genes that perfectly embody the primacy of replication” (Thoms, 2005, p. 88). This, in turn, could mean that viruses either represent very primordial forms of a self- replicating, life-like system, or have regressed from more complex life forms.
50
R. Ball
Here we already come to the central questions about the descent and evolution of viruses: Are viruses precursors of cells or are they the maximum reduction of life to its very last and at the same time very first sense, namely the reproduction (replication) of their genetic information? Or are viruses highly evolved systems that have optimized themselves in parallel with their hosts during co-evolution? Viruses can also be seen as genes that have “sprung” or “escaped” from cells of living organisms, have become independent and, in maximum reduction, replicate only themselves.5 Whether this maximal reduction from the complexity of a living cell to the simplicity of a dead virus and thus to a single “purpose” would qualify as a successful strategy would have to be discussed. In fact, a wide variety of strategies exist in the evolution of living organisms. They range from building up highest complexity (insects, mammals) to reduction to greatest simplicity (unicellular organisms). Is there an optimum between simplicity and complexity? Examples would be viruses (only RNA pieces) and mammals (highly complex, conscious, partly self-aware systems). There are three basic hypotheses for the emergence of viruses. The first hypothesis assumes that viruses evolved and adapted together with organisms in a kind of co-evolution. The high host specificity of viruses could be evidence for this. Viruses would then have evolved from self-replicating pieces of RNA that let their replication and the construction of an envelope be carried out entirely by the respective host organisms. Viral systems would then be nothing other than dead matter that reproduces itself in living organisms. They would have arisen then at least in their original form Here lies a commonality with the computer viruses, whose “program sense” is initially “only” the self-replication. 5
2 Viruses, Microorganisms and Molecular Genetics
51
already (long) before the living beings in the primeval soup of the earth development. Their genetic RNA equipment would be then an indication of their existence in a “pre- DNA world”. Essentially, this hypothesis is based on the idea that locates the emergence of viruses at the transition of biomolecules to the first cell, i.e., the emergence of the first life. Early on, free pieces of RNA were able to self- replicate, it is hypothesized, and virus-like forms and systems then emerged from this. At the same time, the RNA molecules of the early self-replicating systems have evolved permanently, so that viruses both contain an optimized genetic material and represent evolved systems and are by no means simple structures. Viruses and their hosts do not live in a symbiotic relationship from which both partners derive benefits. Nevertheless, viruses likely represent the genetic evolution of their host organisms, as they may have evolved in a kind of co-evolution with and in parallel to their hosts. “Viruses are packages of genes and genetic elements with millions of years of biological experience” (Doerfler, 1996, p. 51). The second hypothesis assumes that viruses are degenerate (or reduced) systems that may have originally evolved from simple living organisms (such as bacteria) and have scaled back their activities by “reducing them to the essentials,” namely, replicating their genetic information. This interpretation implies a combination of freedom and dependence for the status of the virus: on the one hand, viruses “free” themselves from the “ballast” of everything cellular, degenerate and “focus” solely on genetic reproduction. On the other hand, they are maximally dependent on their respective hosts, without which they would not exist. At the same time, this hypothesis implies that viruses are descended from living beings or would even still have the status of a living being (even if maximally degenerated). The basic genetic law with its universal coding rules and
52
R. Ball
functions would then remain essentially limited to living beings. The third hypothesis also assumes that viruses could have originated from living beings. However, not as degenerated, “independent” parasites, but as spin-offs of free DNA and RNA pieces, i.e., genetic material that has reorganized itself outside the original cell and exists but is dependent on a host cell for reproduction and replication of itself. Again, although with a different focus than the first hypothesis, the underlying assumption is that viruses originated in living organisms, or at least originated in a very strong connection with them. However, in hypothesis three, it is “only” functional parts of an original cell (namely parts of the genome) that have become independent and have developed into a system of their own that can replicate only with the help of host organisms. Again, the “purpose” of the virus would be only its original function, necessary for the cell, to replicate the genetic material. Now outside the cell – as a virus and without a purpose for the function of a cell in the organism – this mechanism would, however, be completely devoid of meaning; quite like the first hypothesis: here molecules (just the nucleotides) replicate without any purpose connection with a living being. The difficulties in recapitulating virus evolution also lie in the instability of ribonucleic acid. It is difficult to imagine that viruses and their genes could have evolved safely and stably from RNA, since RNA is a poorly stable molecule that decays rapidly. Viral RNA survives in a cell for only a few minutes, whereas deoxyribonucleic acid is much more long-lived. It is hard to imagine that in prehistoric times RNA could have survived for long under the special conditions of high UV exposure. “This is because long RNA strands in particular break down into small segments more quickly than assembly occurs” (Schreiber, 2019, p. 56). Therefore, research suspects the development of the
2 Viruses, Microorganisms and Molecular Genetics
53
first self-replicating RNA systems, for example, in deep layers of the sea where there is good UV protection. Sites where hot rock lava shoots out of the earth’s interior into the sea (so-called “black smokers”) could have been places for the emergence and development of such self-replicating systems (and early organisms, such as those of the archaea) because of the sulfurous environment, good protection from UV radiation, and elevated temperature. In the initial theories of the origin of life, it was assumed until the 1980s that proteins were the most primordial molecules and thus the precursor structures of living things. In these theories, one also speaks of the “protein world”. However, the insight has prevailed that proteins are already highly complex organic molecules, which are also not easily suitable as a basis for the central “replication question”, which is relevant for the origin of the living and the underlying coding of its information. With RNA as the basic molecule, one had come much closer to the “replication molecule”, especially since RNA fulfills almost all the conditions that must be imposed on a class of substances for the transmission of information. In addition, it fulfills the dual function of information coding and transmission on the one hand and catalytic properties necessary for reading information on the other. “RNA is at the transition from chemistry to biology, software and hardware at the same time” (Mölling, 2015, p. 185). The “iron-sulfur world” hypothesis also seeks an explanation for the formation of the first biomolecules and competes with the protein world and RNA world theories. It assumes that small biomolecules can form faster and easier on the protected surface of minerals than in the free primordial soup because the mineral surfaces facilitated catalysis, resulting in the formation of first biomolecules from inorganic molecules (Weitze, 2008).
54
R. Ball
However, the theory of the so-called RNA world is considered much more likely to explain the origin and beginning of biomolecules, their self-replication, and the emergence of the first cellular organisms. In the current discussion about the origins of life, the RNA world has gained the far greater importance and publicity compared to the protein world. The model of the RNA world goes back to Nobel Prize winner Walter Gilbert (Rauchfuß, 2005, p. 177). If one looks at the early world of replicating systems and the resulting emergence of living things and makes use of the RNA world hypothesis, one presupposes that RNA was the only coding material. Open questions of the RNA world exist not only regarding the instability of the molecule, but also with regard to the information content of the bases in the nucleotides. The information content of a random base sequence in an RNA is not usable if there is no connection to an information system. This system requires an exact information mapping between RNA, tRNA and associated synthetases. (Schreiber, 2019, p. 128)
In his book, Schreiber therefore develops an alternative explanatory model about the origin of living things, combining ideas from the competing models of the iron-sulfur world, the RNA world, and the protein world. In doing so, the author places an emphasis on hydrothermal chemistry at the bottom of the ocean, i.e., the warm springs: […] the hydrothermal chemistry in the fault zones of the continents provides the starting materials for the formation of nucleotides, which are linked to form RNA strands […]. (Schreiber, 2019, p. 128)
The pH values prevailing there provide the necessary conditions for the formation of an RNA chain. However, a
2 Viruses, Microorganisms and Molecular Genetics
55
double-strand formation of the RNA, which then led out of the RNA world and into a DNA world, is prevented by the external conditions on the seafloor. At the same time, however, the RNA is protected there from the destruction of the molecules by UV light, and ionizing radiation from the surrounding rocks leads to mutations, adaptations, and thus to an evolution of the systems (Schreiber, 2019, p. 222). The emergence of life from the inorganic world to explain and classify the status of viruses is still a fascination. Even though there are a wide variety of hypotheses about the origin of viruses as a “dwindling stage” of living things or else as a precursor to them: The fact that viruses possess the basic genetic law, use the canon of nucleotide coding of their genetic information, and can reproduce exclusively in the presence of and using a very specific host, always brings them close to living beings without fulfilling a definition of life in its entirety. The existence of so-called viroids could prove to be another link in the lineage of origin of viruses from free RNA molecules. Viroids are viruses without proteins, “because their RNA is ‘non-coding,’ non-sense- bearing, non-programming for amino acids and proteins” (Mölling, 2015, p. 184). Not only are they orders of magnitude smaller than viruses, but they consist solely of one strand of RNA and can be reproduced in the host cell. They thus position themselves between loose RNA and a “minimal system” of replicating RNA strand, whose degree of organization is higher than loose RNA and lower than that of a virus. Viroids are covalently closed circular RNA molecules that are not complexed with any protein. Their propagation is carried out by cellular polymerases in the nucleus. They represent the smallest replicable nucleic acids […]. (Hof & Dörries, 2017, p. 271)
56
R. Ball
Furthermore, since we find a large amount of viral genetic material in the chromosomes of most living organisms (including the human genome), we assume that viruses have evolved with and in parallel with organisms over millions of years, with constant evolutionary adaptations and optimizations of their genes. Due to the high reproductive cycles of viruses, mutations occur quite frequently, making virus infection control, for example, a challenge. Even though we cannot discuss “meaning and purpose” in our subject matter, it can be stated that a possible “end in itself ” of viruses is their reproduction and a further development in the sense of evolution; and there they are now not too far from the “end in itself ” of biological organisms. “Viruses are the inventors of genetic diversity, they deliver innovation, they are the builders of all genetic material” (Mölling, 2015, p. 28). In recent years, the discovery of so-called giga viruses has provided another perspective on the status of viruses in relation to the kingdom of living things. Giant viruses that almost reach the scale of bacteria are called giga viruses and could mark a transitional stage from viruses to the animate world, while viroids signify the transition from dead matter to viruses. The discovery of giga viruses throws the classification into trouble: It blurs the boundary between virus and cell. The transition is a continuum. […] These near bacteria form the transition from viruses to bacteria, from dead to alive. (Mölling, 2015, pp. 15–21)
2.4.3 Prions We have learned a great deal in this chapter about how the storage and encoding of genetic information, its replication, and its further development in the evolutionary
2 Viruses, Microorganisms and Molecular Genetics
57
process is an important element in defining life. In doing so, we have traversed the arc from the random replication of inorganic molecules, through the emergence of the first biomolecules and the organization of molecules into self- replicating systems, such as viruses, to the emergence of (cellular) life. In doing so, we assumed that the basis of information coding was always molecules such as RNA or DNA, and thus from the anchoring in the basic law of molecular genetics – namely the coding by the base pairs of the nucleotides. Quite obviously, however, there are also structures and systems that do not replicate themselves, but nevertheless pass on information of their structure without using the system of RNA and nucleotide coding. Prions are an example of this. Prions are pathogenic proteins that can cause wellknown diseases such as Creutzfeldt-Jakob disease in humans or BSE (“mad cow disease”) in cattle. They are formed from normal proteins that occur in every organism. However, the prion proteins are folded “incorrectly”. Proteins have a characteristic tertiary structure caused by a special folding. Due to the incorrect folding, they not only have a malfunction, but also “infect” other proteins, which then also fold incorrectly. How this works is still unclear. In Fig. 2.11, one can see the principle of correct and incorrect folding. This also clearly distinguishes prions from, for example, pathogenic toxins, which also cause damage in and to cells but cannot reproduce or replicate themselves. These misfolded proteins (prions) are infectious and are transmitted, for example, from animal to animal. They in turn cause the misfolding of proteins in the affected organism. Quite obviously, they transmit information to other proteins and thus virtually reproduce themselves by increasing the population of misfolded proteins.
58
R. Ball Prpc a normal protein
Prpsc the pathogenic form of the prion protein
Fig. 2.11 Prions in “normal” and in “pathogenic” form. (Source: Designua/Shutterstock)
Prions could also be called viruses without genes (in contrast to viroids, which are genes without proteins). They consist only of proteins – they are a protein; however, they do not contain information-coding nucleotides and yet they can replicate. It is unclear whether information coding is present here, and if so, in what form. One could still call this kind of information effect most aptly “induction”. However, the relevant information is “explicit”, that is, it is already present and genetically read (it is already transliterated), and therefore directly infectious and effective. Apparently, proteins can serve as “blueprints” and reproduce themselves. Lee was able to show that a peptide consisting of 32 amino acids, i.e., a short protein can act as a template and autocatalytically supports its own synthesis (Rauchfuß, 2005).
3 Algorithms, Software and Artificial Intelligence
Abstract Computer viruses are malicious programs that usually spread independently over networks and delete content, manipulate programs, and cause other damage on computers and in program-controlled machines. For clarification, this chapter first explains the background and definitions of algorithms and computer programs and places them in the context of computer viruses. For a long time now, there have been malicious programs that do not simply replicate themselves (as living organisms and viruses do) but are able to adapt to new conditions by changing in a manner analogous to natural evolution. Such systems are called evolutionary algorithms and are thus not only linguistically close to autonomous living beings. This leads to the topic of artificial intelligence, which presupposes that technical systems develop and adapt in a self-determined and evolutionary way. When these systems exceed the intelligence of humans, the so-called singularity is reached. However, a superintelligence that evolves evolutionarily © The Author(s), under exclusive license to Springer Fachmedien Wiesbaden GmbH, part of Springer Nature 2023 R. Ball, Viruses in all Dimensions, https://doi.org/10.1007/978-3-658-38826-3_3
59
60
R. Ball
and becomes autonomous leads directly to autonomous technical systems that could develop and behave like living beings.
3.1 Software and Hardware Before we go into the specifics and references between computer viruses and other viruses in this chapter, let us attempt a terminological clarification and delineation. To run a computer, we need software. Unlike hardware, which refers to the device as such, software is immaterial and contains information and data that the hardware processes. The term software covers programs and data that are processed in the hardware or contain instructions on how the hardware is to work. This software therefore contains all information that is required to make a software-controlled device do something specific. When a video is played on a PC, the video file and the player with its code are the software, while the chipset, screen and speakers are the hardware. Machines and devices that are not controlled by modifiable software can also perform functions, but only ever the same ones. Today, for example, ovens, cars and washing machines are likewise controlled by built-in, chip-based control units. However, the computer programs used in them are not flexible and cannot be used on other machines. In addition, their functions are precisely predefined and cannot be adapted at will. This is referred to as “embedded software” because it is an integral part of a piece of hardware. Non-embedded software, on the other hand, makes computers and computer-like machines flexible and individually usable for a wide variety of tasks.
3 Algorithms, Software and Artificial Intelligence
61
A formal definition of software is provided by ISO/IEC standard 24,765: “Software is a program or set of programs used to operate a computer” (ISO, 2010).1 In the information-theoretical sense, software and (computer) programs are information (software) for the execution of an operation on the computer system and hardware. This information is materially bound on the carrier of the computer program and written in a specific programming language, i.e., a language that the hardware “understands” and thus the device can process. Initially only in the smartphone and tablet world, today also in Windows environments, one often speaks of “apps”, i.e., applications, by which, however, nothing else is meant than application programs. Operating systems, also called system software, on the other hand, are responsible for communication between the hardware and the respective application programs.
3.2 Algorithms and Computer Programs The computer age is an age of algorithms. (Stiller, 2015, p. 18)
Anyone who talks about the digital society today or addresses this topic in a lecture hardly needs to explain what the digital society is supposed to be, what digital data is, and how the digital world differs from the analog world and its presumed, or at least often assumed, backwardness and ISO IEC 24.765:2010: “[Software is] 1 all or part of the programs, procedures, rules, and associated documentation of an information processing system 2. computer programs, procedures, and possibly associated documentation and data pertaining to the operation of a computer system 3. program or set of programs used to run a computer.” 1
62
R. Ball
slowness. Each of us has an idea of what constitutes a digital society, what it means, and what impact it can have on the way its members live and interact together. We are often confronted with the term “Big Data”, which means nothing more than a huge amount of data, so large that it cannot be handled manually or with standard analog or digital tools. With Big Data, the quantification of the reality of our lives reaches a new level. There is no truly scientific definition for Big Data, only an approximation: Originally, this was understood to mean a quantity of information that had become too large for the working memory of a processing computer and required new technologies from developers. (Mayer-Schönberger & Cukier, 2013, p. 39)
Another definition refers to Big Data as the concept of bringing together different data formats and how new insights can be gained from them (Fasel, 2014). Another definition is: Rather, Big Data is evolving into a movement that encompasses the interplay of modern internet technologies and analysis methods that enable the collection, storage, and analysis of large and expandable data, especially data with different structures. (Mayer-Schönberger & Cukier, 2013, p. 9)
The large amounts of data are thereby evaluated by means of algorithms with the aim of gaining new insights and predicting trends based on statistical probabilities. This brings up a term that we have constantly encountered since quantification and digitization. It is therefore useful to approach this of the algorithm and to distinguish it from software and from computer programs.
3 Algorithms, Software and Artificial Intelligence
63
Fig. 3.1 Muhammad ibn Musa al-Chwarizmi. The concept of the algorithm goes back to him. (Source: Likeness on a former Soviet stamp)
The word algorithm comes from the Latin version of an Arabic name. Musa al-Chwarizmi was a Persian scholar and wrote mathematical works at the court of Caliph Al- Ma’mum in Baghdad in the ninth century. Figure 3.1 shows a portrait of the mathematician. One of his works was called “Hisab al-jabr wa l-muqabala” which translates as “Calculation procedure by supplementing and balancing”. The word algebra comes from the Latinization of this title and is apparently also historically associated with solving equations. In Spain, the author’s name appeared about three centuries later in a Latin adaptation of his books. This begins with the words, “Dixit Algoritmi …” (“It spoke Algoritmi …”). In the course of time, this became the name algorithm. At that time, mathematicians understood an algorithm to be a mechanically executable calculation procedure. Today, we define algorithm as a procedure that consists of uniquely defined sequences of operations and leads to the solution of a mathematical problem after a limited number of steps (Stähel & Wienold, 2020). Thus, in modern software development, the term algorithm describes a formal course of action for solving instances of a problem in
64
R. Ball
a finite number of steps (Täubig, 2010). In contrast to general software, algorithms have some characteristic properties: • Discreteness: An algorithm consists of a sequence of discrete steps. • Determinacy: Given the same starting conditions, it always arrives at the same final result. Or formulated differently: Applying the algorithm several times with the same input data must always yield the same output data. • Uniqueness: After each step, an algorithm can be continued in at most one way. The individual steps of an algorithm and their sequence must be described unambiguously. • Finiteness: An algorithm must consist of a finite number of solution steps and, after processing this finite number of steps, must reach its conclusion after a finite time (Kuhlen et al., 2004, p. 2). • Generality: An algorithm describes not only the solution of a specific task, but the solution of a class of problems. • Effectiveness: An algorithm must be executable in real terms by a machine. • Efficiency: an algorithm must use as few resources of a machine as possible (Sobe, 2012, pp. 10–11). In the literature, however, there is a lack of clarity as to whether the term algorithm can be unambiguously defined. One argument against an unambiguous definition is that the semantics of the term algorithm is continuously expanding. Similarly, as with the numbers, which produced in the course of the history different kinds of characteristics such as whole numbers, natural numbers or complex numbers, the term algorithm extends constantly. Thus, after the sequential algorithms, also parallel algorithms, distributed algorithms, or real-time algorithms were developed.
3 Algorithms, Software and Artificial Intelligence
65
Thus, there is still disagreement among mathematicians as to whether the term is sharp enough to be defined yet (Gurevich, 2011, p. 4). Nevertheless, algorithms are ubiquitous. They represent all sorts of finite processes that the human brain may conceive. The cooking recipe is often cited as an everyday example: Here, a certain set of ingredients (inputs) is processed step by step by a recipe (algorithm). At the end of the process, the dish (output) is available. At best, one could define it in general terms: The term algorithm means a prescription for the solution of a problem, which is suitable for a realization in the form of a program on a computer. (Dershowitz & Gurevich, 2008, p. 12)
A computer program, on the other hand, as opposed to a pure algorithm (that is, a pure prescription for solving a problem), is an algorithm plus associated data structure. If you want to know what an algorithm is, you cannot avoid the name of Alan Turing (Fig. 3.2): he is considered the inventor of the modern concept of algorithm. Born in London in 1912, Turing became interested in numbers and
Fig. 3.2 Alan Turing (1912–1954), solving the Hilbert decision problem. (© ARCHIVIO GBB/Alamy)
66
R. Ball
puzzles at an early age. From 1931 to 1934, he studied mathematics at King’s College, Cambridge. In 1936 he published his article On Computable Numbers, with an Application to the Decision Problem with the Design of the Turing Machine (Turing, 1937). Here he developed the model of a “machine” that would later become one of the fundamental concepts of computer science (Weuffen, 2019). The Turing Machine is the mental model of a machine that can simulate all algorithms using only three operations. The reason for the development of the Turing Machine, which is not a machine in the sense of a device or hardware, was the “Hilbert decision problem”. In 1900, the mathematician David Hilbert asked at a Paris congress whether there could be a procedure that decides for every sufficiently formalized statement of mathematics whether it is true or false. It was not until 36 years later that the U.S. mathematician Alonzo Church and Alan Turing answered this question in the negative; it was thus clear that Hilbert’s decision problem could not be solved mathematically. Turing’s idea was to approach the solution with an ideal machine. It was to be algorithmic, i.e., executable in an automated fashion and capable of producing results without the aid of intelligent insight (Weuffen, 2019). This laid the foundation for the development of application-oriented algorithms that today draw conclusions from huge amounts of data, create profiles, and recognize patterns. And all this – as planned by Alan Turing – purely mechanically executable and without intelligent insight. Thus, we approach again the concern of this book, to find out whether computer programs and algorithms can also replicate themselves as computer viruses and thereby
3 Algorithms, Software and Artificial Intelligence
67
change and spread autonomously like “real” viruses. And we wonder where the parallels might be for the analogy between viruses and computer viruses.
3.3 Evolutionary Algorithms We have already seen in the introduction of basic molecular terms that terms from other disciplines are used to create analogies and improve comprehensibility. For example, the terms “letter coding” (genetic code) of “translation” and “transcription” of the genetic information of living organisms or viruses have been adopted from literary and linguistic science for better clarity. Also, with the description of special algorithms in computer science one uses this method, so that the comprehensibility can be increased and improved. One accepts thereby also in principle the danger that analogies can lead to wrong associations. Thus, when algorithms are used not merely to serve as a prescription for processing a sequence of computations, but to perform more complex tasks, we often find the term “evolutionary” or “genetic algorithm” borrowed from genetics. This is because algorithms are normally procedures that solve a given problem. Their most important property is therefore correctness. However, there are attempts to use algorithms also for the optimization of complex systems, for example for solution methods based on data systems and pattern recognition. These methods are called evolutionary algorithms or genetic algorithms. Genetic algorithms in the narrower sense are characterized by the fact that they see the parameter vector as analogous to a set of genes and that they use inheritance mechanisms borrowed from nature (mutation, crossing over). (Weicker, 2015, p. 147)
68
R. Ball
Already here we see a very direct relation of computer science, its structures and terminology to our questions and easily arrive at the genetics of viruses, whose characteristics are self-replication and change through mutations. From the idea of the natural evolution of nature (animate and inanimate), computer science has derived computational operations, which it calls evolutionary algorithms. They are designed to compute optimization processes from imitation of natural evolutionary processes. Essentially, one relies on the evolution of the living matter. Evolutionary algorithms now combine the computer as a universal computing machine with the general problem- solving potential of natural evolution. Thus, an evolutionary process is artificially simulated in the computer to generate the best possible approximations to an exact solution for an almost arbitrarily selectable optimization problem. (Weicker, 2015, p. 20)
In this process, the basic evolutionary mechanisms of mutation, recombination, and selection are imitated. Self- replicating evolutionary algorithms are already much closer to the general self-replicating systems of living organisms, although compared to the encoding of information in DNA, “the encodings considered in evolutionary algorithms […] are much simpler” (Weicker, 2007, p. 40). Self-adapting evolutionary algorithms are not yet intelligent. They are only capable of adapting themselves and their computational flow (and thus their processes) in response to changing external conditions. Just as organisms do not need intelligence to optimize themselves during evolution, but the evolutionary process of random mutations and adaptation to change enables (or virtually induces) “best selection,” evolutionary algorithms can abandon and
3 Algorithms, Software and Artificial Intelligence
69
adapt their pre-programmed flow in response to changes in underlying conditions. The computer pioneer John von Neumann dreamed of self-propagating automata already in the 1950s: The study of self-modifying and self-reproducing algorithms is by no means new. As early as 1949, the American mathematician and computer pioneer John von Neumann (1903–1957) dealt with them. The term computer virus first appeared in 1981 in the work of Adleman and Cohen. From Cohen comes the first systematic account of how these algorithms work, the security aspects involved, and possible legal consequences. (Brecht, 1995, p. 175)
Only when computer programs actively change their own program code due to external influences do we speak of intelligent systems and thus reach the field of artificial intelligence (AI).
3.4 Artificial Intelligence Man is one of the few animals that use tools. He is the only creature that understands parts of his own thinking so well that he can outsource them. (Stiller, 2015, p. 247)
Participation in natural evolution does not require intelligence; on the contrary, one can even argue that only intelligent creatures can override evolution. This is because natural evolution is part of the life stream and structurally intrinsic to living things via the processes of information coding and use. Also highly adapted viruses (if we understand them now as regressed living beings or as systems developed with their hosts in a co-evolution) do not need intelligence for their successful reproduction. If we want to compare
70
R. Ball
computer programs with viruses, it is even counterproductive to analogize artificial intelligent systems of computer science with the natural evolutionary process of organisms and viruses. This is because artificial intelligence is always placed in relation to human intelligence and thus to a system that has already left natural evolution to a large extent. Indeed, as part of computer science, artificial intelligence research is concerned with the automation of intelligence and self-learning machine systems. Whenever AI attempts to mimic the decision-making processes of humans, efforts are made to replicate a human intelligence in AI. The beginning of AI research is marked by a 1950 paper by Alan Turing, whom we have already met as the inventor of the modern concept of algorithm (Mainzer, 2019, p. 10). Artificial intelligence programs, however, can adapt to new framework conditions in an automated and autonomous way and to evolve. This gives rise to autonomous systems that can evolve on their own and thus run away from their creator (namely, the programmer) and “go off the rails” in an unpredictable way – just like evolutionary systems. The basis for artificial intelligence is adaptive algorithms. They are becoming ever more powerful as computer capacities continue to grow exponentially, enabling performance parameters that were unthinkable 20 years ago. The following definition will serve as an approximation of the term artificial intelligence: A system is called intelligent if it can solve problems independently and efficiently. The degree of intelligence depends on the degree of autonomy, the degree of complexity of the problem, and the degree of efficiency of the problem-solving procedure. (Mainzer, 2019, p. 3)
3 Algorithms, Software and Artificial Intelligence
71
In contrast to artificial intelligence, natural intelligence has emerged through mutation and selection, i.e., evolution. Artificial intelligence, however, is created by programs and algorithms. To be able to operate autonomous systems, it is obvious to also develop evolutionary genetic algorithms and programs here, which imitate and reproduce the natural mechanisms of evolution, mutation, and selection. Indeed, when artificial intelligence evolves itself, i.e., keeps optimizing itself and thus keeps improving its learning ability to solve problems that cannot be solved at the previous AI state, “an exponential, i.e., explosive, increase of intelligence occurs – the result of this process is often called technical singularity” (Küppers, 1986, p. 63). Singularity refers to the point of development where machine intelligence exceeds human intelligence. This is also referred to as superintelligence. Some futurologists assume that there will be an algorithm that – as described above – permanently optimizes itself further and thus produces an artificial intelligence that exceeds that of a human being. This would then be the time of the so-called technical or technological singularity (Grunwald, 2019). For our question, it now becomes interesting: Should an algorithm reach the singularity, it could permanently reproduce itself from that point on. After all, this was exactly the idea of its programmer: the algorithm should solve problems on its own by evolving along its tasks, growing along them, and thereby detaching itself from the original code of its programmer. In a sense, an artificial intelligence grows with its tasks and thus achieves a status of autonomy that comes close to our considerations about the parallels between viruses and computer viruses (and artificial intelligence, respectively). For if AI constantly analyzes and improves its code, permanently develops itself further and uses resources for this
72
R. Ball
purpose, it will eventually get out of control and be able to become autonomous. The goal and purpose of an algorithm that is then “out of control” is to solve the original problem for which the programmer had programmed it. Much like the reproduction by replication of its genome serves a virus as its sole “purpose”, the algorithm optimizes itself along the challenge of the problem and evolves (evolutionarily). It does so even and especially when the problem changes. Thus, AI follows a similar mechanism comparable to the evolution of viruses: co-evolution with its hosts here, co-evolution with the problem to be solved there. The question to what extent artificial intelligence, when it is no longer exclusively an aid and instrument of humans, should be entitled to its own moral consideration opens a wide field into the philosophy of robotics (Bendel, 2019) and AI systems, which we cannot discuss here. However, it is addressed in detail in Bernd-Olaf Küppers’ book (1986). Artificial intelligence can therefore be a precursor or even a condition for the development of self-replicating technical systems. Artificial intelligence must be able to learn from experience and make contextual decisions. “This whole process demands knowledge about the world, logical thinking, and abilities to learn and adapt” (Bartneck et al., 2019, p. 4). This is again closer to the structures of human intelligence than to the evolutionary processes of viruses. Therefore, in the next chapter, we will look at what so-called computer viruses are and what they have in common with the other, biological viruses.
4 Computer Viruses, Computer Worms, and the Self-Replication of Programs
Abstract Computer viruses are almost as old as computers themselves. The first technical paper about computer viruses appeared in 1984 and described a self-replicating computer program that reproduces and spreads automatically. The direct comparison and analogy-building between technical (computer) viruses and biologically effective viruses tempt us, via intentional or unintentional linguistic analogies, to see or even construct similarities in content, function, and structure between the two systems. In fact, however, we find a startling parallelism between the two systems. When comparing biologically active viruses and computer viruses, the notion of self-replication is of great importance. For here, as there, (uninhibited) replication and spread is the cause of infection and damage that computer viruses and biologically active viruses cause. Computer viruses are also able to change, mutations are created. The mutated computer viruses are always “new” for anti-virus programs and thus difficult to combat. © The Author(s), under exclusive license to Springer Fachmedien Wiesbaden GmbH, part of Springer Nature 2023 R. Ball, Viruses in all Dimensions, https://doi.org/10.1007/978-3-658-38826-3_4
73
74
R. Ball
The processes are very similar for biologically active viruses and the fight against an infection is very similar. The similarity between the two systems is also striking when it comes to the mechanism of virus defense – in this case, anti-virus software that must learn anew again and again, and in the other case, vaccines that must be adapted again and again to mutants of the original virus.
4.1 “Brain” and the Computer Viruses Exactly at the time of the description of the human immunodeficiency syndrome pathogen, the virus entered the age of information technology infectivity: in 1983, computer scientist Fred Cohen at the University of Southern California explored the possibility conditions of self-reproducing, parasitic computer programs, long after the cyberpunk culture had developed such scenarios. (Borck, 2004, p. 55)
The term virus is also used in technology. The term “computer virus” was coined quite early on. It was used in analogy to biologically effective viruses for programs that replicate themselves and are transferred from one computer to the next via networks or data carriers. The first technical paper about computer viruses, “Computer Viruses”, was published in 1984 by Fred Cohen. Cohen is a US-American, internationally leading researcher in the field of Internet security. Brain, the first PC virus, infected the first 5.25-inch floppy disks in 1986. As “Securelis”t reports, this was the work of two brothers, Basit and Amjad Farooq Alvi, who ran a computer store in Pakistan. Tired of customers making illegal copies of their software, they developed Brain, which re-
4 Computer Viruses, Computer Worms…
75
placed the boot sector of floppy disks with a virus. The virus, which was also the first virus to work undetected, contained a hidden copyright message but did not destroy any data. (Kaspersky Labs Ltd., 2021)
A self-replicating program had already been developed experimentally before: On November 3, 1983, the first virus was conceived of as an experiment to be presented at a weekly seminar on computer security. The concept was first introduced in this seminar by the author, and the name “virus” was thought of by Len Adleman. After 8 h of expert work on a heavily loaded VAX 11/750 system running Unix, the first virus was completed and ready for demonstration. (Cohen, 1984)
Figure 4.1 shows a section of the program’s boot sector with an indication of how dangerous the virus is. The official virus catalog, published since the mid-1980s, listed a modest
Fig. 4.1 Program data of the first PC virus “Brain”. The developer even puts his address in the code. You can contact him to have the virus removed. (© M. Zabel/Lectorate Freiburg)
76
R. Ball
35 computer viruses in 1987 (Paul, 1989, p. 503). In 2015, we estimate that about 350,000 computer viruses and worms are added daily (World, 2015). The way a computer virus works is quickly explained: The virus infects program files by writing itself in front of the beginning of the program, thereby increasing the size of the program file. It is then activated again the next time this program is called up. It can also infect a program file multiple times. Its size increases by the size of the virus with each infection. This is very noticeable behavior for a virus, so that an infection is easy to recognize. (Brecht, 1995, p. 177)
Or by another definition: Computer viruses always attack other programs in the system during their multiplication and copy their own program code into these. Primary goal thereby is the spreading within a system. Each manipulated program is itself a carrier of the virus. It is often referred to as a host program. Only if a host program is started, the virus contained in it becomes active. (Paul, 1989, p. 56)
Relevant to our topic are the two key concepts of self- replication and (unnoticed) spread and infection. This borrows another term from the world of viruses to describe the effect of self-replicating programs and computer viruses. Computer viruses are software programs that self-replicate, spread and multiply by creating a copy of themselves. Here, too, the analogy is striking: viruses also spread by copying themselves, but they cannot create them themselves; instead, they depend on their hosts to do so. Computer viruses also depend on the hosts they infect, such as other computers and their programs. This distinguishes them from computer worms, which can make a copy of their program themselves.
4 Computer Viruses, Computer Worms…
77
One of the many (but mostly very similar) definitions helps us here: We define a computer virus as a program that can infect other programs by modifying them to include a version of itself, which may or may not have evolved further. (Schmundt, 2004)
This brings another characteristic of viruses into the computer world: modification through mutation. In fact, computer viruses can change, mutations are created. The mutated computer viruses are always “new” for anti-virus programs and thus difficult to combat. The processes are very similar for viruses: The host’s immune system (naturally only in the case of higher organisms; bacteria, for example, have no immune system) fights the invading virus. If the virus changes because of mutations, the antibodies are no longer (or only to a limited extent) able to recognize and fight the mutated viruses. Thus, the race between the occurring mutations and the defense mechanisms is very similar in technology and in living beings. The defense mechanism should always be one step ahead of the infection, but this cannot work in this order, because the defense mechanisms are always induced only by the new, changed infectious viruses. Hilmar Schmundt concludes that the idea of John von Neumann’s self-replicating automata has already become self-evident today. The anti-virus industry today works in a constant race with the hackers who introduce new viruses into the network: In experimental scenarios, special antigen programs attract viruses, memorize their structure, and send the antigen thus obtained to their neighbors fully automatically with the aid of self-copying codes. With the help of these “good viruses”, the focus of infection is precisely and locally combated. (Schmundt, 2004)
78
R. Ball
Another definition of computer viruses focuses on the damage they can do: Computer viruses are self-reproducing executable programs that are transmitted unnoticed from computer to computer (for example, through e-mails, downloads from the internet, exchange of data carriers), take root there and usually cause damage to the infected computer, such as changing or deleting data files. (Abts & Mülder, 2017, p. 604)
4.2 Computer Worms and Trojan Horses Computer malware can be divided into three groups: the computer viruses already discussed, which infect files and programs and require host programs to do so; computer worms (or “worms” for short), which, unlike viruses, do not require host programs and reproduce themselves via their program code; and Trojan horses (or “Trojans” for short), which use a useful program to introduce computer viruses into the host system. Thus, Trojan horses are a means of transport that computer viruses use to enter the host; selfreplication does not occur. They are also not actual programs, but “[…] merely a string of commands.” (Paul, 1989, p. 642). Computer worms, on the other hand, spread independently on the network and, unlike viruses, do not require host programs. Worms are programs, which spread in nets independently. Trojan horses are also harmful programs, but they do not have a reproduction mechanism. They often spy out data or passwords on infected systems or open a “back door” to the network unnoticed by the attacker. The newer damaging programs are mixtures of virus, worm, and Trojan. (Abts & Mülder, 2017, p. 606)
4 Computer Viruses, Computer Worms…
79
Werner Brecht states in his “Theoretical Computer Science”: Computer worms (or worms for short) are independent programs with the ability to reproduce. What makes them special is their independence. They do not reproduce by infecting other programs, but by duplicating themselves. (Brecht, 1995, p. 177)
In doing so, they produce a more or less exact copy of themselves. The goal is to create a functional version of themselves and get it running on another computer (Paul, 1989). The term was introduced as early as the 1980s. Worms were initially used in this context as utility programs and work tools that – without human intervention and thus saving a great deal of work – spread automatically in networks and perform the positive tasks assigned to them. It may come as a surprise, but computer worms have been talked about longer than computer viruses. Two years before Fred Cohen wrote his previously cited essay on computer viruses, a program was developed at Xerox’s Palo Alto Research Center that independently installed programs on computers and did “positive” work. The intention was to save work with the increasing number of computers in the Research Center. Today we would speak of a central “roll- out” of software. As these programs “crawled” across the networks from computer to computer, they were summarily called “worms” (Paul, 1989). This system quickly fell into oblivion, and worms were only rediscovered or described again years later in malicious applications, such as the “Clausthal Christmas tree” from 1987 (Brunnstein, 1994). This program ran through the networks, multiplied, and thus spread like an avalanche. Even though it did not carry out any malicious manipulations, the “Clausthaler Weihnachtsbaum” temporarily paralyzed many systems due to its avalanche-like spread as a computer worm (Eilers, 2010).
80
R. Ball
However, it happens again and again that programs become corrupted, change, and thereby do things for which they were not intended, and which cannot be predicted. On the one hand, they have thus evolved from useful working programs to independently acting malicious programs that do harmful things and change continuously. Here, too, the proximity and analogy to biological mutation or the mutation of biologically active viruses are obvious. Although computer worms do not require a host to replicate, they nevertheless react in a very similar way through random changes (mutations) and are also difficult to combat (Heuveline, 2015).
4.3 On the Analogy Between Biologically Active Viruses and Computer Viruses What makes the virus so fascinating? Why has it been appearing as a term, concept, and metaphor for quite some time, not only in immunology papers and computer manuals […]? Are there any commonalities at all between the bioscientific, cybernetic, artistic, and pop-cultural virology of our time? (Mayer & Weingart, 2004, p. 7)
Direct comparison and analogy-building between computer viruses and biologically active viruses all too easily seduces us, via intentional or unintentional linguistic analogies, to see or construct similarities in content, function, and structure between the two systems. We have become familiar with this phenomenon in the description of molecular genetics, where terms from linguistics were adopted for a better understanding of the processes and structures. There we spoke of translation and transcription or of letter codes and alphabets.
4 Computer Viruses, Computer Worms…
81
To describe the relationship of computer viruses to biologically effective viruses only by linguistic analogies is on the one hand not appropriate to the matter, on the other hand actual similarities are found much rather in the primary functions and the effect or the effects of both systems on their environment. For it would of course be nonsensical to try to force the two systems into a relationship that cannot exist because of their very different histories of origin. Rather, we will have to consider the mode of action and the consequences of viral activities if we want to name and discuss the similarities. Above all, we will also come across intriguing similarities in the question of the future development of natural and technical self-replicating systems. Before we analyze the commonalities and similarities, it is useful to first name the fundamental differences between the two systems. One essential difference, for example, is the fact that computer viruses were initially “created” by humans, i.e., programmed. Computer viruses did not emerge eo ipso, nor did they develop and change in a natural environment through evolution. Biologically active viruses, on the other hand, did not suddenly appear in a “spontaneous generation”, nor were they actively created by biological organisms. They have evolved – according to different theories (see Sect. 2.4) – they have either developed from (an-)organic material in the course of evolution parallel to their hosts, or they are degenerated former components of living organisms that have established themselves outside the organism as an autonomous system and developed further. Biologically active viruses exist in the natural environment of our planet without technical aids. They depend only on the specific host organisms in each case for their reproduction. In doing so, they manage in very different ways to survive longer or shorter periods outside the host. Since viruses do not metabolize and therefore do not
82
R. Ball
require energy, water, or oxygen, they can theoretically survive outside a host for any length of time (the word “survive” suggests itself, but of course is not appropriate, since viruses are not living organisms). At the same time, however, viruses are destroyed by UV light or chemical substances; their RNA in particular is unstable. In contrast, computer viruses depend on the existence of computers as the basis of their functioning. Without a functioning computer (i.e., a hardware with running software), computer viruses do not exist either, they are materialized on the system “computer”. We have already talked about the emergence or creation of computer viruses. Nevertheless, the parallelism of both systems is often amazing. Biologically active viruses and computer viruses are both non-biological, i.e., non-living systems that can replicate themselves and adapt to variable conditions through changes (mutations). They thus achieve an autonomy that otherwise only occurs in living organisms. Both systems attack hosts and damage them. The possible protection and defense mechanisms against computer viruses and biologically active viruses also have great similarities. The only way to prevent infection by a virus is to keep a physical distance from the virus and prevent it from entering the host. In the case of biologically active viruses – for humans, for example – this can only be done reliably by social distancing, i.e., keeping a sufficient distance from potential virus carriers, or by means of a physical “firewall” (virus- proof protective clothing). In the case of computer viruses, the computer must be removed from the data network (a kind of “social distancing” of technical systems) and the virus must be prevented from reaching the machine via “direct contact” (i.e., via data carriers). Analysis shows that the only systems with potential for protection from a viral attack are systems with limited transitiv-
4 Computer Viruses, Computer Worms…
83
ity and limited sharing, systems with no sharing, and systems without general interpretation of information. (Cohen, 1984)
Due to their adaptability because of evolutionary mutations, viruses are difficult to “fight”. Biologically active viruses are fought by the host organisms and their cells with the specifically available means of immune defense. Due to the permanent and sometimes very rapid changes in viruses, there is a constant race between the intruder (virus) and the immune response, or the host’s ability to defend itself. On computer systems, too, malware can constantly change and adapt with the aid of evolutionary algorithms, for example, and is thus just as difficult to detect and combat. Of course, the defense attempts here are not carried out by an autonomous biological organism with its immune response, but by anti-virus software created by programmers. In the meantime, however, there are also anti-virus programs that independently adapt to the ever-changing properties of computer viruses and thus, as technical systems, achieve an autonomy status that corresponds to that of variable and evolutionary malware. Here, too, the parallelism between the world of action of biologically active viruses and computer viruses is evident. What do computer viruses and human viruses have in common? Even though the comparison may be a bit lame, there are parallels: In both cases, we rely on entities to provide information and guidance, and to create and enforce regulations that protect us from harm. (Kasparov, 2020)
The intriguing relationship between actively created or created technical viruses and the biologically active viruses created by evolution is an important topic that will occupy us
84
R. Ball
further in this book (Chap. 6). At this point, however, it should be pointed out that in the twenty-first century it is not only conceivable but already a reality that computer viruses “create” or “evolve” themselves. This means that programs exist that have developed from other programs or have been “written” by them without any human intervention. Thus, the technical form of viruses also achieves an autonomy that is quite close to that of naturally evolved viruses. The basis, on which the virus forms can develop, is admittedly in each case another. But just as the natural conditions on earth were sufficient for the emergence of biologically effective viruses, a functioning computer environment, ideally integrated into a network, is sufficient for the emergence of (autonomous) computer viruses. When these viruses then evolve through mutation, they achieve an evolutionary autonomy like that of natural systems. Thus, a co-evolution between malware and the software systems on computers is conceivable, which is comparable to the co- evolution of biologically active viruses with their host organisms. In this context, it is of great interest what the similarities of function and mode of action of technically generated systems with naturally evolved systems are based on. Computer scientist Klaus Mainzer postulates that it is not the origin of the building blocks (such as natural elements of the earth or the software environment of a computer) that is decisive, but their arrangement and structure: Important in considering automated self-replicating systems: quite obviously, it is not the particular building blocks used (molecules, animate/inanimate nature, technology), but the organizational structure that contains a complete description of itself and uses this information to create new copies (clones). (Mainzer, 2019, p. 87)
4 Computer Viruses, Computer Worms…
85
We thus come to question the informational foundations of self-reproducing systems. Or as Vincent Heuveline puts it, “Is information organized and stored in a silicon chip less alive than that of the DNA double helix in our genes?” (Heuveline, 2016). But this is exactly our topic in the following chapter.
5 Information, Genetics, and the Evolution of Life
Abstract Information, data, and knowledge are quite different concepts. We need to distinguish them for understanding the informational basis of inheritance and transmission. Data is context-free and meaningless, it does not induce actions. Information emerges from data only through contextualization and meaning assignment. They then act to control behavior and induce actions. Information does not answer questions about cause, effect, and consequences; that only happens at the level of knowledge, which is tied to humans (or, in future, to intelligent machines). Quite similar mechanisms can be discerned for the coding of information in biology and technology. The genetic code is a fascinating basic mechanism for encoding information in living things and viruses. It is as old as life itself and applies universally to living things and viruses. It evolved along with the emergence of life in the primordial soup from the first organic molecules. In technology, infor© The Author(s), under exclusive license to Springer Fachmedien Wiesbaden GmbH, part of Springer Nature 2023 R. Ball, Viruses in all Dimensions, https://doi.org/10.1007/978-3-658-38826-3_5
87
88
R. Ball
mation is coded based on information theory. The coding in the computer is carried out exclusively by the binary signs zero and one. Just as self-replicating systems for information storage and transmission evolved during the emergence of life, selfreplicating computer systems (or robots) can emerge via evolutionary algorithms, which – analogous to the first unicellular living beings – could find their way to autonomy.
5.1 What Is Information? Data – Information – Knowledge The physical world surrounding us can be described with three basic quantities: mass, energy, and information (Rauchfuß, 2005, p. 255). In this book, we have focused on the topic of “information” and now we want to look at what information as the central basic quantity of physics is all about. Until now we have used the two terms “data” and “information” without having introduced a clear definition to distinguish them. In information science, which occupies a kind of hybrid position between the natural sciences and the humanities, we distinguish three levels of the concept of information: data – information – knowledge. In fact, data are unstructured, isolated, and context- independent signs. They cause little or no behavioral control in humans. Or formulated differently: Data is information that is available without context and without meaning. This is the reason why huge amounts of data do not frighten, scare, or paralyze us. It is only the processing of the data and the creation of a specific context that turns it into information. This is because information is structured, it is anchored in contexts, and it is context dependent. Thus, they have “meaning” and can cause behavioral control.
5 Information, Genetics, and the Evolution of Life
89
The concept of information assumes that certain facts are considered to be the content of information and that a medium conveys this information; in humans, for example, this can be natural language, in computers it can be the coding in the program, and in biological processes of living beings it can be chemical processes, physiological chains and genetic codes. Finally, there is a need for a hitherto uninformed receiver of information. Because in the information model “transmitter – receiver” information is coded by the transmitter and decoded by the receiver. Information is transmitted by means of a medium. All living beings and many technical systems, such as computer programs, permanently produce new data. There is a fundamental difference between data, information, and knowledge. We have already learned that data is unstructured, consisting of context-independent signs that do not influence our (human) actions. Nor does it evoke biological or other systemic responses, such as in viruses. Data, as raw signs without a given context, is neither comprehensible to us nor does it motivate us to do anything. But what suddenly turns data into information? How do we get from data to the second level of the concept of information? Information is context-dependent, it is structured, and it is anchored in contexts. Thus it has “meaning”. Carl Friedrich von Weizsäcker once put it this way: “What is information? Information is only what is understood” (cited in Penzlin, 2016, p. 300). Only with the help of contextualization does data become information, which can then also lead to behavioral control. Information is then “[…] a new, behavior- determining knowledge about an event, a fact, or a state of affairs in reality. Information is elimination of uncertainty” (Merkl, 2015, p. 557) Or: “We call information the meaning of signals, which can be represented in different ways (for example, sensory, electronic, symbolic)” (Mainzer,
90
R. Ball
Knowledge
Pragmatics/Networking; Information is linked to experience and thus results in knowledge.
Information Semantics; Statements/data are assigned a meaning. Data
Characters
Syntax; individual characters are arranged to form a statement by means of syntax.
(c) 2012 Raffael Herrmann; www.derwirtschaftsinformatiker.de
Fig. 5.1 Signs – data – information – knowledge. Knowledge pyramid according to Raffael Herrmann. (Reprinted with kind permission. Source: https://derwirtschaftsinformatiker.de)
2016, p. 8). Figure 5.1 shows the knowledge pyramid according to Raphael Herrmann. Information is a very special good. It is immaterial and its content can only be assessed once it has been received and decoded, i.e., read. That is why you cannot buy information “on trial” or get a test delivery. Informational products or goods are always purchased as a “pig in a poke”, since the value and significance of information can only be assessed when it is fully available. Arnold Picot (1988) has summarized some essential properties of information: • Information is an intangible good, which is not consumed even if it is used several times. • Information is neither a free good nor a common good; the cost of information is determinable and depends on its production, acquisition, use and transmission. • The value of information depends on how it is used. It can be changed by adding, selecting, aggregating, concretizing, or omitting.
5 Information, Genetics, and the Evolution of Life
91
• Information expands during its use and has a basically unlimited capacity for expansion. • Information can be condensed and concentrated. In the concept of information, the following three dimensions are distinguished: • Syntactic dimension: It addresses the coding, the mutual arrangement of coding symbols, the relationship of the signs among themselves and their transmission as well as the evaluability of information elements. • Semantic dimension: this dimension refers to the content of information and its meaning. • Pragmatic dimension: emphasizes the practical value of information for the receiver and the behavioral control inherent in it. Thus, if we want to have answers to questions of cause, effect, and consequences, we must leave the level of information and move to the third level of the concept of information, the level of knowledge. This level is bound to the human being, knowledge exists only with and in him. Imagine once, the earth would be – only as a basis of our thought experiment – no longer inhabited by people. Would knowledge still exist for you then? Or would it not be only data and information, which would be coded and stored in the organisms and viruses, on computers or in other storage media? Knowledge does not exist in other biological organisms, nor in other autonomous systems, such as viruses. Whether it can be built up in technical systems in the future is still controversial. On the way from data to information we reach a new quality with knowledge. Only on the level of knowledge understandable and comprehensible and – in the sense of causality – explainable and provable insights are created.
92
R. Ball
Knowledge, insight, and wisdom are therefore completely different categories than data and information. They are created by the intensive, productive and above all human- driven interpretation and processing of data and information. The creation of knowledge is a cognitive process in the human brain; therefore, knowledge is also tied to humans and is the basis for cognition and wisdom. This step is a particularly elaborate and difficult one at the same time, because human intellectual input is required in the process. Human judgment is always involved in gaining knowledge. Therefore, this process produces only a comparatively small output. This is an important difference, because the production of data and information can happen independently of humans, as can its processing, storage, and archiving. The discussion of knowledge (and wisdom), however, takes us away from the question of what information is and how it is encoded. In the next subchapter, therefore, we look at how information is encoded in living things, viruses, and technical systems. We cannot devote ourselves at this point to the exciting questions about the nature and processes of gaining knowledge and about the definition of wisdom.
5.2 The Coding of Information in Biology and Technology There is, after all, nothing smaller and more “compactly” stored than information that lies passively dormant in genes, reduced to a nucleotide “construction” consisting of particles of almost atomic size, containing “sensors” ready to transform this information expansively into building procedures, be it in some germ, a seed, an egg. (Lem, 2013, p. 98)
5 Information, Genetics, and the Evolution of Life
93
In the previous chapter we learned what information is and how it differs from data and knowledge. For our consideration, it is important to find out what similarities and differences there are in the coding of information in living organisms, autonomous systems such as viruses, and technical (computer) systems. We now want to explore the question of how the particularly powerful information is stored and retrieved in the respective systems. Information can only be passed on over a longer period and act as a basis for the reproduction of systems – whether it is inherited in living organisms, used in viruses for reproduction, or used in technical systems for the autonomous reproduction of, for example, an algorithm – if it is translated into a form that is understandable and reproducible for the respective system and encoded in it in a standardized way. In any case, what is essential here is that only those autocatalytic systems can evolve by natural selection that can also store and “inherit” information in a stable form. Thus, for the origin of life, those replicators are of particular importance which are capable of information transfer (genetic replicators). (Bülle, 2000)
There is a whole range of different coding systems with the help of which information can be stored and retrieved in a formalized way. The alphabets of many human languages with a fixed number of characters (information science would call it a controlled vocabulary) represent such a code form; so, does the binary code of the digital world in the form of zeros and ones. Also, with the genetic code we can determine the attributes of standardization and formalization, we have already got to know it in the context of the molecular genetic basic law.
94
R. Ball
Viruses use almost the same elements and retrieval procedures for the fixation of their blueprint information as living beings. They use their genetic code for this purpose. Only prions (infectious proteins) encode and transport their information epigenetically; moreover, their informational mode of action (induction of structural changes) is completely different. To the dimensions of information already explained, namely syntax, semantics, and pragmatics, we must now add the dimension of encoding information. For the storage and transmission of information, coding in special signs or symbols is necessary, since information does not exist “in itself ” and can only be stored, transmitted and reproduced by means of a medium. Communication means exchange of information: The sender encodes information, and the receiver decodes it again. This makes information readable and understandable, and it can develop relevance for action on the part of the receiver. This transmission or storage can have a temporal or a spatial dimension. The following applies: “It is not the medium that is the information, but what the medium ‘transports’” (Wikipedia, 2021a). This process is familiar to us from human communication, and there is a whole range of theories and concepts on how the relationship between sender and receiver is shaped in human language, which messages (information) are exchanged in which form, and which processes take place during encoding and decoding (e.g., Haaß, 1997; Khabyuk, 2018; Röhner & Schütz, 2020). As an example, we can only refer to Bühler’s well-known communication model from the 1930s (Bühler, 1934), which represents the relationship between sender, receiver, and mediated content as a sign and communication model. The processes of information exchange between people by means of human language are still easy to understand and to comprehend, since
5 Information, Genetics, and the Evolution of Life
95
communication and information exchange are a central constituent of human existence and are very close to us. Therefore, the parallel use of terms in biological information processing and human communication (for example, information, translation, transcriptions, code) is beneficial and helps in understanding the processes (Junker & Paul, 2009). If we think one step further, the matter becomes already more complicated, but still comprehensible: The inheritance of information from one generation of living beings to the next is also an information process that follows the same principal laws as human communication by means of language. Here, as there, information is first encoded, then transmitted and/or stored, and then “read out”, i.e., decoded and converted into action-relevant information. If we go one complexity step further, we realize that the reading of genetic information in the cell and the application of this information in the formation of substances and structures also represents just such an information coding and decoding process. When genetic information is transported from the cell nucleus to the ribosomes and “read out” there, i.e., decoded, so that proteins and other functional and structural elements can be formed from the building blocks of life, the amino acids, then this is also the fundamental processes of information coding, transmission and decoding. The fact that the genetic code is a system of information coding that has functioned almost identically for billions of years and serves the most diverse levels of information exchange and transmission – from communication within cells to the inheritance of characteristics across generations – must be fascinating. Studies have shown that the genetic code is very stable against errors. Presumably, the formation of the code was completed very early, since it is identical in almost all organ-
96
R. Ball
isms. To date, it is not clear how the genetic code evolved in RNA. (Weicker, 2015, p. 12)
Whether the genetic code itself has also become the subject of evolutionary change is a matter of controversy in the scientific community (cf. Sect. 2.2). It has already been suggested in earlier chapters that molecular biology uses the terminology and concepts of information science to explain processes and sequences. This is even more understandable when it now becomes clear that molecular biological processes also represent information handling processes at their core, which can be meaningfully represented not only terminologically but also in terms of content using concepts from information theory. Within just a decade, [the] representations of molecular processes as processes of information storage and transmission, of (trans)writing and translation, advanced to become the key concepts of molecular biology discourse. They provided a rhetorical repertoire that today […] seems to have gotten rid of its metaphorical origins: “genetic code,” “information,” and even the rhetoric of “genetic writing” are now a self-evident part of bioscientific terminology. (Brandt, 2004, p. 8)
We can even go a step further and postulate that it is not just about terminology but suggest that all processes surrounding the origin and development of life are information- driven processes. Küppers poses this question explicitly, albeit in a different turn of phrase: “The question of the origin of life therefore turns out to be synonymous with the question of the origin of biological information” (Küppers, 1986, p. 18). This raises the aspect of information and its coding to a central, if not the central issue in the debate about the origin and development of life, but also of other
5 Information, Genetics, and the Evolution of Life
97
self-replicating systems, such as viruses. The question of the cause of coding is interesting in this context: “Who codes in human communication? Humans as the cause and originator. In biology, it is natural selection” (Smith, 2000).
5.3 The Genetic Code and Information Theory Today, anyone who wants to encode a text or information selects a code to be stored; the content is encoded, that is, coded, then sent (or stored), and then decoded again by the receiver. In this case, information has been “actively” encoded and the code has been deliberately selected and applied. Information science and molecular biology developed to some extent in parallel in the 1940s and 1950s. With the discovery of the genetic code, one saw the construction plan of God revealed. Bill Clinton still formulated at the decoding of the human genome in 2000: “Now we learn the language with which God created life” (Lossau, 2011). With this view, one then assumes an active coding, in this case and in religious premise, a coding by God. In the evolution of living things – and in the analysis of the self-replication of viruses – the legitimate question now arises whether and to what extent codes and information- coding systems can evolve from themselves. In the field of computer science, we will see that so-called self-encoding systems can occur, especially in such systems as those discussed in artificial intelligence and autonomous algorithms. In human communication, humans are both active originators and originators of the code; in biology – beyond human language – it is natural selection that has given rise to the coding system of information transmission (Smith, 2000). Weigmann, for example, criticizes the use of
98
R. Ball
information science terms in molecular biology because they suggest an intention or a meaning behind the statement. This, however, is not present in evolution, which is why she refers to DNA as a “text without context” (Weigmann, 2004, p. 116). This is understandable, unless one interprets natural selection as meaning per se. Then again, contextualization can be discovered. In the emergence of living things, whose existence is essentially information, we still must discuss these issues, while in the emergence of life the basic physicochemical question is quickly found: it is a matter of finding out how chemical storage of information can or did occur. Because nothing else is the genetic code: A storage of information in the form of organic or inorganic substances; or let us call them more concretely: Molecules. In RNA or DNA, it is the nucleic acids or nucleotides that form the basis for the storage of information. Küppers postulates that no explanation is needed for the meaningfulness of the self- reproduction and self-replication of nucleic acids; it lies in their chemical structure and thus exists independently of whether there is an underlying blueprint as information (1986, p. 168). With it he opens a way to the self-replication of “dead” matter, whose “sense” is not biologically given, but by the structures of the molecules on the one hand and the in each case prevailing physical basic conditions (for example temperature, air pressure, radiation intensity) on the other hand. The step from self-replicating, non-living molecules to the development of the living then goes by itself: Through the formation of separating membranes and thus through the internal compartmentalization as well as the separation into an external and an internal world, the emergence and development of the cell as the basic unit of life is predetermined. It then makes use of the self-replicative mechanisms of molecules. An early, chemical basis for the coding of
5 Information, Genetics, and the Evolution of Life
99
genetic information is the RNA, a strand of nucleic acids, which contains not only the necessary information for the synthesis of the substances and structures required in a cell (or a precursor structure), but also still that information which is required for replication and for subsequent generations. Thus, RNA stands at the transition from chemistry to biology; it is “software and hardware at the same time” (Mölling, 2015, p. 185), fulfilling a dual function: as a data carrier, it is hardware, but its nucleotide sequence is the encoded information (software). However, the chemical structure alone does not result in an “information system”. This is needed, however, to determine the assignment of the bases and to their respective specific meaning. Because: … information content of a random base sequence in an RNA is not usable as long as there is no connection to an information system. This system requires an exact information mapping between RNA, tRNA, and associated synthetases. (Schreiber, 2019, p. 128)
Thus, information of genes always has a syntactic and a semantic dimension, which is established by the classification into an information system. Also, the existence of RNA or DNA does not by itself make any practical sense in terms of information, even if DNA can be regarded as the “information center” in cellular events and serves not only the function of information storage, but also as a matrix for copying information and to produce further information carriers (mRNA) (Küppers, 1986). Several (informational) steps are required for the synthesis of a functional protein. The basis is the nucleotide sequence of the amino acids on the RNA or DNA. So that these sequences can be read correctly, there are special
100
R. Ball
nucleotides which, as start and stop sequences, determine the reading direction, but also the activation or deactivation of sequence sections. After that, the spatial arrangement of the amino acids must be correct (secondary structure) before folding in space (tertiary structure) makes the protein functional. Because in addition to the coding of the genetic information by the nucleotide sequence (codon), the reading frame is the second informational basis of the genetic information. The basic structure of RNA and DNA and their sequences can be read in both directions and is therefore ambiguous. Special start and stop sequences are what allow meaningful decoding in the first place (Merkl, 2015). It gets a little more complicated, however, because even the sequence of the nucleotides together with their start and stop sequences alone are not sufficient for the final and functional construction of the proteins. Proteins have a secondary structure, this is the spatial arrangement of the amino acids, in addition to their sequential order. However, proteins are only biologically functional if their special folding in space, the so-called tertiary structure, is also implemented. This incorporates all existing physical interactions. Only in the correct tertiary structure are the proteins biologically effective. It is therefore not possible to derive a protein unambiguously from its sequence. If one wants to compare different protein products, for example, the analysis of the sequences alone is not sufficient. When a tertiary structure is reduced to sequence, a wealth of information is lost. Therefore, a comparison of sequences will have to be less informative than a comparison of 3D structures. (Merkl, 2015, p. 31)
Reading the genes and deducing the coded contents is therefore different from “backward inference” from the
5 Information, Genetics, and the Evolution of Life
101
phenotype to the genotype. Thus, the reading direction of information is always from genotype to phenotype. From the phenotype, i.e., the complete organism, it is not possible to infer the underlying genetic information. Küppers describes this as so-called genetic determinism: Genes and DNA, respectively, are seen as material carriers or storage media of a “knowledge” that is of crucial importance for the life processes in an organism. By using this knowledge, among other things, during protein biosynthesis to produce proteins, genes make an essential informational contribution to the development and physiology of a living being. (Hildt & Kovacs, 2009, p. 21)
The fact that, in the other direction, information also of the individual flows into the genome from the phenotype or the respective life environment and experience was for a long time neither considered nor thought possible. The term epigenetics is used to describe this process. However, it has sometimes been used in an esoteric way, so we need to look carefully at what epigenetics means and whether it is pure wishful thinking or even “lazy magic”. We will go into more detail on this topic in a moment. In addition to the genetic information base fixed in the DNA sequence, further “additional information” is needed so that the genes give rise to exactly what constitutes the phenotype. And this “additional information” is not fixed in the DNA or RNA but interacts on the conversion path of the primary information of the genome to the phenotype. Genes carry information about the adult form of living beings; in a sense, they contain not only the blueprint but also the potentials for the developing individual. Nevertheless, the individual form of the adult living beings is always unique even on low organizational form, although the genetic blueprints are identical there. The complexity of
102
R. Ball
the biological information chain is nevertheless substantially larger, than it may appear at first sight. For there are not only genetic information systems via the coding of DNA and RNA, but also neuronal information systems that encode and transmit information. But only with higher complexity of living beings’ neuronal information systems, like nervous systems and brains, can exceed information systems of the genetic stage. This is possible only from the developmental stage of reptiles, but not in bacteria, viruses and other lower organisms and systems. In the evolution of life, the ability to process information is by no means limited to genetic information systems. Of paramount importance, information processing became associated with nervous systems and brains in more highly evolved organisms, and eventually with communication and information systems in populations and societies of organismic organisms. (Mainzer, 2016, p. 119)
The interplay of the various information systems beyond the genetic basis cannot be addressed here. We will have to limit ourselves to the genetic information coding system for the interesting question of transferring the idea of information coding and self-replication to technical systems.
5.3.1 Epigenetics Knowledge manifests itself in biological systems in three different forms: genetic information, experience of the single individual or the respective group of organisms, and in humans in the form of language, writing, and records (Junker & Paul, 2009, p. 129). In molecular genetics, the direction of information processing is always from genotype to phenotype. Küppers (1986) calls this genetic
5 Information, Genetics, and the Evolution of Life
103
determinism. This not only means that we cannot unambiguously trace information from the phenotype to the genotype, but also implies that empirical values of the individual or a collective cannot have a direct influence on the genetic information base. The fact that natural selection and selection in evolution also has genetic consequences precisely due to the influence of changed framework conditions is the central statement of evolutionary biology. However, never in the sense of a causal relationship, but always only because of random gene changes and the subsequent “trial-and-error” system of selection in evolution. It is true that this process – strangely complicated and outrageously slow from the point of view of efficiency – allows, in addition to adaptation to changing conditions, developments that are not purposeful and therefore serve diversity or even the playful diversity of nature, but at the same time, from the rational point of view of humans, one would rather expect a process that makes direct, individual experiences usable as stored genetic information for future generations, beyond phenomena such as language, learning and culture, which are essentially limited to humans and a few higher animals. Epigenetics seems to describe a process that enables exactly this: information transmission from phenotype to genotype. This means that experience values, which the individual makes during his individual existence, are coded as information and have both influence on the gene activity and thus the concrete biosynthesis of the current individual and can be inherited to further generations. Here, special mention should be made of the many research findings on traumatic experiences of people, whose experiences are reflected in subsequent generations through a wide variety of disorders in physique and psyche (Leuzinger-Bohleber & Fischmann, 2014). But increasingly there is research
104
R. Ball
showing that other basic conditions of the individual, such as nutrition or environment, also find their way into the genetic basis of the individual and thus into the information chain of generations. However, the exact processes are not yet clear: However, unlike the mechanisms of transmission of innate traits, which are relatively well understood and fall within the realm of genetics, we know less about how acquired traits are passed on. (Mansuy et al., 2020, p. 12)
The central proposition of epigenetics is the control of gene activities beyond the mere decoding of genetic information as explored and described by genetics. It is thus apparent that, in addition to the sequence of nucleotides, there are further controlling factors that accompany the path from genotype to phenotype and vice versa in terms of information. We have already learned about the switching on and off of genes by so-called start and stop sequences. With epigenetic information, there is another level that influences the decoding process, which is also called gene expression. Epigenetic information is encoded in the chromosomes, but not by the classical sequence of nucleotides, the DNA sequence, but by the open or closed chromatin structure of the chromosomes, i.e., the protein scaffold of the double helix, which determines the reading or non-reading of the gene sequences. The state of chromatin structure can be controlled by three different mechanisms: (1) by methylation, which is the addition of a methyl group to DNA nucleotides, (2) by altering the histone structure of chromatin, and (3) by noncoding, inactive RNA strands that cluster around the genome and prevent reading (Mansuy et al., 2020, p. 81). “Intuitively, it seems obvious that there is a data set that is added to the information provided by genes:
5 Information, Genetics, and the Evolution of Life
105
How epigenetics works Promoter: Section on the DNA that regulates the expressions of a gene. On - off switch for genes, silenced gene section
Readable gene section
STOP
STOP
STOP STOP
DNA methylation, Switch off the gene, methyl groups block promoter
Gene products Expression refers to the formation of a gene product encoded by a gene, especially proteins or RNA molecules
Fig. 5.2 How epigenetics works. (Reprinted with kind permission from Visionaries of Health. Source: https://visionaere- gesundheit.de/)
[…] This added data set is epigenetics” (Mansuy et al., 2020, p. 26). Figure 5.2 shows how epigenetics works. Thus, epigenetic information is definitely chemically encoded. Unlike DNA-encoded information, it is causally induced by specific environmental conditions; it can be reversed or inherited over many generations. However, the influence of epigenetics does not only refer to the possible inheritance of information outside the DNA sequence to future generations, but also has concrete effects on the respective active life of an individual, namely by how well or poorly genetic information is implemented in all physiological processes of the organism. With epigenetics, another central informational part of heredity becomes manifest. If the research results are confirmed and if epigenetics and the epigenome establish themselves as a further relevant element of the genetic information system, the previous idea of the development of species by mutation and selection will experience a tremendous addition, almost an upheaval. The dictum of Darwin and all research results of geneticists see the
106
R. Ball
only process for the formation of biological diversity and the survival of species in the random and not purposeful mutation of the genetic equipment of living beings and in the subsequent best selection. Thereby mutations are random and the following selection “merciless”. For only a “reality check” shows which mutation is the most suitable for the respective current conditions of an individual and its species. Causality and correlation are not in play in the evolution of species at the level of genetic mutations. At best, the experiences and living conditions of a single individual play a role in its individual life span. The concept of epigenetics, on the other hand, allows for a causal, informational connection between the general conditions to which an individual is exposed, the implementation of its genetic makeup, and the (possible) inheritance of these altered characteristics to subsequent generations. This adds a dimension to genetics’ notion of mutation and selection that is also informationally and qualitatively different from the mechanisms of genetics. If one thinks the causal connection between the influence of environmental conditions and the epigenetic effects on the individual and the transmission to next generations consistently to the end, the autonomy of man, the concept of his freedom and consequently the anthropological difference are at stake. If the freedom of man is reduced to his epigenetic factors, it is to be feared that Homo autonomus has become Homo determinatus. But if we turn the causal dependence of our physique and our psyche on epigenetic properties and determinants of the environment into a positive one, we get with the chance to turn our genes on and off a (certain) possibility to influence our genetic make-up (which in turn determines us again). Or, as neuro epigeneticist Isabelle Mansuy puts it in the title of her book, “We can control our genes” (Mansuy
5 Information, Genetics, and the Evolution of Life
107
et al., 2020). The idea that we would have complete freedom of choice in doing so, as Mansuy suggests in her book (“What body and mind would you like to form? Which heritage would you like to pass on?”, p. 141) unnecessarily places epigenetics in the vicinity of esoteric mumbo-jumbo. In addition to the informational dimensions in heredity, there are further informational layers in the formation and subsequent folding of proteins into their secondary and tertiary structures that describe the process of encoding hereditary information and thereby have an informational character in the process chain from genotype to phenotype. Like epigenetics, the informational basis of these processes is not yet understood in detail. Learning-induced evolution also exists beyond formal epigenetics. The so-called “Baldwin effect” shows that, in addition to purely genetically dominated evolution, other phenotypically induced evolutionary mechanisms also exist, although it can quite obviously only apply to higher organisms: Individual development [has] an indirect influence on evolution and the new genotypes that emerge in the process. The essential basis of the Baldwin effect is a shared environment in which both evolution and learning occur. Thus, phenotypes that have changed because of learning then also influence the shared environment and thus the progression of evolution. This can result in selection advantages or disadvantages for individual genotypes in the population. Likewise, genotypes that provide a better basis for learning certain traits may prevail more easily in the population than other individuals. Learning is an integral part of the environment and thus an essential part of a species’ adaptation to the environment. According to the theory of the Baldwin effect, learned behavior can thus become instinctive b ehavior over long periods of time, which is then inherited quasi- directly. (Weicker, 2015, p. 34)
108
R. Ball
5.3.2 Prions As we have seen above, there are different dimensions of information coding in living, dead and technical systems. Information coding without using the alphabet of DNA or RNA, i.e., nucleotide sequences, is also possible and occurs in living systems. Whether such epigenetic phenomena also occur in the gene expression of viruses is still unknown. Viruses, after all, are little more than a genome (usually made of RNA) with a protein coat. Their replication occurs after the RNA is introduced into a host cell, in whose nucleus the viral RNA is then read and amplified. If we imagine the RNA of a virus to be removed, all that remains is a protein envelope. Can we think of such protein molecules as systems that also replicate and thereby become infectious in the host? Indeed, such systems exist. They are called prions and we have already introduced them in Sect. 2.4.3 in a differentiation to viruses. They are proteins with a special secondary structure, i.e., a folding, which distinguishes them from non-infectious proteins. In the host organism, they in turn induce non-infectious prions to mutate into infectious ones; in a sense, they replicate themselves through this induction. This replication occurs entirely without the use of the basic genetic principle of encoding information in the nucleotide sequence of RNA or DNA. “Inheritance, since it occurs without RNA/DNA, can thus be described as epigenetic” (Modrow et al., 2010, p. 664). The term “prions” has existed in the scientific literature since 1982 (Hörnlimann, 2001, p. 10). It refers to proteins that occur in nerve cells, i.e., preferentially in the spinal cord and brain of higher animals. These are normal, functional proteins that are formed in the organism and protect the cell from radicals, for example.
5 Information, Genetics, and the Evolution of Life
109
However, when these prion proteins, which are useful and completely harmless in their normal form, change their structure and folding, they not only become non-functional, but are also deposited in the brain, causing cellular and neuronal disorders that lead to death. Thus, the pathogenic form of the protein not only no longer fulfills its function but is deposited as a functionless “junk protein” in the brain and nervous system, where it triggers cell death. Known prion diseases of the central nervous system include the very rare Creutzfeldt-Jakob disease in humans, scrapie in sheep, and BSE (“bovine spongiform encephalopathy”) in cattle, also known as “mad cow disease”, which is triggered by feed ingestion of infectious sheep meal. The non-infectious prion proteins are abbreviated PrPC (for prion protein cellular) and are mainly alpha helices in their secondary structure. The refolded, pathogenic prions (PrPSc for prion protein scrapie), on the other hand, do not have an alpha-helix structure, but are folded as beta-sheet structures. When these “leaflet prions” encounter normal prions, they induce the conversion of the alpha-helix proteins into the leaflet structure with the known consequences. For these discoveries, made as early as 1982, Stanley Prusiner was awarded the Nobel Prize 15 years later (Prusiner, 1982). In contrast to viral, animal, and bacterial pathogens, the body of the organism infected with prions shows neither immune reactions nor inflammation or other defensive effects (Riesner, 2001, p. 80). The mode of action, “transmission, and propagation pathways of prions are only beginning to be understood” (Kayser et al., 2014, p. 546), and their classification into the phenomena of the animate or inanimate world does not yet succeed definitively. If one wants to establish a ranking from dead to living matter or from dead systems to animate systems (which is factually forbidden but can contribute as
110
R. Ball
a thought model to the understanding of our question), prions rank on the lowest level of self-replicating systems, directly after lifeless matter. Above them the viruses are to be arranged, above them the living organisms. The information-theoretical classification of replication and inheritance mechanisms is particularly difficult in the case of prions. Thus, prions do not use the genetic alphabet for information coding – in contrast to what we have come to know in the case of viruses. Inheritance is epigenetic in the sense that there is a particular type of information base responsible for triggering and transposing the infectious chain (inducing structural transformation of normal prions), and “inheritance” based on the transfer of misfolding information from infectious to normal prions. Thus, the infectious prion with its particular folding structure serves as a pattern for inducing the infection of normal prions. Replication of infectious prions without a genetic basis is thus a “template for themselves” (Smith, 2000, p. 184). Thus, prions do not replicate by division like living cells, but by induced modification of neighboring molecules. The reproduction follows a mechanism that is more like the growth of crystallization nuclei. The question is therefore which form of information coding is at work here if RNA and DNA do not occur as information carriers. For if prions are not only infectious, but also multiply, then they are more than a cell poison, which also has an infectious effect and causes cell damage, but neither replicates itself (living organisms), allows itself to be replicated by hosts (viruses), nor causes other molecules to take on its own form (prions). Prions replicate by inducing the change of normal proteins into infectious ones, i.e., the transformation of helical proteins into folded- sheet proteins. How they manage to do this is still largely unclear.
5 Information, Genetics, and the Evolution of Life
111
Mirroring the phenomenon of “prions” on the genetic dogma could make information induction appear as an alternative form of information transmission during the information evolution of the living. Because the mechanism of the molecular genetic basic law is a highly complex process, which must run millions of times, needs complex build-up and degradation processes, and requires complicated, manifold physicochemical conditions. It seems hardly conceivable that the living can encode its information exclusively in this form. Simpler possibilities are conceivable, less error-prone and with lower energy consumption, which could make the transmission of information over time and over generations possible. It stands to reason to also look for and consider possible such “life forms” that do not use the classical genetic dogma and yet reproduce successfully. This cannot be more than a rough hypothesis, but it is certainly exciting. With prions, we have an impetus to do so.
5.4 Information Coding in Technology Information coding in technology is based on information theory, which was developed in 1948 by the American mathematician C. E. Shannon (1916–2001). In it, he analyzes the mathematical basis and relationships of information that is transmitted or stored as a temporal and spatial transformation. Each information has syntactic aspects (arrangement of information elements, evaluability), semantic aspects (meaning, content) and pragmatic aspects (value of information for the receiver, relevance for action). On the syntactic level, there are issues of encoding, transmission, and compression. Human speech, for example, or other
112
R. Ball
human utterances are not immediately suitable for machine transmission and storage; they must be reworked, i.e., encoded, so that they can be stored or transported with the help of the “computer” system and its computational processes. However, the computer does not encode information, but data. “Data is physically available on a data carrier (for example, paper). Information is the abstract content of data that the receiver gains through interpretation” (Anlauff et al., 2002, p. 18). They are processed in the computer system and are formulated according to unique rules. Coding is generally defined as a mapping rule that assigns to each character of a defined character set (for example, the alphabet) a unique character from another character set (for example, ASCII code). Coding in computers is done using so-called binary characters, represented by the numbers one and zero (1, 0). The combination of these characters determines the content to be encoded. The variability – the code only has the numbers zero and one – is rather low compared to biological coding systems (such as the nucleotides in the genetic code). A rule for mapping one set of characters into another set of characters or words is called a code or encoding. An encoding is reversibly unambiguous if from the diversity of the images follows the diversity of the original images and vice versa. Decoding is only possible for a reversibly unique code. (Anlauff et al., 2002, p. 23)
Today, the coding of information in technical systems, such as machines and automata, always takes place on integrated or connected external computers. In this process, information is processed and stored according to clearly defined instructions.
5 Information, Genetics, and the Evolution of Life
113
In computers, encoding is the process of translating a particular string of characters (letters, digits, punctuation marks, or symbols) into a special format so that it can be transmitted or stored more efficiently. Decoding is the corresponding counter-process – converting an encoded format back to the original string. (ComputerWeekly.com, 2005)
The coding in the computer itself is carried out exclusively by means of the so-called binary characters already mentioned. These are, for example, the numbers zero and one or the colors black and white in the QR code. This is the basis of the digital world, in which all information is encoded by a sequence of these binary characters. The combination and sequence of these two characters determines the content to be encoded. The binary code, which also goes back to Shannon’s information theory (see above), is based on the distinction between only two states, such as “on” and “off” in the case of switches or precisely the numbers one and zero (binary code), as it forms the basis of digitization. In a computer, binary characters are represented by different physical states of the switching elements, such as high or low voltage {H, L}. The internal character set of a computer is therefore binary. Therefore, to represent the external character set, a combination of binary characters must be assigned internally to each character. (Anlauff et al., 2002, p. 10)
On basis of this binary code a set of special codes was developed, which are all binary codes, however in their respective shaping special coding requirements and tasks serve. Thus there are beside the dual code the BCD code (coding of the numbers 0 to 9 in four bits), the EBCDIC code (8-bit character coding), the ASCII code (American standard code for information interchange with a coding in seven bits), the machine code (instruction set for individual processor types), the excess code (signed numbers in binary code), the
114
R. Ball
Stibitz code (an equivalent to the BCD code, coding of decimal numbers from 0 to 9), the Aiken code (comparable to the BCD code, assigns decimal digits to 4 bits), the 1-out-of-n code (codes a decimal number in n bits) or the Gray code (one-step code). Thereby the binary code of the computer corresponds to the genetic basic law (dogma), after which nature stores and transfers the (genetic) information with viruses and living beings. Special codes are also used for programming computer programs: In programming, the programmer translates the specifications to algorithms into a source code, which is formulated according to the syntax of a particular programming language and which, during further development of a computer program, is translated into further forms of program code – such as intermediate code (for example, bytecode). The final resulting machine code contains the machine instructions that a processor can execute. (Wikipedia, 2021b)
When converting (encoding) textual information for the computer’s computational process, the ASCII code is most commonly used. However, other codes are also used to encode information for the computing process, making it processable, transmittable, and storable (ComputerWeekly. de, 2005).
5.5 The Meaning of Information with Viruses and Algorithms Information is a special good; as an entity, it therefore has a very special character and its own “essence”. Information that stands without context and without meaning is data, as
5 Information, Genetics, and the Evolution of Life
115
such not effective and therefore inherently neutral and worthless. Only when information becomes action- determining (for example in humans) or action-inducing, for example when it serves as a basis, in the form of a blueprint of organisms, as a template and pattern for the synthesis of products in a cell, does it become meaningful. Information is therefore also defined as “new, behavior- determining knowledge about an event, fact, or circumstance in reality. Information is elimination of uncertainty” (Schneider & Werner, 2007, p. 38). Information thus plays a constituent role in the emergence of life and for the successful operation of all its processes in all their diversity and complexity. It is the basis and blueprint for the reproduction of organisms and their transmission to subsequent generations. If the primary goal in the development of living organisms is initially the identical transmission of the basic structures and blueprints to subsequent generations, the special design of information coding in genetics at the same time enables the chances for change and recombination of information in the genome. Only in this way have development and diversity of species become possible during evolution. Küppers assumes, in accordance with the molecular Darwinian approach, that selection in the sense of Darwin already began at the level of inanimate matter and molecules (and, of course, also at that of viruses) and thus “biological information arose through selective self-organization and evolution of biological macromolecules” (Küppers, 1986, p. 165). In the origin of life, the manifestation of the special conditions of self-replicating molecules is of great importance, since with increasing complexity organisms cannot arise completely ex novo identically. Therefore, the informational foundation and coding in a powerful information system with a sophisticated explication mechanism belongs to the most fundamental phenomena of life par excellence.
116
R. Ball
Thereby the respective structuring of information is of not to be underestimated importance for the classification of automated, self-reproducing systems. Quite obviously, it is not the respective building blocks used, such as the molecules of animate and inanimate nature, or the basic elements in the case of technology, but it is their organizational structure “that contains a complete description of itself and uses this information to create new copies (clones)” (Mainzer, 2019, p. 87). The coding of information of first early living organisms or their still lifeless preforms was done by RNA. Long RNA chains, on the other hand, are quite unstable and information encoded there is easily lost again after a few duplication steps when the long molecule breaks apart. Rauchfuß calls this the first “information crisis of living organisms” (Rauchfuß, 2005, p. 267), calling for catalysts to ensure more accurate replication. Computer algorithms can also replicate and duplicate themselves. To do so, they use the system functions and replicate themselves identically. This means that a program or (malware) can spread quickly and en masse from computer to computer via a network, for example, and cover entire systems, companies, or countries with copies of itself, as well as computers. Here, too, the primary maxim applies that the information in the program or the program itself serves as a blueprint, template, and basis initially for its own identical reproduction and is copied. Useful self-reproduction of computer software also exists, for example in the independent rolling out of new versions and programs across entire companies and their computer networks. Evolutionary algorithms on the other hand use the possibility that self-learning software changes, adapts and develops itself just as autonomously with changed basic conditions. Thus, the basic information of such a software
5 Information, Genetics, and the Evolution of Life
117
reaches an evolutionary status and can develop – detached from original programming – independently. Information and its application thus form the basis both in the emergence and development of organisms and their life processes and in technology in computer programs and all computer-controlled systems – initially to secure a stable basis with identical copies. However, in advanced forms and usage scenarios, it can evolve as a self-changing system and to autonomously adapt to changing conditions. In this sense, we can already speak of information systems of nature and technology in general. (Mainzer, 2016, p. 109)
This is the path to the autonomy of systems.
6 The Great Continuum: The Convergence of Life and Technology
Abstract If we want to compare biological and technical systems, self-replication, adaptation, and autonomy are central themes. At the origin of life, free molecules came together and were able to reproduce. Autonomous systems then evolved from this; life emerged.Autonomous software does not arise by itself, it must be programmed. But if the programmer programs self-replication, adaptability and autonomy into a system, software could act independently even beyond a given individual goal and follow only its own principle, just as autonomous agents do. The issue of artificial intelligence plays into this as well. In the replication of simple cells in experiments, an attempt is made to implement life principles in technical systems and to develop self- replicating nano systems. Such structures are called protocells, which are precursors of life, but not yet living beings. The example of a colony of robots that extract all the raw materials they are made of themselves and thus reproduce
© The Author(s), under exclusive license to Springer Fachmedien Wiesbaden GmbH, part of Springer Nature 2023 R. Ball, Viruses in all Dimensions, https://doi.org/10.1007/978-3-658-38826-3_6
119
120
R. Ball
autonomously is an idea that ties in with self-replicating viruses. Except for attempts at nature-imitating methods of software programming, the question of permanent change, mutation, reorientation, and adaptation, i.e., the issue of implementing evolutionary selection, has so far remained unsolved.
6.1 The Independence of the Systems Without speculation there is no new observation. (Charles Darwin, 1857)
Free molecules that come together in the early seas of earth evolution and reproduce themselves based on special, random properties do not yet constitute autonomous systems. They are simply free molecules that reproduce. This meaning of “free” is not to be confused with the term “autonomous” as we use it in this context. An autonomous system is a complex structure that maintains, evolves, and, if necessary, reproduces itself. Free molecules capable of simple replication are still far from all this. They simply react based on pattern recognition between structures: Chemists therefore rightly speak of molecular pattern recognition, in which a molecular structure is “recognized” or “selected” as suitable or unsuitable by another molecule. Neither perception nor consciousness nor biological selection mechanisms are assumed. And yet, an information gain […] takes place. (Mainzer, 2016, p. 111)
Autonomous systems need a trigger, an occasion, or even a condition, i.e., a reason why they exist and evolve. It would
6 The Great Continuum: The Convergence of Life…
121
be too far-fetched to speak of a necessary “sense”, which at best accrues to living beings, and in a strict understanding can even only be granted to active, self-determined human action. In the context of our search for the trigger of self- organized, autonomous systems we should better speak of the underlying “principle”. Principle here denotes an ultimate justification of things and of being, which needs no further justification and is thus unprovable valid. Starting from the “principle”, being takes its further starting origin. Aristotle bases this in his metaphysics (Volkmann- Schluck, 1979). But if free molecules come together and reproduce, then an autonomous system can develop from it, which is characterized by special rules, how the replication must take place. We can then replace “sense” by “principle”, and thus obtain a principle-guided system from the random reproduction of molecules. Thus, we are on the best way to the emergence of living beings, whose principle means their reproduction and preservation. Living beings show the immanent principle to preserve themselves and thus to continue to exist – as an individual, but also in the sense of evolution as a whole species. This principle is encoded as information about the process of replication and determines the continuity of all living things since the development of the first cell. Technical autonomous systems are also committed to a principle and ordered by rules. The autonomous system of a self-driving car has been determined by man according to principles which he has built into the software. He then releases the system into freedom. There, the system must prove itself even without an online connection. Unlike molecules, which behave autonomously based on special properties or a given biological blueprint, the freedom of the system of a self-driving car results from computer
122
R. Ball
programs that control or even constitute the autonomous system. The information of the computer program represents the autonomous system in the sense of its self-control, not the controlled elements of the system itself. Autonomous is therefore not the visible self-driving vehicle with its (mechanical) parts, but the informational basis in its computer programs. If we continue this thought, the principles of autonomy will have to be based in the software. However, if the software is merely a complex and thoroughly programmed unit that can only run according to given paths, it is not autonomous, but only a fast program that calculates through millions of options and delivers results according to given algorithms. A software is autonomous only when it determines the “principles” according to which it works. Only then does it free itself from determination by another entity, namely the programming human being. Are autonomous software systems conceivable at all? How can computers give themselves their own “principle”? Do they need free will for this? And: Would this mean that all living beings except humans would have no autonomy? In artificial intelligence, software that has special characteristics and behaves autonomously, for example, is referred to as “software agents”. Such agents are considered to have attributes such as “autonomous” (operates independently), “cognitive” (is capable of learning and learns based on previous decisions), “modal adaptive” (changes own settings and parameters), “active” (performs actions on its own initiative), “reactive” (responds to environmental changes), “robust” (compensates for disturbances), “social” (communicates with other agents) (Wooldridge, 2002).
6 The Great Continuum: The Convergence of Life…
123
Michael N. Huhns and Munindar P. Singh, in their introduction to the book Readings in Agents (1998), further expand the notion of autonomy for software agents. Agents, in their view, move between two extreme notions: one of the agents as a simple automaton working through its program, and the other of the agent endowed with consciousness, feelings, perception, and cognitive performances like those of a human. For our considerations on the development of self-replicating technical systems following the autonomy of living beings, agents of the second kind are primarily considered, even if we do not have to apply human attributes such as feeling or consciousness for autonomous software systems. Although this idea comes close to a humanoid robot or software system, we do not want to take a science fiction perspective in our considerations, but rather think about whether and to what extent software programs as agents can achieve an autonomous status that allows them to act in a “self-determined” manner. And this is independent of whether the system comes in a robot-like guise, travels as an autonomous vehicle, or runs as a pure software program that depends on the presence of a basic infrastructure in the form of a powerful computer with sufficient power supply. Huhns and Singh distinguish among agents “absolute, social, executive, and design autonomy” (Huhns & Singh, 1998, p. 3). Adaptive agents are characterized by the ability to adapt to changing and unforeseen conditions. Autonomous agents have an individual goal that they pursue, acting completely independently (Tokoro, 1994). We can operate here again with the notion of the “principle” that is intrinsic to an autonomous agent just as it is to a living being, and which governs the behavior of the agent just as it does that of a living being. Mario Tokoro concedes the principle of survival to an autonomous agent: “It
124
R. Ball
behaves to survive” (Tokoro, 1994, p. 424). In our thinking, however, this principle has so far only been reserved for living organisms; we have only thought it with them or thought it to them. Technical systems (software and robots) that follow this principle can at least be interpreted as a bridge between the dead and the living world: We […] cannot state that self-reproducing programs are alive, and thus, they cannot be compared exactly to living organisms. Certain biological structures, however, may still be suitably compared to self-reproducing programs. (Kraus, 2009, p. 64)
In particular, the questions about the “principles” that are input into the system or that it determines for itself and based on which it acts, make the difference between pure software systems that process tasks (however fast and complex, but never autonomously) and those that, following the development of living beings, behave autonomously and can define and define themselves by their own principles. Autonomy, reactivity, goal-directedness, and interaction capabilities are the key properties that distinguish an intelligent agent from simpler entities. (Görz et al., 2014, p. 529)
This already answers many of the above questions and we see software systems well on their way to autonomy. These are systems that can adapt to changing conditions, are self- initiating and communicate with their environment. They have all the prerequisites for developing into independent, autonomous systems that lead a kind of self-determined existence very much like living beings. The independence of the systems is only a matter of time.
6 The Great Continuum: The Convergence of Life…
125
6.2 The Self-Replication of Artificial Intelligence Perhaps patient biology will once again endure what the intellect, prone to aberrations, has prepared for it with its offspring – technology? (Jarzebski, 2013, p. 16)
With the assumption made above that autonomous systems must be able to learn and be self-initiated, we quickly get into the subject area of artificial intelligence, which is characterized by the fact that it imitates human intelligence and imitates structures of human decision-making processes through machine learning. This takes us some distance away from our question about the informational basis of self-replication and autonomy of living and technical systems. Nevertheless, artificial intelligence forms the basis for the development of autonomous software systems, where the focus is precisely not on the imitation of human intelligence, but in the chances of the emergence of a self“liberating” computer program or algorithm (see also Sect. 3.4). Artificial intelligence “thus analyzes its code and finds some ways to improve it” (Kipper, 2020, p. 69). But surely this means that it is autonomous in the sense of self- determination and therefore can make its own and independent decisions. We do not want to speculate about whether a “superintelligence” can emerge that will outstrip humans as a singularity1 but want to be content with our more modest question about the possibilities of a
“Transhumanism will transcend the limits of the human organism through technology. Beyond the technological singularity, a superintelligence will guide human development” (Mainzer, 2019, p. 227). 1
126
R. Ball
self-replicating and autonomous software.2 However, even its potential existence alone would already possess an enormous dimension. In analogy to biologically effective viruses, could there be computer viruses, i.e., software programs, which behave autonomously from a certain stage and only follow their own principle, for example that of self-replication? We also came to artificial intelligence via the topic of agents, which creates the basis for an autonomous agent with self-learning systems. There are various definitions for artificial intelligence, some of which focus on human-like references (thinking and acting humanly) and others define rational thinking and acting as its attributes (Russell & Norvig, 2016, p. 2). We can find examples of applications for such autonomous agents very early on, for example in space travel, where computer systems are self-reliant “on their own” and must act autonomously (Russell & Norvig, 2016, p. 28). Although it is also true there that the programs must always make new decisions by analyzing and considering changing environmental conditions, which a programmer cannot foresee in such detail and therefore has not “programmed in”, the software agent nevertheless acts according to the goal or “principle” entered by the programmer. Thus, if the robot’s goal is to take a soil sample on an alien planet, the autonomous agent may act autonomously based on the conditions currently found (because the goal of taking a soil sample is its “agenda”), but it would never get the “idea” to now take air measurements instead of a soil sample. A detailed note on the question of what, then, autonomy means for an agent lies in the knowledge available to the agent as a basis for its behavior. Thus, at the beginning of an “A much debated question is whether the development of general AI will soon be followed by that of a superintelligence – a system whose intelligence is clearly superior to that of humans […] Superintelligence is thus very likely possible” (Kipper, 2020, p. 61). 2
6 The Great Continuum: The Convergence of Life…
127
autonomous agent’s work, the programmer must input a base knowledge on which the agent can act. During its activities, however, the agent independently expands its knowledge through a variety of perceptions, thus creating its own basis on which it now acts (Bresch, 1986; Russell & Norvig, 2016, p. 39). Thus, the degree of autonomy increases; the system is not only self-learning, but also self- expanding its autonomy. For us, this raises the exciting question of the extent to which such an autonomous agent could detach itself from its original “mission” and thus mutate into a “free” agent that gives itself its principle and becomes an autonomous, self-determined system. Thus, it would exceed a qualitative limit and could develop autonomously up to the implementation of the life principle of the own preservation. With this alone, the software would get into a realm of self- replicating systems, which moved autonomously in the world and quickly got into the kinship proximity of biologically effective viruses. Nevertheless, there is a fundamental difference between biological systems and computer programs: In contrast to natural life forms programs are information-, not matter-based. (Kraus, 2009, p. 64)
In this context, systems are conceivable that organize themselves exclusively based on computer hardware as software systems and as described above, depend on the existence of correspondingly powerful hardware with sufficient energy supply; in science fiction, these systems usually appear as electronic brains. But: A self-reproducing program is incapable of such a feat, even residing on a system and drawing on said computing system’s memory and energy, it remains dependent on activation through the operating system. (Kraus, 2009, p. 64)
128
R. Ball
On the other hand, systems are conceivable that autonomously manifest themselves holistically as a technical realization. In this context, this idea of autonomous robots is far older than the notion of autonomous, self-replicating (malicious) software in the form of a computer virus. Nevertheless, it seems complicated to imitate systems of natural organisms by computer programs. Nature is already so complex in many areas that it can no longer be adequately represented by software and algorithms. Natureimitating procedures in software programming, such as simulated annealing, are therefore at best heuristic approximation procedures that only allow approximations, because the complete trying out of all options is impossible. The method therefore abandons the goal of an optimal solution and also accepts suboptimal solutions as local (intermediate) optima. “As an introduction to these so-called natureimitating modeling paradigms, we recommend the […] described procedure of ‘simulated annealing’” (Vöcking et al., 2008). For all the expectation of self-replicating technical systems, this nevertheless also points out the limitations we currently have in imitating natural systems with software.
6.3 The Robots Are Coming: Technical Solutions for Autonomous Systems There is a lot of work going on and a lot of microscopy, but someone needs to come up with a clever idea. Rudolf Virchow (1821–1902)
There is a whole series of attempts and approaches to imitate a “living” cell and thus a living autonomous system. The replication of simple cells in experiments is one such
6 The Great Continuum: The Convergence of Life…
129
approach, which is oriented towards biology. Such structures are also called protocells, which are precursors of life, but not yet living beings. They are laboratory-made “self- replicating nano systems that exhibit many of the properties of living cells” (Miebach, 2016, p. 357), such as an information store, a metabolic system, and an envelope (membrane). Orthogonal biosystems are such modified cellular machines, where alternative basic biochemical substances are used and thus almost the same or at least similar functions and processes as in the original cells are achieved. For example, the 20 amino acids that are part of the genetic dogma and are used in the synthesis of proteins in almost all living systems can be replaced by other building blocks. There are also attempts to replace the natural genetic coding of amino acids in triplets with coding in quadruplets, that is, coding an amino acid by four bases instead of the natural three. Orthogonal bioengineering can also provide a cell with two genetic replication systems, giving it redundancy capability but also making coding more ambiguous. Transforming biochemical cellular systems to an engineering level is another approach to recreating living systems with their attributes of autonomy. Applying and transferring the methods and terminology from systems biology with the topics such as information, feedback or synchronization to the engineering domain is the beginning of the technical construction of systems. This is the first step to technically reproduce cellular structures and processes, including those of replication mechanisms and informational bases. The main unit of our system is an electronically programmable microfluid chip equipped with micro-electrodes and fluid channels and chambers, linked to a reconfigurable electronic chip. (Tangen et al., 2006, p. 49)
130
R. Ball
On the technical level, there have been experiments with self-replicating systems since the 1970s and an attempt to develop an assembly line in which robots reproduce themselves (Mainzer, 2019, p. 229). Even as early as the 1930s, serious scientists such as mathematician and quantum theorist John von Neumann discussed an artificial automaton with self-replication and mentally anticipated the autonomous computer virus: Von Neumann’s dream of unlimited efficiency also contains the core of the nightmare of the killer virus that drives the silicon brain to madness: The efficiency of electron brains could even be increased if “artificial complex automata” were able to reproduce themselves, von Neumann speculated as early as 1934. (Schmundt, 2004, p. 166)
The example of a colony of robots that extract all the raw materials they are made of themselves and thus reproduce autonomously is a precursor to the idea of self-replicating viruses. However, except for attempts to use nature- imitating methods of software programming, the questions of permanent change, mutation, realignment, and adaptation, i.e., the questions of how to implement evolutionary selection, remain unresolved and thus the analogy to biological systems is still unachieved (Schreiber, 2019). All these (thought) experiments are based on the idea that in future there may be self-replicating systems beyond viruses and living organisms that are replicated as artificial cells in a technical-inorganic milieu, for example, by various complex circuits systematically interacting and mimicking cell functions (Miebach, 2016, p. 358). The analogy of computer viruses and self-replicating software programs or even autonomous robots naturally provokes the question of whether “information organized and stored in a silicon chip
6 The Great Continuum: The Convergence of Life…
131
is less alive than that of the DNA double helix in our genes” (Heuveline, 2016, p. 1). Genes of living beings transport information not only about the principal blueprint in its generic statement, but also contain information about the adult form of the living being and thus also all potentials of an individual shaping of the organism beyond the ideal-typical blueprint of the species. Thus, the individual arrangement of the adult living beings is always unique even on the lowest form of organization. Also, this peculiarity of the living seems to be difficult in the technical realization of self-replicating systems.
7 The Coevolution of Life and Technology: Summary and Outlook
Abstract In this book, we have searched for functional and structural parallels between living organisms, biologically active viruses, and technical systems with respect to their replication systems, and we have asked ourselves whether computer systems can succeed in achieving autonomy in the sense of biological systems or biologically active viruses, and in triggering infectious pandemics similar to what viruses are capable of. Hardly anyone doubts that super intelligent technical systems will soon exist; but whether they can and will run “amok” is difficult to predict. Biologically effective viruses use the billion-year-old molecular biological dogma of living beings to reproduce and multiply. They are also fascinating because they represent the radically reduced counter-concept to the hyper-complex development of, for example, vertebrates, their brains, and neuronal control, up to the development of consciousness in humans. The “sense” of replication of viruses lies in the selfpreservation and reproduction of the system, the sense of © The Author(s), under exclusive license to Springer Fachmedien Wiesbaden GmbH, part of Springer Nature 2023 R. Ball, Viruses in all Dimensions, https://doi.org/10.1007/978-3-658-38826-3_7
133
134
R. Ball
living beings in the preservation of their species. And exactly here early life forms, viruses and technical computer systems and their software meet again: the sense of self-replicating systems obviously is only the reproduction of themselves, the end in itself of such systems is their replication. The system proves: The virus and the virus have entered a new stage of their coevolution. (Schmundt, 2004, p. 178)
At the beginning of this book, we asked whether there are functional and structural parallels between living organisms, biologically effective viruses, and technically generated computer viruses. We have wondered how far the parallels of replication systems extend and whether computer viruses or computer programs can and will succeed in achieving autonomy in the sense of biological systems or biologically active viruses. We have also asked ourselves whether we may or must assume that soon there will be techno-autonomous systems that develop together with and in parallel to living beings through mutation and selection, enter co-evolution with them and are partners or competitors for resources and habitat. In doing so, we have attempted to trace the broad arc from the replicating systems of still inanimate nature, such as organic molecules, to the mechanisms of replication of viruses, to (early) biological replication systems, to self- replicating technical computer programs and evolutionary algorithms, and to answer the question of whether there is a continuity from the inorganic-dead to the living to technically built, self-replicating systems. “Viruses are software – and we are their hardware” (Bartenschlager, 2020). Against this background, we view the pandemics that biologically active viruses have triggered in recent centuries up to the Corona pandemic in 2020 not only through the burning glass of a parasite-host relationship, but also with the perhaps anxious question of whether self-replicating technical systems can become as dangerous
7 The Coevolution of Life and Technology…
135
as infectious viruses in triggering pandemics. This is especially true if they achieve an autonomous status that enables independent further development and optimization through mutation and selection and thus flexible adaptations to new environmental conditions. Deprived of any control by the original programmer, such systems would then act only according to “their” principle and thus autonomously. A “pandemic” of computer systems, algorithms and autonomous robot systems is conceivable. Artificial intelligence and evolutionary algorithms then form the basis for autonomous technical systems that can give themselves their “principles” and thus follow their own agenda. We have looked at the various replication mechanisms and it has become clear that the replication processes of biologically active viruses and those of software are on a comparable informational basis. While algorithms and software programs do not employ organic components such as DNA and RNA for replication, the mechanism of encoding their basic information for system function and replication is similar. It is no longer only science fiction films and books that address the possibility of the singularity, i.e., that technical systems will be superior to the intelligence of humans (and could then act autonomously). That this singularity will come one day in the not-so-distant future is hardly controversial anymore; but how such a super intelligent system will act and whether it can run “amok” is difficult to predict. We have already mentioned possible barriers in the form of energy and powerful software systems above. In this context, it would be important to formulate standards and rules in software development so that “moral” barriers can be built into such systems, as well as a possible “emergency stop button” that stops the system when it wants to do harm. For biologically active viruses, as for autonomous computer systems, similar rules also apply to danger prevention: Infection and spread must be prevented so that the whole
136
R. Ball
world is not infected with them. This is initially achieved by isolating and distancing virus carriers from potential hosts and, in the case of IT systems, by removing malware and potential hosts from the network. An out-of-control singularity on a lonely, isolated high-performance computer cannot develop power or spread. Individual protection against biologically effective viruses and software systems also resorts to quite similar mechanisms: living beings develop antibodies as an immune response to the intrusion of viruses. In computer systems, there are attempts to install an “immune response” in the form of learning software that detects the invading malware and renders it harmless, as well as tracking its changes and “staying on” the intruder. This race between immune response and viral invader also presents a striking parallel between the two systems. We have learned about the basic law of molecular biology, which has been valid as an almost unchanged dogma for almost four billion years. During evolution, it has changed little and has developed far less than its carrier organisms. This stability over such an enormously long period of time, which is unusual in the realm of the living, is impressive. At the same time, the distribution of this coding system of hereditary information (or hereditary data) is fascinating: it is almost identical in almost all lower and higher organisms of any form and expression, its functional mechanisms and principles are almost the same in every living being. Moreover, we find precursor forms of this basic molecular biological law in the adjacent areas from and to living things, for example in organic molecules and their replication. We also see it in the large and successful group of (biologically effective) viruses, which are virtually masters in the use of the molecular biological dogma, since they do not only use this principle to replicate themselves (or have
7 The Coevolution of Life and Technology…
137
themselves replicated) and thus ensure their continued existence, but also very effectively manipulate the replication apparatus of their respective host cells. Viruses are therefore fascinating. And this is far less because of their occasionally lethal and destructive effect on their host organisms, as we see it in viral pandemics in the animal kingdom and in humans, but primarily because of their place in the kingdom of living things. We have explained various hypotheses about this in my book. There is circumstantial evidence that viruses may be precursors of living things or that they may have arisen secondarily as a reduced form of living things; moreover, their evolutionary history is conceivable as a mixture of these theories. But perhaps viruses also are so fascinating because they represent a low-tech concept for the hypercomplex evolution of, say, vertebrates, their brains and neural control, all the way to the evolution of consciousness in humans. They are the model of radical reduction to the essential, to the only relevant question on this planet, that of survival and reproduction. And this with the simplest means, with maximally reduced structural, functional, and physiological ballast. Yes, one could even say that viruses have thrown off all ballast: They travel with only the light baggage of their slender genome and a thin shell. They have completely outsourced all the elaborate processes and structures for the replication of the genome and the envelope to a host, which they entrust to them, and which must perform their reproduction for them not only with its own structures, but also with its own resources, and as a reward usually not only suffers capital damage, but often even death. Viruses are masters of reduction. The most diverse, fascinating, and unusual designs and variants of life and its expressions appear before us. This ranges from the simplest
138
R. Ball
single-celled organism, a simple bacterium, to the highly complex organisms in the animal and plant world. In addition, beside the organic and inorganic matter of this earth and variant energy forms we find a set of systems which have developed with the organisms, are related to them, and represent a part of the whole life environment. Viroids, prions, and viruses, for example, are among them. Appreciating all that we can see and explore and based on our current intellectual abilities with their limited insight and foresight, we assume that this system surrounding us represents, or at least should represent, a balanced equilibrium. To seek a meaning behind all these questions is not so much a scientific approach, but a human intellectual one. No explanation is needed for the meaningfulness of the replication of nucleic acids. This lies in their chemical structure and thus exists independently of whether a blueprint is based on information or not. Küppers assumes, in accordance with the molecular Darwinian approach, that selection in the sense of Darwin has already begun at the level of inanimate matter and molecules and thus “biological information has arisen through selective self-organization and evolution of biological macromolecules” (Küppers, 1986, p. 165). The “purpose” of replication of viruses seems to be self- preservation and reproduction of the system, and the purpose of living organisms is the preservation of their species. In pure self-replicating systems, the sense is only the reproduction of themselves, the end in itself of such systems is their replication. And exactly here early life forms, viruses and technical computer systems and their software meet again. These results do not provide a clear answer to the question of whether we will be determined in the future by foreign RNA or DNA or by artificial code, just as they do
7 The Coevolution of Life and Technology…
139
not provide a clear answer to the question of meaning. However, it seems far from certain that a radically reduced system, be it a virus or autonomous software, will not dispute the ancestral places of humans and other living beings on this planet in future.
References
Abts, D., & Mülder, W. (2017). Grundkurs Wirtschaftsinformatik. Eine kompakte und praxisorientierte Einführung (9., erweiterte und aktualisierte Aufl.). Springer Vieweg. Anlauff, H., Böttcher, A., & Ruckert, M. (2002). Darstellung von Information. Codierung und Zahlensysteme. In H. Anlauff, A. Böttcher, & M. Ruckert (Hrsg.), Das MMIX-Buch (S. 9–30). Springer. https://doi.org/10.1007/978-3-642-56233-4_2 Bartenschlager, R. (2020, Juli 26). Viren sind Software – und wir ihre Hardware. FAZ. https://www.faz.net/aktuell/wissen/ medizin-ernaehrung/corona-pandemie-viren-sind-software- und-wir-ihre-hardware-16876006.html Bartneck, C., Lütge, C., Wagner, A. R., & Welsh, S. (2019). Ethik in KI und Robotik. Hanser. Bendel, O. (2019). Roboterphilosophie. In Gabler Wirtschaftslexikon. Springer Gabler. https://wirtschaftslexikon.gabler.de/ definition/roboterphilosophie-54555/version-371804 Borck, C. (2004). Vivarium des Wissens. Kleine Ontologie des Schnupfens. In R. Mayer & B. Weingart (Hrsg.), VI-
© The Editor(s) (if applicable) and The Author(s), under exclusive license 141 to Springer Fachmedien Wiesbaden GmbH, part of Springer Nature 2023 R. Ball, Viruses in all Dimensions, https://doi.org/10.1007/978-3-658-38826-3
142 References
RUS. Mutationen einer Metapher (S. 43–61). Transcript. https://doi.org/10.14361/9783839401934-001 Brandt, C. (2004). Metapher und Experiment. Von der Virusforschung zum genetischen code. Wallstein. Brecht, W. (1995). Theoretische Informatik. Grundlagen und praktische Anwendungen. Vieweg. Bresch, C. (1986). Wie das Leben leben lernte. In R. Böhme & K. Meschkowski (Hrsg.), Lust an der Natur. Ein Lesebuch aus Literatur und Wissenschaft (S. 199–205). Piper. Brunnstein, K. (1994). Beastware (Viren, Würmer, trojanische Pferde). Paradigmen Systemischer Unsicherheit. In J. Eberspächer (Hrsg.), Sichere Daten, sichere Kommunikation/Secure information, secure communication (Bd. 18, S. 44–60). Springer. https://doi.org/10.1007/978-3-642-85103-2_4 Bühler, K. (1934). Sprachtheorie. Die Darstellungsfunktion der Sprache (3. Aufl., ungekürzter Neudr. d. Ausg. Jena, Fischer). Lucius & Lucius. Bülle, J. (2000). Kinetische Verfolgung selbstreplizierender Systeme durch Messung von Fluoreszenz-Resonanz-Energie-Transfer [Dissertation, Ruhr-Universität Bochum]. http://www-brs.ub. ruhr-uni-bochum.de/netahtml/HSS/Diss/BuelleJan/diss.pdf Cohen, F. (1984). Computer viruses – Theory and experiments. Introduction and abstract. Introduction and Abstract. https:// web.eecs.umich.edu/~aprakash/eecs588/handouts/cohen- viruses.html ComputerWeekly.de. (2005, November). Codierung und Decodierung [Definitionen]. ComputerWeekly.de. www.computerweekly.com/de/definition/Codierung-und-Decodierung Cremer, C. (2011). Mikroskope und Mikroben. In K. Sonntag (Hrsg.), Viren und andere Mikroben. Heil oder Plage? (S. 100–135). Universitätsverlag Winter. Dershowitz, N., & Gurevich, Y. (2008). A natural axiomatization of computability and proof of Church’s thesis. Bulletin of Symbolic Logic, 14(3), 299–350. https://doi.org/10.2178/ bsl/1231081370 Deutsche Welle. (2014, März 6). Super-Virus aus dem ewigen Eis – Kein Grund zur Panik. Deutsche Welle. https://www.
References 143
dw.com/de/super-virus-aus-dem-ewigen-eis-kein-grund-zur- panik/a-17479901 DNA. (1994). In R. Sauermost (Hrsg.), Herder-Lexikon der Biologie (Bd. 4). Spektrum Akademischer. Doerfler, W. (1996). Viren. Krankheitserreger und Trojanisches Pferd. Springer. Eigen, M., Biebricher, C. K., Gebinoga, M., & Gardiner, W. C. (1991). The hypercycle. Coupling of RNA and protein biosynthesis in the infection cycle of an RNA bacteriophage. Biochemistry, 30(46), 11005–11018. https://doi.org/10.1021/ bi00110a001 Eilers, C. (2010). Schadsoftware. Die Entwicklung der Würmer. Die Entwicklung der Würmer. www.ceilers-news.de/ serendipity/75-S chadsoftware-D ie-E ntwicklung-d er- Wuermer.html Fasel, D. (2014). Big Data – Eine Einführung. HMD Praxis der Wirtschaftsinformatik, 51(4), 386–400. https://doi. org/10.1365/s40702-014-0054-8 FAZ. (2001, Februar 12). Menschliches Erbgut ganz entziffert. FAZ Politik. www.faz.net/aktuell/politik/genforschung- menschliches-erbgut-ganz-entziffert-115130.html Gatlin, L. L. (1972). Information theory and the living system. Columbia University Press. Görz, G., Schneeberger, J., & Schmid, U. (2014). Handbuch der künstlichen Intelligenz (5., überarb. und aktualisierte Aufl.). Oldenbourg. Grunwald, A. (2019). Der unterlegene Mensch. Die Zukunft der Menschheit im Angesicht von Algorithmen, künstlicher Intelligenz und Robotern. riva. Gurevich, Y. (2011). What is an algorithm? https://www.microsoft.com/en-us/research/publication/what-is-an-algorithm/ Haaß, W.-D. (1997). Kommunikationsmodelle. In Handbuch der Kommunikationsnetze (S. 115–246). Springer. https://doi. org/10.1007/978-3-642-59036-8 Haeseler, A., & Liebers, D. (2003). Molekulare Evolution (Orig.Ausg, Bd. 15365). Fischer Taschenbuch.
144 References
Heuveline, V. (2015). Infiziertes Netz. Virtuelle Krankmacher. Ruperto Carola, 6, 101–107. https://doi.org/10.11588/ RUCA.2015.6.20895 Heuveline, V. (2016, Juni 10). Virtuelle Krankmacher. Was Computerviren und ihre biologischen Gegenparts gemeinsam haben. Scinexx – das Wissensmagazin. https://www.scinexx.de/service/dossier_print_all.php?dossierID=91180 Hildt, E., & Kovacs, L. (2009). Zur Bedeutung genetischer Information. Eine Einführung. In E. Hildt & L. Kovacs (Hrsg.), Was bedeutet “genetische Information”? (S. 13 ff). de Gruyter. Hodcroft, E. B., Zuber, M., Nadeau, S., Vaughan, T. G., Crawford, K. H. D., Althaus, C. L., Reichmuth, M. L., Bowen, J. E., Walls, A. C., Corti, D., Bloom, J. D., Veesler, D., Mateo, D., Hernando, A., Comas, I., González Candelas, F., SeqCOVID- SPAIN Consortium, Stadler, T., & Neher, R. A. (2020). Emergence and spread of a SARS-CoV-2 variant through Europe in the summer of 2020 [Preprint]. Epidemiology. https://doi.org/10.1101/2020.10.25.20219063 Hof, H., & Dörries, R. (2017). Medizinische Mikrobiologie (6. Aufl.). Thieme. Hörnlimann, B. (2001). Historische Einführung. Prionen und Prionenklassen. In B. Hörnlimann, D. Riesner, & H. Kretzschmar (Hrsg.), Prionen und Prionkrankheiten (S. 4–20). de Gruyter. Huhns, M. N., & Singh, M. P. (Hrsg.). (1998). Readings in agents. Morgan Kaufmann. Irmer, J., & Müller-Jung, J. (2020). Das verwunschene Virus. FAZ – Natur und Wissenschaft, 105, 11. ISO. (2010). Systems and software engineering. Vocabulary (Nr. 24765). https://www.iso.org/obp/ui/#iso:std:iso-iec- ieee:24765:ed-1:v1:en Jarzebski, J. (2013). Der Verstand der Evolution und die Evolution des Verstandes. In S. Lem (Hrsg.), Die Technologiefalle. Essays (S. 16 ff.). Suhrkamp. Junker, T., & Paul, S. (2009). Der Darwin-Code. Die Evolution erklärt unser Leben. C. H. Beck. Kasparov, G. (2020, April 20). Was Computerviren und menschliche Viren gemeinsam haben. Gastbeitrag. Focus Online. https://
References 145
www.focus.de/digital/dldaily/kolumnen/autoritarismus-geht- viral-was-computerviren-und-menschliche-viren-gemeinsam- haben_id_11900140.html Kaspersky Labs GmbH. (2021, Februar 1). Computerviren – Geschichte und Ausblick. https://www.kaspersky.de/resource- center/threats/computer-viruses-and-malware-facts-and-faqs Kayser, F. H., Böttger, E. C., Haller, O., Deplazes, P., & Roers, A. (2014). Taschenlehrbuch Medizinische Mikrobiologie (13., vollst. überarb. und erw. Aufl.). Thieme. Khabyuk, O. (2018). Kommunikationsmodelle. Grundlagen – Anwendungsfelder – Grenzen (H. Peters, Hrsg.). Kohlhammer. Kipper, J. (2020). Künstliche Intelligenz – Fluch oder Segen? J.-B.Metzlersche Verlagsbuchhandlung. Kraus, J. (2009). On self-reproducing computer programs. Journal in Computer Virology, 5(1), 9–87. https://doi.org/10.1007/ s11416-008-0115-z Kuhlen, R., Seeger, T., & Strauch, D. (Hrsg.). (2004). Grundlagen der praktischen Information und Dokumentation (5. Aufl.). de Gruyter. https://doi.org/10.1515/9783110964110 Küppers, B.-O. (1986). Der Ursprung biologischer Information. Zur Naturphilosophie der Lebensentstehung. Piper. Lander, E. S., et al. (2001). Initial sequencing and analysis of the human genome. Nature, 409(6822), 860–921. https://doi. org/10.1038/35057062 Lem, S. (Hrsg.). (2013). Die Technologiefalle. Essays. Suhrkamp. Leuzinger-Bohleber, M., & Fischmann, T. (2014). Transgenerationelle Weitergabe von Trauma und Depression. Psychoanalytische und epigenetische Überlegungen. In V. Lux & J. T. Richter (Hrsg.), Kulturen der Epigenetik: Vererbt, codiert, übertragen (S. 69–88). de Gruyter. https://doi. org/10.1515/9783110316032.69 Lossau, N. (2011, Februar 3). Was uns die Entschlüsselung des Erbguts gebracht hat. Welt. https://www.welt.de/wissenschaft/article12437202/Was-uns-die-Entschluesselung-des-Erbguts- gebracht-hat.html Mainzer, K. (2016). Information. Berlin University Press.
146 References
Mainzer, K. (2019). Künstliche Intelligenz – Wann übernehmen die Maschinen? (2. Aufl.). Springer. https://doi. org/10.1007/978-3-662-58046-2 Mansuy, I., Gurret, J.-M., & Lefief-Delcourt, A. (2020). Wir können unsere Gene steuern! Die Chancen der Epigenetik für ein gesundes und glückliches Leben. Berlin Verlag. Mayer, R., & Weingart, B. (2004). Viren zirkulieren. Eine Einleitung. In R. Mayer & B. Weingart (Hrsg.), VIRUS. Mutationen einer Metapher (S. 7–42). Transcript. https://doi.org/10.1436 1/9783839401934-intro Mayer-Schönberger, V., & Cukier, K. (2013). Big Data. Die Revolution, die unser Leben verändern wird (2. Aufl.). Redline. h t t p : / / d e p o s i t . d -n b . d e / c g i -b i n / dokserv?id=4335922&prov=M&dok_var=1&dok_ext=htm Megginson, L. C. (1963). Lessons from Europe for American Business. Southwestern Social Science Quarterly, 44(1), 3–13. Meißner, R. (2010). Geschichte der Erde. Von den Anfängen des Planeten bis zur Entstehung des Lebens (3., aktualisierte Aufl., Bd. 2110). C. H. Beck. Merkl, R. (2015). Bioinformatik. Grundlagen, Algorithmen, Anwendungen (3., vollst. überarb. u. erw. Auflage). Wiley-VCH. Miebach, H. D. (2016). Wissenschaftliche Transdisziplinarität. Ein philosophischer und ethisch-kritischer Diskurs. Peter Lang. Modrow, S., Falke, D., Truyen, U., & Schätzl, H. (2010). Molekulare Virologie (3. Aufl.). Spektrum Akademischer. https:// doi.org/10.1007/978-3-8274-2241-5 Mölling, K. (2015). Supermacht des Lebens. Reisen in die erstaunliche Welt der Viren. C. H. Beck. https://doi. org/10.17104/97834066 Paul, M. (Hrsg.). (1989). Computergestützter Arbeitsplatz (Bd. 222). Springer. Penzlin, H. (2016). Das Phänomen Leben. Grundfragen der Theoretischen Biologie (2., aktualisierte und erweiterte Aufl.). Springer Spektrum. Picot, A. (1988). Die Planung der Unternehmensressource “Information”. In Erfolgsfaktor Information (S. 223–250).
References 147
niversitätsbibliothek der LMU München. https://epub. U ub.uni-muenchen.de/7062/1/7062.pdf Prusiner, S. (1982). Novel proteinaceous infectious particles cause scrapie. Science, 216(4542), 136–144. https://doi. org/10.1126/science.6801762 Rauchfuß, H. (2005). Chemische Evolution und der Ursprung des Lebens. Springer. Riesner, D. (2001). Die Scrapie-Insoform des Prion-Proteins PrP(sc) im Vergleich zur zellulären Isoform PrP(c). In B. Hörnlimann, D. Riesner, & H. Kretzschmar (Hrsg.), Prionen und Prionkrankheiten (S. 81–91). de Gruyter. Röhner, J., & Schütz, A. (2020). Klassische Kommunikationsmodelle. In J. Röhner & A. Schütz (Hrsg.), Psychologie der Kommunikation (S. 27–51). Springer. https://doi. org/10.1007/978-3-662-61338-2_2 Röhrlich, D. (2012). Urmeer. Die Entstehung des Lebens. Mare-Verl. Russell, S. J., & Norvig, P. (Hrsg.). (2016). Artificial intelligence. A modern approach (3. Aufl.). Pearson. Schaper, M. (2004). Die Geburt der Erde und die Entstehung des Lebens. Wie aus einem glühenden Plasmahaufen der Blaue Planet wurde (Bd. 1). Gruner und Jahr. Schmidt, K. (2009). Transgene Tiere und genetische Information. In E. Hildt & L. Kovacs (Hrsg.), Was bedeutet “genetische Information”? (S. 97). de Gruyter. https://doi. org/10.1515/9783110211986.95 Schmitt, P.-P. (2020, Juli 5). Schon die Wikinger litten unter einem Virus. FAZ. https://www.faz.net/aktuell/gesellschaft/gesundheit/pocken-s chon-d ie-w ikinger-l itten-u nter-e inem- virus-16874876.html Schmundt, H. (2004). Der Virus und das Virus. Vom programmierten Leben zum lebenden Programm. In R. Mayer & B. Weingart (Hrsg.), VIRUS. Mutationen einer Metapher (S. 159–182). Transcript. https://doi.org/10.1436 1/9783839401934-007
148 References
Schneider, U., & Werner, D. (Hrsg.). (2007). Taschenbuch der Informatik. Mit 108 Tabellen (6., neu bearb. Aufl.). Fachbuchverlag Leipzig im Carl-Hanser-Verl. Schreiber, U. C. (2019). Das Geheimnis um die erste Zelle. Dem Ursprung des Lebens auf der Spur. Springer. Shapiro, J. A. (2009). Revisiting the central dogma in the 21st century. Annals of the New York Academy of Sciences, 1178, 6–28. https://doi.org/10.1111/j.1749-6632.2009.04990.x Smith, J. M. (2000). The concept of information in biology. Philosophy of Science, 67(2), 177–194. https://doi. org/10.1086/392768 Sobe, P. (2012). Algorithmen. Vorlesungsfolien. Vorlesungsfolien. https://www2.htw-dresden.de/~sobe/Vo_Info1_Jg12/1_Algorithmen.pdf Sourjik, V. (2011). Mikroorganismen. Überblick. In K. Sonntag (Hrsg.), Viren und andere Mikroben. Heil oder Plage? (S. 9–25). Universitätsverlag Winter. Stähel, U., & Wienold, H. (2020). Algorithmus. In Lexikon zur Soziologie. Springer Fachmedien. https://doi. org/10.1007/978-3-658-30834-6 Stetter, K. O. (2011). Hyperthermophile Archaea als Zeugen der Urzeit. In K. Sonntag (Hrsg.), Viren und andere Mikroben. Heil oder Plage? (S. 25–55). Universitätsverlag Winter. Stiller, S. (2015). Planet der Algorithmen. Ein Reiseführer. Albrecht Knaus. Tangen, U., Wagler, P. F., Chemnitz, S., Goranovic, G., Maeke, T., & McCaskill, J. S. (2006). An electronically controlled microfluidic approach towards artificial cells. Complexus, 3(1–3), 48–57. https://doi.org/10.1159/000094187 Täubig, H. (2010). Algorithmen und Datenstrukturen. Grundlagen. Grundlagen. http://wwwmayr.informatik.tu-muenchen. de/lehre/2010SS/gad/slides/GAD-2010-04-20.pdf Thoms, S. P. (2005). Ursprung des Lebens (Bd. 16128). Fischer Taschenbuch. Tokoro, M. (1994). The society of objects. ACM SIGPLAN OOPS Messenger, 5(2), 421–429. https://doi.org/10.1145/ 260304.260305
References 149
Turing, A. M. (1937). On computable numbers, with an application to the Entscheidungsproblem. Proceedings of the London Mathematical Society, s2–42(1), 230–265. https://doi. org/10.1112/plms/s2-2.1.230 Vöcking, B., Alt, H., Dietzfelbinger, M., Reischuk, R., Scheideler, C., Vollmer, H., & Wagner, D. (Hrsg.). (2008). Taschenbuch der Algorithmen. Springer. Volkmann-Schluck, K.-H. (1979). Die Metaphysik des Aristoteles. Klostermann. Weber, M. (1919). Wissenschaft als Beruf: Bd. Max Weber, Vortrag 1. Duncker & Humblot. Weicker, K. (2007). Evolutionäre Algorithmen (2., überarbeitete und erweiterte Aufl.). B.G. Teubner. https://doi. org/10.1007/978-3-8351-9203-4 Weicker, K. (2015). Evolutionäre Algorithmen (3., überarb. u. erw. Aufl.). Springer Vieweg. Weigmann, K. (2004). The code, the text and the language of God. When explaining science and its implications to the lay public, metaphors come in handy. But their indiscriminant use could also easily backfire. EMBO Reports, 5(2), 116–118. https://doi.org/10.1038/sj.embor.7400069 Weitze, M.-D. (2008, Mai 14). Der Ursprung des Lebens – Ein einmaliges Ereignis? Neue Zürcher Zeitung. www.nzz.ch/der_ ursprung_des_lebens__ein_einmaliges_ereignis-1.73280 Welt. (2015, März 27). Jeden Tag 350.000 neue Viren und Würmer. Welt. www.welt.de/wirtschaft/webwelt/article160309780/Jeden-Tag-3 50-0 00-n eue-V iren-u nd- Wuermer.html Weuffen, G. (2019). Die Turingmaschine. “Biber am laufenden Band”. “Biber am laufenden Band”. http://www.matheprisma. uni-wuppertal.de/Module/Turing/index.htm Wikipedia. (2021a). Information. In Wikipedia, die freie Enzyklopädie. https://de.wikipedia.org/w/index.php?title=Informati on&oldid=209655275 Wikipedia. (2021b). Code. In Wikipedia, die freie Enzyklopädie. https://de.wikipedia.org/w/index.php?title=Code&ol did=210825976
150 References
Wooldridge, M. (2002). Intelligent agents: The key concepts. In Multi-agent systems and applications II: 9th ECCAI-ACAI/ EASSS 2001, AEMAS 2001, HoloMAS 2001: Selected revised papers (S. 3–43). Springer. http://rave.ohiolink.edu/ebooks/ ebc/3540459820
Index
A
Acids, 22, 35, 45, 46, 48, 49, 55, 98, 99, 138 Adaptations, 5, 29, 33, 34, 55, 56, 63, 68, 103, 107, 130, 135 Adenine, 21, 22, 26, 47, 49 Alan turing, 65, 66, 70 Algorithms, 6, 14, 61–67, 69–72, 93, 97, 114–117, 122, 125, 128, 135 Amino acids, 15–17, 19, 22, 23, 25, 26, 55, 58, 95, 99, 100, 129 Anti-virus programs, 73, 77, 83 Archaea, 12, 13, 31, 48, 53
Artificial intelligence (AI), 69–72, 97, 122, 125–128, 135 Autonomous systems, 36, 70, 71, 81, 91, 93, 120–122, 124, 125, 128–131, 134 B
Bacteria, 10–15, 19, 20, 27, 31, 42, 43, 45, 48, 49, 51, 56, 77, 102, 138 Bacteriophages, 42, 43 Big Data, 62 Binary codes, 93, 113, 114 Biological dogma, 27, 133, 137
© The Editor(s) (if applicable) and The Author(s), under exclusive license 151 to Springer Fachmedien Wiesbaden GmbH, part of Springer Nature 2023 R. Ball, Viruses in all Dimensions, https://doi.org/10.1007/978-3-658-38826-3
152 Index C
D
Catalysts, 25, 29, 31, 116 Causality, 91, 106 Cell nucleus, 12, 18, 26, 27, 30, 31, 43, 95 Chromatin, 20, 104 Chromosomes, 20, 56, 104 Co evolution, 50, 51, 69, 72, 134 Code, 97 Coding of genetic information, 5, 12, 98–99 Codons, 22, 24, 25, 100 Co-evolution, 84 Compartmentalization, 29–31, 98 Computer, 5, 6, 24, 29, 33, 46, 60–62, 65–85, 88, 89, 91, 93, 97, 112–114, 116, 122, 123, 126, 127, 130, 134–136, 139 programs, 6, 7, 14, 33, 40, 46, 60–67, 69, 70, 74, 89, 114, 117, 121–122, 125, 127, 128, 134 virus, 6, 69, 74, 76, 77, 128, 130 worms, 7, 76, 78–80 Corona pandemic, 2, 4, 7, 41, 135 COVID-19, 2, 3, 42 Cytosine, 21, 22, 47, 49
Decoding of genetic information, 104 Deoxyribonucleic acid (DNA), 4–6, 16–28, 31, 37, 39, 43, 45, 46, 48, 49, 52, 55, 57, 68, 85, 98–102, 104, 108, 110, 131, 135, 139 DNA sequences, 17, 20, 101, 104, 105 Double helix, 20, 21, 25–27, 85, 104, 131 E
Emergence of life, 12, 19, 20, 34–40, 55, 87, 88, 98, 115 Encoding, 7, 16, 46, 68, 94, 107, 111–114, 135 Encoding information, 3, 87, 94, 108 Encoding of genetic information, 56 Epigenetic information, 104, 105 Epigenetics, 101–108, 110 Eukaryotes, 12, 26, 30, 31, 45 Evolutionary algorithms, 67–69, 83, 88, 116, 134, 135
Index 153 G
Genes, 17, 20–22, 25, 40, 47–52, 56, 58, 67, 85, 92, 99–101, 103, 104, 106, 108, 131 Genetics, 4, 5, 14–16, 19, 20, 23, 27, 31, 32, 41–58, 67, 68, 93, 96, 97, 101–104, 106, 108, 110, 111, 114, 115, 129 algorithms, 67, 71 alphabet, 22, 23, 110 codes, 2, 15, 17, 19–28, 47, 49, 67, 87, 89, 93–112 information, 5, 6, 16, 18–20, 22, 25–28, 31, 32, 35, 43–46, 48, 50, 51, 55, 67, 95, 100–103, 105 laws, 15, 23, 24, 27, 32, 51, 55 Genotypes, 101–104, 107 Giga viruses, 13, 14, 56 Growth, 16, 25, 29, 32, 110 Guanine, 21, 22, 26, 47, 49 H
Hardware, 7, 53, 60, 61, 66, 82, 99, 127, 134 Hereditary information, 14, 16, 17, 20, 21, 26, 28, 31, 32, 46, 107, 136
Human genome, 47, 48, 56, 97 Hypercycle, 37–39 I
Infection, 4, 6, 45, 56, 73, 74, 76, 77, 82, 110, 136 Infectious, 2, 6, 7, 42, 57, 58, 77, 94, 108–110, 135 Influenza, 42, 44 Information coding, 46, 53, 57, 58, 69, 95, 97, 102, 108, 110–115 Information storage, 22, 23, 28, 35, 46, 88, 96, 99 Information theory, 20, 48, 88, 96–111, 113 Iron-sulfur world, 37, 53, 54 K
Knowledge, 2, 21, 72, 87–93, 101, 102, 115, 126, 127 L
Lives, 2, 5, 6, 12, 14, 20–22, 24, 27–40, 43, 46, 49–51, 54, 55, 57, 62, 69, 87, 95–98, 101, 102, 105, 106, 111, 115, 117, 127, 129, 134, 138, 139 Louis Pasteur, 12
154 Index M
Machine, 60, 64, 66, 68, 70, 71, 82, 87, 112–114, 125, 129 Macromolecules, 6, 21, 35, 36, 39, 115, 138 Membranes, 6, 30, 31, 43, 45, 98, 129 Messenger RNA (mRNA), 18, 19, 26, 99 Microorganisms, 4–7, 10–13, 29, 42, 43 Molecular, 5, 14, 15, 17, 20, 23, 27, 45, 67, 96–98, 115, 120, 133, 136–138 Molecular genetics, 14–28, 57, 80, 93, 102, 111 Multiplication, 12, 25, 76 Mutation, 4, 6, 24, 33, 34, 40, 55, 56, 67, 68, 71, 73, 77, 80, 82–84, 105, 106, 130, 134, 135 O
Origin of life, 12, 28, 34, 36, 37, 40, 49, 53, 93, 96, 115 P
Phenotype(s), 22, 101–104, 107 Prebiotic, 6, 16, 37, 38, 40 Primordial soup, 34–36, 53, 87 Prions, 56–58, 94, 108–111
Program code, 6, 69, 76, 78, 114 Programming language, 61, 114 Protein biosynthesis, 21, 26, 101 Proteins, 3, 15–23, 25, 26, 31, 35, 37–39, 43–47, 53–55, 57, 58, 94, 95, 99–101, 104, 107–110, 129 Protozoa, 10, 12, 27 R
Regulations, 29, 32, 83 Replicating, 14, 48, 51, 54, 55, 134 Replication, 4, 5, 7, 15, 16, 19–20, 24, 25, 27, 33, 34, 36, 37, 39, 45–47, 49, 50, 52, 53, 56, 57, 72, 73, 99, 108, 110, 116, 120, 121, 128, 129, 133–135, 137–139 Reproduction, 4, 6, 11, 12, 14, 15, 26, 29, 32–38, 40, 45, 48, 50–52, 56, 69, 72, 78, 81, 93, 98, 110, 115, 116, 121, 134, 137–139 Reproductive, 56 Ribonucleic acid (RNA), 5, 6, 16–28, 37–39, 43–46, 49–55, 57, 82, 96, 98–102, 104, 108, 110, 116, 135, 139
Index 155
Ribosomes, 16, 18, 19, 95 RNA sequences, 3, 17, 19 RNA world, 37, 41–58 Robert Koch, 12 S
Secondary structure, 100, 108, 109 Selection, 24, 68, 71, 93, 97, 98, 103, 105–107, 115, 120, 130, 134, 135, 138 Self-assembly, 45, 47 Self-replicating, 6, 7, 35, 36, 40, 49–51, 53, 57, 68, 72, 77, 81, 84, 88, 97, 98, 110, 115, 123, 126–131, 134, 135, 138 program, 75, 76 technical computer programs, 7, 134 Self-replication, 14, 24, 40, 50, 54, 68, 73, 76, 78, 97, 98, 102, 125–128, 130 Self-replicative, 98 Sequence of DNA, 20 Sequence of RNA, 108 Sequences, 16, 19, 21, 22, 25, 26, 32, 35, 54, 63, 64, 67, 96, 99, 100, 104, 108, 113 Severe acute respiratory syndrome (SARS), 2 SARS-CoV-1, 5 SARS-CoV-2, 3, 4, 11, 41, 42, 47
Singularity, 39, 71, 125, 135, 136 Start and stop sequences, 100, 104 Structural, 20, 26, 27, 31, 32, 94, 95, 110, 134, 137 Structures, 10, 12, 13, 20, 22, 26, 29, 31, 32, 39, 40, 43–48, 51, 53, 57, 62, 65, 68, 72, 80, 84, 95, 98–100, 104, 109, 110, 115, 116, 120, 124, 125, 129, 137, 138 T
Tertiary structure, 57, 100, 107 Thermophilic organisms, 13 Thymine, 21, 22, 26, 47, 49 Transcriptions, 16–20, 25–27, 44, 45, 48, 67, 80, 95 Transfer RNA (tRNA), 18, 19, 54, 99 Translation, 15–20, 25–27, 30, 44, 45, 47, 48, 67, 80, 95, 96 Triplets, 16, 17, 22, 23, 25, 26, 47, 129 Turing, 65, 66 Turing machine, 66 U
UV radiation, 53
156 Index V
W
von Neumann, John, 69, 77, 130
Worms, 76, 78, 79