319 36 12MB
English Pages 270 Year 2022
Using Documents
Using Documents
A Multidisciplinary Approach to Document Theory Edited by Gerald Hartung, Frederik Schlupkothen and Karl-Heinrich Schmidt
ISBN 978-3-11-078077-2 e-ISBN (PDF) 978-3-11-078088-8 e-ISBN (EPUB) 978-3-11-078094-9 Library of Congress Control Number: 2022938167 Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available on the Internet at http://dnb.dnb.de. © 2022 Walter de Gruyter GmbH, Berlin/Boston Cover image: Michael Ruml, University of Wuppertal; with snippets and quotes from: Francis James Barraud. His Master’s Voice. Painting, 1898–1899; Tim Berners-Lee. Information Management: A Proposal. CERN, 1989; Nicolas de Larmessin. Johannes Gutenberg. Engraving, 1632–1694; Paul Otlet. Traité de Documentation. Editions Mundaneum, 1934. Typesetting: Michael Ruml and Frederik Schlupkothen, University of Wuppertal Printing and binding: CPI books GmbH, Leck www.degruyter.com
Acknowledgments The editors would like to thank all the contributors to this volume and all the participants in the associated conference held at the University of Wuppertal in October 2019. The realization of this volume has come a long way as global events have unfolded—the perseverance of all those involved cannot be overestimated. We owe a general debt of gratitude to Stefan Gradmann, without whose work Roger T. Pédauque would hardly be known to German-speaking scholars. We would also like to thank Niels Windfeld Lund for giving Laura Rehberger and Frederik Schlupkothen permission to revise his translation of Pédauque’s “Document: Forme, signe et médium, les re-formulations du numérique.” The linguistic quality of the volume can be attributed to the excellent proofreading and translation work of Alastair Matthews on several contributions. Furthermore, we would like to thank the DFG-funded Research Training Group 2196 “Document—Text—Editing: Conditions and Forms of Transformation and Modeling; A Transdisciplinary Perspective” for their generous funding of both the conference and the volume at hand; without it this undertaking could not have been realized.
Wuppertal, March 2022 Gerald Hartung Frederik Schlupkothen Karl-Heinrich Schmidt
https://doi.org/10.1515/9783110780888-202
Contents Part I: Introduction Gerald Hartung, Frederik Schlupkothen, and Karl-Heinrich Schmidt Preface | 5 Roswitha Skare Document Theory | 11
Part II: Using Documents Gerald Hartung The Document as Cultural Form | 43 Ulrich Johannes Schneider From Manuscript to Printed Book | 63 Cornelia Bohn Documenting as a Communicative Form | 83 Frederik Schlupkothen and Karl-Heinrich Schmidt Legibility and Viewability | 109 John A. Bateman A Semiotic Perspective on the Ontology of Documents and Multimodal Textuality | 147
Part III: Background Gerald Hartung and Karl-Heinrich Schmidt Simmel’s Excursus on Written Communication | 203 Roger T. Pédauque Document: Form, Sign, and Medium, as Reformulated by Digitization | 225 List of Contributors | 261
|
Part I: Introduction
Joseph Nicéphore Niépce. View from the Window at Le Gras. Heliograph, 1827. Thought to be the first permanent photograph ever made. Source: https://commons.wikimedia.org/wiki/File:View_from_the_Window_at_Le_Gras,_Joseph_ Nic%C3%A9phore_Ni%C3%A9pce.jpg (03.03.2022), public domain.
Gerald Hartung, Frederik Schlupkothen, and Karl-Heinrich Schmidt
Preface
Even though documents play a central role in the practices of human societies— especially those that rely on recordable remote communication—documents as such do not figure centrally in scholarly discourse as an overarching topic. Typically, disciplinary divisions of work as well as formulations of lines of inquiry are dominated by the study of specific content architectures such as text or video,¹ even though these architectures are all primarily brought to social life and put into use through documents. Using Documents therefore pursues an overarching interdisciplinary theoretical discussion that seeks fundamental descriptions of the rich world of documents in our daily lives. To this end, a necessarily selective overview of existing theoretical foundations is given and an equally necessarily selective, but also ambitious look ahead is undertaken—digital documents in particular will continue to accompany human societies in a manifold number of ways. In the overview of theoretical foundations, an excursus by the philosopher and sociologist Georg Simmel on written communication and a text by an interdisciplinary French collective of authors, working under the pseudonym Roger T. Pédauque, on the consequences of the digital revolution for the concept of documents are taken as a starting point. Looking ahead, these texts are expanded on in contributions from an interdisciplinary conference of cultural scientists and cultural technologists who gathered at the University of Wuppertal in October 2019 to discuss the topic of document use. Georg Simmel’s “Written Communication” excursus, which is well known in cultural studies, is a landmark text that is still worth reading after more than a century; it is presented here in a new way with regard to the social processing and distribution of information (Hartung and Schmidt 2022, 203–224 in this volume). Letters are treated by Simmel as a form of protected point-to-point communication that allows double contingency to be handled in a genre-specific way. With its genre orientation, Simmel’s excursus points, surprisingly early, to an approach that is in general also empirically viable for today’s cultural studies as well as for cultural-technical applications, since letters in particular (besides CVs, invoices, etc.) are extensively standardized as part of common document processing; this standardization can in turn be used as the basis for further applications. 1 Also in the media-technological sense of MIME types. Gerald Hartung, Frederik Schlupkothen, Karl-Heinrich Schmidt, University of Wuppertal https://doi.org/10.1515/9783110780888-001
6 | Gerald Hartung, Frederik Schlupkothen, and Karl-Heinrich Schmidt Beyond genre issues, a crucial text by the French collective of authors called Roger T. Pédauque is presented in a new English translation (Pédauque 2022, 225–259 in this volume). This pseudonym represents a team of philosophers, sociologists, economists, and computer scientists, among others, who are particularly concerned with the effects of digitization on document concepts. The text identifies three general perspectives on documents: – The form perspective (vu) with the guiding question: how is a document perceived by (human) senses? – The content perspective (lu) with the guiding question: how is a document comprehended by a cognitive agent? – The medium perspective (su) with the guiding question: how is a document used by a communicating collective? From a (cultural-)technical point of view, the form perspective addresses, in particular, logical document descriptions that will potentially be output in various different layouts. Important foundations are currently provided by standardized languages such as HTML with its associated languages. From the point of view of cultural studies, this always involves perception. The content perspective addresses the exploitation of a document: in cultural-technical terms, this can be supported by standardized content description languages such as those of the RDF family; in terms of cultural studies, the focus here is on the “traditional” interpretation of a document’s content. Finally, the medium perspective addresses the typically context-dependent use of a document. Work on this third perspective has not yet been consolidated in such a way that it can be considered adequately understood alongside the modeling of form and content. Achieving such an understanding is the main focus of this volume. These classic works are complemented by the introductory contribution of Roswitha Skare. She gives a general overview of the historical development of theoretical reflection on documents and the formulation of document theories (Skare 2022, 11–38 in this volume). The chapter takes as its starting point the Latin documentum and the use of the concept of documents in European state bureaucracy from the seventeenth century onward. The first interest in document theory was a professional one and can be observed at the beginning of the twentieth century. While the notions of document and documentation was well established around 1930, it was replaced by the notion of information after World War II, at least in the anglophone community. Nevertheless, at the same time as a new kind of document theory was emerging, a general scientific theory that is first and foremost about what documents are and do was developed by critical social scientists. Since the 1990s, there has again been a growing interest in the notion of documents and
Preface
| 7
documentation inside library and information science, together with a growing interest in digital documents. The three contributions introduced so far describe both the theoreticalhistorical and the structural framework of this volume. Empirically concrete and general document-theoretical analyses that look ahead are embedded in this framework. A first approach to documents beyond architecture- and genre-specific questions is provided by the discussion of documents as a cultural form. Gerald Hartung chooses this approach in his contribution (Hartung 2022, 43–61 in this volume), in which he explores the potential of a cultural studies perspective with the triumvirate of Dilthey, Cassirer, and Simmel. Essentially, this makes it possible to formulate questions that result in a distinction of documents from other forms of culture. The next two contributions examine documents as cultural forms in terms of the cultural-historical evolution from the “print” to the “screen” technological media type.² Ulrich Johannes Schneider starts with the genesis of the document platform of books, which defined the modern era (Schneider 2022, 63–82 in this volume). The ability to be read in a rigidly sequential manner is a first analytical marker of the connection between text and whoever witnesses it. This connection is investigated in Schneider’s contribution and its discussion of the printed book in its infancy. The extract from the Gutenberg Bible in Schneider (2022, 66, figure 1 in this volume), “written in the most pure and most correct way” (Pope Pius II),³ thus illustrates the starting point for all later evolution of “book reading” as a type of use. Through empirical analysis and observation, Schneider provides tools for its analysis, particularly where structure and navigation are concerned. Early printed books give us an insight into additional contextual conditions on the page itself. The sequential flow of text acquires rhythm through readers who turn to their pens or printers who invent paragraphs or blank spaces. Thus, reading a text in a book is like watching a text perform its own representation. If Simmel and Schneider demonstrate several aspects of document-based remote communication on paper, by referring to personal letters and books, communication at the beginning of the twenty-first century has become empirically much harder to keep track of—not least in sociology. This is addressed by Cornelia Bohn’s chapter, a central concern of which is to set out the concept of a 2 This distinction can be found in CSS; cf. https://drafts.csswg.org/mediaqueries-4/#media-types (09.01.2022). 3 “Non vidi Biblias integras sed quinterniones aliquot diversorum librorum mundissime ac correctissime littere, nulla in parte mendaces, quos tua dignatio sine labore et absque berillo legeret” (Davies 1996).
8 | Gerald Hartung, Frederik Schlupkothen, and Karl-Heinrich Schmidt Darstellung media type, in which the “documental” is a central communicative form (Bohn 2022, 83–108 in this volume). The chapter thereby develops sociologically a perspective that has, in a documentation-theoretical context, been linked to documentation with reference to forms of social life and social subsystems (Lund 2004). Bohn proceeds in four steps. The first part of her chapter describes a “digital complex,” which, however, has not caused a complete social transformation but is embedding itself in the complex structure of society. The extensive existing analyses of this process make it possible to use “form” and “medium” in the Luhmannian sense in the subsequent argumentation. To this end, the second part introduces semantics of documenting that have emerged in organizations and institutional administrations, the modern sciences, publishing and the dissemination of knowledge, the identification of individuals, or modern art. The third step then sets out the theoretical frame, which conceptualizes documenting as a communicative form in a (systems-theoretically understood) Darstellung media type. The fourth part discusses Niklas Luhmann’s card index as an example. The incorporation of digital documents and their processing into society is evident in a social world pervaded by screens. For visual outputs, Frederik Schlupkothen and Karl-Heinrich Schmidt introduce legibility and viewability as central categories of form that describe the informational use of visual content architectures (Schlupkothen and Schmidt 2022, 109–145 in this volume). The digital use of these content architectures inherits the textual and cinematic handling of the underlying sequences of signs that was historically developed in the dispositifs of typographeum (the world of print) and cinema. The “screen” media type then exploits (primarily through markup languages) the structurability of these sequences as content fragments and their transformation into varying outputs. In this way, a hitherto largely unexplored intersection between media science and media technology is examined. John A. Bateman concludes the volume along document-theoretical grounds, arguing that a reoccurring problem with most existing approaches to notions of documents and texts can be located in a certain weakness in the semiotic foundations they exhibit (Bateman 2022, 147–197 in this volume). After reviewing ongoing debates in text encoding and annotation, approaches to materiality, and accounts of information artifacts from the field of ontological engineering, the contribution sets out how new approaches to multimodal semiotics can be employed to construct a framework capable of distinguishing and interrelating the core concepts more effectively. The framework is then briefly applied to the opening problem cases to generate a more finely differentiating view of the distinct kinds of annotation needed for effective text encoding. The result is that Bateman’s material/form discussion and use of discourse semantics—involving the vu and lu dimensions respectively—provide a wide-ranging semiotic foundation for analyzing the use of
Preface
| 9
documents. In this context, documents fulfill the function of a “mediating environment”: “Thus, to be a document is to play a particular role, in the foundational ontological sense, in social activities involving mediations of information and meaning” (Bateman 2022, 189 in this volume). With the fulfillment of such a role, the su dimension of documents becomes evident in its accomplishment here as well.
Bibliography Bateman, John A. A semiotic perspective on the ontology of documents and multimodal textuality. In: Hartung, Gerald, Frederik Schlupkothen, and Karl-Heinrich Schmidt, editors, Using Documents. A Multidisciplinary Approach to Document Theory, pp. 147–197. Berlin: De Gruyter, 2022. Bohn, Cornelia. Documenting as a communicative form. What makes a document a document—a scoiological perspective. In: Hartung, Gerald, Frederik Schlupkothen, and Karl-Heinrich Schmidt, editors, Using Documents. A Multidisciplinary Approach to Document Theory, pp. 83–108. Berlin: De Gruyter, 2022. Davies, Martin. Juan de Carvajal and early printing: The 42-line bible and the Sweynheym and Pannartz Aquinas. In: The Library, volume s6-XVIII, issue 3, pp. 193–215. September 1996. DOI: https://doi.org/10.1093/library/s6-XVIII.3.193. Hartung, Gerald. The document as cultural form. In: Hartung, Gerald, Frederik Schlupkothen, and Karl-Heinrich Schmidt, editors, Using Documents. A Multidisciplinary Approach to Document Theory, pp. 43–61. Berlin: De Gruyter, 2022. Hartung, Gerald and Karl-Heinrich Schmidt. Simmel’s excursus on written communication. A commentary in eleven steps. In: Hartung, Gerald, Frederik Schlupkothen, and Karl-Heinrich Schmidt, editors, Using Documents. A Multidisciplinary Approach to Document Theory, pp. 203–224. Berlin: De Gruyter, 2022. Lund, Niels W. Documentation in a complementary perspective. In: Rayward, W. Boyd, editor, Aware and Responsible: Papers of the Nordic-International Colloquium on Social and Cultural Awareness and Responsibility in Library, Information and Documentation Studies (SCARLID), pp. 93–102. Lanham, MD: Scarecrow Press, 2004. Pédauque, Roger T. Document: Form, sign, and medium, as reformulated by digitization. A completely reviewed and revised translation by Laura Rehberger and Frederik Schlupkothen. Laura Rehberger and Frederik Schlupkothen, translators. In: Hartung, Gerald, Frederik Schlupkothen, and Karl-Heinrich Schmidt, editors, Using Documents. A Multidisciplinary Approach to Document Theory, pp. 225–259. Berlin: De Gruyter, 2022. Schlupkothen, Frederik and Karl-Heinrich Schmidt. Legibility and viewability. On the use of strict incrementality in documents. In: Hartung, Gerald, Frederik Schlupkothen, and Karl-Heinrich Schmidt, editors, Using Documents. A Multidisciplinary Approach to Document Theory, pp. 109–145. Berlin: De Gruyter, 2022. Schneider, Ulrich Johannes. From manuscript to printed book. Page design and reading. In: Hartung, Gerald, Frederik Schlupkothen, and Karl-Heinrich Schmidt, editors, Using
10 | Gerald Hartung, Frederik Schlupkothen, and Karl-Heinrich Schmidt Documents. A Multidisciplinary Approach to Document Theory, pp. 63–82. Berlin: De Gruyter, 2022. Skare, Roswitha. Document theory. In: Hartung, Gerald, Frederik Schlupkothen, and KarlHeinrich Schmidt, editors, Using Documents. A Multidisciplinary Approach to Document Theory, pp. 11–38. Berlin: De Gruyter, 2022.
Roswitha Skare
Document Theory Interest in the document approach varied during the twentieth century.¹ Since the late 1980s and early 1990s, however, increased attention has been given to the concept of documents and documentation in disciplines like Library and Information Science (LIS), but also in society in general. Technological developments, especially the creation of the World Wide Web by Tim Berners-Lee in 1989 and the subsequent increased production and use of digital documents, were important in this process.
1 A Norwegian Perspective In Norway a new act of legal deposit was created in 1990 in response to these technological developments. The act stated that “[a]ll material published in Norway must be legally deposited with the National Library of Norway. This applies regardless of the format of publication, as the law is media-neutral.”² As such, the law required that not only paper-based or printed documents be collected, but also photographs, films, broadcast material, and digital publications—both offline and online. This act, together with the newly established National Library, increased the need for an educational program to deal with these new challenges. At about the same time, a political process establishing a librarianship program outside the only existing one in Oslo, the capital of Norway, was set in motion. The general lack of librarians in northern Norway was an important reason for the decision to establish an educational program for librarians at the University of Tromsø in 1995. The program was named Documentation Studies and started in January the following year. 1 This paper is a revised and extended version of a talk given at the conference “Using Documents: Looking Back and Forward” at the Bergische Universität Wuppertal in October 2019. The first part of the paper is derived from an article with a similar title in the Encyclopedia of Library and Information Science, Third Edition (Lund and Skare 2010). 2 See https://www.nb.no/en/legal-deposit/ (22.12.2021) for details. It is worthwhile noting that digital material (Web pages and digital radio) first appears in the statistics in 2005, e-books first in 2009. Roswitha Skare, UiT The Arctic University of Norway https://doi.org/10.1515/9783110780888-002
12 | Roswitha Skare The choice of the name and content of the program were the result of technological and political developments during the 1980s and 1990s. Whereas comparable degree programs at that time were called Library and Information Science, Niels W. Lund, the first full professor in Documentation Studies, decided to broaden the perspective by using the terms “document” and “documentation.” As pointed out by Lund (2007, 12), “[t]he choice of the name Documentation studies was not based on a paradigmatic critique of Library and Information Science, but on a much more pragmatic and general political interest […].” Lund’s idea of a broad practical and theoretical education for librarians was also in line with international developments during the late 1990s in the work of researchers like Michael Buckland, W. Boyd Rayward, and Ronald E. Day, who helped rejuvenate the concept of documents and documentation by publishing articles such as What Is a “document”? (Buckland 1997) and translating French texts, introducing names like Paul Otlet and Suzanne Briet to an Anglo-American audience. The terms “neo-documentation” or “neo-documentary turn” have been coined to describe this return to the roots of the European documentation movement.³
2 Historical Perspectives 2.1 The Concept of a Document Going back to the term’s Latin origin, Lund (2010, 740) argues that documentum “can be separated into the verb doceo and the suffix mentum.” Based on the etymological roots and the conceptual history of the word “document,” Lund (2010, 743) defines it as “any results of human effort to tell, instruct, demonstrate, teach or produce a play, in short to document, by using some means in some ways.” Based on this definition, a document is not only something that can be held in one’s hand or a piece of written evidence. We can therefore say that the oral document tradition was the primary one until the seventeenth century, with documents oriented toward educational purposes like lectures as important examples. Oral lectures could be documents, and may indeed have been the prototypical document. This oral document tradition oriented toward educational purposes is almost forgotten today, and many would think of the original conception of the document as being a legal one dating back to antiquity. However, the legal conception is more related 3 See Hartel (2019), where she identifies a series of turns that have occurred within the field of Library and Information Science. The neo-documentary turn in the 1990s is considered to be a reaction to the cognitive turn’s focus on the mental dimension during the 1980s.
Document Theory |
13
to the emergence of European state bureaucracy from the seventeenth century onward. In France it is first found in 1690, in the combination “titres et documents” (Rey et al. 1992–1998, 620). It is defined as “écrit servant de preuve ou de renseignement” (Rey et al. 1992–1998, 620) or “something written, inscribed, etc. which furnishes evidence or information upon any subject, as a manuscript, title-deed, tombstone, coin, picture, etc.” (Simpson and Weiner 1989, 916). From the beginning of European modernity and the Enlightenment, a document is first and foremost a written object stating and substantiating transactions, agreements, and decisions made by citizens. It was an essential part of the creation of a public bureaucracy across and independent of local customs based on droit coutumier (in contrast to droit écrit), laws and rules varying from place to place, being oral or gestural in nature, like handshaking agreements on a marketplace. Second, documents also became a matter of proof, and the question of whether they are authentic or not a crucial one. Third, documents still deliver information, they are pieces of writing that tell you something. These three characteristics can be merged into one central phenomenon of modern society: written, true knowledge. During the eighteenth century, an essential part of the development of modern bourgeois society, and especially its public sphere, was that the legitimacy of politics, economy, the court, and science became increasingly dependent on one’s ability to document one’s rights and claims. Following that legal tradition, in the late eighteenth century, empirical evidence also became increasingly important for scientists, who had to demonstrate true positive knowledge through controlled experiments or by collecting documents to demonstrate empirical proof as the basis for their arguments. During the nineteenth century, the noun “documentation” became a key word in both administration and science. This made the perfect setting for the first explicit document theory to be articulated as part of what has been called the first documentation movement, led first and foremost by the Belgian lawyer Paul Otlet (1868–1944).
2.2 Document Theory—a Professional Interest Technical inventions like microfilm, moving images, and sound recording at the end of the nineteenth century contributed to an increased number of documents becoming available for scholars. Scientific associations and international journals were founded, and the need for tools to locate publications and to use collections of data increased. The International Institute of Bibliography was founded by Henri La Fontaine and his younger colleague Paul Otlet in 1895. Its catalogue was organized following an elaborate classification scheme, the universal decimal classification (UDC), for a very practical reason: to provide useful tools for scholars. For Otlet,
14 | Roswitha Skare the main purpose was “the organisation of documentation on an increasingly comprehensive basis in an increasingly practical way in order to achieve for the intellectual worker the ideal of a ‘machine for exploring time and space’” (Rayward 1990, 86). But in order to realize his ideal and to improve the practical organization of documentation, he had to define what is meant by a “document.” For that purpose, he needed a new science of bibliography: The Science of Bibliography can be defined as that science whose object of study is all questions common to different kinds of documents: production, physical manufacture, distribution, inventory, statistics, preservation and use of bibliographic documents; that is to say, everything which deals with editing, printing, publishing, bookselling, bibliography, and library economy. The scope of this science extends to all written or illustrated documents which are similar in nature to books: printed or manuscript literary works, books, brochures, journal articles, news reports, published or manuscript archives, maps, plans, charts, schemas, ideograms, diagrams, original or reproductions of drawings, and photographs of real objects. (Rayward 1990, 86)
In 1934 Otlet published Traité de documentation (Otlet 1934),⁴ a work often considered to be the most complete analysis of documentation available. Otlet developed a very broad concept of documents, but at the same time—as also stated by the subtitle Le livre sur le livre, théorie et pratique—he also retained a bias toward books and printed documents. Otlet always talked about “books and documents” as well as “bibliography and documentation,” thus developing a document theory for libraries, not for society in general. A similar inconsistency can be observed in Otlet’s use of the term documentation. La Fontaine and Otlet and their International Institute of Bibliography in Brussels used the word “with various new meanings”; “moreover [they] seem to have felt that it had an almost magical power as a manifesto of their ideas” (Woledge 1983, 270). They “sometimes contrast[ed] documentation with bibliography, but they also say that bibliography is only part of documentation” (Woledge 1983, 271). Other terms like “hyperdocumentation” are only used once in the entire Traité (see Le Deuff and Perret 2019, 1466), but nevertheless important when it comes to
4 A French project called “HyperOtlet” tries to achieve Otlet’s vision of going beyond the book by presenting his Traité in a digital, non-linear edition. The aim of the project is to build an online encyclopedia on documentation, centered around Otlet’s work, which will contain a non-linear, digital Traité at its core, as well as other digitized documents and encyclopedia articles by various contributors. See https://hyperotlet.hypotheses.org/le-projet (22.12.2021).
Document Theory | 15
understanding Otlet’s vision. He uses the term when describing the entire development of documentation in stages: In the first stage, Man sees the Reality of the Universe by his own senses. […] In the second stage, he reasons Reality and, combining his experience, generalizing it, he makes a new representation of it, In the third stage, he introduces the Document which records what his senses have perceived and what his thought has constructed. At the fourth stage, he creates the scientific instrument and Reality then appears to be magnified, detailed, specified […]. In the fifth stage, the Document intervenes again and it is to directly record the perception provided by the instruments. (Otlet 1934, quoted in Le Deuff and Perret 2019, 1465)
As pointed out by Le Deuff and Perret (2019), “[w]e currently live in the fifth stage,” while the sixth stage of Otlet’s vision, where “[v]isual documents and acoustic documents are completed by other documents, with touch, taste, fragrance and more,” is still to come. At the sixth stage, “also the ‘insensitive’, the imperceptible, will become sensitive and perceptible through the tangible intermediary of the instrument-document” (Le Deuff and Perret 2019). The sixth stage, for Otlet the stage of hyperdocumentation, is about “the revelation of instruments of all kinds of sense data, and even of data our senses cannot perceive” (Woledge 1983, 277). The concept of hyperdocumentation can be understood as an expression of Otlet’s belief in the technical possibilities of new media; microphotography in particular was an important part of his vision of knowledge beyond his time. He also envisioned the Universal Book as a single large hyper-book years before Vannevar Bush dreamed of the memex (Bush 1945) and decades before Ted Nelson coined the term “hypertext” (Nelson 1965). The Universal Book was to be created using what Otlet called the monographic principle. This would involve isolating each “fact” by taking all documents apart and taking cuttings from the original documents to make a card or sheet for each “fact,” and then pasting the individual fact-cards together in a particular order, creating a coherent cosmos out of the chaos of idiosyncratic documents. The “higher and higher levels of representational abstraction” with classification and metadata would contain “the most atomic and abstract representation” (Day 2019, 43), thus the highest truth (see figure 1 on the following page). While this method of “codification” could be fatal in an analog environment, destroying the original document forever, Otlet envisioned the possibilities of a digital “hyper-book” in which one can cut and paste without destroying the original document. Paul Otlet also imagined a new kind of scholar’s workstation: a machine that would let users search, read, and write their way through a vast mechanical database stored on millions of index cards. Even if Otlet was most interested in books and printed documents, he was open to other kinds of document: any object that can be observed and studied, thus
16 | Roswitha Skare
Fig. 1: “Universe, Intelligence, Science, Book” (Otlet 1934, 41).
providing us with new knowledge, can be considered a document: not only images and sound recordings, but also “natural objects, artifacts, objects bearing traces of human activity (such as archeological finds), explanatory models, educational games, and works of art” (Otlet 1934, quoted in Buckland 1997, 805). The Mundaneum, today a non-profit organization,⁵ was created by Otlet and La Fontaine to gather and host all human knowledge and to classify it according to the UDC system. Their aim was not only to facilitate worldwide sharing, thus contributing to peace among nations, but also a complete synthesis of knowledge, for “Otlet considered synthesis to be an evolutionary process of generalization and integration of knowledge, paralleling the increasing specialization and complexity of knowledge” (Van Acker 2012, 386). Otlet’s aim of a complete synthesis already
5 See http://www.mundaneum.org (22.12.2021).
Document Theory | 17
Fig. 2: Paul Otlet, Cellula Mundaneum, 1936 (© Collection Mundaneum, Mons).
appears in the architectural plans for the Mundaneum. Otlet’s drawing of the Cellula Mundaneum (see figure 2) can be seen as an expression of this “notion of ‘synthesis’ that was central to the nineteenth-century theory of positivism, which incorporated to a certain extent this platonic model of encyclopedism” (Van Acker 2012, 384). In addition to positivism and Platonism, occultism can be traced as a third sphere of influence on Otlet’s thought. There were others, like Walter Schürmeyer and Donker Duyvis, considering the theoretical issues regarding documents (see Buckland 1997), but the most important person after Otlet was without doubt Suzanne Briet, the French librarian and documentalist, author of many articles and the manifesto Qu’est-ce que la documentation? (Briet 2006).
18 | Roswitha Skare Like Otlet, Briet was also mostly concerned with practical documentation and not so much with theoretical questions. At the same time, she was very conscious about what she was trying to accomplish. In order to promote her vision of a new professional field, she also discussed the question of what constitutes a document. She starts by referring to a general definition—“A document is a proof in support of a fact”—and to the official definition of the French Union of Documentation Organizations from 1935: any base “of materially fixed knowledge, and capable of being used for consultation, study, and proof.” Influenced by linguists like Charles S. Peirce and his theory of three basic kinds of sign (iconic, indexical, and symbolic), she proposes the following definition: “any concrete or symbolic indexical sign [indice], preserved or recorded toward the ends of representing, of reconstituting, or of proving a physical or intellectual phenomenon” (Briet 2006, 10). From this, one might argue that Briet claims that documents are primarily indexical signs, in contrast to symbolic signs: Is a star a document? Is a pebble rolled by a torrent a document? Is a living animal a document? No. But the photographs and the catalogues of stars, the stones in a museum of mineralogy, and the animals that are catalogued and shown in a zoo, are documents. (Briet 2006, 10)
The major difference between the two kinds of object is that the star and other objects named here are concrete objects unconnected with any specific sign, while the photograph, catalogue, and so on are specifically intended to represent something: stars, a special kind of mineral, or a special animal specimen such as a new kind of antelope, which Briet uses as an example of the relationship between documents and the whole process of documentation. They are initial documents, distinguished from what she calls secondary documents. She describes how new documents are created as derivatives, or secondary documents; the antelope is considered the initial document that is the basis for an entire complex of documents like catalogues, sound recordings, monographs about antelopes, encyclopedia articles about antelopes, and so on, all creating together a new kind of culture for scientists, centers of documentation conducted by documentalists “performing the craft of documentation,” using a new cultural technique of documentation: The proper job of documentation agencies is to produce secondary documents, derived from those initial documents that these agencies do not ordinarily create, but which they sometimes preserve. […] We are now at the heart of the documentalist’s profession. These secondary documents are called: translations, analyses, documentary bulletins, files, catalogues, bibliographies, dossiers, photographs, microfilms, selections, documentary summaries, encyclopedias, and finding aids. (Briet 2006, 25–26)
While the ideas of Otlet and Briet were rediscovered during the 1990s, other documentalists are still forgotten. Michael Buckland has continued to disseminate
Document Theory |
19
knowledge about long-forgotten persons important for document theory, like Robert Pagès, a student in Briet’s documentation program. Pagès uses the term “documentology” for the study of documents and documentation. Buckland establishes a relationship between the work of Pagès and Briet’s manifesto. He points to the fact that “sources are mostly absent” from her manifesto: “In particular, no sources are given for the examples given above (star, rock, antelope) or for the distinction between primary and secondary documents” (Buckland 2017, 1). Buckland turns to Robert Pagès’s ideas about documents published in the Review of Documentation and concludes that Robert Pagès’ thesis of 1947, published as an article in 1948, anticipates and explains Suzanne Briet’s famous example of an antelope as a document and also the distinction between initial and secondary documents. This priority suggests that these ideas originate with him, but does not prove it since he was at the time a student in Briet’s program and he later acknowledged her influence […]. The ideas might have come to him from Briet as his teacher or from one or more other sources. Regardless of its origins, Pagès’ overlooked article is a valuable contribution to document theory. (Buckland 2017, 6)
Both Otlet and Briet played a key role in the international documentation community during the first decades of the twentieth century, leading to the foundation of organizations and journals,⁶ all carrying the term “documentation” in their titles. By that time, the older concept of bibliography had been replaced by the terms “document” and “documentation,” and the concept of documentation can be considered as well established. In 1948, Samuel Bradford—president of the International Federation for Information and Documentation (FID)—could therefore speak of “fifty years of documentation.” Nevertheless, an important difference between the Anglo-American and Francophone worlds slowly began to emerge during the years after World War II. The concept of information gained importance and replaced the concept of documentation in stages (see Farkas-Conn 1990); in 1968, the American Documentation Institute decided to change its name to the American Society for Information Science.
6 One of the most important journals in the field, the Journal of Documentation (JoD, still being published today), was founded in 1945. See Woledge (1983, 270) for its first editors’ definition of the word “documentation” and the scope of the journal.
20 | Roswitha Skare
2.3 Document Theory—a General Scientific Interest After the late 1960s, especially in the Anglophone world, influence shifted from professional document theory to information theory. At the same time, a new kind of critical document theory emerged, one that is not so much interested in making documents about something as in asking what documents are and do. The French philosopher Michel Foucault states the following about documents in the introduction to the Archeology of Knowledge: […] ever since a discipline such as history has existed, documents have been used, questioned, and have given rise to questions; scholars have asked not only what these documents meant, but also whether they were telling the truth […] all this critical concern, pointed to one and the same end: the reconstitution, on the basis of what the documents say, and sometimes merely hint at, […] the document was always treated as the language of a voice since reduced to silence, its fragile, but possibly decipherable trace. (Foucault 2002, 6–7)
From this case, the discipline of history, Foucault develops a general document theory, turning the focus away from the assumed content or message of the document to the very material and active role of documents as bricks or parts in the construction of a historical totality. In his later book on the emergence of the prison, Discipline and Punish: The Birth of the Prison (Foucault 1995), he demonstrates how this document theory can be used not only in historical studies but also as a critical analytical tool in relation to the modern society in general. This is a fundamental criticism of the belief that a document contains a message in itself, as if a book is a document per se. It is only when a particular material thing such as a printed book becomes part of a constructed totality, such as the literary world, that it becomes a document. Foucault’s concept of the document is not very different from Briet’s concept of a documentary culture. In comparison to Otlet, however, one can argue that Foucault’s criticism of the assumption that documents have an inherent content challenges the belief that it is possible to undertake a hyper-organization of knowledge using a lot of cards with facts independently of their societal context. In the late 1960s and during the 1970s, other scholars than Foucault also understood documents as material building blocks in the social construction of the world. American sociologists like Harold Garfinkel and Dorothy E. Smith developed theories about documentary practices and methods for studying these practices. These methodologies, ethnomethodology and documentary interpretation, were to a large degree based on the German sociologist Karl Mannheim, who formulated what he called “documentary or evidential meaning” (Wolff 1993, 147). The documentary meaning is the meaning the document reveals “unintentionally” (Wolff 1993, 150), which might be its meaning in a larger social context; or in other words, documentary interpretation deals with the social role of a document, which is not
Document Theory |
21
explicitly expressed in the document but nevertheless demonstrated by its place in the construction of a social world as a whole. Inspired by Garfinkel, Foucault, and others, the French sociologist Bruno Latour made similar studies of how scientific facts were constructed in the lab and demonstrated this on several occasions, notably in his work with Steve Woolgar from 1979. Latour and Woolgar carried out anthropological fieldwork in a scientific lab in California, observing how scientific facts were constructed by making documents of various kinds: articles, monographs, diagrams, photos, and so on. Since then a large field of science studies has emerged based on the same general document theory, stressing that “[d]ocuments of various kinds come into play, but their connection with human agency varies according to the particular instance of interaction” (Lynch and Woolgar 1990, 133). John Seely Brown and Paul Duguid’s essay from 1996, The Social Life of Documents (Brown and Duguid 1996), follows this general critical approach in asking about the role of documents in social life, for instance the way in which documents negotiate meaning, how they pass between communities, but also how they are used to control boundaries between communities. They start out with questioning the “widely held notion of the document as some sort of paper transport carrying pre-formed ‘ideas’ or ‘information’ through space and time,” which is based on the “conduit” metaphor and a view of information as being “‘in’ books, files, or databases as if it could just as easily be ‘out’ of them”: As new technologies take us through major transformations in the way we use documents, it becomes increasingly important to look beyond the conduit image. We need to see the way documents have served not simply to write, but also to underwrite social interactions; not simply to communicate, but also to coordinate social practices. (Brown and Duguid 1996)
The research of Brown and Duguid is based on work by authors like Anselm Strauss, Benedict Anderson, Stanley Fish, and Bruno Latour. Apart from Latour, these are scholars who use the document concept very little, if at all, in their writings. Brown and Duguid translate a number of related theories of social life into a single theory of the role of documents in social life. These related theories are all characterized by an interest in how worlds, communities, and networks of humans and objects are created and constructed through shared documents. Brown and Duguid develop a document theory claiming that it is up to humans to discuss and decide whether something is a document.⁷
7 It is interesting to note that ideas developed in the article The Social Life of Documents (1996) later went into a book by the two authors entitled The Social Life of Information (2000). A new
22 | Roswitha Skare
3 Documents in the Digital Age 3.1 A Renewed Interest in the Materiality of Documents One of the main arguments for abandoning the documentation approach in the library field in the 1960s was the belief that the computer and thus digital documents would make it possible to go beyond physical barriers, having everything in one place and in one format. It became a matter of formulating the information needs of users, or in other words of observing how users thought, of getting a hold on their cognitive structures in order to match them with data structures in the computer. While many definitions of the document stress the physicality of the document, some might think that the document approach will become outdated now that everything is becoming digital and gathered together into one big database. The reality is that digital documents are no less physical than printed documents, but their type of physicality differs. Following the historical approach, Michael Buckland has focused on the physical dimension in several articles, notably in Information as Thing (Buckland 1991). Buckland points to the fact that the term “information” itself is “ambiguous and used in different ways” (Buckland 1991, 351). He identifies “three principal uses” (Buckland 1991, 351) of the concept of information, which he illustrates with reference to libraries dealing with different documents in order to inform their users (“information-as-process”). This information might increase the knowledge of the user (“information-as-knowledge”), but the documents handled and operated by the library must be considered the material basis of information (“information-asthing”). Buckland believes that “‘information-as-thing’, by whatever name, is of especial interest in relation to information systems because ultimately information systems, including ‘expert systems’ and information retrieval systems, can deal directly with information only in this sense” (Buckland 1991, 352). Buckland tells us that he is dealing with the ultimate condition for information, the thingness of, or material conditions for, information. In this context, drawing very much on the contributions of Otlet and Briet, the document is discussed as a possible concept for the “informative thing,” as being the core object for the whole field.
edition was published in 2017 under the same title but with new introductions. This is at least an indication of the competition between different concepts like “document” and “information.”
Document Theory |
23
In the classic article “What Is a Document?,” Buckland (1997) reviews a number of definitions of a document.⁸ Most authors, except for the Indian S. R. Ranganathan, are in favor of a very broad definition of a document, such as the general international definition from 1937: “Any source of information, in material form, capable of being used for reference or study or as an authority” (Buckland 1997, 805). One difference between the views of the documentalists discussed above and contemporary views is the emphasis that would now be placed on the social construction of meaning, on the viewer’s perception of the significance and evidential character of documents. (Buckland 1997, 807)
This shifts the focus from the materiality of the document to a focus on the social and perceptual dimensions of a document, moving back to the semiotic tradition of “object-as-sign.” After being told initially that materiality/physicality is the ultimate condition for dealing with information, it is surprising to read that one should not focus so much on the physical form but more give priority to the social function and how it is perceived by people in different social settings. According to Buckland, this must be seen in the light of digital technology: any distinctiveness of a document as a physical form is further diminished, and discussion of “What is a digital document?” becomes even more problematic unless we remember the path of reasoning underlying the largely forgotten discussions of Otlet’s objects and Briet’s antelope. (Buckland 1997, 808)
This apparent paradox of where to put the emphasis in document theory can be understood better when we see the work by Buckland not only as a materialistic criticism of the dominating information paradigm in Library and Information Science, but also as part of several efforts to formulate one or more socially and culturally oriented alternatives to that paradigm since the early 1990s. In addition to Buckland, this has been attempted by American scholars like Ronald E. Day and Bernd Frohmann, and by Scandinavian scholars like Birger Hjørland and Vesa Suominen. Drawing on many general theories from philosophy, especially Benjamin, Foucault, Derrida, Deleuze, and Peirce, a major theme for Ronald E. Day in several of his works has been to demonstrate the metaphysical character of the dominat8 As pointed out by Furner, Buckland focuses—with the exception of Ranganathan—on pre1966 definitions of “document” and “never intended to provide comprehensive coverage, even of the time period to which he limited himself” (Furner 2019, 4). Furner provides some post-1966 definitions and concludes “that documents are (or, at least, are typically considered to be) complex objects rather than simple ones” (Furner 2019, 7).
24 | Roswitha Skare ing information paradigm, which perceives information as something abstract, existing in our heads or in the air, and thus ignores the objects, the “informationas-things,” as the material basis for information. In Indexing It All: The Subject in the Age of Documentation, Information, and Data (Day 2014), Day examines “persons and documents as dialectically constructed subjects and objects” (Day 2019, 3) and calls this the modern documentary tradition. In his latest monograph, Documentarity: Evidence, Ontology, and Inscription, he continues his reflections about Suzanne Briet by thinking about the particular antelope “both before and after the process of its capture and being made evidence by means of traditional techniques and institutions” (Day 2019, 3). Starting in the early 1990s, Bernd Frohmann made a number of critical inquiries into some of the dominant paradigms in order to present the alternative paradigm of a materialist approach to the documentary systems formerly known as information systems. Like Day, Frohmann draws on several larger theories not only to present a criticism of the information paradigm, but also to present a paradigmatic alternative. In a monograph from 2004, Frohmann presents a fullscale paradigmatic alternative to the dominating library and information science paradigm, based on a complex document theory. He defines documents as “different material kinds of temporally and spatially situated bundles of inscriptions embedded in specific kinds of cultural practices” (Frohmann 2004, 137). What interests Frohmann is how documents work in different situations, how they function as stabilizing factors in social communities—such as the role of scientific journals in scientific communities. In this way, he follows the document theory developed by Foucault and other critical philosophers and social scientists mentioned above. By doing so, he wants to show “that powerful theoretical resources exist to continue and further the research inaugurated by neo-documentation” (Frohmann 2007, 37). Frohmann also revisits the question of what a document is (Frohmann 2009), and reflects on whether “we can think productively about documents and documentation without definitions” (Frohmann 2009, 291). A similar approach has been taken by Scandinavian scholars. The Danish scholar Birger Hjørland recommends that “the object of study [be changed] from mental phenomena of ideas, facts and opinion, to social phenomena of communication, documents and memory institutions” (Hjørland 2000, 39). Hjørland claims that the most important thing is that “the intrinsic natures of these objects are relatively irrelevant” (Hjørland 2000, 39). They only become documents once they are assigned an informative value by a collective or domain, as Hjørland has called the communities involved in deciding whether a thing becomes a document or not. Here we are told that documents are used as a stabilizing means in our society, being “relatively stable forms of practice” (Fjordback Søndergaard et al. 2003, 310),
Document Theory |
25
creating a sociocultural document theory along the lines of the work of Ronald E. Day, Vesa Suominen, and notably Bernd Frohmann. As mentioned above, the advent of digital technologies, and thus of digital documents consisting of bits and bytes, led to a belief that documents’ materiality would become less important. Today, we find that the rise of the new media has actually contributed to a material turn (Roberts 2017) in many disciplines; the “materializations of the text” (Brooks 2003, 679) have become more important, and scholars have started to call for media-specific analyses (Hayles 2004), “[r]ather than stretch the fiction of dematerialization thinner and thinner” (Hayles 2003, 275). Research on reading practices and the impact of the physical form on “reading as an embodied and multi-sensory experience” (Hayler 2016, 16) is part of this renewed interest in materiality.
3.2 Theories of Digital Documents As pointed out by Allen-Robertson (2017, 1733), digital documents “arise and persist as signals confined within software and hardware assemblages.” Only the “increasingly user-friendly software that express and mimic the typographic conventions of print culture” (Allen-Robertson 2017, 1733) establish a familiarity between the user and the document, while the digital technologies remain behind enigmatic black boxes. Even in 1994, David Levy challenged the assertion “that we are moving from the fixed world of paper documents to the fluid world of digital documents” (Levy 1994, 24). He argues “that all documents, regardless of medium, are fixed and fluid” (Levy 1994, 24), and that the notion of genre is important because “each genre has a characteristic rhythm of fixity and fluidity” (Levy 1994, 25). The intended use is connected to a document’s lifetime: some documents, like post-it notes, for instance, are only meant to exist for a short time, but can become more stable if the document becomes important for a receiver who decides to take care of it, thus extending its lifetime and fixity. How to guarantee the fixity and permanence of a document has been an important issue since the rise of digital media. One may ask: what is meant by being the same, how does one identify the likeness of two documents? This is a theme that has attracted a number of scholars and leads to a number of questions concerning the nature of documents. If one identifies two documents as being alike, one must be able define the criteria for them being the same. In the article Towards Identity Conditions for Digital Documents, Renear and Dubin (2003) deal with exactly this problem, wondering why so little work has been done on it in document theory, since it is crucial for dealing with digital documents.
26 | Roswitha Skare As a result, not only is this critical concept under-theorized, but progress on a number of important problems—including preservation, conversion, integrity assurance, retrieval, federation, metadata, identifiers—has been hindered. The development of identity conditions for a particular kind of entity is not something separate from, let alone subsequent to, defining that entity, so we cannot begin our development of identity conditions with an explicit definition of what we mean by ‘document’. […] By document […] we refer to the abstract symbolic expression which may be physically instantiated repeatedly and in various media. […] Although now fairly common, this sense of ‘document’ does compete with another well established and closely related use of the term (particularly common in the library, archival, and legal community) to refer to the physical carrier with its instantiated inscription. (Renear and Dubin 2003, 1)
They touch on one of the core issues in document theory—the competing use of the concept of a document to refer either to physical instantiation or abstract expression—and, one could add, on the understanding of the document as a sociocultural construct. This theoretical challenge has been dealt with in one of the largest theoretical projects on document theory, the French multidisciplinary network called “Document and Content: Creating, Indexing, and Browsing,” which included about one hundred researchers from different disciplines and was coordinated by Jean-Michel Salaün. The concept of document is not central to some of the disciplines covered by the network and the researchers only have a partial understanding of what this concept covers. The purpose of the network is therefore to shift the focus in order to make the document an essential subject of research, at least for a time, by pooling the partial contributions of the different researchers. (Pédauque 2003, 2; Pédauque 2022, 227)
It was decided to approach the concept of document from three angles: 1. The document as form (vu) angle discusses approaches “that analyze the document as a material or immaterial object” (Pédauque 2003, 3; Pédauque 2022, 228), while setting its content aside. Researchers investigate the structure of documents, which “can be modeled and which, in a way, independently of the medium, represents the ‘reading contract’ concluded between the document producer and its potential readers” (Pédauque 2003, 7; Pédauque 2022, 233). 2. The document as sign (lu) angle is about research that primarily perceives the document as meaningful and intentional. Although the form is sometimes considered,⁹ the main interest is in the document’s content and its interpretation: “What is important is the content, materialized by the inscription, which
9 Pédauque’s research is about electronic documents. Even if the arrival of electronic/digital documents has increased scholars’ awareness of the document’s materiality (see section 3.3),
Document Theory |
27
conveys the meaning” (Pédauque 2003, 12; Pédauque 2022, 240). The concept of the document is often only considered secondary in research from this angle: “only the text, the content, really matters” (Pédauque 2003, 15; Pédauque 2022, 245). 3. The document as medium (su) angle “raises the question of the document’s status in social relations” (Pédauque 2003, 3; Pédauque 2022, 228) and includes all “the approaches that analyze documents as a social phenomenon, a tangible element of communication between human beings” (Pédauque 2003, 17; Pédauque 2022, 247). Each of these angles focuses on one aspect of documents: the materiality or physical perception of a document (seeing); the interpretation of a document’s content, that is, the mental aspects that demand intellectual effort (reading); and finally, its social aspects and what position the document has in society (understanding); in French: vu, lu, su.
3.3 Document Theory—a Complementary Approach Different schools and disciplines have focused on different aspects of documents and used different approaches to document analysis over the course of time. The claim “that any and every document has a physical angle and a mental angle and a social angle and the related claim that in considering documents none of these three angles can be completely understood without acknowledging the other two” (Buckland 2016, 5) is nevertheless supported by many scholars. The question then is how to deal with all three aspects or angles of documentation analysis. In an article entitled Documentation in a Complementary Perspective, Lund (2004) introduced the notion of complementarity as a central concept in documentation analysis to solve this problem. Inspired by Niels Bohr and his quantum theory, Lund applies the concept of complementarity to the concepts of documentation (vu), information (lu), and communication (su). Lund demonstrates that although all three focus on different aspects, none of them is more important than the others; they complement each instead. Using the example of a book, Lund makes one thing clear: while social aspects stand in the foreground of the communicative domain, information focuses on content, and with it mental structures. In contrast, Documentation Studies, as well as Library and Information Science in the conventional sense, focuses largely on the content is nevertheless still often considered superior by many researchers, at least in the humanities.
28 | Roswitha Skare the materiality of a book. Lund argues that we can talk about “three complementary, but exclusive features of the description of the book. One is not making a synthesis, but three complementary closures around the book, making a joint completion of the description” (Lund 2004, 96). While Lund initially links Bohr’s principle of complementarity with the three concepts of documentation, information, and communication, he elicits three complementary ways of looking at a document such as a book: (1) document as a 100 percent physical phenomenon, (2) as a 100 percent social phenomenon, and (3) as a 100 percent mental phenomenon. Even if Lund admits that it is impossible to find “the ultimate perfect concept capturing the very essence of everything, since all concepts will be biased in some way and in principle only be partial in relation to an assumed totality” (Lund 2010, 744), he also repeats his belief in the complementary processes in later writings: For me, there’s value in keeping them [all three complementary processes] separate. Only by analyzing the processes individually can you do them justice and really expose the tensions between them. To study them well, they require different approaches. […] I insist on asynthesis. I believe that trying to synthesize, you lose the details that matter. (Lund et al. 2016, 7)
While it is easy enough to see the complementary principle when dealing with binary phenomena like ways of viewing light (particles and waves), it becomes much more complicated in the humanities, where we are used to hermeneutic methodologies. I have argued (see Skare 2009) that it is correct to transfer Bohr’s principle to the concepts of documentation, information, and communication since the boundaries between these fields ensure that the various ways of observation complement one another but also exclude one another. If one were to focus, for example, on social or material aspects in the field of information, then one would be in danger of leaving one’s own field. At first glance, the example of books (material, social, and mental aspects) also appears to make sense; one can often observe that established fields of inquiry, such as literary studies, for example, will neglect one or more aspects in favor of another, or even leave them aside entirely. Various traditions within a field, or even schools within the same field, often come to completely different conclusions while at the same time excluding other results and approaches. Dichotomies, such as qualitative and quantitative methods, induction and deduction, or subject and object are examples of such mutually exclusive points of departure. With reference to quantum theory, Bohr stated that ob¬ser¬vations can never be made simultaneously. For example, one cannot see an electron as a particle and a wave at the same time. Two different experimental situations are necessary, but they cannot be conducted at the same time, only one by one. In referring to Bohr, Lund also accepts Bohr’s criteria of exclusivity
Document Theory |
29
and thereby overlooks the fact that Bohr limits the validity of the principle of complementarity to quantum theory. I therefore wondered whether it would not be more accurate to view complementarity as a relationship between parts that form a whole, not thereby excluding one another, parts that can never be viewed completely separately from one another. That would also mean that the various approaches would not necessarily exclude one another and that they could be investigated either in parallel to one another or nearly simultaneously, even though synchronous observation is not possible. To clarify this, I used a book as an example and Gérard Genette’s concept of the paratext as a source of terminology with which to study material elements and the relationship between text and paratext, and between the material, mental, and social aspects of a document (Skare 2008, 2009).¹⁰ Despite the material turn (Roberts 2017) in the humanities and an increased awareness of the importance of media-specific analyses (Hayles 2004), material aspects have often been considered less important than the content or the meaning of a text. While the content of a text is considered to be the product of creativity and artistry, material aspects are often regarded as craftsmanship or, as Lund puts it, “something inferior, […] a necessary evil for symbolic production” (Lund 2010, 736). This might also be one reason why many scholars focus on examples with eye-catching material aspects in their discussion of why materiality matters.¹¹ While the standardizations within the publishing industry during the second half of the twentieth century might be another reason for why many material aspects of the book have been ignored, Genette demonstrates the importance of “elementary” elements like binding, cover design, or choice of paper by using examples from the history of the book. Although Genette’s concept has several shortcomings (see Skare 2021)—one might even criticize his choice of the term
10 In his study Paratexts: Thresholds of Interpretation, the French literary scholar Genette introduces the concept of the paratext to his readers. For Genette “the paratext is what enables a text to become a book and to be offered to its readers and, more generally, to the public” (Genette 1997, 1). In doing so, the importance of paratextual elements in transforming the text into a book is highlighted. Most of the paratextual elements explored by Genette are textual elements, but he also mentions non-textual manifestations: iconic (such as illustrations), material (e.g., typography, format, binding, paper quality), and factual (the author’s gender and age, her reputation, awards, and so on) ones. By drawing our attention to these non-textual elements, Genette also includes material, social, and economic aspects in his analyses. 11 In Writing Machines (Hayles 2002), Hayles explores—according to the publisher—“works that focus on the very inscription technologies that produce them, examining three writing machines in depth” (https://mitpress.mit.edu/books/writing-machines, 22.12.2021). The works chosen are both printed (Mark Z. Danielewski’s House of Leaves and Tom Phillips’s artist’s book A Humument) and digital (Talan Memmott’s Lexia to Perplexia).
30 | Roswitha Skare “paratext”—the concept nevertheless provides us with an awareness of elements of a document that are “very much a contributing, and at times constitutive, part” (Gray 2015, 231) of it. The concept thereby also helps us to recognize the importance of different versions instantiated in the same or in different media, accessed by the user on different platforms and devices. By using Genette’s concept, we begin with material aspects but are also able to investigate the connections between material and social aspects and the importance of them for our interpretation. It remains a challenge, however, to treat all forms of documentation on equal terms. For a literary scholar, the material aspects of a printed book are familiar or recognizable, while other media like films or computer plays or digital documents are more difficult to understand for a person without any technical background. Also, most of these documents are very complex ones, and the need to choose some elements to focus on is obvious unless we are part of a larger interdisciplinary team. Nonetheless, this should not prevent us from putting the accuracy of the principle of complementarity to the test in numerous and varied document analyses. The word “complementarity” means both mutual exclusivity and completeness of description (see Plotnitsky 1994, 5); for a scholarly analysis of a document, in my opinion, only the second sense is adequate. As pointed out by Hansson, there “have been few major critiques and attempts to provide alternative analyses” (Hansson 2017, 11) of Lund’s understanding of the concept of complementarity. Hansson himself discusses Tai Chi in terms of complementarity and distinguishes two types: “one intrinsic, defined by elements within the initial document itself, and one extrinsic, defined by, and depending on, elements from outside of the initial document” (Hansson 2017, 14). Hansson suggests moving toward embodied documentation so as to extend his argument to “other forms of symbolic corporeal action, such as dance or pantomime” (Hansson 2017, 17), and refers to the work of Olsson and Lloyd (2016). Hansson’s discussion of Tai Chi can be considered as part of the embodied turn in Library and Information Science (see Hartel 2019), as can the work by Gorichanaz and Latham (2016) on the necessity of an agency-based phenomenological and holistic analysis of documents: in order to understand documents, we must examine them from diverse perspectives. This has long been implicitly recognized; for instance, Lund’s (2004) framework of the document and the ensuing discourse has shown that considering a document from multiple perspectives (physical, mental and social) can lead to better understanding of that document. This points to the need for renewed consideration of these aspects (and possibly others) as a way to further documental understanding. (Gorichanaz and Latham 2016, 1117)
Gorichanaz and Latham seek to articulate a phenomenological framework for document analysis (see figure 3 on the facing page) in which the key concepts
Document Theory |
Object
Document
Intrinsic Information
31
Person Abtrinsic Information
physical properties e.g. color
physiological properties e.g. emotional state
Meaning Extrinsic Information
attributed properties e.g. provenance
Adtrinsic Information
associative properties e.g. memories
Fig. 3: Phenomenological structure of documental becoming (Gorichanaz and Latham 2016, 1118).
include four different forms of information: intrinsic and abtrinsic information (Lund’s material aspects), adtrinsic information (Lund’s mental aspects), and extrinsic information (Lund’s social aspects). In a conversation on document conceptualization with Niels W. Lund, Gorichanaz and Latham elaborate further on this: we believe that analysis and synthesis should both be used in tandem. The four informations are analytical tools meant to ‘take apart’ a document experience. Once the whole is looked at in each of these four ways, we bring them back together to address the document experience. (Lund et al. 2016, 7–8)
Gorinchanaz and Latham suggest their framework “as a device to help document scholars and practitioners communicate about documents” (Gorichanaz and Latham 2016, 1129), and they invite other scholars and practitioners to use and criticize the model (Gorichanaz and Latham 2016, 1130). Only a wide range of different analyses of both “traditional” and digital documents will prove the usefulness of this concept (like any other). Figure 4 on the next page summarizes how different researchers use different terms and approaches to talk about the complementarity of documents.
Document
32 | Roswitha Skare
• physical angle (Buckland) • material aspects/documentation (Lund) • vu/seeing (Pédauque) • intrinsic and abtrinsic information (Gorichanaz • and Latham)
• mental angle (Buckland) • mental aspects/information (Lund) • lu/reading (Pédauque) • adtrinsic information (Gorichanaz and Latham)
• social angle (Buckland) • social aspects/communication (Lund) • su/understanding (Pédauque) • extrinsic information (Gorichanaz and Latham)
Fig. 4: Different terms used to describe the complementarity of documents.
3.4 A New Functional Approach to Documents The Document Academy (DOCAM) was established in 2003 as a consequence of the renewed interest in the concept of documents and documentation since the 1990s, and has since become a meeting place for researchers interested in issues relating to and applications of the concept, with annual conferences and open-access proceedings.¹² The Document Academy explores issues relating to and applications of documentation and documents across academia, the arts, business, and society at large. Discussion focuses on the diversity of documents and how to study the wide range of problems, related to processes of documentation and the resulting documents, that are posed by concepts of authorship, identity, and intellectual property, as well as by document retrieval, annotation, principles of preserving digital documents, multimedia documents, and the politics of documentation. While the question of what a document is and, consequently, what it is not was much discussed during the early years, questions of what documents do and how they do it have become 12 See http://documentacademy.org for more information about the organization and https: //ideaexchange.uakron.edu/docam/ (22.12.2021) for the proceedings. See also Lund and Buckland (2008, 163–164) on the establishment and early years of the Document Academy.
Document Theory |
33
more important since then, and concepts of self-documentation, documentality, auto-documentality, and hyper-documentation have been discussed in recent years. A “thing’s documentary agency, power or force” is called documentality by Frohmann (2012, 173). He is “interested in traces as happenings, in how something gets written, or, in the widest sense, inscribed” (Frohmann 2012, 174). Ronald E. Day includes in this approach auto-documentality, which relates to the agency of documents and explores it as “self-evidentiality in terms of human, non-human, and, more broadly speaking, natural entities” (Day 2018, 1). The concept of documentality is also used by Maurizio Ferraris in his theory of social ontology. According to Ferraris, documentality is the form that social objects take through “inscription with institutional value.” He suggests “that we should regard as a document any inscription with institutional value” (Ferraris 2013, 249), and distinguishes between strong (inscription of an act that would not exist without the inscription) and weak (banknotes no longer current, expired passports, but also fingerprints left unintentionally) documents (Ferraris 2013, 267), and further between documents in a narrow (biometrical data and photographs) and in a broad (sound recordings, films, and videos, as well as DNA) sense (Ferraris 2013, 250–251). According to Ferraris (2013, 266), a document theory has to consider at least seven entities: 1. The different types of document, from informal notes to formal and solemn documents; 2. Their various physical realizations; 3. The different operations that can be performed by documents […]; 4. The various acts that can be executed thanks to documents […]; 5. The diverse ways […] in which those acts can be realized; 6. The institutional systems to which documents belong and their role in them; 7. The provenance of documents (the difference between original, copy and fake). As pointed out by Perret (2019, 1), “[d]ocumentality is not to be confused with documentarity,” even though the terms are almost identical. Ronald E. Day frames documentarity as a “philosophy of evidence” (Day 2019, 1) built upon the history of inscription. Day has “coined the term […] to distinguish” his approach “from Ferraris’s notion of documentality” (Day 2019, 8–9). He argues that “documentarity has both static (classification) and revelatory (algorithmic) forms for making evident, blocking out competing psychologies of time and experience” (Day 2019, 6). Day’s philosophical approach follows a timeline where he examines “a broad range of different types, genres, and modes of inscription by which beings become evident and are taken as evidence” (Day 2019, 151). Day starts out at the end of the nineteenth century with two different types of documentarity: Otlet’s posi-
34 | Roswitha Skare tivism (with knowledge as collection of facts about the world) and knowledge as experience (developed in field anthropology and early ethnology in France). He returns later to Briet and discusses French literary realism and the avant-garde’s criticism of representation in the twentieth and twenty-first centuries, and ends with post-documentation technologies that “tend to function in real-time and as infrastructural technologies” (Day 2019, 8).
4 Conclusion The renewed interest in the concept of documents and documentation since the 1980s and 1990s has led to a wide range of research with contributions from various fields. The annual DOCAM conferences and their proceedings, but also other conferences about historical, empirical, and theoretical perspectives in Library and Information Science (like CoLIS) and journals like the Journal of Documentation, have been platforms for discussion about this and other, competing concepts. One sometimes gets the impression that trends and fashions are important when terms and concepts are discussed, for instance when educational programs decide on new names in order to attract students. I would like to argue that our choice of concepts and terms should be decided by what is most suitable to answering our research question, not what concept or term is most trendy at a given moment. That a term has a long history is not in itself reason enough for choosing it, even if we encounter such an argument from time to time.¹³ We have to put the usefulness of the concept chosen in any given case to the test in numerous and varied analyses, demonstrating that it opens up new insights not provided in the same way by other concepts.
Bibliography Allen-Robertson, James. Critically assessing digital documents: materiality and the interpretative role of software. Information, Communication & Society, 21(11):1732–1746, 2017. DOI: https://doi.org/10.1080/1369118X.2017.1351575.
13 See, for instance, for the notion of “text,” Greetham (1999, 26): “Some years ago, when looking for a suitable title for a new interdisciplinary journal of textual scholarship, I came up with the term text. It had some obvious virtues: it was short and easily remembered, it had roughly the same form and meaning in several European languages, and it had a long scholarly history.”
Document Theory |
35
Briet, Suzanne. What is Documentation? English Translation of the Classic French Text. Ronald E. Day, Laurent Martinet, and Hermina G. B. Anghelescu, translators. Lanham, MD: Scarecrow Press, 2006. Brooks, Douglas A. Gadamer and the mechanics of culture. Poetics Today, 24(4):673–694, 2003. DOI: https://doi.org/10.1215/03335372-24-4-673. Brown, John Seely and Paul Duguid. The social life of documents. First Monday, 1(1), 1996. DOI: https://doi.org/10.5210/fm.v1i1.466. Buckland, Michael K. Information as thing. Journal of the American Society for Information Science, 42(5):351–360, 1991. DOI: https://doi.org/10.1002/(SICI)1097-4571(199106)42: 53.0.CO;2-3. Buckland, Michael K. What is a “document”? Journal of the American Society for Information Science, 48(9):804–809, 1997. DOI: https://doi.org/10.1002/(SICI)1097-4571(199709)48: 93.0.CO;2-V. Buckland, Michael K. The physical, mental and social dimensions of documents. Proceedings from the Document Academy, 3(1):4, 2016. DOI: https://doi.org/10.35492/docam/3/1/4. Buckland, Michael K. Before the antelope: Robert Pagès on documents. Proceedings from the Document Academy, 4(2):6, 2017. DOI: https://doi.org/10.35492/docam/4/2/6. Bush, Vannevar. As we may think. The Atlantic Monthly, July 1945. URL: https://www.theatlantic. com/magazine/archive/1945/07/as-we-may-think/303881/, (14.02.2022). Day, Ronald E. Indexing It All: The Subject in the Age of Documentation, Information, and Data. Cambridge, MA: MIT Press, 2014. Day, Ronald E. Auto-documentality as rights and powers. Proceedings from the Document Academy, 5(2):3, 2018. DOI: https://doi.org/10.35492/docam/5/2/3. Day, Ronald E. Documentarity: Evidence, Ontology, and Inscription. Cambridge, MA: MIT Press, 2019. Farkas-Conn, Irene Sekely. From Documentation to Information Science: The Beginnings and Early Development of the American Documentation Institute-American Society for Information Science. Santa Barbara, CA: Greenwood Press, 1990. Ferraris, Maurizio. Documentality: Why It Is Necessary to Leave Traces. Richard Davies, translator. New York: Fordham University Press, 2013. Fjordback Søndergaard, Trine, Jack Andersen, and Birger Hjørland. Documents and the communication of scientific and scholarly information revising and updating the unisist model. Journal of Documentation, 59(3):278–320, 2003. DOI: https://doi.org/10.1108/ 00220410310472509. Foucault, Michel. Discipline and Punish: The Birth of the Prison. Alan Sheridan, translator. New York: Vintage Books, 1995. Foucault, Michel. Archaeology of Knowledge. 2nd edition, London: Routledge, 2002. DOI: https://doi.org/10.4324/9780203604168. Frohmann, Bernd. Deflating Information: From Science Studies to Documentation. Toronto: University of Toronto Press, 2004. Frohmann, Bernd. Multiplicity, materiality, and autonomous agency of documentation. In: Skare, Roswitha, Niels W. Lund, and Andreas Vårheim, editors, A Document (Re)turn, pp. 27–39. Frankfurt am Main: Peter Lang, 2007. URL: https://www.academia.edu/14048919/ Documentation_materiality_and_autonomous_agency_of_documentation, (14.02.2022). Frohmann, Bernd. Revisiting ‘what is a document?’. Journal of Documentation, 65(2):291–303, 2009. DOI: https://doi.org/10.1108/00220410910937624.
36 | Roswitha Skare Frohmann, Bernd. The documentality of Mme Briet’s antelope. In: Packer, Jeremy and Stephen B. Crofts Wiley, editors, Communication Matters: Materialist Approaches to Media, Mobility and Networks, pp. 173–182. Milton Park: Routledge, 2012. Furner, Jonathan. The ontology of documents, revisited. Proceedings from the Document Academy, 6(1):1, 2019. DOI: https://doi.org/10.35492/docam/6/1/1. Genette, Gérard. Paratexts. Thresholds of Interpretation. Cambridge: Cambridge University Press, 1997. Gorichanaz, Tim and Kiersten F. Latham. Document phenomenology: a framework for holistic analysis. Journal of Documentation, 72(6):1114–1133, 2016. DOI: https://doi.org/10.1108/ JD-01-2016-0007. Gray, Jonathan. Afterword: Studying media with and without paratexts. In: Geraghty, Lincoln, editor, Popular Media Cultures: Fans, Audiences and Paratexts, pp. 230–237. London: Palgrave Macmillan Limited, 2015. Greetham, David C. Theories of the Text. Oxford: Oxford University Press, 1999. Hansson, Joacim. Representativity and complementarity in Tai Chi as embodied documentation. Proceedings from the Document Academy, 4(2):11, 2017. DOI: https://doi.org/10.35492/ docam/4/2/11. Hartel, Jenna. Turn, turn, turn. In: Proceedings of the Tenth International Conference on Conceptions of Library and Information Science, Ljubljana, Slovenia, June 16–19, 2019, number 1901 in CoLIS, Information Research, 2019. URL: http://informationr.net/ir/244/colis/colis1901.html, (14.02.2022). Hayler, Matt. Matter matters: The effects of materiality and the move from page to screen. In: Griffin, Gabriele and Matt Hayler, editors, Research Methods for Reading Digital Data in the Digital Humanities, pp. 14–35. Edinburgh: Edinburgh University Press, 2016. Hayles, Kathrine N. Writing Machines. Cambridge, MA: MIT Press, 2002. Hayles, Kathrine N. Translating media: Why we should rethink textuality. The Yale Journal of Criticism, 16(2):263–290, 2003. Hayles, Kathrine N. Print is flat, code is deep: The importance of media-specific analysis. Poetics Today, 25(1):67–90, 2004. Hjørland, Birger. Documents, memory institutions and information science. Journal of Documentatio, 56(1):27–41, 2000. Le Deuff, Olivier and Arthur Perret. Hyperdocumentation: origin and evolution of a concept. Journal of Documentation, 76(6):1463–1474, 2019. DOI: https://doi.org/10.1108/JD-032019-0053. Levy, David M. Fixed or fluid? Document stability and new media. In: ECHT ’94: Proceedings of the 1994 ACM European Conference on Hypermedia Technology, pp. 24–31, 1994. DOI: https://doi.org/10.1145/192757.192760. Lund, Niels W. Documentation in a complementary perspective. In: Rayward, W. Boyd, editor, Aware and Responsible: Papers of the Nordic-International Colloquium on Social and Cultural Awareness and Responsibility in Library, Information and Documentation Studies (SCARLID), pp. 93–102. Lanham, MD: Scarecrow Press, 2004. Lund, Niels W. Building a discipline, creating a profession: An essay on the childhood of ‘Dokvit’. In: Skare, Roswitha, Niels W. Lund, and Andreas Vøarheim, editors, A Document (Re)turn, pp. 11–26. Frankfurt am Main: Peter Lang, 2007. Lund, Niels W. Document, text and medium: Concepts, theories and disciplines. Journal of Documentation, 66(5):734–749, 2010. DOI: https://doi.org/10.1108/00220411011066817.
Document Theory |
37
Lund, Niels W. and Michael Buckland. Document, documentation, and the document academy. Archival Science, 8:161–164, 2008. Lund, Niels W. and Roswitha Skare. Document theory. In: Bates, Marcia J. and Mary Niles Maack, editors, Encyclopedia of Library and Information Sciences, volume 1, pp. 1632–1639. 3rd edition, New York: Taylor and Francis, 2010. Lund, Niels W., Tim Gorichanaz, and Kiersten F. Latham. A discussion on document conceptualization. Proceedings from the Document Academy, 3(2):1, 2016. DOI: https: //doi.org/10.35492/docam/3/2/1. Lynch, Michael E. and Steven Woolgar, editors. Representation in Scientific Practice. Cambridge, MA: MIT Press, 1990. Nelson, Theodor Holm. A file structure for the complex, the changing, and the indeterminate. In: Proceedings of the 1965 20th National Conference, pp. 84–100. Association for Computing Machinery, 1965. DOI: https://doi.org/10.1145/800197.806036. URL: https://dl.acm.org/ doi/pdf/10.1145/800197.806036, (14.02.2022). Olsson, Michael and Anemaree Lloyd. Being in place: embodied information practices. In: Proceedings of the Ninth International Conference on Conceptions of Library and Information Science, Uppsala, Sweden, June 27–29, 2016, number 1601 in CoLIS, Information Research, 2017. URL: http://InformationR.net/ir/22-1/colis/colis1601.html, (14.02.2022). Otlet, Paul. Traité de Documentation. Brussels: Editions Mundaneum, 1934. Pédauque, Roger T. Document: Form, sign and medium, as reformulated for electronic documents. 2003. URL: https://archivesic.ccsd.cnrs.fr/sic_00000594/document, (14.02.2022). Perret, Arthur. Writing documentarity. Proceedings from the Document Academy, 6(1):9, 2019. DOI: https://doi.org/10.35492/docam/6/1/10. Plotnitsky, Arkady. Complementarity. Anti-Epistemology after Bohr and Derrida. Durham, NC: Duke University Press, 1994. Pédauque, Roger T. Document: Form, sign, and medium, as reformulated by digitization. A completely reviewed and revised translation by Laura Rehberger and Frederik Schlupkothen. Laura Rehberger and Frederik Schlupkothen, translators. In: Hartung, Gerald, Frederik Schlupkothen, and Karl-Heinrich Schmidt, editors, Using Documents. A Multidisciplinary Approach to Document Theory, pp. 225–259. Berlin: De Gruyter, 2022. Rayward, W. Boyd. International Organisation and Dissemination of Knowledge. Selected Essays of Paul Otlet. Amsterdam: Elsevier, 1990. Renear, Allen and David Dubin. Towards identity conditions for digital documents. In: Proceedings of the 2003 Dublin Core Conference, 2003. URL: https://dcpapers.dublincore.org/ pubs/article/view/746/742, (14.02.2022). Rey, Alain, Marianne Tomi, Tristan Horé, and Chantal Tanet. Dictionnaire Historique De La Langue Française. Paris: Dictionnaires Le Robert, 1992–1998. Roberts, Jennifer L. Things. Material turn, transnational turn. American Art, 31(2):64–69, 2017. Simpson, John A. and Edmund Weiner. The Oxford English Dictionary. 2nd edition, Oxford: Clarendon Press, 1989. Skare, Roswitha. Christa Wolfs “Was bleibt”: Kontext – Paratext – Text. Münster: LIT Verlag, 2008. Skare, Roswitha. Complementarity: A concept for document analysis? Journal of Documentation, 65(5):834–840, 2009. DOI: https://doi.org/10.1108/00220410910983137. Skare, Roswitha. The paratext of digital documents. Journal of Documentation, 77(2):449–460, 2021. DOI: https://doi.org/10.1108/JD-06-2020-0106.
38 | Roswitha Skare Van Acker, Wouter. Architectural metaphors of knowledge: The Mundaneum designs of Maurice Heymans, Paul Otlet, and Le Corbusier. Library Trends, 61(2):371–396, 2012. DOI: https: //doi.org/10.1353/lib.2012.0036. Woledge, Geoffrey. Historical studies in documentation: ‘bibliography’ and ‘documentation’. Journal of Documentation, 39(4):266–279, 1983. Wolff, Kurt H., editor. From Karl Mannheim. Oxford: Oxford University Press, 1993.
|
Part II: Using Documents
Beginning of the book of Genesis in the Gutenberg Bible, p. 5r. Print, 1454. Earliest book printed using movable type in Europe. Source: https://commons.wikimedia.org/wiki/File:Gutenberg-Bibel_Bd1_005_r_Genesis.jpg (03.03.2022), public domain.
Gerald Hartung
The Document as Cultural Form The late nineteenth and early twentieth centuries brought with them a concern for how we can describe ordered knowledge effectively. For the modern world, “effectively” means no longer being able to turn to assumptions that can hardly be verified scientifically. In his Positive Philosophy (French original 1830–1842), Auguste Comte distinguished between theological and metaphysical stages in explaining the world, and set them in turn apart from a positive stage in which people seek to channel the use of reason and to order observable realities on the basis of patterns of resemblance and causality. Above all, this means connecting so-called facts of nature and of social life. Such connections involve lines and nodes that ideally yield a coherent documentation of our factual knowledge.¹ With Comte, we have the ingredients for a theory of ordering knowledge that relies on observation and a shared used of reason to discover regularities and laws, and to explain individual facts in nature and culture by connecting them to these general rules and laws. This is the theoretical background to the scientific optimism that also underlies Paul Otlet’s landmark study Traité de Documentation: Le Livre sur le Livre; Théorie et pratique (Otlet 2015; see also Skare 2022). Otlet envisages that the world of facts, from individual elements to the overall ensemble, can be fused together by drawing connections and demonstrating correlations. The Mundaneum, which was founded by Paul Otlet and Henri La Fontaine in Brussels in 1898 as the Institut International de Bibliographie, was intended to cover the whole world of knowledge with all-encompassing documentation (see Levie 2006).²
1 “In the […] positive state, the mind has given over the vain search after Absolute notions, the origin and destination of the universe, and the causes of phenomena, and applies itself to the study of their laws,—that is, their invariable relations of succession and resemblance. Reasoning and observation, duly combined, are the means of this knowledge. What is now understood when we speak of an explanation of facts is simply the establishment of a connection between single phenomena and some general facts, the number of which continually diminishes with the progress of science.” (Comte 1853, 2) 2 I thank Karl-Heinrich Schmidt and Frederik Schlupkothen for pointing out Otlet’s work and the line of research to which it gave rise in Pédauque and Salaün concerning document theory and the architecture of information and knowledge. I was not familiar with these contexts until Gerald Hartung, University of Wuppertal https://doi.org/10.1515/9783110780888-003
44 | Gerald Hartung I intend to proceed in three steps in the discussion that follows. I begin by bringing a philosophical perspective to bear on the definitions of the document—as part of both a theory of documents and an architecture of information circulation—in Otlet and his successors. I then go on to identify the equivalent elements in a philosophical theory of culture (the theory of objective spirit in Wilhelm Dilthey, Ernst Cassirer, and Georg Simmel) that allow bridges to be built with document theory. Finally, I present some theses on how the two theoretical contexts can be drawn together.
1 Theories of Documents and Their Use Paul Otlet provides a general definition of the document: “The most general definition of a book or a document that can be given is this: a carrier made of a certain material and with certain dimensions, possibly [in the case of a book] folded or rolled in a certain way, marked with signs that represent specific intellectual data.”³ This definition, general as it is, is usable insofar as Otlet shows that documents have a particular materiality and extension (i.e., they belong among the extended, spatial things, res extensae) and serve as a medium for preserving signs that in turn refer to intellectual processes (i.e., the temporal sequentiality of res cogitantes). We are, therefore, still in a world of Cartesian dualism here. For Otlet, the document par excellence is the book because the preservation and dissemination of information/knowledge can be illustrated paradigmatically with reference to it. Books are instruments of culture, of education, and more. In this respect, books also have a social role. But there are a great variety of documents of many different kinds, some of which are not meant for publication and others of which are, as Otlet enigmatically adds, not meant to be.⁴ It is crucial for my purposes that Otlet sets out the basic structures of a document. It is composed of intellectual (e.g., ideas), material (e.g., paper, paper format), and graphical (e.g., signs such as text or illustrations) elements. A document can recently, and am therefore astonished by how close and distant these theories are to and from my own field of research, the philosophy of culture as a theory of objective spirit. In the present chapter, I advance the thesis that the two theoretical contexts should be placed in a productive relationship with each other. 3 “La définition la plus générale qu’on puisse donner du Livre et du Document est celle-ci: un support d’une certaine matière et dimension, éventuellement d’un certain pliage ou enroulement sur lequel sont portés des signes représentatifs de certaines données intellectuelles.” (Otlet 2015, 43) 4 “qui […] ne sont pas destinées à l’être.” (Otlet 2015, 43)
The Document as Cultural Form
| 45
be considered in terms either of its content (contenu) or its form (contenant). The strength of Otlet’s treatment lies in the fact that he provides a comprehensive account of the history of documents and then points out general connections (types, forms, categories). As a result, a world made up of general document structures, a bibliographical (if seen in terms of the paradigm of the book) cosmos, takes shape before the reader’s eyes. Otlet describes not only the production conditions but also the reception conditions for documents (or books) (Otlet 2015, 315 ff.).⁵ He concludes with a synthesis of the different kinds of document in an “organisation mondial” in which local, regional, national, and international “organismes” of information are considered in their porous interrelatedness (Otlet 2015, 420). The theory of documents thus becomes a description of a practice of freely circulating information. Recent theories of the document are indebted to Otlet in a number of ways (which I cannot trace individually here), but understandably move away from his primary focus on the book as a model. When Roger Pédauque speaks of the omnipresence of documents in our lifeworld, for instance, he also has in mind the transition from the written to the audiovisual medium and from paper to digital format (Pédauque 2006, 29; see also Pédauque 2022, 226). The book becomes one among many carrier media for distributing and preserving information and knowledge. A relatively recent account of the scholarly discussion is presented in the study Vu, Lu, Su by Salaün (2012), and it is adopted as the foundation for the ideas that are developed in what follows. In contrast to Otlet, Salaün explicitly positions the document in a social space: “The document becomes established as a preferred vehicle for the circulation of knowledge, the adoption of [particular] techniques, and certainly also for the stability of organizations and the development of commerce.”⁶ In addition, the document is explicitly separated from the book paradigm, thereby allowing its omnipresence in our lifeworld, private and public, to be grasped. Salaün identifies three dimensions of the document, which he presents with varying degrees of precision. Vu, as the form of the document, is described in detail. Only aspects of form are involved here, not features of content. Content, as the textual structure of the document, is treated in the lu dimension. This dimension involves an intellectual process: comprehension. The practice involved here is not perception (as in vu) but a complex process such as the reading of a text. Drawing on Stanislas Dehaene,
5 There is an instructive cultural history of reading here, for example. 6 “Le document s’affirme comme un véhicule privilégié pour la circulation du savoir, l’affirmation des techniques et certainement aussi la stabilité des organisations et le développement du commerce.” (Salaün 2012, 40)
46 | Gerald Hartung Salaün writes that “reading is the first ‘prosthesis of the spirit.’”⁷ The third dimension of document use, su, is the dimension of circulation, and with it the social function of the document. The treatment of this dimension is remarkably brief in comparison to the other parts of the study. Whereas the aspects of perception and comprehension are well developed in terms of argument, the third dimension is not fully explored. This is the case, for example, with the far-reaching thesis that in the process of reading (or absorbing information), the person reading not only opens themselves to a social space but is, by using a document, subjected to a transformation.⁸ In this context, it becomes clear that Salaün’s study involves setting out a research program rather than presenting the results of one. This applies above all to the various angles from which a definition of the document is attempted. Starting from Pédauque, a document is initially presented as a contract between people whose qualities work together on the levels of readability/perception, intelligibility/processing, and social action/integration. Taking the idea of a contract as a guide, Salaün then defines a document as a trace of understanding in the material (vu), intellectual (lu), and social (su) dimensions,⁹ and further ascribes to it the qualities of reproducibility, formability, and being processed. Forms and types of document can develop in conjunction with various contracts—e.g., bills, laws, identity cards, photographs, business cards, archeological finds, and so on. The essence of Salaün’s position is that documents are a means of rediscovering our past, which by necessity has to be reconstructed for the sake of understanding our present and orienting our future.¹⁰ In summary: A document is not just any old object. A document is the object of a social practice. That social practice is fundamentally bound up with the level of perception, in a more complex form with the level of comprehension, and most complexly of all with the social level. This model can on the one hand be understood as a structural model of the composition of our social world (in the sense of a bottom-up model), but at the same time it must also be recognized that the most basic processes of perception are dependent on a contractual foundation. I always already recognize a document in its difference from a mere natural object, and I at
7 “la lecture est la première ‘prothèse de l’esprit.’” (Salaün 2012, 54; see also Dehaene 2007) 8 “Pour poursuivre l’exemple de la lecture de ce livre, vous-même, comme lecteur, êtes transformé par l’information qui a été mise en mémoire sur les pages imprimées.” (Salaün 2012, 55) 9 “un document est une trace permettant d’interpréter un événement passé à partir d’un contrat de lecture. Nous retrouvons bien les trois dimensions, matérielle avec la trace (vu), intellectuelle avec l’interprétation (lu), mémorielle avec l’événement passé (su), ainsi que la nécessaire construction sociale avec le contrat.” (Salaün 2012, 59) 10 “Le document est une façon de retrouver notre passé et, nécessairement, de la reconstruire en fonction de notre présent pour orienter notre futur.” (Salaün 2012, 59)
The Document as Cultural Form | 47
once recognize that document as a particular type, for instance as a bill, identity card, or letter. The higher degrees of complexity—we can also call them genres—are always already at play (in the sense of a top-down model) in acts of perception and understanding. This idea can be elaborated to form a thesis. By speaking of an implied contract, document theory asserts that we always already know something is a document when we perceive it. The boundary between natural objects and documents as cultural objects is—at least as a rule, let us say cautiously—recognized intuitively. Uncertain and exceptional cases such as, for example, archeological finds where it is not clear whether nature or a human hand formed an object, need more detailed consideration. But as a rule, we recognize documents as such, and thus document theory engages neither with the nature–culture difference nor, against this background, with the special questions that are raised by the genesis and validity of documents as a cultural form. The philosophy of culture offers a way of doing just that.
2 On a Cultural Philosophy of Documents There are several possible ways to bring the perspective of the philosophy of culture to bear on lines of thought from document theory. First, I pursue the implications of the fact that knowledge is always already inscribed implicitly in acts of perception. It is interesting here to draw connections with nineteenth- and twentieth-century theories of perception in the fields of phenomenology and philosophical psychology. Ernst Cassirer points the way in this respect when he speaks of the symbolic conciseness (symbolische Prägnanz) of acts of perception (see Schwemmer 1997, 237–240). Second, I take up the challenge posed by the fact that understanding is both a basic process (making recourse to intuition/Anschauung and perception/Wahrnehmung) and a complex process (preforming knowledge). Here, I explore a connection with the philosophical hermeneutics associated above all with the name of Dilthey and his school in the twentieth century (see Steinbach 2020). Third, I raise the question of how general knowledge, in conjunction with sociocultural factors that vary from one case to another, influences individual courses of action—how, that is to say, its formal aspects and type can determine the character of a given object as a document. These three dimensions correspond to the vu–lu–su structure in the work of Salaün and the context of his research group. It is clear that there are various possibilities for drawing correspondences between document theory and the philosophical tradition. In all of them, though, there is no getting away from a problem
48 | Gerald Hartung that presents interdisciplinary work with a significant challenge: the need to align and translate the different specialist vocabularies involved. In the present context, I concentrate on the third aspect of Salaün’s structure (su), and build a bridge from there to the philosophy of culture. My thesis is as follows: the analysis of documents as a cultural form can only gain by engaging with ideas from the philosophy of culture. The theory of objective spirit (or cultural or symbolic forms, as the philosophy of culture was variously known in the nineteenth and twentieth centuries) yielded notable insights into the genesis and validity, and the use, of cultural forms. These insights may still be of interest today if we can make it possible to relate them to the vocabulary of document theory.
2.1 Documents and Dilthey’s Concept of the Reality of Human Spirit This brings us to the heart of the matter. What do we mean when we speak of the background to a cultural world (in the sense of su)? The technical language of the nineteenth century refers to “objective spirit.” This means a general principle of form that can be found in all cultural objects. At stake is the question of how general knowledge, in conjunction with sociocultural factors that very from one case to another, influences individual courses of action and determines, in particular, the character of a given informational object as a document. An extended quotation from the philosopher Wilhelm Dilthey will provide some initial pointers with which to address this question: This great outer reality of human spirit always surrounds us. It is an actualization of human spirit in the world of the senses—from a fleeting expression to the rule of a constitution or code of law that lasts for centuries. Each single manifestation of life re-presents something common or shared in the realm of objective spirit. Every word or sentence, every gesture or polite formula, every work of art or political deed is intelligible because a commonality connects those expressing themselves in them and those trying to understand them. The individual always experiences, thinks, and acts in a sphere of commonality, and only in such a sphere does he understand. Everything that has been understood carries, as it were, the mark of familiarity derived from such common features. We live in this atmosphere; it surrounds us constantly; we are immersed in it. We are at home everywhere in this historical and understood world; we understand the sense and meaning of it all; we ourselves are woven into this common sphere. (Dilthey 2002, 168–169)
We can see immediately that this is not a definition of the cultural world as objective spirit. The problem paraphrased here involves the condition of possibility—not, it is true, defined, but nonetheless describable for any phenomenon—for perceiv-
The Document as Cultural Form | 49
ing, experiencing, thinking, treating a shared sphere (subjectively) in every act of perception, every case of understanding, and every element of knowledge, and (objectively) in every sentence, every gesture, every institution, and so on. To help make his ideas clear, Dilthey draws on the philological thesis of the distinction between nature and history, alluding to August Boeckh’s formula of Erkenntnis des Erkannten¹¹ and its adoption in Droysen’s Historik (see Rodi 2003). The idea is that we classify an object as historical and non-natural if we see it as produced, formed, as something already understood and recognized by others. Boeckh and his successors apply this formula to historical artifacts—analogously to Otlet’s positivistic approach to documents—but Dilthey is prepared to extend its scope. Individual events such as words, gestures, historical deeds, but also general concepts such as family, state, nation, are elements of objective spirit, as also are writing, works of art, cultural artifacts, books—documents in the narrower sense. Dilthey does not distinguish between material and immaterial formations on a fundamental level. The reasons for this are complex and relate to the idealistic philosophy that he inherited (see Beiser 2014; Hartung 2006, 382–396). To put it briefly, he is interested in the causal chains from basic physical-psychological experience to abstract communication patterns on which a categorization of our cultural world rests. He accepts that there may be productive nexuses (Wirkungszusammenhänge) in a cultural system (e.g., law or art) that become detached from their emergence in the mental process of experiencing and understanding; but that is not where his interest lies. Psychological hermeneutics as pursued by Dilthey is concerned with the cultural world as a unitary productive nexus insofar as experiencing and understanding give rise to values and purposes that shape our actions (social interaction, but also the use of cultural objects). The perspective is teleological: everything—both material and immaterial goods—is slotted into chains of values and purpose and acquires sense and meaning only as a result of this. From the perspective of document theory, one way of engaging with this is to use Dilthey to set the ambitious understanding of documents as a contract between people (Salaün) in the context of hermeneutic psychology. This also means asking what we always already know by a document in order to be able to classify it as a document, and what values and purposes always already need to be implied in the use of documents for us to speak of a contract. An extended theory of documents would benefit a lot from this, for the analysis would extend into the realm of Weltanschauung and thereby take into account the determining contextual factors of psychology, social phenomenology, praxeology, institutions—cultural
11 For this concept, see Güthenke (2020, 115).
50 | Gerald Hartung hermeneutics in the widest sense. Covering both individual and general sociocultural artifacts would elaborate Salaün’s su as a complex formation. There is also a flip side to the observation that Dilthey is not concerned with differentiating between material and immaterial elements of the cultural world (objective spirit). The reason for this is simply his disinterest in the materiality of cultural goods, i.e., their character as forms. He shares this disinterest with other exponents of an idealistic philosophy of culture, which include neo-Kantians of various schools, as well as those working on a theory of the spiritual world in his wake (see Hartung 2018, 10–26).¹²
2.2 Cassirer’s Logic of Cultural Forms A theory of culture that goes beyond the perspective of cultural hermeneutics credits language with a constituting function for the cultural world. In the work of the philosopher of culture Ernst Cassirer, the su is linguistic in nature. It is above all in the medium of language that a shared world of meaning is opened up. It is a general principle that the act of self-contemplation is medial, above all linguistic in nature.¹³ In addition, the cultural world of humans is constituted through medial communication. In the process of this communication, active, creative expression condenses as meanings that present a complex picture in line with the various modes of symbolic figuration. The constancy that is achieved in the meaning of mythical images, linguistic signs, or religious ideas is, as Cassirer emphasizes, of a fundamentally different character than the constancy of natural forms. It does not correspond to the “property-constancy” and “law-constancy” that dominate the physical world, but proves to be a meaning-constancy with its own regularities: “We live in the words of language, the figurations of poetry and the plastic arts, the 12 As an aside, it is worth noting that where our theme is concerned, there is even less to be learned from the proponents of a materialist or naturalistic theory of culture, because the ambivalences of bottom-up and top-down movements for the constitution of cultural phenomena—which include documents—are not considered there at all. Compared to this reductionism, Dilthey’s disinterest in the materiality of cultural forms is, for all its one-sidedness, an advantage, for it does not exclude what it does not consider. By highlighting the relative independence of cultural forms from the social context in which they emerge, he opens the way for a study of the intrinsic logic—with respect to genesis and validity—of cultural forms, e.g., linguistic forms, mathematical rules, and moral or juridical norms. On this, see Hartung (2017, 155–171). 13 “Language is therefore by no means merely an alienation from ourselves; rather, like art and each of the other ‘symbolic forms,’ it is a pathway to ourselves. It is productive in the sense that through it our consciousness of the I and our self-consciousness are first constituted.” (Cassirer 2000, 54)
The Document as Cultural Form
| 51
forms of music, the formations of religious ideas and beliefs. And it is only in them that we ‘know’ each other” (Cassirer 2000, 74–75). Analyses in the cultural sciences need to draw out the totality of the forms in which human life unfolds. Even though these forms tend to be infinitely varied, the anthropological situation nonetheless leads to a unified structural context, “for it is ultimately the ‘same’ human being that meets us again and again in a thousand manifestations and in a thousand masks in the development of culture” (Cassirer 2000, 76). This basic assumption should not, however, be misinterpreted along naturalistic lines. Cassirer certainly agrees with the historian Hippolyte Taine, for whom all culture is the work of humans. But he strongly disagrees that a cultural epoch, a work of art, or a society can be fully grasped as the sum of its empirically observable phenomena of expression. In his view, a cultural object is fundamentally different from a natural object because it contains several layers of meaning. Thus, the description of meadows and fields by geographers and botanists is fundamentally different from the representation of those same meadows and fields in landscape painting. Where the natural sciences analyze empirical findings, the cultural sciences “interpret symbols […] in order to make the life from which they originally emerged visible again” (Cassirer 2000, 86). It becomes possible, that is to say, to grasp the document world of the su in all its complexity precisely because we direct our attention to the sociocultural factors that shape us, our interests, preferences, inclinations, and strategies for constituting documents, and to the use of documents in social contexts. The point is that a document is not simply a document but becomes one in a particular way, through the meaning we assign to it and the use we make of it. In Cassirer’s view, the struggle between the perspectives of the natural and cultural sciences entered a new phase in the nineteenth century. According to the thesis of substance-concept and function-concept, the foundation of basic concepts was dismantled in late-nineteenth century mathematical physics and those concepts (e.g., that of causality) defined as purely relational terms. If this idea is transferred to the field of theoretical biology, it means that developments in the organic world that can be described in causal terms are merely considered a tendency in organic events, and not associated with the power of inner forces (immanent teleology) or treated as the expression of a purposiveness in the development of forms (entelechy). This has important implications for research in the cultural sciences. The analysis of cultural phenomena can henceforth not be satisfied with causally explaining the becoming of culture, but has to consider its
52 | Gerald Hartung meaning and function in an “analysis of form” of language, religion, and art.¹⁴ Where the limits of causal explanation are reached, because its discourse is unable to reach the origin of cultural phenomena (or documents), it reveals their character as basic phenomena. This—in an ontological sense—negative finding is acknowledged, and all that can be done is to analyze their functions and determine their functional unity. In contrast to classical ontology, which seeks to follow the principle of sufficient reason, Cassirer argues for a stance that is epistemologically skeptical.¹⁵ This relative skepticism leads Cassirer to leave the question of the terminus a quo of symbolically functioning accomplishments unanswerable. The question of the “being” of language, art, and religion amounts to a petitio principii—the only possible answer is assumed in the premises of the question in the first place. The answer is basically what the question was based on to begin with. This dilemma also presents itself when the model of evolutionary biology is transferred to the anthropological field. In Cassirer’s view, all efforts to explain cultural phenomena by making recourse to human “nature” have failed because they fail to register the specific change in function (e.g., between animal noises and specifically human language). A causal explanation tears down the borders between nature and culture, treats the phenomena of expression in nature as not yet culturally purposed, or cultural phenomena as still conditioned by nature, and thereby fails to account properly for the functional attainment of either animal noises or human words. For a critical philosophy of culture, there is no way to avoid facing up to the fact that individual cultural phenomena cannot be sufficiently explained in causal terms. Realizing that we “can attain knowledge of the ‘essence’ of man only in that” we “catch sight of him in culture, and in the mirror of his culture” (Cassirer
14 “Generally considered it [the task] consists in determining the ‘what’ of each individual form of culture, the ‘essence’ of language, religion, and art. What ‘is’ and what does each of them mean, and what function do they fulfill? And how are language, myth, art, and religion related to one another? What distinguishes them and what joins them to each other? Here we arrive at a ‘theory’ of culture that in the end must seek its conclusion in a ‘philosophy of symbolic forms’—even if this conclusion appears as an ‘infinitely distant point’ that we can approach only asymptotically.” (Cassirer 2000, 97) 15 “Skepticism is not only the disavowal or the destruction of knowledge. Philosophy itself can serve as proof of this. One needs only to think of its most significant and fruitful periods in order to realize what an important and indispensable role not-knowing has played in them […]. All genuine skepticism is relative skepticism. It declares certain questions as unsolvable in order to draw our attention all the more effectively to the sphere of solvable questions and to get a surer grip on them. […] Once this has been made clear, to concede that the question as to the origin [Entstehung] of the symbolic function is not solvable by scientific means in no way appears to be a mere agnosticism, an intellectual sacrifice that we are painfully forced to make.” (Cassirer 2000, 100–101)
The Document as Cultural Form
| 53
2000, 102), that is to say, is more than just a negative conclusion. It is actually a liberation, because it means that any inclination to look behind the mirror of human culture can be abandoned. These ideas, too, may be of huge significance for document theory, for it now becomes clear why comprehending a cultural form in its being is not at stake. Instead, what matters is use—and any object can become a cultural form through its use. It is also a matter of functionality. There is no cultural form that does not have a functionality for us humans. The material side (which does not, however, interest Cassirer) and possibility of being used make it possible to distinguish a cultural form, and thus also documents, from other objects in this world. Use implies the provision of form for an appropriate practice. In this sense, the world is full of objects that are meaningless for us and have no practical purpose. But as soon as we credit an object with meaning and use, we are dealing with a cultural form; and as soon as we consider its use for informational purposes, we are dealing with a document. Seen in this way, it is reasonable to say that a cultural form, and thus the document, is a basic phenomenon. Whatever it might be beyond our world of meaning and our dealings with it, is of no meaning and of no consequence. But within our world of meaning and use, all cultural forms are part of the basic and complex structures out of which sociocultural knowledge is built (su).
2.3 Documents and Knowledge of a Thousand Interesting and Significant Things in Simmel Not until Georg Simmel do we find another way of analyzing the logic of cultural forms, and considering documents in particular, beyond the plain alternatives of idealism and materialism/naturalism.¹⁶ I feel that Simmel’s position lends itself particularly well to drawing links with the questions addressed by document theory. In what follows, I analyze a number of passages from various works by Simmel. These include the Philosophie des Geldes (1900), his great Soziologie (1908), and his main philosophical work, Hauptprobleme der Philosophie (1910). I identify five points that strike me as important for drawing connections with document theory. First, Simmel analyzes the cultural process by focusing on an objectification of personal aspects of life into a supra-individual form. In an early phase of his work, he, like Dilthey and Cassirer, does not distinguish between material and immaterial concepts:
16 On Simmel, see recently Hartung et al. (2021).
54 | Gerald Hartung Such is the civilizing influence of culture that more and more contents of life become objectified in supra-individual forms: books, art, ideal concepts such as fatherland, general culture, the manifestation of life in conceptual and aesthetic images, the knowledge of a thousand interesting and significant things—all this may be enjoyed without any one depriving the other. (Simmel 2011, 313–314)
It should be underlined that Simmel is concerned on the one hand with the use of concepts, on the other hand with their supra-individual character. This means that he is confronting the questions that surround Salaün’s su. His reference to a “knowledge of a thousand interesting and significant things” points to the sociocultural shaping of our perception, our understanding, and our way of dealing—not least—with documents in particular. Second, Simmel describes the transition from nature to culture as a process of cultivation (or refinement). In its material aspect, culture is nothing more than cultivated nature. And yet, cultivation depends on impulses that emanate from us humans. And so it is only by analogy that we can describe impersonal things (e.g., documents in Otlet’s sense) as “cultivated”: The material products of culture—furniture and cultivated plants, works of art and machinery, tools and books—in which natural material is developed into forms which could never have been realized by their own energies, are products of our own desires and emotions, the result of ideas that utilize the available possibilities of objects. It is exactly the same with regard to the culture that shapes people’s relationships to one another and to themselves: language, morals, religion and law. (Simmel 2011, 484)
In Simmel’s view, material bodies are defined as “the manifestation or embodiment of the […] growth of our energies” (Simmel 2011, 452). After this general discussion, which takes us back, as it were, to the Cartesian world—albeit from the opposite side to Otlet, that of the subject—Simmel takes a crucial step further in describing modern culture, as we shall see in what follows. Third, Simmel turns to the problem posed by the fact that in the fields of both intellectual and material cultural products, the level of cultivation of such products can be decoupled from that of their representatives (Träger): individuals and collectives. “Things […] tools, means of transport, the products of science, technology and art”—and thus also documents in the broadest sense—“are extremely refined” (Simmel 2011, 453), whereas we very much lag behind. But how can this be so, given that the whole culture of things is simply a culture of humans, whose energy it makes manifest? How is understanding possible at all if there is a hiatus here between the subjective experience and objectively possible use of documents? What is the idea that books, for instance, are far more clever than us supposed to mean?
The Document as Cultural Form
| 55
The answer to this question can be found in Simmel’s conception of culture (objective spirit). It is true that he too uses this enigmatic concept to refer to a culture of things as a culture of humans. He shows that with the objectification of the spirit in, for instance, products of technology and science, a form is attained that permits a preservation of mental labor and makes possible a passing on of what has been achieved from one generation to another. This seems not unimportant to Simmel, for with the possibility of objectifying the spirit in words and works, organizations and traditions, and of passing it on, the difference of the human from other forms of life is marked: only we humans are granted a whole world. This, however, comes at a cost whose significance can hardly be overstated, for in particular modern man is surrounded by nothing but impersonal objects […]. Cultural objects increasingly evolve into an interconnected closed world that has increasingly fewer points at which the subjective soul can interpose its will and feelings. And this trend is supported by a certain autonomous mobility on the part of the objects. […] Both material and intellectual objects today move independently, without personal representatives or transport. Objects and people have become separated from one another. Thought, work effort and skill, through their growing embodiment in objective forms, books and commodities, are able to move independently; recent progress in the means of transportation is only the realization or expression of this. By their independent, impersonal mobility, objects complete the final stage of their separation from people. (Simmel 2011, 465)
Let us look in detail at what Simmel claims here. For a general definition of cultural objects, it is the case that there must be a personal impulse, for all cultural objects are first and foremost figurations of an intellectual energy. But this is only one side of the coin, for Simmel also allows the possibility that material objects (or documents) and intellectual objects (or ideas/ideals with which we live our lives) can be viewed as having an “autonomous mobility” without requiring a “personal representative.” Documents such as books, for example, are not merely mobile due to the energy of the civilized people (Kulturmenschen) that create and consume them: they are also distinguished by a mobility that we can call “impersonal.” For Simmel, this autonomous mobility is associated not only with the content of a material concept (the book as document) or an immaterial concept (notion, idea, ideal, etc.), but also with its form and its use. Fourth, it is not always clear in Simmel’s virtuoso thought process whether he is systematically thinking through the issues he raises. The famous excursus on written communication in his great Soziologie of 1908 does not pick up the ideas outlined here (Wolff 1950, 352–355). This is, however, in part related to the context, for Simmel is concerned there with the sociological phenomenon of letters as it pertains to a discussion of public and secret written communication in societies.
56 | Gerald Hartung His central thesis is that writing creates publicity, specifically in the medium of culture and society (objective spirit). In his view, it is a basic principle that intellectual contents (such as, e.g., natural laws and moral imperatives, concepts, and artistic creations) have a timeless validity that is independent of their historical realization. A Euclidean rule does not become true or more true by being applied. A moral stipulation (e.g., the Ten Commandments or the Universal Declaration of Human Rights) pertains independently of whether, when, and by whom it is carried over into an action and realized. And so it is too for the truth of an idea: writing gives it an objective form, thereby intensifying its supra-temporal character, and furthermore making its accessibility to a subjective consciousness possible —“but its significance and validity are fixed, and thus do not depend on the presence or absence of these psychological realizations” (Wolff 1950, 352). The letter as a sociological phenomenon is an excellent example with which to address the phenomenon of material and intellectual cultural forms, including documents: the realization of intellectual energy, including its claim to supratemporal validity, converges here with the objectification of a subjective impulse, the transition of a momentary, radically temporal mood into the permanent form of writing. The intersection, the letter, the written document, contains this ambivalence. Taking a sideswipe at Dilthey, who is not mentioned by name here, Simmel criticizes the idea of a naive homogeneity of cultural processes that make possible “our (apparently so simple) mutual ‘understanding.’”¹⁷ Let us summarize: cultural forms or objective formations, in this particular case documents, stand (if we place the material aspect in the foreground) in a first contradiction of emerging from intellectual-personal energy but being able to acquire autonomous mobility, and in a further contradiction of giving expression to the objectification of a subjective impulse and at the same time being an intellectual and material form with a claim to supra-temporal validity. Fifth, Simmel gives indications in his Hauptprobleme der Philosophie (ch. 2: “On Being and Becoming”)¹⁸ that he is prepared to abandon the idealistic constraints of his earlier analysis of cultural forms. Here, he considers cultural forms in their structural ambivalence: they are the product of a subjective fashioning will, yet have a “a distinct objective, mental existence beyond the individual minds
17 “It is one of the intellectual achievements of written communication that it isolates one of the elements of this naïve homogeneity, and thus makes visible the number of fundamentally heterogeneous factors which constitute our (apparently so simple) mutual ‘understanding.’” (Wolff 1950, 355) 18 “Vom Sein und vom Werden.”
The Document as Cultural Form
| 57
[…] that originally produced them or that subsequently reproduce them.”¹⁹ This independence applies to legal requirements, moral stipulations, linguistic forms, artistic products, and much more. What these forms have in common, and what simultaneously separates them, is the fact they are bound to external forms through which they become manifest and palpable. The usual argument here points out that the validity of cultural forms is bound neither to their social and historical genesis nor to their material shape. Simmel, however, now inquires further and approaches the document character of specific cultural forms: The spirit with which a printed book, say, is suffused is indisputably in it, for it can be gleaned from it. But in what guise can it be in it? It is the mind of the author, the content of his psychological processes, that the book contains. But the author is dead, so it cannot be his mind as a psychological process. And so it must be the reader, with his mental dynamism, that turns the lines and squiggles on the page into mental substance. But this is conditioned by the existence of the book, and in a fundamentally different and more direct way [my emphasis] than it is conditioned by the fact, say, that the reproducing subject breathes and has learned to read. The content that the reader puts together as a living process in himself is contained in objective form in the book; the reader “extracts” [my emphasis] it. But even if he did not extract it, the book would not lose this content as a result; and its truth or mistakenness, its nobility or its baseness, is obviously not dependent on how frequently or rarely, with how much or how little understanding, the sense of the book is reproduced in subjective minds.²⁰
Two points matter for Simmel here. On the one hand, he emphasizes (thereby restating his earlier position) that the immaterial, intellectual, and transpersonal character of a cultural form is not determined by its use. On the other hand, however, he indicates that the materiality and use of a cultural form do condition its intellectual content insofar as its social and historical consequences cannot be 19 “ein eigentümliches objektives, geistiges Dasein jenseits der einzelnen Geister […], die sie ursprünglich produziert haben oder die sie nachträglich reproduzieren.” (Kramme and Rammstedt 1996, 67) 20 “Der Geist, der etwa in einem gedruckten Buch investiert ist, ist zweifellos in ihm, da er aus ihm herausgewonnen werden kann. Aber in welcher Art kann er darin sein? Es ist der Geist des Verfassers, der Inhalt seiner psychischen Prozesse, den das Buch enthält. Allein der Verfasser ist tot, sein Geist als psychischer Prozeß kann es also nicht sein. So ist es also der Leser, dessen seelische Dynamik aus den Strichen und Kringeln auf dem Papier Geist macht. Allein dies ist durch die Existenz des Buches bedingt und zwar in einer prinzipiell andern und unmittelbareren Weise [my emphasis], als es etwa dadurch bedingt ist, daß dieses reproduzierende Subjekt atmet und lesen gelernt hat. Der Inhalt, den der Leser in sich als lebendigen Prozeß bildet, ist in dem Buch in objektiver Form enthalten, der Leser ‘entnimmt’ [my emphasis] ihn. Wenn er ihn aber auch nicht entnähme, so würde das Buch darum dieses Inhaltes nicht verlustig gehen und seine Wahrheit oder Irrigkeit, sein Adel oder seine Gemeinheit ist ersichtlich gar nicht davon abhängig, wie oft oder selten, wie verständnisvoll oder nicht verstehend der Sinn des Buches in subjektiven Geistern wiedererzeugt wird.” (Kramme and Rammstedt 1996, 67–68)
58 | Gerald Hartung understood independently of its use: use, and thus, in the case of the document, the situation in which the content is extracted, also matter. With this, Simmel gives us an indication of the complexity of the su, the structure with whose analysis we are concerned: This is the form of existence of all such contents, religious or legal, scientific or in some way traditional, ethical or artistic. They appear historically and are arbitrarily reproduced historically, but between these two psychological realizations they possess an existence in a different form and thereby demonstrate that even within these forms of subjective reality, they exist as something that cannot be reduced to them, as something meaningful in itself—as spirit, no doubt, that has nothing whatsoever to do in substance with its points of reference in the world of the senses, but objective spirit whose actual meaning stands pristinely above its subjective life in one consciousness or another. This category, which allows the supramaterial to be dissolved in the material, and the supra-subjective in the subjective, marks the entire historical development of humanity; this objective spirit allows the work of humanity to ensure its results survive beyond all individual people and individual reproductions.²¹
We recognize aspects treated earlier once more here: on the one hand, the objectification of the subjective; on the other hand, the autonomous mobility of objectified formations, together with a claim to supra-temporal validity. Simmel’s achievement lies in connecting the genesis and validity of cultural forms as objective formations. The ambivalence that is structurally inscribed into these formations becomes the object of his analyses. The last quotation introduces another aspect, which we can characterize as a recasting of the problem. Simmel hints, at least, that the material existence of the book, in which the supra-material (of the spirit) is dissolved—Otlet, for his part, speaks of the book as a document par excellence because the preservation and dissemination of information/knowledge can be illustrated paradigmatically with reference to it—could be a field of study in its own right. The objective formations could be examined in terms of how they make it possible to dissolve the supra-material in the material and the supra-subjective in 21 “Diese Existenzform haben all jene Inhalte, religiöse oder rechtliche, wissenschaftliche oder irgendwie traditionelle, ethische oder künstlerische. Sie tauchen historisch auf und werden historisch beliebig reproduziert, aber zwischen diesen beiden psychischen Verwirklichungen besitzen sie eine Existenz in andrer Form und erweisen damit, daß sie auch innerhalb jener subjektiven Realitätsformen als etwas mit diesen nicht Erschöpftes, als etwas für sich Bedeutsames subsistieren—zweifellos als Geist, der mit seinen sinnlichen Anhaltspunkten sachlich nicht das Geringste zu tun hat, aber objektiver Geist, dessen sachliche Bedeutung unberührt über seiner subjektiven Lebendigkeit in diesem oder jenem Bewusstsein steht. Diese Kategorie, die es gestattet, Übermaterielles in Materiellem und Übersubjektives in Subjektivem aufzuheben, bestimmt die ganze historische Entwicklung der Menschheit, dieser objektive Geist läßt die Arbeit der Menschheit ihre Ergebnisse über alle Einzelpersonen und Einzelreproduktionen hinaus bewahren.” (Kramme and Rammstedt 1996, 68)
The Document as Cultural Form
| 59
the subjective. The interesting point is the storage function and character of a document as a permanent form. This is as far as Simmel got: alongside the immaterial aspect, the material aspect of a cultural form/an objective formation/a document is credited with significance for an understanding of the cultural development of humanity. He did not advance beyond this, but by pointing out this third way he prepared the ground for a bridge between document theory and the philosophy of culture.
3 Some Closing Remarks I would like to conclude by highlighting the following points. 1. The sets of cultural forms and (types of) documents are not coextensive, if nothing else because documents are explicitly conceived as material things in Otlet (2015), whereas objective formations are conceived as intellectualmaterial entities in Dilthey and Simmel. What does this meant for a theory of documents? 2. The boundary between documents and objective formations is, in my view, fuzzy and in need of closer study. It is significant that both Pédauque (2006) and Dilthey/Simmel bind documents/objective formations to human action (production/intention and consumption/contemplation). Beyond that, the differences are important. Here again, the question is: how can a more precise division be drawn in such a way that it is productive for further research? 3. In Salaün (2012), documents are conceived in terms of their materiality, but the three dimensions imply—in Simmel’s terms—the supra-material (vu), suprasubjective (lu), and supra-temporal (su). The question arises as to how a material formation can become a storage medium for immaterial and supratemporal phenomena, and how a transsubjective dimension might influence the use of documents in a genre-specific manner. Even “extractability” (Simmel) and “mechanical reproducibility” (Benjamin 1980, 504–505) may be more closely related than simply juxtaposing the two concepts suggests. Both document theory and cultural theory have good reasons to continue engaging with each other. These discussions will enrich reflection on the material turn in the human and cultural sciences.
60 | Gerald Hartung
Bibliography Beiser, Frederick C. The Genesis of Neo-Kantianism—1796–1880. Oxford: Oxford University Press, 2014. Benjamin, Walter. Das Kunstwerk im Zeitalter seiner technischen Reproduzierbarkeit. 2. Fassung. In: Tiedemann, Rolf and Hermann Schweppenhäuser, editors, Gesammelte Schriften, volume I.2, pp. 471–508. Frankfurt am Main: Suhrkamp, 1980. Cassirer, Ernst. The Logic of the Cultural Sciences. Steve G. Lofts, translator. New Haven, London: Yale University Press, 2000. Comte, Auguste. The Positive Philosophy. Harriet Martineau, translator. volume 1. London: John Chapman, 1853. Dehaene, Stanislas. Les neurones de la lecture. Paris: Odile Jacob, 2007. Dilthey, Wilhelm. The formation of the historical world in the human sciences. Rudolf A. Makkreel and John Scanlon, translators. In: Selected Works, volume 3. Princeton, Oxford: Princeton University Press, 2002. Güthenke, Constanze. Feeling and Classical Philology: Knowing Antiquity in German Scholarship, 1770–1920. Cambride: Cambridge University Press, 2020. DOI: https://doi.org/10.1017/ 9781316219331. Hartung, Gerald. Noch eine Erbschaft Hegels: Der geistesgeschichtliche Kontext der Kulturphilosophie. In: Philosophisches Jahrbuch der Görres-Gesellschaft, volume 113, pp. 382–396. Freiburg, Munich: Karl Alber Verlag, 2006. Hartung, Gerald. Friedrich Albert Lange et l’histoire critique du matérialisme. In: L’Allemagne et la querelle du matérialisme: Une crise oubliée?, pp. 155–171. Paris: Classiques Garnier, 2017. DOI: https://doi.org/10.15122/isbn.978-2-406-07049-8. Hartung, Gerald. Kulturphilosophie als Bildungsphilosophie – Wilhelm Windelband als Philosoph der modernen Kultur. In: Lessing, Hans-Ulrich, Markus Tiedemann, and Joachim Siebert, editors, Kultur der philosophischen Bildung: Volker Steenblock zum 60. Geburtstag, pp. 10–26. Hanover: Siebert Verlag, 2018. Hartung, Gerald, Jörn Bohr, Heike Koenig, and Tim-Florian Steinbach, editors. Simmel Handbuch: Leben – Werk – Wirkung. Stuttgart, Weimar: Verlag J. B. Metzler, 2021. Kramme, Rüdiger and Otthein Rammstedt, editors. Georg Simmel Gesamtausgabe Bd. 14: Hauptprobleme der Philosophie. Philosophische Kultur. Frankfurt am Main: Suhrkamp, 1996. Levie, Françoise. L’homme qui voulait classer le monde: Paul Otlet et le Mundaneum. Brussels: Les Impressions Nouvelles, 2006. Otlet, Paul. Le livre sur le livre: Traité de documentation. Fac-similé de l’édition originale de 1934. Brussels: Les Impressions Nouvelles, 2015. Pédauque, Roger T. Le Document à la lumière du numérique. Caen: C&F éditions, 2006. Pédauque, Roger T. Document: Form, sign, and medium, as reformulated by digitization. A completely reviewed and revised translation by Laura Rehberger and Frederik Schlupkothen. Laura Rehberger and Frederik Schlupkothen, translators. In: Hartung, Gerald, Frederik Schlupkothen, and Karl-Heinrich Schmidt, editors, Using Documents. A Multidisciplinary Approach to Document Theory, pp. 225–259. Berlin: De Gruyter, 2022. Rodi, Frithjof. Das strukturierte Ganze: Studien zum Werk von Wilhelm Dilthey. Weilerswist: Velbrück Wissenschaft, 2003.
The Document as Cultural Form | 61
Salaün, Jean-Michel. Vu, Lu, Su: Les architectes de l’information face à l’oligopole du Web. Paris: La Découverte, 2012. Schwemmer, Oswald. Ernst Cassirer. Ein Philosoph der europäischen Moderne. Berlin: Akademie Verlag, 1997. Simmel, Georg. The Philosophy of Money. From a first draft by Kaethe Mengelberg. Tom Bottomore and David Frisby, translators. 3rd edition, London, New York: Routledge, 2011. Skare, Roswitha. Document Theory. In: Hartung, Gerald, Frederik Schlupkothen, and KarlHeinrich Schmidt, editors, Using Documents. A Multidisciplinary Approach to Document Theory, pp. 11–38. Berlin: De Gruyter, 2022. Steinbach, Tim-Florian. Gelebte Geschichte, narrative Identität: Zur Hermeneutik zwischen Rhetorik und Poetik bei Hans Blumenberg und Paul Ricœur. Freiburg i. Br.: Verlag Karl Alber, 2020. Wolff, Kurt H., editor. The Sociology of Georg Simmel. With an introduction by Kurt H. Wolff, translator. Glencoe, IL: The Free Press, 1950.
Ulrich Johannes Schneider
From Manuscript to Printed Book Page Design and Reading
1 Books as Documents for Readers Printed books are documents, at least in a cultural sense, when read and dealt with as witnesses of knowledge, information, and news. Today, treating printed books as documents means recognizing them as items that can be stored away, much like other documents. Library catalogues would be their retrieval instrument. Compared to born-digital documents, books may appear obsolete and clumsy, simply because of how they are stored and catalogued. On closer inspection, however, they are indeed documents of a complicated nature. The following remarks do not question the status of books as documents or deal with the difference between digital and printed document formats. The question is rather how books became documents when printing started. What we observe is that it took some time to shape books into usable documents. While books in manuscript culture were few and precious items of learning, instruction, and devotion, in the print era they represented the whole spectrum of human activities, documented them, and did so in an increasingly ordered fashion. One element of turning books into reliable documents fit for use in human communication is the page number. The slow emergence of page numbers is my topic here. I am interested in why it took almost a century for page numbers to be printed as something taken for granted in every book. I make this feature part of the story of how printed books became usable documents, even though they were meant to be, in the early stages of printing culture, little more than better manuscripts. Books had been cultural artifacts ever since the invention of the codex in late antiquity; they became a different kind of cultural artifact when printers started to care about the legibility of texts: Books were designed as usable, i.e., legible, documents for many.
With the advent of the printing press in the mid-fifteenth century, the making of books changed in many ways. The most notable sign of the new mode of production is the appearance of the title page, which had not existed before. However, this Ulrich Johannes Schneider, Leipzig University
https://doi.org/10.1515/9783110780888-004
64 | Ulrich Johannes Schneider change came slowly. For a long time, the information we consider typical for the opening section of a book—author, title, place and date of print, printer’s name—was given at the end of the text, not at the beginning. Keeping in line with the traditional layout of manuscripts, the early printed texts started to reveal their identity at the very end. In the year 1500, we find both ways, tradition and innovation, coexisting alongside each other, but not for long. The end-of-text colophon disappeared from printed books early on in the sixteenth century, and the title page remained for good (Rautenberg 2008). With printing, the book also started to change internally, since every printed page automatically had duplicates in other books. Shaping the text on the page— any text on any page—also took some time, as in the case of the title page, but it involved much more than moving information from one place (the end of the text) to another (the beginning). There were many features that shifted when the average printed page took on new characteristics that defined the character of the book as a document. What we can observe is this: the form of printed pages responded more and more to the needs of readers. Typography was the first thing to be affected: abbreviations disappeared and letter fonts became more homogeneous. In addition to shedding older features, with time the printed page adopted new layout features. Eventually, its design would look distinctively different from most manuscript pages. Printed books evolved into cultural objects, since from an early date they related to widespread reading practices, rather than considering only experts in monasteries and erudite circles. Eventually, the printed page turned the book from an arbitrary text-container into a document that was recognizable and distinct. With printing, books became conventional and uniform; they displayed features that, ultimately, no book would lack—most notable among them, page numbers. They did not exist in the fifteenth century. They only emerge in the sixteenth century, never to go away. With page numbers, a book was useful for and beyond reading, since readers were able to point to a certain page and cite its number, so that other readers would be able to quote or check it in their own copy. We still use this technique today: referencing a quotation is easily done by naming the year of the edition and the number of a page. Even in the digital world, PDF freezes pages, thus enabling exact references for places where text is located, allowing it to be pointed to in order to indicate a lie or a truth, something remarkable or something forgettable. The printed page number has become, by virtue of its ubiquity, part of our culture of reading, writing, and debating. It is what makes a book a cultural document throughout: something that can be referred to in a precise manner. Telling the difference between manuscript layout and typographical layout is easy when comparing the Bible text in Gutenberg’s Latin Bible and in Luther’s
From Manuscript to Printed Book |
65
version printed almost seventy years later. A lot happened to accommodate the reader, to make reading a smooth and yet structured process. In the later version we see paragraphs, printed initials, blank lines, notes and text in the margins. At the bottom of the page, a catchword previews the beginning of the next page. It is a printed page not to be confused with a manuscript. There is much better legibility.¹ Yet there are still no page numbers.
2 Numbering Pages When numbering pages was introduced in the sixteenth century, some problems arose that we still face today. First, page numbers identify the pages of a book and in doing so identify the overall number of pages of the book. However, the printed number at the top or bottom of the page only identifies meaningful text; it belongs to the text rather than to the book. If we look more closely, there are almost no cases where the page number is to be found on every page. This is what librarians may find confusing: Where library cataloguing rules for pagination have generally gone astray is in paying too much attention to the physical details of pagination when the interest of the entry is in the extent of the work. Why should a work be labelled as ‘283 p.’ merely because the last page number is ‘283,’ when the last page of printed matter it actually the 284th page? If the concern is with the extent of the work, that extent might as well be recorded as accurately as possible, especially since the presence or absence of a printed page number is a detail of typographic design irrelevant in this context. (Tanselle 1977, 52)
Even today, page numbers qualify an area within a book; they do not quantify the extent of the book as a whole. When Digital Humanists undertake layout analysis, page numbers count as one of many features that the description of a page has to take into account. Before being able to harvest text, in the Digital Humanities one is confronted with the fact that text is framed and formatted through its design. So the page number, although it has an arbitrary relationship to what is on a page, is important on it because it gives identity, it specifies the page and the text on it
1 Ryder (1979, 16): a “page in a book must not look like a collection of letters. At least it should look like a collection of words, at best a collection of phrases. To achieve this, not only must the right letter-design for the language and text and format be chosen, but it must also be used in a way which allows the easy flow, and therefore recognition, of phrases. That is the basis of designing a page for the printer to print for the reader to read.” Ryder then concentrates largely on typographical questions.
66 | Ulrich Johannes Schneider
Fig. 1: Biblia, Mainz: Johannes Gutenberg, Peter Schöffer, c. 1455 (LUL: Ed.vet.perg.1:1).
Fig. 2: Martin Luther: Das Newe Testament Deutzsch, Wittenberg: Melchior Lotter (LUL: Libri.Sep.A.1502).
as a historical artifact. It is an important piece of metadata for any philological comparison: In a digital age, philologists need to treat our editions as components of larger, well-defined corpora rather than as the raw material for printed page layouts. Many may take this as obvious but few have pursued the implications of this general idea. The addition of punctuation, the use of upper case to mark proper names, specialized glossaries, the addition of name and place indices, and even translations prefigure major classes of machine-actionable annotation. Interpretations of morphological and syntax analyses, lexical entries, word senses, co-reference, named entities are only a subset of the features we may choose to include as new practices of editing emerge. (Crane 2018)
When text is produced by printing, the very conventionality of printing culture produces a certain way of seeing and reading text. When Luther’s printer separated paragraphs and introduced blank lines, and used printed initials and marginal notes (figure 2), he most likely wanted to help readers—to prevent them from drowning in the sea of letters continuously and endlessly spilled onto the page, limited on all four sides, yet densely packed together (figure 1).² In contrast, a more 2 Leipzig University Library (LUL).
From Manuscript to Printed Book |
67
elegant design, a more structured layout of the printed text makes reading it aloud and memorizing it a lot easier. The early printers produced books in the sense that they transformed every manuscript into print by adding layout features. Printers effectively translated manuscripts into text that could be offered to readers outside of expert cultures.
3 Page Layout after Gutenberg The layout inventions of the first European printers in the period up to the first quarter of the sixteenth century are generally put down to individual efforts and achievements. As the map of European printing workshops around 1500 makes clear, there were several hundred printers keeping up a steady production.³ The output was quite varied, and books differed much in size and design. As Alan Bartram has convincingly shown, early printing had its flaws (Bartram 2001). On the printed page until c. 1530, we can easily detect many typographical accidents and unconvincing layout solutions.⁴ Looking back from today, pages printed by the middle of the sixteenth century were no longer densely filled and visibly attempted a more structured presentation of text. However, the transformation was not only slow, as in the case of the title page; it was also complex. Even with regard to the new European reading culture, books did not change suddenly. In order to understand the overall transformation in page design between the time of Gutenberg and the time of Luther, and in order to understand the emergence of page numbers, we have to study a variety of features that tell us about the design of a printed page and the problems associated with it.
3.1 Lines Studying early printed books allows us to make two observations: most texts have no paragraphs; some have rubrication marks. Both features are linked, most likely because printers were slow to create paragraphs and rubrication helped. Rubricators embellished manuscripts by adding red or blue strokes by hand in
3 Cf. online ‘The atlas of early printing’: http://atlas.lib.uiowa.edu/ (31.10.2021). 4 Most of the examples in this chapter come from a catalog (224 pages) produced in connection with a 2016 exhibition in Lyons and Leipzig (Schneider 2016a).
68 | Ulrich Johannes Schneider
Fig. 3: Avicenna: Canon medicinae, Venice: Pierre Maufer, Nicolaus de Contugo, 1483 (LUL: Med.arab.17).
Fig. 4: Biblia, Lyons: Antoine du Ry, 1528 (LUL: Biblia.451).
order to highlight the beginning of a line, a sentence, or some word that might be important to the reader. Any fifteenth-century library collection will show that rubrication was still common. The evidence we have of not-so-perfect rubrication marks in printed books tells us a lot about how popular this technique was, even if the sheer number of identical copies could not possibly all be rubricated, and indeed were not all rubricated, as again we can tell from library collections. Rubrication means intervening on the page by hand, making the text look better and more accessible to the reader’s eye. So long as the printed version of a text scarcely differs from its manuscript version, this technique makes a good deal of sense. We can observe that the text in figure 3 bears many hallmarks of manuscripts like abbreviations and ligatures, as well as rubrication. The reasons why rubrication did not survive the first half of the sixteenth century include some that have to do with costs. There is little sense in bringing down the cost of books by printing the text, only to pay the extra cost of rubrication (Harris 2020). The easiest way to keep the effect of the rubrication marking and avoid additional cost was to invent an extra character—the rubrication mark—and print it. In figure 4 we can see a common form of this new typeset character that— being black—has a different, yet similar effect from the colored ink-stroke added by hand. Like a manuscript, the text still uses abbreviations: horizontal strokes above n or m, a special sign for the word “and.” Yet it is advanced in the sense that its sentences are kept apart by the full point and an additional character. There is a further option open to printers when working with lines while keeping the text in flow and making room for as many letters as possible: adding
From Manuscript to Printed Book | 69
Fig. 5: Alkinoos: Disciplinarum Platonis epitoma, Nürnberg: Anton Koberger, 1472 (LUL: Ed.vet.1472,1).
Fig. 6: Giovanni Pico della Mirandola: Heptaplus de septiformi sex dierum Geneseos enarratione, Florence: Bartolomeo di Libri, c. 1490 (LUL: Ed.vet.s.a.m.80).
blank space. The text in figure 5 features full points between sentences, treating them like words, with equal distance from the last letter of the word before and the first letter of the word after. The text very much resembles a manuscript, with abbreviations and ligatures. In addition, the full points are slightly elevated. The other example (figure 6) is very different in that the full point is positioned on the line, not higher, and that it is attached to the last letter of the previous sentence, followed (mostly) by a blank space. There are other punctuation marks like the colon (not as part of an abbreviation) and parentheses. It is still unclear where blank spaces come from. Were they invented by Aldus Manutius in Venice (Smith 2003)? It may seem a niche question, yet it is an important one: when most printers started out, they imitated the work of the scribes, producing text, and text alone, leaving the page open for intervening artists adding rubrication marks and other decorative elements like initials. With the later insertion of blank space in the line (sometimes called typographical white, French blanc aldin), printers started to work not only on the text but also on its appearance to readers. When printers created entire blank lines, they went even further in giving more structure to the page. To sum up a complex story, we can safely say that the modern printed page came into being in the sixteenth century when printers not only worked with ink, i.e., with black ink only, but also with the ink-free spaces on the page (Kwakkel 2018, 47–53).
3.2 Manuscript Culture Continues Besides rubrication, there is another manuscript tradition continued by printers: initials. Initials are the opening letters of opening words at the beginning of books, chapters, or paragraphs. Initials consist of ornaments around that first letter, sometimes elaborated into images. In manuscript culture, there is no limit to the art
70 | Ulrich Johannes Schneider
Fig. 7: Missale Lugdunense, Lyons: Jean Neumeister, 1487 (fol. 131 of BmL: Res.inc.589).
Fig. 8: Missale Lugdunense, Lyons: Jean Neumeister, 1487 (fol. 131 of BmL: Res.inc.288).
Fig. 9: Missale Lugdunense, Lyons: Jean Neumeister, 1487 (fol. 131 of BmL: Res.inc.407).
of initials, given money, artistic ambition, and craft. In printed books, including initials required reserving space of more than one line at the beginning of the text in question where an artist would later add a little piece of artwork. Among the treasures of the Bibliothèque municipale de Lyon, France’s second-largest library when it comes to early printed books, there is a printed book in three editions, printed at the same time by the same printer. The difference between them is most obviously betrayed by their initials (figures 7 to 9).⁵ Initials were part of the art of bookmaking; they made the price of a book go up on the market. The way initials were executed was part of a value system, not part of a layout structure. The three books in figures 7 to 9 on the current page have the same printed text, yet are quite differently executed. One of the paper editions has only a rudimentary initial (figure 7), the other a more elaborate one (figure 8). Even more finesse was applied in the edition printed on parchment (figure 9). The evidence preserved in old and large collections like those in Lyons and Leipzig is overwhelming: European printers produced a great number of books where space was reserved for initials drawn by hand. The dependency of printers on manuscript culture when it comes to the design of pages with text is quite obvious. Yet there is a difference between rubrication and initials: the colored strokes were added by superimposing them on the body text, so missing out on rubrication did not alter the printed text in any way. Initials however, if not executed, are still “visible” on the page. The Leipzig printer Kachelofen provided initials (or had them made) of a relatively simple quality, adding them to a page where paragraphs and headings already exist (figure 10 on the facing page). Had the two initials not been added, two letters would have been missing, and two spaces would have been empty. In 5 Bibliothèque municipale de Lyon (BmL).
From Manuscript to Printed Book | 71
Fig. 10: on the left—Bernhard von Eger: Dyalogus virginis Mariae misericordiarum, Leipzig: Konrad Kachelofen, 1493 (LUL: Inc.Civ.Lips.133). Fig. 11: on the right—Desiderius Erasmus: De ratione studii, Strasbourg: Johann Herwagen, 1524 (LUL: Päd.8225).
contrast, the Strasbourg printer of Erasmus printed the letter s where an initial was not inserted, making the page appear incomplete to every reader who comes across it (figure 11). The remedy, of course, was to print the whole initial, and the Venetian printer Aldus Manutius did this expertly early on (figures 12 to 13 on the following page). The size of the printed initial—which effectively was a woodcut—could vary quite considerably. With the printed initial, the danger of leaving empty space on the page was neutralized, and the art of the manuscript continued in a sense, albeit without the use of color and without having the initial extend beyond the body text. Printed initials were embedded artistic elements, yet they also signaled that a particular paragraph was important and deserved highlighting. Looking at early specimens of printed books is often rewarding because of the many beautifully designed pages, and always instructive because some problems
72 | Ulrich Johannes Schneider
Fig. 12: Aristotle: Problemata, Venice: Aldus Manutius, 1497 (LUL: Ald.5).
Fig. 13: Bible [Greek], Venice: Aldus Manutius, Andreas Socerus, 1518 (LUL: Ald.38).
with the layout are obvious. Missing initials resulted in empty space in prominent positions. It did not help much that in the middle of the free space, printers produced in small the letter they wanted to be drawn. Printed pages with these provisional letters looked worse than any other page. This conflict is evident in many library collections. Printers leaving space for initials to be executed by hand continued a manuscript tradition that for many reasons, but mostly because of the sheer number of copies, could not really survive. Printed initials were presumably less expensive than hand-made ones, but they still cost extra. Aldus in Venice probably had the money, while Denis Roce in Paris clearly lacked it. His Boccaccio on the Greek gods is short of some printed initials (figure 14 on the facing page). Obviously, he had planned for too many. In contrast, another Paris printer made good use of the fact that the printed initial would stay within the body text, leaving the margins free for additional woodcuts, here in a Euclid edition (figure 15 on the next page). The initial did not lose its appeal quickly or entirely.⁶ Even in the sixteenth century, the changing book market demanded less artistry, and books with simpler designs were more easily sold. It is also likely that the initial eventually disappeared because a different way of creating paragraphs was adopted. We know of no specific moment in time when printers decided on how to design pages without using initials; we only have the books as silent witnesses of what printers actually did. It is by observation that we see them experimenting with blank space within lines and with white space between them. Once this proved successful, and paragraphs
6 On the long story of “visual typography,” see Proot (2021).
From Manuscript to Printed Book | 73
Fig. 14: on the left— Giovanni Boccaccio: Genealogia deorum gentilium, Paris: Denis Roce, 1511 (LUL: Philos.50p/2). Fig. 15: on the right— Euclid: Elementa Geometrica, Paris: Henri Estienne, c. 1516 (LUL: Geogr.gr.38).
were created, the aesthetic of the printed page departed for good from that of manuscript culture.
3.3 Typographical White It seems today that shaping text into paragraphs is a good way of structuring the flow of words by interrupting it just a little. We feel it makes sense when the early printers started to create paragraphs by ending the text in the last line with blank space, or by indenting the first line (figure 16 on the following page). A bolder step in creating paragraphs was the insertion of blank lines and headings (figure 17 on the next page). The example here shows three distinctive ways of separating paragraphs: the blank line, the heading, and the initial (not filled in). The blank line after the heading instead of before it was quite common in early printed books. Texts with clearly separated paragraphs helped readers’ spatial memory; the text on the page could be more easily remembered. As a sequence of legible items, paragraphs helped structure readers’ attention. Yet there are many variations. Printers created paragraphs as a matter of course in books listing regulations or where successive topics were not related. Textbooks lent themselves well to paragraphs for consecutive instructions. Books with historical narratives, where
74 | Ulrich Johannes Schneider
Fig. 16: Girolamo Manfredi: Centiloquium de medicis et infirmis, Bologna: Bazalerius de Bazaleri, 1489 (LUL: Allg.Path.432).
Fig. 17: Lucius Annaeus Florus: Epitoma Gestorum Romanorum, Venice: Giovanni Rosso, c. 1487 (LUL: Hist.lat.4b/2).
manuscript sources traditionally displayed text without any interruption, were a different matter. The printed history of Rome in figure 17 has paragraphs and headings. Headings rarely existed in manuscript culture, which did not usually provide tables of contents either, at least not by the original scribes. When an index was added by later users, the text became searchable. In a way, printed headings fulfilled the same function: they helped navigate the text by interspersing index entries into the text flow. However, the business of adding headings to the body text has to be explored in more detail. Clearly, printers made it their business to structure the text visibly. This was less costly in the sense that no rubricator or initial artist had to be paid. Studying the way early printed books are produced and pages are designed shows that printers must have cared about what readers would appreciate. Readers and printers shared the same disposition to work toward effective use of books, i.e., reading as an intellectual process.
Many books from the sixteenth century show headings serving as a simple means of separating paragraphs; the latter were created automatically once a heading was inserted. Indeed, a paragraph was mainly precisely that: something produced by a separation, by interrupting the flow of text. Hence, headings were not infrequently put at the bottom of the page, i.e., at the bottom of the body text. Pages were seen as containers through which text flows like in a canal. Pauses or breaks were effected through paragraphs, or by headings, which could happen anywhere on the page. A palpable sense of sequence also applies to books with integrated woodcuts— of saints, of women, of plants, of animals, and so on—where putting images and
From Manuscript to Printed Book | 75
texts in a regular order matters more than other concerns about their placement. Almost everywhere the image precedes the text, and this consecutive order can even be observed in cases where the text had to be placed on the following verso page, i.e., separated from its image in the process of reading.⁷ We see here that in early printing, a page was filled, not built. We can also see that, with rare exceptions, the double-page spread was of no concern as a design element in printed books before 1530. We have to assess page design in retrospect on the basis of what printers did, since no traces of plans or principles for the layout of books with visual elements have survived. The one and only Hartmut Schedel has provided us with sketches prior to printing, since he planned his Cosmographia very carefully, setting up a dialogue between explanatory texts and corresponding woodcuts on the facing pages (cf. Reske 2000, 50–59; Rücker 1988, 90–111).
4 Putting Numbers on Pages To sum up, inserting blank spaces was the key design innovation of the early European printers. Besides reducing the use of abbreviations and ligatures, and besides creating new fonts, adopting italics and letters in bold, and so on, the page that we know from the middle of the sixteenth century onward differs unmistakably from the page we see a hundred years before in its handling of blank space within the body text. However, reworking the text by inserting blank spaces is only part of what we might call the printers’ revolution in text design around 1500. While printers worked on the body text, they also started to rework the areas around it. In the sixteenth century, printed pages became not only visibly distinctive, but also individualized and organized by the addition of metadata to them. For navigational purposes, these metadata were of great practical importance. They eventually also included the page number. There are two different kinds of information given outside of the body text, for instance at the bottom: catchwords (figure 18 on the next page) and signatures (figure 19 on the following page), which could appear in any combination (figure 20 on the next page). A catchword anticipates the first word of the next page. It is helpful for readers as well as binders. A catchword links one page to the next, every recto to its verso, and is thus of general usefulness. Identification by signature serves a more specialized purpose. Each sheet of paper printed on constitutes a set that, by way of folding, leads to a certain number of pages, depending on 7 See the examples Schneider (2016a, 89–119).
76 | Ulrich Johannes Schneider
Fig. 18: Flavio Biondo: De roma triumphante, Basel: Johannes Froben, 1531 (LUL: Alt.Gesch.116).
Fig. 19: Josse Clicthove: De Regis Officio opusculum, Paris: Henri Estienne, 1519 (LUL: Syst.Theol.537/4).
Fig. 20: Albrecht Dürer: Geometria, Paris: Christian Wechel, 1532 (LUL: Math.71).
how many times it is folded. Binders would know what sheets to include in a quire by identifying sheets by signatures, i.e., combinations of letters and numbers. Printing catchwords and signatures were two ways of making sure that the complex production of a book was not messed up by the binders, who had to produce several identical copies of a book. Managing the blank space around the body text helps to define the page as something sequential and also as something different from what comes before and after. The page functions not only as a container of text, but becomes something recognizable and identifiable in its own right (cf. Janssen 2004a). By addressing the areas outside of the body text, printers became artist-producers of what was to become the page as we know it. In the sixteenth century, the printed page thus acquired paratextual elements by means of which it was coded. In contrast to manuscripts, where marginal notes were the rule, printers tended instead to invest meta-information in the lower and upper margins. In early printed books, marginal notes to the left and right of the body text continued the manuscript tradition of accommodating something relevant to the text. What was noted above and below the body text also, in a different sense, concerned the overall structure of the book. Very important for the history of book printing is the fact that printers did not stick to the manuscript culture of producing letters alone; they also produced legible text by including meta-information not stemming from the author.
As we have seen with catchwords and signatures, information printed below the body text addressed the physical structure of the book, its material composition, and the sequential order of its text. In contrast, information printed above the body text provided different kinds of structured information, pertaining either to current content (i.e., a running head) or to the relative position of the page within the book (i.e., a folio or page number). Directly over the column, readers were told where exactly they were in the text: what part, chapter, or subdivision thereof (in
From Manuscript to Printed Book | 77
Fig. 21: Gabriel Biel: Circa quattuor libros Sententiarum, Lyons: Jean Clein, 1514 (LUL: St. Nicolai.1027:2).
Fig. 22: Prudentius: Liber Peristephanon, Basel: Andreas Cratander, 1527 (LUL: Scr.eccl.1920).
figure 21: book 1, distinction 48, question 1). Page numbers told readers where they were in the book (figure 22). The upper-right position for the page number (mostly arabic numerals) occupied the position formerly held by the folio number (mostly roman numerals). The upper-left position for the page number was a new addition, since no information had been given in this position previously. The back side (verso) of any leaf was usually not given any number (i.e., a folio number), which was only printed on the recto side. Page numbers mechanically counted the pages much like folio numbers, yet they discontinued the recto/verso difference.⁸ When page numbers took over, left-hand pages were given even numbers and right-hand pages odd ones. Strictly speaking, metadata like folio or page numbers are not required for creating a book as a collection of pages (signatures and catchwords would suffice), nor are they necessary for reading a text (running heads could be enough). However, for referencing text and for comparing versions and editions, metadata like the page number proved to be essential. Neither the book as a material artifact nor the text as an intellectual artifact requires abstract tagging with consecutive numbers. However, page numbers define the book as a cultural artifact or document (i.e., a distinct edition). Hence, numbered pages have their functionality in meaningful conversations about texts and books, identifying the most visibly obvious element in a book: the page. Page numbers are helpful for readers of narratives (histories and stories) because they facilitate interrupted reading, they enable criticism and censorship of text, they facilitate commentary and emendation. The culture of reading and discussion is facilitated by the simple page number once this number is
8 On the technical requirements for printing page numbers and body text at the same time, cf. Janssen (2004b).
78 | Ulrich Johannes Schneider found everywhere. Page numbers were introduced relatively late, yet they became an essential part of our text culture, and still are today. Mechanically unalterable, the page number gives every reader a sense of the arbitrary nature of the book in hand. The size of fonts and the spacing mean that identical texts can consume different amounts of pages, for a start. A culture of copying text by hand—like the Arab world knew well into the nineteenth century— is less prone to change the book itself, because it is all about the text. Even if the early printers invented the page number only to serve the interest of readers, they turned the book into a cultural artifact, a document. Henceforth, the form of the text and the form of the book can vary independently of each other; they have remained in continuous dialogue ever since.
5 Early Printers Speak to Their Readers Page design and text layout in the first seven decades of typeset printing did not just happen; they were produced by design and as a business. Printers had a vision of what was best for readers. Hence the many changes in typography, the early use and later disappearance of rubrication marks and initials, the introduction of paragraphs through the aid of blank space and blank lines.⁹ There is a world of difference between the Gutenberg Bible and the version by Luther printed almost seventy years later (figures 1 to 2 on page 66). In the sixteenth century, the reading public was already much wider, more widespread, and more diverse than before. Consuming with books was no longer the preserve of an expert culture. When metadata were added to the book on the title page and, no less importantly, to every page—with catchwords and signatures at the bottom, with running heads and folio or page numbers at the top—texts became a commodity that was easy to reference. It was then that early modern European society started to work with texts, to learn and acquire knowledge via printed words, to exchange opinions and theories about almost everything. While early printers in this way served the culture of reading and discussing texts, they would normally not participate actively in it—with one exception: when they confessed their mistakes to readers. The printing of errata, i.e., a list of misprinted words, at the end of a book, is a very curious fact. Printers made themselves part of the book, took over the role of speaking directly to their readers. From an early date, errata were printed as part of the colophon. They were part of a communication by the printers—a kind 9 For the similarly long story of punctuation marks, see the study by Parkes (1992). On the influence of printing, cf. Parkes (1992, 87 ff., 215 ff.).
From Manuscript to Printed Book | 79
Fig. 23: Leonardo Bruni: Epistolae familiares, Leipzig: Jakob Thanner, 1499 (LUL: Inc.civ.lips.228).
Fig. 24: Ramon Llull: De clericorum, Paris: Guy Marchant, 1499 (BmL: SJ.Inc.b.098,2).
of confession really—who placed them at the very end of the book, where they included not only the mark of their print workshop but also the title of the work, its author, and the place and date of printing. It was in addition to these elements that they quite often listed the mistakes they had made in printing the text (figures 23 to 24). Detailing their mistakes, the errata section was the only place in any book where the early printers ever talked to their public (except in cases where printers were also editors). What is more, listing errors forced the printers to make specific references to their “own” book. By inserting the errata section they created a need to reference text as precisely as possible. As it turns out, the references made were quite varied and tell us again about the long and slow process of including metadata in a book. Printers would want to replace one falsely printed expression with another, saying in 1487, for example, “Moretur pro moreretur,” without telling the reader where exactly to replace the expression “he would die” with “he died” (figure 25 on the following page). The book in question has no folio or page numbers. Another example: printers said in 1499, at the end of a book containing several theater pieces by Plautus, that in a particular comedy the word “vires” should read “vices”
80 | Ulrich Johannes Schneider
Fig. 25: Quintus Serenus, Giovanni Sulpizio da Veroli: Carmen Medicinale, Rome: Eucharius Silber, c. 1487 (LUL: Ed.vet.s.a.m.144).
Fig. 26: Plautus: Comoediae, Venice: Simone Bevilacqua, 1499 (LUL: Poet.lat.3b).
Fig. 27: Jakob Wimpfeling, Philipp Fürstenberg, Crato Hofmann von Udenheim: Adolescentia Wympfelingij, Strasbourg: Martin Flach, 1500 (LUL: Ed.vet.1500,4).
(figure 26). There is no indication of where the mistake occurred. When folio numbers were included, the errata section made use of them, as shown in a book printed in 1500 (figure 27). A fine example for the use of even more exact page numbers can be found in a history of Sicily (figure 28 on the next page). Printers know they work for readers and speak to them through the errata section, revealing their concerns—however indirectly—through explicit discourse.
On rare occasions, books in public holdings show us that readers acted upon the printers’ indication of a misprint. This is proof of real communication between a printer and readers, a gift to historians because of insufficient evidence in general, a lack of explicit statements by contemporaries. Leipzig University Library has one book showing a correction made by a reader in a biography printed in 1540. It says at the very end “Pagina 25, vers. 23, renovata lege revocata,” and an unknown reader has indeed corrected the word by hand on page 25 (figure 29 on the facing page).
Bibliography Bartram, Alan. Five hundred years of book design. London: British Library, 2001. Crane, Gregory. Give us editors! Re-inventing the edition and re-thinking the humanities. In: Moyn, Wuk, editor, Derived copy of Online Humanities Scholarship: The Shape of Things to Come. OpenStax-CNX, February 2018. URL: https://cnx.org/contents/[email protected]: XfgqFrtg@2/Give-us-editors-Re-inventing-the-edition-and-re-thinking-the-humanities, (31.10.2022). Harris, Neil. Costs we don’t think about. An unusual copy of Franciscus de Platea, opus restitutionum (1474), and a few other items. In: Dondi, Christina, editor, Printing R-Evolution
From Manuscript to Printed Book | 81
Fig. 28: Hugo Falcandus, Matthieu de Longuejoue, Martin Gervais de Tournai, Roberto di San Giovanni: Historia Hvgonis Falcandi Sicvli De Rebus Gestis in Siciliae Regno, Paris: Mathurinum Dupuys, 1550 (LUL: Hist.ital.511).
Fig. 29: Loys Le Roy: G. Budaei Viri Clariss. Vita, Paris: Roigny, 1540 (LUL: Vit.251xf).
and Society 1450–1500. Fifty Years that Changed Europe, pp. 511–539. Venice: Edizioni Ca’ Foscari, 2020. DOI: http://doi.org/10.30687/978-88-6969-332-8. Janssen, Frans A. Layout as means of identification. In: Janssen, Frans A., editor, Technique and Design in the History of Printing, pp. 101–111. Houten: HES & DE GRAA, 2004a. Janssen, Frans A. A footnote on headlines. In: Janssen, Frans A., editor, Technique and Design in the History of Printing, pp. 129–132. Houten: HES & DE GRAA, 2004b. Kwakkel, Erik. Books Before Print. Leeds: ARC Humanities Press, 2018. Parkes, Malcolm Beckwith. Pause and Effect. An Introduction to the History of Punctuation in the West. London: Routledge, 1992. Reprint New York 2016. Proot, Goran. The transformation of the typical page in the handpress era in the southern Netherlands, 1473–c. 1800. In: Chang, Ku-ming (Kevin), Anthony Grafton, and Glenn W. Most, editors, Impagination – Layout and Materiality of Writing and Publication, chapter 8, pp. 237–272. Berlin: De Gruyter, 2021. DOI: https://doi.org/10.1515/9783110698756-009. Rautenberg, Ursula. Die Entstehung und Entwicklung des Buchtitelblatts in der Inkunabelzeit in Deutschland, den Niederlanden und Venedig. Quantitative und qualitative Studien. In: Estermann, Monika and Ursula Rautenberg, editors, Archiv für Geschichte des Buchwesens, volume 62, pp. 1–105. Munich: K. G. Saur Verlag, 2008. Reske, Christoph. Die Produktion der Schedelschen Weltchronik in Nürnberg. Wiesbaden: Harrassowitz, 2000. Ryder, John. The Case for Legibility. London: The Moretus Press, The Bodley Head Ltd, 1979. Rücker, Elisabeth. Hartmann Schedels Weltchronik. Das größte Buchunternehmen der Dürerzeit. Munich: Prestel, 1988. Schneider, Ulrich Johannes, editor. Textkünste. Buchrevolution um 1500. Darmstadt: Philipp von Zabern Verlag, 2016a. There is also a French version (Schneider 2016b). Schneider, Ulrich Johannes, editor. Les arts du texte. La révolution du livre autour du 1500. Lyon: Bibliothèque municipale de Lyon, 2016b. Smith, Margaret M. “Le blanc aldin” and the paragraph mark. In: Jones, William Jervis, editor, ‘Vir ingenio mirandus’. Studies Presented to John L. Flood, volume 2, pp. 537–557. Göppingen: Kü mmerle, 2003.
82 | Ulrich Johannes Schneider Tanselle, G. Thomas. Descriptive bibliography and library cataloguing. In: Bowers, Fredson, editor, Studies in Bibliography, volume 30, pp. 1–56. Charlottesville: University Press of Virginia, 1977.
Cornelia Bohn
Documenting as a Communicative Form What Makes a Document a Document—a Sociological Perspective
1 Introduction: The Digital Complex It is a much-discussed question whether we can speak of a digitalization of society analogously to the literalization of sections of world society that has been institutionally driven since 1800. Empirical indications would include, first, the digital media competence of the general population, which is accompanied by the expectation that this cultural technique can be assumed in potential addressees and institutional processes and has become a matter of routine. Digital literacy presents itself in degrees, just as literality did in the nineteenth century. Second, analogously to the “pasteurization of France” described by Latour (1993), an increasing “colonization” of the social world by screens, digitized material, algorithms, and digital expertise may be significant. The concomitant protest movements—which, it should be stressed, reject not the technological devices but their unreflected use—are also part of this picture. Even beyond such evidence that can be empirically registered, however, it is plausible to assume that the digital complex has become indispensable to contemporary society, that it will persist and had already become established before we found a name for it. “We do not see digitalization,” as Nassehi puts it; “instead, central fields of society are already seeing digitally. Digitality is one of the crucial ways society has of referring to itself.”¹ And he uses the structure of society itself to make this conclusion plausible, identifying a structural similarity between digital technology—the data structure, not the data quantity— and binary-coded social subsystems. The far-reaching operative parallel lies in the fact that both combine a simple basic coding with a practically unlimited variety of forms of possible connection. It is precisely these combinations of simplicity and a diversity of forms that—so Nassehi argues—manifest themselves in the proliferation of choices that is characteristic of modernity. In contrast to Simmel, for 1 “Wir sehen nicht Digitalisierung, sondern zentrale Bereiche der Gesellschaft sehen bereits digital. Digitalität ist einer der entscheidenden Selbstbezü ge der Gesellschaft” (Nassehi 2019, 29); the formulation is modeled on Cassirer’s observation that we see objectively rather than seeing objects (Nassehi 2019, 88). Cornelia Bohn, University of Lucerne https://doi.org/10.1515/9783110780888-005
84 | Cornelia Bohn instance, who grounds this in the medium of money and the differentiation of society into social circles, the systems-theoretical explanation of increasing freedoms and proliferation of choices takes as its starting point the increasing complexity of society, in response to which structural reconfiguration and structure-building take place in such a way that freedom and contingency and the resultant diversity of forms become integrated into the structure. From the perspective of social theory it can therefore be assumed that the digitalization of society is not taking place as a general transformation of semantics and forms of societal differentiation of the kind that occurred synchronically with printing, the monetization of the economy, the positivization of law, and the self-referentiality of modern art and of modern individuality. Instead, it is embedding itself in the factually differentiated structure of society, adding, however, a new form of communication to the existing ones, which it thereby also changes. Metamorphoses of the social do not—it will be assumed in what follows— present themselves as the displacement of one form by another, even if the introduction of new media is often accompanied by a disruptive or disparaging rhetoric. Media do not substitute one another; instead, while seeming to replace one another, media and forms of social communication change by selectively reaching back and looking ahead, by opening up new choices and potentialities. Thus, the introduction of writing into communication also changed spoken language; this is reflected in the concept of konzeptionelle Schriftlichkeit (“conceptual literacy”; Raible 2001). It also, through written documentation, multiplied the potential connective and combinatorial possibilities for actualized meaning in social communication and interaction, and has created, with print culture and screen culture, infrastructures in the form of libraries, archives, files, registries, databases, metadata, platforms, or clouds. Modalization of the understanding of reality can be seen as the crucial modification of social semantics that accompanies the textualized Darstellung of reality.² This entails a narrowing of what is considered necessary and impossible in communication and a widening of what is treated as contingent reality. Again and again, categorial possibilities of one medium are realized by another: thus, for instance, in visual Darstellung, instantaneousness and contingency were first realized in nineteenth-century painting, even though it was only with the visual medium of photography that they became technologically possible.³
2 As will become apparent, the concept of Darstellung used in this chapter cannot be translated with a single word in English. The German noun is thus retained; the verb darstellen is rendered as “to display,” the adjective darstellbar as “displayable.” 3 On the textualization of society, see Bohn (2013), Luhmann (2012, esp. 150 ff.); on the relationship of painting and photography, see Imdahl (1996).
Documenting as a Communicative Form
| 85
An increasing normalization of remote synchronization in the subsystems of society is, I assume in what follows, part of the structural conditions that both evoke and are evoked by digitalization. The semantic conditions for this include— according to my hypothesis—an erosion of linear semantics of time, as is articulated in the modern sciences and modern art; the operative social practices include a differentiated written, numerical, visual, and audiovisual self-documentation of society. The digital complex is generally treated in three dimensions in the literature. First, the technicity of computing machines (computation) that process complex sets of data on the basis of binary number codes. The interplay between the digital data form and the volume of data and the speed of operation, weighted in different ways and frequently drawn together in the opaque expression “big data,” is emphasized. The manifold ways of and unlimited possibilities for connecting and recombining data once they have been generated, leaving behind permanent, unerasable data traces and making possible unpredictable reuse and second and further uses, are noted; or the actualizing rediscovery and linking of data by relevance-generating, algorithmically programmed search engines is thematized. Second, an operative-communicative practice that can take place both in the mode of simultaneity when spatial separation pertains and in the mode of temporally displaced operation. This enables, as a matter of routine, remote synchronization and the ability to reach absent addressees, both temporally displaced and in real time, as well as real-time monitoring and the reintegration of documents into courses of action in situ. Third, a semantic dimension, inscribed in the medium, that presents itself as a space of the possible for forms of knowledge, action, experience, and communication. This involves communication routines, epistemological challenges of category-formation when observing society, and phenomena of apophenia as a mode of meaning-production in the sense of a simultaneity of perception that leads to the discovery of patterns in chance data; methods for observing and describing these forms are also included here. Studies of the information-theoretical, social, and cultural significance of digital media are undertaken with a varying focus on all three of these levels— technological, communicative, medial—without overlooking the fact that these dimensions, although they are given separate analytic treatment depending on the scientific/disciplinary question addressed in any given case, stand in a mutually constitutive relationship to one another.⁴
4 On the formation of sociological categories in the light of digital data traces, see Cardon (2015). On information-theoretical distinctions between database, datastream, timeline, and media an-
86 | Cornelia Bohn The pluridisciplinary research group that has given itself the name Roger T. Pédauque—which symbolizes distributive authorship rather than standing for a real person—takes a specific phenomenon as its starting point and asks how documents change in this always ongoing process of digitalization (Pédauque 2006, and Pédauque 2022 in the present volume). These inquiries do not pursue an empirical survey; instead, they are concerned with outlining a conceptual framework that makes it possible to register correspondences and differences in the technological, communicative, and medial nature of documents and their use in traditional (D1) and digital (D2) media. The analysis concentrates on formats and the social use of documents. The model proposed starts from the three aspects described above in order to describe the unity of documents: form, sign, and medium are distinguished as three levels of analysis, to which the French terms vu, lu, and su are assigned. They refer to the technological-material form of documents and the practice of perception (vu), to their semiotic nature and the practice of reading signs as understanding (lu), and to the media status of documents as knowledge-imparting elements (su) in social communication. The model focuses on the reception perspective and is shaped both by analytical tools of linguistic theory (syntax, semantics, pragmatics) and by ideas from semiotics. At the same time, the authors stress that they are not just pinning down changes in the “objects” but are also concerned with the equally rapid “evolution” of epistemologies and semantics. In what follows, I would like to take up the terms “medium” and “form,” but to use and modify them in the sense of Luhmann’s event-based social theory in order to propose a sociological description of the documental that makes it possible to cover morphogeneses. To make clear the sociological extent of the phenomenon, I will first introduce semantics of documenting that have developed in such very different fields as organizations and institutional administrations, the modern sciences, publishing and the dissemination of knowledge, the identification of individuals, or modern art. In the next step, which constitutes the main part of the chapter, a theoretical frame is set out that conceives documenting as a communicative form in the medium of Darstellung. To this end, two theoretical perspectives are developed and related to each other. First, some premises of event-based social the-
alytics, see Manovich (2012, 2018). On data infrastructures, see Kitchin (2014). On data traces, see Latour (2007). On real-time visual monitoring, see Ammon (2016). For critical engagement with the practical domain’s naive thesis regarding the end of theory, see Kelly (2008); Boellstorff (2013); Crawford et al. (2014); and many others. On causality and correlation, with an apologetic slant, see Cowls and Schroeder (2015). On apophenia, see Stäheli (2021, 71 ff.), who interprets it in communication-theoretical terms—with a term borrowed from psychology—as Desaktualisierungsschwäche.
Documenting as a Communicative Form
| 87
ories are outlined. On this basis—and this is the second theoretical component—a new media type is added to media and communication theory: the media of Darstellung, which however can only take shape through the establishment of forms; this also includes documenting as a communicative form. This is illustrated using forms of documenting that are described prominently in the sociological literature as examples.
2 Semantics and Practices of Documenting Reconstructions of the historical semantics and social practice of documenting agree that documenting became established as a pertinent form of communication in the course of the textualization of society and the rise of print culture. Practices of documenting are bound up with the emergence of modern organizations and European state bureaucracy, and equally with the development of modern scholarly publishing (see Lund 2009; Lund and Skare 2010; Blair et al. 2021, 413 ff.). This relationship is described prominently in Max Weber’s analysis of bureaucratic rule. Rulings and agreements that had previously been made orally became the preserve of written records and registries, among other things to ensure their continuity and enforceability. File-based administration was a constitutive component of the bureaucratic rationality of modern rulership. Organizational sociology speaks of the formal structures of organizations; their smooth functioning—Weber argued— put an end to subjective arbitrariness and intransparency, and was a result of the textualization of society. “Administrative acts, decisions, and rules are formulated and recorded in writing, even in cases where oral discussion is the rule or is even mandatory” (Weber 1978, 219).⁵ Even interactions or telephone calls only became part of administrative reality in the form of written records. Files were meant to document the tangible documentation and explainability of states of affairs without breaks or interruptions. To cope with the constantly growing mountains of files that resulted, classifications were introduced to regulate the destruction of 5 And he continues: “This applies at least to preliminary discussions and proposals, to final decisions, and to all sorts of orders and rules.” This ideal of rationality has long since been contested in organizational sociology. Important analyses of how organizations handle information can be found in Feldman and March (1981), who use the concept of signaling; Schwarzkopf (2020), emphasizes the significance of ignorance where the proliferation of information in organizations is concerned, and attributes this to digitalization. On digital infrastructure in organizations, using the example of integrative forms of complex digital information/process-management systems (OneSolution as a transinstitutional cloud solution) and their flexible recombinability, for instance with forms of electronic file, see Büchner (2018).
88 | Cornelia Bohn “dispensable” files. The fact that, in the form of registries, new files were created about files that had been discarded is—as Cornelia Vismann puts it—part of the “paradox of […] writing as the basic rule of bureaucracy.”⁶ If we follow the analysis of the Transactions of the Royal Society that Bazerman presented for the period from 1665 to 1800 (see Bazerman 1988, esp. chs. 3 and 5), the experiments covered in the Transactions were initially performed in public before the membership gathered at the society’s regular meetings. The truth of the experiments was guaranteed by the fact that they could be witnessed in public. The written reports assumed, as it were, the memory of the witnesses who had seen everything. Once the presence of the public sphere ceased to be the norm, the Darstellung had to be so clear that it allowed those who were not there to get a picture of the course of the experiments. It was perfectly possible for “to publish” and “to document” to be used synonymously around the turn from the eighteenth to the nineteenth century. Writing down scientific experiments or political-administrative decisions was meant to ensure that they could be followed by people who were not present or when used later. The practice and self-definition of documenting had become enriched with semantic aspects of knowledge-transmission, of truthfulness, of the ability of states of affairs and events to be confirmed and digested (administrative decisions, scientific experiments, payments or property transactions, identifying features of individuals), thus allowing them to be fixed factually, to be made accessible in social and temporal terms—to people who were not present and at later points in time respectively. The illusion of validity, continuity, completeness, persistence, and retrospectiveness was tied to the written form. At the same time, however, the written mode also saw the establishment of an awareness that writing, images, and other visual, numerical, or acoustic forms of Darstellung should be considered genuine modes of communication and action with a reality-generating potential.⁷ 6 Thus the media-theoretically informed analysis of Vismann (2008, 126 f.): “It appears that nothing on file can ever really disappear. It leaves a trace, be it only in the shape of a registered gap.” 7 This ambivalence is apparent in Simmel (1950, 352); in his excursus on written communication, he credited “the written form” with enacting a cultural objectification, as well as with having an objective determinateness and ambiguity. Unlike transactions in front of witnesses, he argued, written documentation socially encompasses a potentially unlimited publicity; as an intellectual objectification, has a “timeless validity.” Sombart’s (1987, 120) double-entry bookkeeping is a further example of the reality-generating potential of a form of documentation. He credited—as is well known—double-entry bookkeeping with a catalytic significance for the development of the monetization and calculability of the modern economy. His central argument in our context is that by means of its tabular, multicolumn form it makes the division of incomings and outgoings (loi digraphique: débit/crédit) transparent and for the first time allows the profit of a business
Documenting as a Communicative Form
| 89
The restriction of the semantics of documenting to written material was only temporary. Diderot and d’Alambert’s encyclopedia, as a lexicon-like form of documenting knowledge, already consisted equally of image plates and texts. The modern European territorial states of the nineteenth century transitioned to identifying their citizens with identity papers. Whereas the wanted posters (Steckbriefe) or lists of thieves and vagrants of early modernity contained only descriptions of wanted individuals (Signalements) and purely numerical registrations, the development of photography in the nineteenth century made a crucial contribution to the modern passport system. At the beginning of the twentieth century, identity documents became a global standard; at its close they were transferred worldwide to digitalized biometric processes. It was no longer foreign status, illness, or deviance that was meant to be documented, as in the origins of this document type, but the uniqueness and distinctiveness, grounded in the modern semantics of individuality, of the individual as a means of identifying and addressing them. At the same time—through the belonging to nations, for instance, that was also recorded in these documents—population categories and social entities were constructed, such as the concept of a constitutive people that is integral to modern statehood.⁸ Mass writing around 1800 brought with it the problem that it was increasingly difficult to keep track of the sheer number of publications that resulted from new institutions such as reviewing or canon-formation. The flood of publications also brought with it new professions and accompanying fields of knowledge that can, from a modern perspective, be seen as the beginnings of the information sciences (thus, e.g., Rayward 1997; see also Dommann 2008). They were concerned no longer with collecting and preserving bodies of knowledge as completely as
to be truly calculated because it makes it possible “to follow the unbroken flow of capital in an enterprise. […] It is obvious how much calculability was bound to be furthered by double-entry bookkeeping. It knows of no economic events that are not in the books: quod non est in libris, non est in mundo. And only something that can be expressed with a monetary sum can enter the books. Monetary sums, though, are displayed [my emphasis] only in figures, so every economic event has to correspond to a figure, and so to be economically active is to calculate” (“den lückenlosen Kreislauf des Kapitals in einer Unternehmung zu verfolgen. […] Wie sehr die Rechenhaftigkeit durch die doppelte Buchhaltung gefördert werden mußte, liegt auf der Hand: diese kennt keine wirtschaftlichen Vorgänge, die nicht in den Büchern stehen: quod non est in libris, non est in mundo; in die Bücher kommen kann aber nur etwas, das durch einen Geldbetrag ausgedrückt werden kann. Geldbeträge aber werden nur in Ziffern dargestellt [my emphasis], also muss jeder wirtschaftliche Vorgang einer Ziffer entsprechen, also heißt Wirtschaften Rechnen”). See below on the concept of Darstellung. For a brief outline of double-entry bookkeeping and the concept of accounting in terms of historical semantics, see also Blair et al. (2021, 287 ff.). 8 On the identification and registration of individuals and the accompanying documentation, see Bertillon (1893); Galton (1965); Blauert and Wiebel (2001); Bohn (2006); Caplan and Torpey (2018).
90 | Cornelia Bohn possible but with analyzing and displaying them by assembling metadata in the form of activities that are now synonymous with documenting: systematizing, classifying, indexing, cataloging, decontextualizing, breaking down, recombining, making visible internal reference structures—all as forms of making knowledge accessible. In this way, library cataloging systems with their internal reference structures were organized, local as they were, more and more on the basis of international standards; with the digitalization of the early twenty-first century, they are being replaced by search engines with translocal and transtemporal modes of access and reference structures. In parallel to this infrastructure, the human and social sciences developed a non-linear research method that Abbott calls “library research.”⁹ At the end of the nineteenth century, the semantics of documenting was extended to every form of materially fixed knowledge that lent itself to consultation or study, or served as proof of identity or credibility; the institutionally grounded modern use of patents is also part of this (see Buckland 1997, 805).¹⁰ The development of new procedures for recording and imaging, such as photography, film, sound recordings, microfilm, and photocopying, all the way to digital forms of Darstellung, led to a plurality and hybridization of forms of documentation, which went hand-in-hand with a diffusion of the semantics of documentation into further social subsystems.
9 Abbott (2011, 43): “By library research I mean those academic disciplines that take as their data material which is recorded and deposited. Throughout the period here investigated, that deposit took place in libraries or archival repositories. In practice, the library research disciplines include the research branches of the humanities and a substantial portion of the social sciences: study of the various languages and literatures, philosophy, musicology, art history, classics, and history, as well as extensive parts of linguistics, anthropology, sociology, and political science.” He also notes changes that accompany digitalization, for instance in digital indexing, which is increasingly organized by words (dominant on the Internet and suitable for search engines), instead of by terms and concepts (dominant in print culture) that an informed reader and user can handle. On active and passive indexes, see Abbott (2014, 37 ff.). 10 For modern patents, see Hemmungs Wirtén (2019, 579), who in this context identifies a documentation movement in France at the beginning of the twentieth century: “The ‘documentation movement’ refers to the networks and alliances that resulted from the establishment of Otlet and La Fontaine’s Institut International de Bibliographie (IIB) in 1895 and that reached its pinnacle at the 1937 Congrès Mondial de la documentation Universelle (CDMU)” (see also La Fontaine and Otlet 1895). On the use of patents in the German Empire, see Seckelmann (2008). In parallel to patents for technological knowledge and scientific discoveries (which are time-limited), copyright emerges for documented intellectual property in art and scholarship (the use of which needs to be acknowledged no matter how much later it occurs). For a media-historically informed history of copyright as a legal norm, see Dommann (2014).
Documenting as a Communicative Form
| 91
Modern art initially distinguished documenting from fiction. With the emergence of the documentary film in the 1930s, documenting itself developed into an artistic genre that differed explicitly from simple reporting because of its aesthetic principles of composition (see Wöhrer 2015a,b). In contemporary art—by which I mean not art that is currently being produced but a distinct artistic position that has emerged since the second half of the twentieth century—art documentation is itself advancing into an innovative artistic form. This involves art documentation as artistic practice, as curatorial practice, and the self-description of art, which in a constant process of transgression changes what art and artworks are. In the artistic positions of contemporary art, documenting becomes at once the object of art and a form of artistic Darstellung—in Husserl’s (1966) terms, noesis and noema. These artistic positions realize digitalization by using forms of digital Darstellung, reflecting on categorial possibilities of the digital and acting operatively and thematically in the global space. The Forensic Architecture artistic position operates at the intersection of art, forensic investigations, and activism. Synthetic images, interactive platforms, and digital simulations are used here to document precisely dated world events, to comment on official accounts and correct them, present them in a uniform format, and exhibit them in a constantly updated digital archive. Other positions integrate GPS visualizations generated for the artwork and other forms of geotagging into the artistic Darstellung. The use of software developed specifically for a particular art project underlines further the inherent selectivity of the documentary form of Darstellung that is reflected in contemporary art (see Weizman 2017).¹¹ These project-oriented artistic positions have in common the fact that documentation itself becomes an exhibit. Curatorial practice experiments with these artistic positions in digital exhibition formats. Their innovative potential lies, alongside new interactive exhibits, in global access to local exhibitions and the accompanying proliferation of interpretive points of contact, and thus in the dissolution of (Western) canonizations; and also in the self-archiving of digital exhibitions. It is only the curation of online archives, however, that endows them with a documentary character; this also includes archive stories of worldwide “visitors,” and the documenting and encoding of them by means of metadata.¹²
11 Forensic Architecture: https://forensic-architecture.org/; Esther Polak and Ieva Auzina: the Milk Project: https://milkproject.net/en/, Nomadic Milk: https://www.ediblegeography.com/ nomadic-milk/; MIT Senseable City Lab: TrashTrack: https://senseable.mit.edu/trashtrack/ (31.05.2021). I am grateful to Inge Hinterwaldner for the latter two references. 12 As an example: online talk “Wissensort Museum?” Kunstmuseum Basel https://kunstmu seumbasel.ch/de/programm/themen/wissensort-museum (12.03.2021). Labels such as “documentary turn” or “neo-documentary turn” refer to artistic practice, curatorial practice, and the self-descriptions of art.
92 | Cornelia Bohn In the manner of a disruptive movement within art, the dominance of the very concept of the artwork as fini in the sense of the material unity of a perceivable, haptic “object” is at stake in these and other positions of contemporary art. The place of the closed, completed “work” is taken by an extended concept of the artwork as a sequence of events that includes conceiving, designing, producing, experiencing, and documenting, and that also documents the work’s own openendedness. This takes place in a mode of Darstellung that reflects on the producing, reality-generating dimension of Darstellung just as much as the undisplayability inherent to it (see Bohn 2021; Kester 2011).
3 Documenting as a Communicative Form in the Medium of Darstellung I assume in what follows that, independently of the document type, the communicative form of documenting responds to the same general problem: that of making knowledge and socially relevant states of affairs of the most varied kinds socially displayable and, in their recorded forms, socially visible and reusable beyond their immediate spatiotemporal context; this includes their ability to be found and to enter into new connections in new horizons of reference and chains of action. The Darstellung of many different events can be involved, such as scholarly insights, medical findings, computed tomography, X-rays, administrative decisions, architectural designs, the documentation of artistic events or payments, and much more. There is always, I argue, a productive element to all these forms of Darstellung. Digital documenting allows such producing and making accessible uno actu in real time and in quick succession, with the consequence of a new gradation of material stability and instability in the sense of Kathrin Hayles.¹³ It thereby acquires the operative advantages of remote synchronization in the guise of accessibility that is not dependent on a particular location and the possibility for the documentation to be simultaneously reintroduced into the current situation. Real-time documentation also reveals a principle of documenting that calls into question the illusion, bound up with print culture, of the persistence, retrospective
13 With Hayles (2003), I strongly dispute the print = stable/digital = unstable opposition. She assumes different levels of materiality/immateriality in the transposition relationships between print media and digital media. She proposes the concept of the “technotext” for digital text genesis, which in practice generally involves a combination of digital and analog.
Documenting as a Communicative Form
| 93
quality, and referentiality of documenting,¹⁴ correcting it with inference, simultaneity, and self-reference. Documenting is then not the retrospective recording of an act that has already taken place; instead, the documentation of a financial transaction is itself the payment that generates further possible payments; the publication of a new insight is itself the insight for the scholarly communication that follows—it can be criticized, developed further, or revised. Notes, sketches, or models in architectural, artistic, or scientific design and Darstellung processes are at the same time documents insofar as they have become a point of reference for a design, artwork, or insight in the present of the communicative operation in any given case. As Herbert Simon puts it for the sciences of the artificial: “At each stage in the design process, the partial design reflected in these documents serves as a major stimulus for suggesting to the designer what he should attend to next” (Simon 1996, 92).¹⁵ My remarks are based on a non-essentialist epistemology of event-based social theories that is neither concerned with the content or nature of documents, nor seeks demarcation criteria in order to define them with, say, the text/work/ document distinction or in terms of a combination whereby text + annotation = document.¹⁶ It is not Dinghaftigkeit and properties, or having textual form and possible additions or deviations, that are my points of reference; instead, I use the theoretical resources of event- and difference-based social theories to approach the way in which documenting functions communicatively. The systems-theoretical variant of difference-based forms of social theory used here takes as its starting point actions and instances of communication as eventful temporalized basic elements of the social.¹⁷ This means, first, that these elements owe their identity and meaning not to substantive content or the intention of acting parties, but rather to their factual
14 Ethnomethodological criticism of documenting in writing as a purely surface phenomenon makes much of this illusion, but it rests on the implausible assumption that writing is—as Garfinkel holds—an epiphenomenon relative to situated interaction and that it is only interaction that is event-based; see Garfinkel (1967). 15 He continues: “This direction to new sub goals permits in turn new information to be extracted from memory and reference sources and another step to be taken toward the development of the design.” See also Latour’s (1988) “inscriptions.” On the design process, see also Hinterwaldner (2017); Ammon and Hinterwaldner (2017); from an interactional perspective Mondada (2012). 16 This and similar proposals can be found in the literature; see Hayles (2003); Skare (2009); Lund (2010). The non-essentialist assumption also applies to Briet’s (1951, 3) much-discussed antelope. Briet’s solution is to think in terms of being cataloged; only the cataloged antelope functions as a document, unlike the wild version. 17 Luhmann (1995), Social Systems, can be read as a reference work for this paradigm shift in systems theory.
94 | Cornelia Bohn linking and temporal positioning in a subsequent series of communicative events. It is, then, only connectedness and linkage with other events that give actions and artifacts their meaning, which can be different depending on the subsequent events that actualize it in any given case. In the theory of language as difference, this recognition that the genesis of meaning is inherently temporal is conceived as the logical primacy of the inferential connection between linguistic sign and linguistic sign over referentiality. Saussure speaks of momentary values, valeurs momentanées, and emphasizes that the value of a linguistic sign in its identity thus results “from the forms which from one moment to the next surround it” (de Saussure 2006, 164).¹⁸ A second implication of event-based forms of social theory is that the same message, the same communicative artifact, acquires different meanings depending on the series of communicative events in which it is actualized. A piece of art documentation can, as an exhibit, be an element of artistic communication in the art-experience of the observer. As a loan from a collector, it is also an object, but not a document, of insurance agreements that are in turn documented in a contract. A third fundamental insight of event-based forms of social theory lies in the fact that everything that happens, happens in the present in any given case, and happens simultaneously—that communication, even in textualized, printed, or digitized form, remains an event bound to a point in time. This counterintuitive insight is relevant to documentary forms and their illusionary connection to what is “past.”¹⁹ Documenting, in the sense employed here, thus means a connecting operation that involves administrative processes, financial transactions, advances in knowledge, medical diagnostics, architectural design, and artworks and art-experience. As the semantic reconstructions have shown, however, the mere recording of an event is not itself a documentary act. Documenting is instead a highly selective form of communication that also imparts its own selection criteria and is therefore selfreflexive. The documentary character of accounts or artifacts is thus determined
18 Saussure’s theory of language as difference set in motion—as is well known—the deontologization of linguistic identities. On the relationship between difference, identity, and synchronism in Saussure, and the new interpretation of Saussure’s theory of language on the basis of recently edited Nachlass finds, see Jäger (2003), who emphasizes that Saussure conceived the concept of reference against the background of a theory of the inferential linking of sign to sign; see also Jäger (2020). The assumption of a logical primacy of inference over reference can also be found in inferential semantics, albeit as the foundation of a normative pragmatics in Brandom (2000, chs. 1–2). 19 This idea is already present in Mead (1929); see also below on Foucault’s archaeological model.
Documenting as a Communicative Form
| 95
not by their nature but rather by the links into which they enter in communication. But how can this communicative and connectionist notion of documenting be further grasped and situated theoretically? I would like to elucidate the concept by adding a new media type to Luhmann’s social-theoretically founded theory of media and communication. That theory rests on two parts, the theory of communication and the medium/form theory, that are positioned in parallel and complement each other. Their parallel presence in his later writings turned out to be highly productive; the openness in recombining theory-elements and questions allows numerous new phenomena to be addressed.²⁰ In contrast to information-theoretical models of communication that conceive of media as noise-free transmission channels, at stake here is not the communicative imparting of meaning but its communicative production. This also means that medial morphogeneses change the nature of communication itself; this applies equally to its cultural significance and operative enactments.²¹ Both theoretical components, the theory of communication and the medium/form theorem, contribute to the analysis of social morphogeneses. Communication theory in the sense employed here can reconstruct and analyze how solutions to problems give rise to the genesis of new problems. Its dynamism as a theory stems from the fact that communicative routines do not always function normally: it is concerned with the problems to which the establishment of new media provides solutions. The analytical potential of medium/form theory is different. Its explanatory power as a theory lies in showing how an infinite variety of forms can be generated from a limited number of medial substrates, a variety that, once materialized, constrains not the variety of the forms but the possibilities 20 This becomes clear in Luhmann’s texts on art in particular; Luhmann (2000, 2008). The theoretical elements were introduced to his theoretical architecture at very different times. He developed communication theory as an independent theoretical element in the 1960s. The 1980s saw him introduce his media/form theory as a selective combination and reformulation based on the traditional metaphysical matter/form conceptual framework and Heider’s (2005) perceptiontheory theorem of “thing and medium” (“Ding und Medium”), which Luhmann transforms into the distinction-theoretical figure of medium/form and extends for the analysis of media of communication. Somewhat overemphasizing his distance from the matter/form tradition, Luhmann determines that the medium/form distinction can permit self-reference on both sides, whereas this was traditionally reserved for form (Luhmann 2008, 124). Communication and media theory is a central pillar in the overall structure of Luhmann’s social theory, which has numerous links with the theory of society and with the underlying systems theory. This context cannot be covered here but is apparent in Luhmann’s individual monographs (Luhmann 1995, ch. 4; Luhmann 1990; Luhmann 2000, ch. 3; Luhmann 2012, ch. 2). I likewise pass over a reconstruction of the extensive discussion of Luhmann’s communication and media theory; on this, see Mü ller (2012). 21 For operative, structural, and cultural consequences of writing, see Bohn (2013); Luhmann (2012, 150 ff., and passim).
96 | Cornelia Bohn for combining the shared elements or medial substrates. The medium/form distinction thus makes it possible to show how, once media have become established, structured complexity develops in them simultaneously with, and only through, the emergence of forms, and that is it through forms that media are able to become visible and operative in the first place. Communication theory takes the problem of bridging the divergence between alter and ego as its starting point and has to date identified three defining problems that should be recalled briefly. (1) The consciousnesses of alter and ego are not mutually transparent; media of understanding—Luhmann places language here— respond to this. As media of distance and dissemination, writing, print, and, later, telecommunications make (2) the problem of reaching absent addressees clear by solving it. These solutions lead (3) to a new problem in the increasing likelihood of rejection through the “no” of language and the institutionalization of the capacity for criticism with writing and print, in other words an increased risk that communication will not pay off in the sense of being adopted as the premise for further action. Symbolically generalized media of communication such as money, power, love, and truth respond to this. The method of analyzing medial metamorphoses by identifying social problems that lie behind solutions that have been found always also conveys the contingency of those solutions, the awareness that the solution not only generates new problems but could also itself have been different. I propose to add the media of Darstellung as a new type to the media that have already been identified; their specific way of bridging the divergence between alter and ego consists of making socially apparent through Darstellung and making those Darstellungen accessible. “Making apparent” is to be understood here not in terms of the theory of perception, as a visual act of sense perception, but in terms of communication and Darstellung theory, in a sense to be elaborated in what follows. Whereas the media development that has been outlined is oriented on the ability of language to negate and its binary coding of yes/no, acceptance/rejection, which reappear as binary codes in the media of power, money, and truth (to pay/not to pay, true/not true), media of Darstellung include linguistic, written, iconicvisual, diagrammatic, computed, or embodied states of affairs, or sound, and their messages. They thereby correct the linguistic bias of the theory in favor of a multitude of forms of Darstellung and their communicative employment in the genesis of social realities (see Bohn 2017, esp. ch. 1); this multimodal variety is also apparent in the documental form. Media of Darstellung are indispensable for diachronic media developments because they allow the construction of the by no means monomedial semantic spaces in which symbolically generalized media such as money, power, love, truth, art can develop. The concept of media of Darstellung allows two at first glance heterogeneous aspects of the semantics of Darstellung to be drawn together. First, a dimension
Documenting as a Communicative Form
| 97
of accountability, prominently stressed in the ethnomethodological literature— making something displayable, reportable, and thus observable, followable, and ratifiable for others—foregrounds the social dimension. Accounts in this sense unfold in a self-reflexive form; this means that they exist insofar as they are returned to, and are thus part of a constant process of renewal; as Garfinkel puts it, “such practices consist of an endless, ongoing, contingent accomplishment.”²² A second dimension presents itself in the historical semantics of concepts of Darstellung and their metamorphosis in modernity; the material dimension is foregrounded here. Classical semantics of linguistic, narrative, or visual Darstellung shared the assumption that the transparency of the Darstellung provided the true representation; this is reflected in the ideal of imitative invention in the mimetic program of traditional art. In the theoretical discourse of modern representation from the Renaissance to Impressionism, starting with Alberti, the painting was treated as an open window on the world (Marin 1994, 385).²³ Classical theories of language—still represented in the Port-Royal Logic in the seventeenth century—assumed a direct transposition of ideas into language, and of spoken language into writing, as an equivalence without loss or gain. These are concepts that were eroded away in stages with the increasing reflection on the capacity for narrative, visual, or epistemic Darstellung itself. With modern art and the modern sciences, a knowledge of the most innate properties of Darstellung itself became established; the concept of the referential ideality of Darstellung gave way to a concept of Darstellung as an authentic action that was considered an act of producing. Whereas classical poetics were based on the difference between action and Darstellung, theories of literature at the end of the eighteenth century frustrated this distinction by declaring literary Darstellung itself an authentic action. Instead of just having action as its object, it is itself action, as has been shown paradig-
22 Garfinkel (1967, 1): “When I speak of accountable […] I mean observable-and-reportable. i.e. available to members as situated practices of looking-andtelling.” More recent research has refined and extended the concept. Neyland and Coopmans (2014, 2) coined the concept of visual accountability and showed how in organizations visual documents are drawn into the distribution of accountability relationships by self-reflexive gestures of inquiry: “The import of working towards a repertoire for studying visual evidence in conjunction with the distribution of accountabilities […] lies in recognizing how forms of visual evidence such as photographs, video records, medical images and visualizations produced with scientific instrumentation ‘enter into accountability relationships’ just like output charts, budget reports and other numerical accounts.” 23 There is an extensive discussion here of the classical concept of Darstellung in painting and the stages in the gradual erosion of this semantics.
98 | Cornelia Bohn matically with reference to theories of literature; there, Darstellung functions as “irreducible action” and not as the enactment of an invention that precedes it.²⁴ Geometry and the development of the modern experimental natural sciences also stimulated further reflection on the concept of Darstellung. In Kant, we find the conception of Darstellung as a self-reflexive process that is held to be beyond Vorstellung, to be neither representation nor presentation. The modern experimentalempirical natural sciences ultimately became a provocation and point of reference for the formation of philosophical concepts and semantic reflection on the insight-generating potential of Darstellung.²⁵ The semantics of Darstellung were fashionable around 1800 and became the distinguishing feature par excellence of modern art, whose self-description no longer circles around rules and programs of Darstellung but rather declares the new program of making the conditions and means of Darstellung itself the object of Darstellung.²⁶ Finally—the way having been prepared by Kant—the boundary experience of Darstellung itself becomes its own object in romanticism, and the “paradoxical task of achieving the impossible Darstellung” is reflected in Schlegel as a constant “struggle to display the undisplayable.”²⁷ What, then, does it mean if we describe the document as a form in the medium of Darstellung thus characterized? After all, it is only the possibility space in which documenting as a communicative form can be an option at all that is described with the medium of Darstellung. It is therefore worthwhile recalling again some crucial insights related to the medium/form distinction. First, medium and form are, in the distinction-theoretical sense used here, to be grasped only as difference, each only in its relationship to the other. They 24 Menninghaus (1994, 211) showed this for Klopstock (quotation: “irreduzible Handlung”); see also Klopstock (1989, 157 ff., 166 ff.). For Herder’s and Klopstock’s concepts of Darstellung, see Mülder-Bach (1998, 76, 179 ff.). 25 On geometrical Darstellung in Kant, see Schubbach (2017a); on the significance of modern experimental chemistry for the formation of philosophical concepts in Hegel, see Schubbach (2017b). On the Darstellung of chemical experiments in scientific textbooks, see Knight (2016, 144): “A classic work here was Faraday’s Chemical Manipulation (1827); […] this did not instruct students in the principles of chemistry and the properties of things, but only in how to perform experiments.” 26 Thus a characterization of modern art in Danto (1997, 28)—which serves as a typical example. 27 “paradoxe Aufgabe, die unmögliche Darstellung dennoch zu leisten”; “Kampf das Undarstellbare darzustellen”; Schlegel (1981, 241) quoted following Menninghaus (1994, 2017). From a contemporary perspective, this semantics can be interpreted as a—largely ignored, of course—anticipation of the linguistic “to say something is to do something” of speech act theory (see Austin 2003), and of the discussions of performativity beyond linguistic utterances in the last third of the twentieth century, including the performativity paradigm in the sociology of knowledge (see in summary Gertenbach 2020).
Documenting as a Communicative Form
| 99
are synchronously present and therefore always have to be actualized simultaneously. No medium is unformed. Media are possibility spaces for the development of highly varied forms; at the same time, no form can implant itself in any medium regardless of what it is. Acoustic sound cannot “choose” light as a medium, financial transactions cannot “choose” truth, and linguistic messages need words and not molecules. Words and pitches are forms in the acoustic medium. At the same time, words, like written characters, are forms in the medium of language; but written characters can also be observed as forms in the visual medium. Every form—it could be said—makes use of an already present stock of forms as a medium. This implies that in operative use, forms can become media for the development of further forms: letters for words, words for sentences, sentences for texts. The medium/form distinction, as Luhmann puts, it “spares us the search for ‘ultimate elements,’ which nuclear metaphysics [sic] à la Heisenberg tells us do not exist anyway” (Luhmann 2012, 116).²⁸ Second, media are intangible; only as forms can they become operative, recognizable, and experienceable. The possibilities of a medium, however, do not become exhausted in the always ongoing process of form development; instead, they are regenerated by being used. As potentiality for the development of forms, media do indeed remain intangible—yet also (temporally) more stable than forms. As forms in a medium, concrete forms become visible, able to assert themselves, and yet operatively more unstable and liable to break down. But because the medium is realized precisely in the constitution of forms, the stability of media rests on the instability of forms (see Luhmann 2000, 129 and passim). They take shape through the bringing together and coalescence of elements as events that are available to the medium relatively independently of one another.²⁹ Drawings, photographs, forms, notes, and texts as elements of the medium of Darstellung can coalesce into documental forms. Through selective consultation, library holdings 28 Heisenberg’s “objective indeterminacy” leads, as Bachelard (1988, 122) puts it, to awareness of an unavoidable interference between object and method. This can also be applied to the relationship between medium and form, which mutually determine each other and cannot be observed independently of each other. The essentialistic tradition also assumed difference to be inescapable. “But this is the relationship between matter and form, because form gives existence to matter; therefore, matter cannot exist without a form” (Aquinas 2007, 239, who considers the essence of composite substances and determines that it “is not only the form, but it comprises both form and matter”). 29 Following Heider (2005), the unconvincing expressions “loose” and “strict coupling” have become established in the literature. Traditionally, concretum was used to refer to being brought together and coalescing as concrete forms. “Only form enters empirical consciousness. What we think of as material, is form” (“Nur die Form kommt ins empirische Bewusstseyn. Was wir für Materie halten ist Form”; thus a formulation of Schlegel 1964, 37 f.).
100 | Cornelia Bohn or archival documents (Abbott’s “library research”) can in turn become media for research and its documentation in scholarly publications, and thereby become concrete and visible in a new form; this takes place in the present of any given case and at the same time creates such a present. Foucault’s archaeology of knowledge formulates this awareness of the actualizing recombination of documental forms in the present of any given case in a different descriptive language and thereby realizes—without referring to them as such—premises of event-based social theories described above. Whereas the classical historical method treats the document as a decodable trace of an event that has gone forever, the archaeological method understands the document as an element in a positive architecture of knowledge. It no longer attempts to interpret a document: it organizes it, decomposes it, divides it up, arranges it on levels or in series, defines units and elements, and describes relationships (Foucault 1981, 14; see Gehring 2004, 63 ff.). Reformulated in terms of medium/form theory, this means a growing intensification of the capacity for the elements of scholarly research to be split up, which accompanies a differentiated potential for form development in the sense of an increasing capacity for recombination. The archaeological method of knowledge is without question one of the protagonists in the epistemological restructuring of the social sciences that makes it possible to think archives as an open processuality in with actualization becomes a genuinely renewing activity. We are dealing here with a turn away from the history of thought as a place of uninterrupted continuities and toward a privileging of the discontinuities, of the proliferation of breaks, series, deviations, autonomies, and differentiated dependencies. Thinking in differences has taken the place of a confirmatory form of the identical. Once this dynamic of identifying and linking elements that display into forms, the constant splitting up and recombinatorial reuse of those forms, has been set in motion, media—and this is the third insight related to the medium/form distinction that I wish to recall—tolerate only particular linkings. Normal language uses grammar as a means of linking words, in contrast to poetry; monetization of the economy also means that money can now only be displayed and become operative in figures and quantifying arithmetical units. Media are, it is true, not unformed—only the materia prima, as tradition has it, is pure potentiality—but they are not inherently predetermined in their structured complexity. Structure-building and its constant renewal takes place only through continuous and recursive formation; it is thus in formation that the dynamizing role of the medium/form distinction lies. Only in the social use of the media of Darstellung with historically contingent and recursively connected formations—this includes specific accomplishments of these forms and means of their production— does a space of meaning simultaneously open up for the construction of social
Documenting as a Communicative Form
| 101
semantics in the sense of reflexive expectations. This interplay between communicative forms and specific expectability in the medium of Darstellung ultimately becomes the condition of possibility for the genesis of specific knowledge that can, as an orienting expectability, be considered part of the semantic apparatus of a given society.³⁰
4 Luhmann’s Card Index as an Example from the Social Sciences The card index as a communicative form of documenting in the medium of Darstellung will serve as a concluding example from the social sciences with which the plausibility of some of the insights developed above can be reinforced. The cultural technique of card indexing emerged in the wake of print culture as a means of moving away from its linear and hierarchical ordering principles and preparing the way for what, as an electronic ordering system, we have by now long considered part of the digital complex. The architecture of card indexes reveals significant ways in which documents are used in the genesis of knowledge. By now, as recent studies of media history and the history of knowledge show, electronic card indexes have enabled speed-optimized, automatic access and combination in electronically prepared data pools (see Krajewski 2012, 292).³¹ Card indexes are an example of the reaching back and looking ahead that was mentioned earlier as a mode of the social morphogenesis of forms of communication and their devices. Although they historically emerged with print culture, card indexes are prepared by hand and at the same time anticipate ordering principles of digital culture. Card indexes are a response to the flood of data, reading, and information, but are not simply externalized memories or memory aids; instead, they serve as writing tools. They host and channel selected building-blocks of information, fruits of reading, ideas, which they carefully portion out, divide up, and sort as an intermediate stage in the production of follow-ups to reading in the guise
30 I outlined this as semantics of documenting in section 2. Luhmann (2000, 106) speaks of “chains of dependencies” that “point to the […] evolutionary achievements” of formation. On the cultural significance of media/form theory, see Hahn (2004). 31 He points to the invention of hypertext in 1937/1945 and places the card index in a mediahistorical continuum with digital systems.
102 | Cornelia Bohn of new publications (Krajewski 2011, 2012, 2013).³² They embody the insight of event-based social theories that, as discussed above, it is only connectedness, linkage, and temporal positioning in a subsequent series of communicative events that imbue a singular communication artifact—the individual card-index entry— with meaning (which could also turn out to be different): they make it part of a knowledge architecture. The important architectural principles of a card index include the mobility of its units—the cards function as documents here—and a sophisticated set of rules and reference system that allow links that are potential, not preformed, to be drawn between the documents in reuse. Luhmann based his work with the card index on the idea of a partnership in order to enable surprise, contingency, and change in his communication with it. For the card index’s “inner life” and capacity for internal development, Luhmann reported on the basis of his experience, a systematic arrangement, a content-based system—such as that of a book’s structure or a classification system—is to be avoided, as this leads to intractable problems of categorization. “Independence” and “intrinsic complexity” in the construction of the self-referential document machine are further ensured by the “capacity for arbitrary internal branching,” “possibilities of linking,” card numbers, and a register (Luhmann 1981, 224).³³ The knowledge architecture of this card index stringently rejects sequentialization or classifications in favor of fixed positions that—and this is crucial—do without hierarchy and privileging. Access is equally possible from any position within the card index; it grows by becoming more and more dense internally. We need, Luhmann writes, “to give up the idea in preparing a card index that there should be privileged places or slips that have a special quality of guaranteeing knowledge. Every note is only an element that receives its quality only from the network of links and back-links within the system. A note that that is not connected to this network will get lost in the card file and will be forgotten by it. Its rediscovery depends on accidents and on the vagary that this rediscovery means something at the time it is found” (Luhmann 1981, 225). Card indexes act as a documentary form in the medium of Darstellung. In themselves building up a structured complexity through their use, they become media for the documentary form of scholarly
32 See also Blair (2010, 62 ff.), who points to the transitive element in reading notes in knowledge genesis in the Renaissance: compendiums and textbooks were produced using the notes of the “masters.” The meaningful selectivity of learned reading was always emphasized. 33 Translations from this text are drawn, in some cases modified slightly, from http://luhmann. surge.sh/communicating-with-slip-boxes (31.03.2022). For a close analysis of the internal construction of the card index, see Schmidt (2012, 2017, 2018). On artificial communication with card indexes, which is linked to the production of contingency, see Esposito (2017, 255 ff.).
Documenting as a Communicative Form
| 103
publications; this in turn is at the same time a form in the communication medium of truth. Luhmann’s Bielefeld card index has since been converted completely into digital formats—the cards can be accessed in digitized form. The ongoing research project hopes to make it possible to display digitally the structural complexity of the card index by making the potential references between the elements and cluster formations visible; the paper version did not have this possibility. If in the use of the card index, the cards with their potential references served as documents in the genesis of knowledge, the card index has now itself become a document for further research (Luhmann 1951–1996).
Bibliography Abbott, Andrew. Library research infrastructure for humanistic and social scientific scholarship in the twentieth century. In: Camic, Charles, Neil Gross, and Michéle Lamont, editors, Social Knowledge in the Making, pp. 43–88. Chicago, London: The University of Chicago Press, 2011. Abbott, Andrew. Digital Paper: A Manual for Research and Writing with Library and Internet Materials. Chicago, London: The University of Chicago Press, 2014. Ammon, Sabine. Generative und instrumentelle Bildlichkeit. In: Friedrich, Kathrin, Moritz Queisner, and Anna Roethe, editors, Image Guidance. Bedingungen bildgeführter Operation, pp. 9–19. Berlin, Boston: Walter de Gruyter, 2016. Ammon, Sabine and Inge Hinterwaldner, editors. Bildlichkeit im Zeitalter der Modellierung. Operative Artefakte in Entwurfsprozessen der Architektur und des Ingenieurwesens. Paderborn: Fink Verlag, 2017. Aquinas, Thomas. On being and essence. In: Klima, Gyula, editor, Medieval Philosophy: Essential Readings with Commentary, pp. 227–249. Malden: Blackwell, [1255], 2007. Austin, John L. How to Do Things with Words. Cambridge: Cambridge University Press, [1972], 2003. Bachelard, Gaston. Die Bildung des wissenschaftlichen Geistes. Frankfurt am Main: Suhrkamp, 1988. (French original: Bachelard, Gaston. La formation de l’esprit scientifique. Paris: Vrin, 1938). Bazerman, Charles. Shaping Written Knowledge. The Genre and Activity of the Experimental Article in Science. Madison: University of Wisconsin Press, 1988. Bertillon, Alphonse. Identification Anthropométrique: Instructions Signalétiques. Melun: Imprimerie administrative, 1893. Blair, Ann, Paul Duguid, Anja-Silvia Goeing, and Anthony Grafton, editors. Information: A Historical Companion. Princeton, Oxford: Princeton University Press, 2021. Blair, Ann M. Too Much to Know: Managing Scholarly Information before the Modern Age. New Haven: Yale University Press, 2010. Blauert, Andreas and Eva Wiebel. Gauner- und Diebslisten. Registrieren, Identifizieren und Fahnden im 18. Jahrhundert. Studien zu Policey und Policeywissenschaft. Frankfurt am Main: Vittorio Klostermann, 2001.
104 | Cornelia Bohn Boellstorff, Tom. Making big data, in theory. First Monday, 18(10), 2013. DOI: https://doi.org/10. 5210/fm.v18i10.4869. Bohn, Cornelia. Passregime: Vom Geleitbrief zur Identifikation der Person. In: Bohn, Cornelia, editor, Inklusion, Exklusion und die Person, pp. 71–95. Konstanz: Universitätsverlag Konstanz, 2006. Bohn, Cornelia. Schriftlichkeit und Gesellschaft. Opladen: Westdeutscher Verlag, [1999], 2013. Bohn, Cornelia. Autonomien in Zusammenhängen. Formenkombinatorik und die Verzeitlichung des Bildlichen. Paderborn: Fink, 2017. Bohn, Cornelia. Contemporary art and event-based social theory. Theory, Culture & Society, October 2021. DOI: https://doi.org/10.1177/02632764211042085. Brandom, Robert B. Articulating Reasons. An Introduction to Inferentialism. Harvard: Harvard University Press, 2000. Briet, Suzanne. Qu’est-ce que la documentation? Industrielles et techniques, collection de documentologie. Paris: Éditions documentaires, 1951. URL: http://martinetl.free.fr/ suzannebriet/questcequeladocumentation/#ref0, (09.05.2021). Buckland, Michael K. What is a “document”? Journal of the American Society for Information Science, 48(9):804–809, 1997. DOI: https://doi.org/10.1002/(SICI)1097-4571(199709)48: 93.0.CO;2-V. Büchner, Stefanie. Digitale Infrastrukturen — Spezifik, Relationalität und die Paradoxien von Wandel und Kontrolle. Arbeits- und Industriesoziologische Studien (AIS), 11(2):279–293, October 2018. DOI: https://doi.org/10.21241/ssoar.64878. Caplan, Jane and John Torpey, editors. Documenting Individual Identity. Princeton: Princeton University Press, 2018. Cardon, Dominique. À quoi rêvent les algorithms. Paris: Seuil, 2015. Cowls, Josh and Ralph Schroeder. Causation, correlation, and big data in social science research. Policy & Internet, 7(4):447–472, 2015. DOI: https://doi.org/10.1002/poi3.100. Crawford, Kate, Mary L. Gray, and Kate Miltner. Critiquing big data: Politics, ethics, epistemology. International Journal of Communication, 8:1663–1672, 2014. Danto, Arthur Coleman. After the End of Art: Contemporary Art and the Pale of History. Princeton: Princeton University Press, 1997. de Saussure, Ferdinand. Writings in General Linguistics. Oxford: Oxford University Press, 2006. Dommann, Monika. Dokumentieren: die Arbeit am institutionellen Gedächtnis in Wissenschaft, Wirtschaft und Verwaltung 1895–1945. In: Heyen, Erk-Volkmar, editor, Jahrbuch für europäische Verwaltungsgeschichte, volume 20, pp. 277–299. Baden-Baden: Nomos, 2008. Dommann, Monika. Autoren und Apparate. Die Geschichte des Copyrights im Medienwandel. Frankfurt am Main: S. Fischer, 2014. Esposito, Elena. Artificial communication? The production of contingency by algorithms. Zeitschrift für Soziologie, 46(4):249–256, 2017. Feldman, Martha S. and James G. March. Information in organizations as signal and symbol. Administrative Science Quarterly, 26(2):171–186, 1981. DOI: https://doi.org/10.2307/ 2392467. Foucault, Michel. Die Archäologie des Wissens. Frankfurt am Main: Suhrkamp, [1969], 1981. (Orig.: Foucault, Michel. L’archéologie du savoir. Paris: Gallimard, 1969). Galton, Francis. Finger Prints. New York: Da Capo Press, [1892], 1965. Garfinkel, Harold. Studies in Ethnomethodology. Engelwood Cliff: Prentice-Hall, 1967. Gehring, Petra. Foucault. Die Philosophie im Archiv. Frankfurt am Main, New York: Campus Verlag, 2004.
Documenting as a Communicative Form
| 105
Gertenbach, Lars. Von performativen Äußerungen zum Performative Turn. Performativitätstheorien zwischen Sprach- und Medienparadigma. Berliner Journal für Soziologie, 30:231–258, 2020. DOI: https://doi.org/10.1007/s11609-020-00422-6. Hahn, Alois. Ist Kultur ein Medium? In: Burkart, Günter and Gunter Runkel, editors, Luhmann und die Kulturtheorie, pp. 40–58. Frankfurt am Main: Suhrkamp, 2004. Hayles, N. Kathrine. Translating media: Why we should rethink textuality. The Yale Journal of Criticism, 16(2):263–290, 2003. Heider, Fritz. Ding und Medium. Baecker, Dirk, editor. Berlin: Kulturverlag Kadmos, [1926], 2005. Hemmungs Wirtén, Eva. How patents became documents, or dreaming of technoscientific order, 1895–1937. Journal of Documentation, 75(3):577–592, 2019. DOI: https://doi.org/10.1108/ JD-11-2018-0193. Hinterwaldner, Inge. Künstlerisches und architektonisches Entwerfen – “not a time for dreaming”? In: Hinterwaldner, Inge and Sabine Ammon, editors, Bildlichkeit im Zeitalter der Modellierung. Operative Artefakte in Entwurfsprozessen der Architektur und des Ingenieurwesens, pp. 315–345. Paderborn: Fink Verlag, 2017. Husserl, Edmund. Zur Phänomenologie des inneren Zeitbewußtseins (1893–1917). volume X of HUA. The Hauge: Martinus Nijhoff, 1966. Imdahl, Max. Die Momentfotografie und “Le Comte Lepic” von Edgar Degas. In: Imdahl, Max, editor, Zur Kunst der Moderne, volume 1 of Gesammelte Schriften, pp. 181–194. Frankfurt am Main: Suhrkamp, 1996. Jäger, Ludwig, editor. Ferdinand de Saussure. Wissenschaft der Sprache. Neue Texte aus dem Nachlass. Frankfurt am Main: Suhrkamp, 2003. Jäger, Ludwig. Die Welt der Zeichen. In: ZEICH(N)EN SETZEN, volume 11, pp. 303–327. Bielefeld: transcript Verlag, 2020. Kelly, Kevin. On Chris Anderson’s the end of theory. 2008. URL: https://web.archive.org/web/ 20200814090232/http://edge.org/discourse/the_end_of_theory.html, (14.08.2020). Kester, Grant. The One and the Many. Contemporary Collaborative Art in a Global Context. Durham: Duke University Press, 2011. Kitchin, Rob. Big data, new epistemologies and paradigm shifts. Big Data and Society, 1(1):1–12, April–June 2014. DOI: https://doi.org/10.1177%2F2053951714528481. Klopstock, Friedrich Gottlieb. Gedanken über die Natur der Poesie. Dichtungstheoretische Schriften. Winfried Menninghaus, editor. Frankfurt am Main: Insel Verlag, 1989. Knight, David. Illustrating chemistry. In: Baigrie, Brian S., editor, Picturing Knowledge: Historical and Philosophical Problems Concerning the Use of Art in Science, Toronto Studies in Philosophy, chapter 4, pp. 135–163. Toronto: University of Toronto Press, [1996], 2016. DOI: https://doi.org/10.3138/9781442678477-006. Krajewski, Markus. Paper Machines: About Cards and Catalogs, 1548–1929. Cambridge: MIT Press, 2011. Krajewski, Markus. Kommunikation mit Papiermaschinen: Über Niklas Luhmanns Zettelkasten. In: von Herrmann, Hans-Christian and Wladimir Velminski, editors, Maschinentheorien/Theoriemaschinen, pp. 283–305. Frankfurt am Main: Peter Lang, 2012. Krajewski, Markus. Paper as passion: Niklas Luhmann and his card index. In: Gitelman, Lisa, editor, “Raw Data” is an Oxymeron, pp. 103–112. Cambridge: MIT, 2013. (Orig.: Krajewski 2012). La Fontaine, Henri and Paul Otlet. Sur la création d’un répertoire bibliographique universel. In: Conférence Bibliographique Internationale, Documents, Brussels, September 1895. Imp. Larcier.
106 | Cornelia Bohn Latour, Bruno. Visualization and cognition: Drawing things together. In: Lynch, Michel and Steven Woolgar, editors, Representation in Scientific Practice, pp. 19–68. Cambridge, MA: MIT Press, 1988. Latour, Bruno. The Pasteurization of France. Cambridge, MA: Harvard University Press, 1993. Latour, Bruno. Beware, your imagination leaves digital traces. 2007. URL: http://www.brunolatour.fr/sites/default/files/P-129-THES-GB.pdf, (03.03.2022). Luhmann, Niklas. Zettelkasten. Digital Collections. Bielefeld University. 1951–1996. URL: http://ds.ub.uni-bielefeld.de/viewer/collections/zettelkasten/, (30.09.2021). Luhmann, Niklas. Kommunikation mit Zettelkästen. Ein Erfahrungsbericht. In: Baier, Horst, Hans Mathias Kepplinger, and Kurt Reumann, editors, Öffentliche Meinung und sozialer Wandel: Für Elisabeth Noelle-Neumann, pp. 222–228. Opladen: Westdeutscher Verlag, 1981. Luhmann, Niklas. The medium of art. In: Luhmann, Niklas, editor, Essays on Self-Reference, pp. 215–226. New York: Columbia University Press, 1990. Luhmann, Niklas. Social Systems. Stanford: Stanford University Press, 1995. (German original: Luhmann, Niklas. Soziale Systeme. Grundriß einer allgemeinen Theorie. Frankfurt am Main: Suhrkamp, 1984). Luhmann, Niklas. Art as a Social System. Stanford: Stanford University Press, [1995], 2000. (German original: Luhmann, Niklas. Die Kunst der Gesellschaft. Frankfurt am Main: Suhrkamp, 1995). Luhmann, Niklas. Schriften zu Kunst und Literatur. Niels Werber, editor. Frankfurt am Main: Suhrkamp, 2008. Luhmann, Niklas. Theory of Society. volume 1. Stanford: Stanford University Press, 2012. (German original: Luhmann, Niklas. Die Gesellschaft der Gesellschaft, volume 1. Frankfurt am Main: Suhrkamp, 1997). Lund, Niels W. Document theory. Annual Review of Information Science and Technology, 43(1): 1–55, 2009. Lund, Niels W. Document, text and medium: Concepts, theories and disciplines. Journal of Documentation, 66(5):734–749, 2010. DOI: https://doi.org/10.1108/00220411011066817. Lund, Niels W. and Roswitha Skare. Document theory. In: Bates, Marcia J. and Mary Niles Maack, editors, Encyclopedia of Library and Information Sciences, volume 1, pp. 1632–1639. 3rd edition, New York: Taylor and Francis, 2010. Manovich, Lev. Data stream, database, timeline: The forms of social media. 2012. URL: http://lab.softwarestudies.com/2012/10/data-stream-database-timeline-new.html, (09.02.2021). Manovich, Lev. Media Analytics & Gegenwartskultur. In: Engemann, Christoph and Andreas Sudmann, editors, Machine Learning – Medien, Infrastrukturen und Technologien der Künstlichen Intelligenz, pp. 269–288. Bielefeld: transcript Verlag, 2018. DOI: https: //doi.org/10.1515/9783839435304-012. Marin, Louis. Die klassische Darstellung. In: Nibbrig, Christiaan L. Hart, editor, Was heißt ‘Darstellen’?, pp. 375–397. Frankfurt am Main: Suhrkamp, 1994. Mead, George H. The nature of the past. In: Coss, John, editor, Essays in Honor of John Dewey, pp. 235–242. New York: Henry Holt and Company, 1929. Menninghaus, Winfried. ‘Darstellung.’ Friedrich Gottlieb Klopstocks Eröffnung eines neuen Paradigmas. In: Nibbrig, Christiaan L. Hart, editor, Was heißt ‘Darstellen’?, pp. 205–226. Frankfurt am Main: Suhrkamp, 1994.
Documenting as a Communicative Form
| 107
Mondada, Lorenza. Video analysis and the temporality of inscriptions with social interaction: the case of architects at work. Qualitative Research, 12(3):304–333, 2012. Mü ller, Julian. Systemtheorie als Medientheorie. In: Jahraus, Oliver and Armin Nassehi et al., editors, Luhmann-Handbuch. Leben – Werk – Wirkung, pp. 57–61. Stuttgart: Metzler, 2012. Mülder-Bach, Inka. Im Zeichen Pygmalions: das Modell der Statue und die Entdeckung der “Darstellung” im 18. Jahrhundert. Munich: Fink, 1998. Nassehi, Armin. Muster: Theorie der digitalen Gesellschaft. Munich: C. H. Beck, 2019. Neyland, Daniel and Catelijne Coopmans. Visual accountability. The Sociological Review, 62(1): 1–23, 2014. DOI: https://doi.org/10.1111/1467-954X.12110. Pédauque, Roger T. Le Document à la lumière du numérique. Caen: C&F éditions, 2006. Pédauque, Roger T. Document: Form, sign, and medium, as reformulated by digitization. A completely reviewed and revised translation by Laura Rehberger and Frederik Schlupkothen. Laura Rehberger and Frederik Schlupkothen, translators. In: Hartung, Gerald, Frederik Schlupkothen, and Karl-Heinrich Schmidt, editors, Using Documents. A Multidisciplinary Approach to Document Theory, pp. 225–259. Berlin: De Gruyter, 2022. Raible, Wolfgang. Literacy and orality. International Encyclopedia of the Social & Behavioral Sciences, 13:8967–8971, 2001. DOI: https://doi.org/10.1016/B0-08-043076-7/03036-9. Rayward, W. Boyd. The origins of information science and the international institute of bibliography/international federation for information and documentation (FID). Journal of the American Society for Information Science, 48(4):289–300, 1997. Schlegel, Friedrich. Philosophische Vorlesungen 1800–1807. In: Behler, Ernst and Jean-Jacques Anstett, editors, Kritische Friedrich-Schlegel-Ausgabe, volume 12. Munich, Paderborn, Vienna, Zurich: Schöningh/Thomas-Verlag, 1964. Schlegel, Friedrich. Fragmente zur Poesie und Literatur. In: Behler, Ernst and Hans Eichner, editors, Kritische Friedrich-Schlegel-Ausgabe, volume 16. Munich, Paderborn, Vienna, Zurich: Schöningh/Thomas-Verlag, 1981. Schmidt, Johannes F. K. Luhmanns Zettelkasten und seine Publikationen. In: Jahraus, Oliver and Armin Nassehi et al., editors, Luhmann-Handbuch, pp. 7–13. Stuttgart: Metzler, 2012. Schmidt, Johannes F. K. Niklas Luhmann’s card index: Thinking tool, communication partner, publication machine. In: Cevolini, Alberto, editor, Forgetting Machines. Knowledge Management Evolution in Early Modern Europe, pp. 289–311. Leiden: Brill, 2017. Schmidt, Johannes F. K. Niklas Luhmann’s card index: The fabrication of serendipity. Sociologica, 12(1):50–63, 2018. DOI: https://doi.org/10.6092/issn.1971-8853/8350. (Abridged and revised version see Schmidt 2017). Schubbach, Arno. Kants Konzeption der geometrischen Darstellung. Zum mathematischen Gebrauch der Anschauung. Kant-Studien, 108(1), 2017a. DOI: https://doi.org/10.1515/kant2017-0002. Schubbach, Arno. Der “Begriff der Sache.” Kants und Hegels Konzeptionen der Darstellung zwischen Philosophie, geometrischer Konstruktion und chemischem Experiment. In: Hegel-Studien, volume 51, pp. 121–162. Hamburg: Felix Meiner Verlag, 2017b. Schwarzkopf, Stefan. Sacred excess: Organizational ignorance in an age of toxic data. Organization Studies, 41(2):197–217, 2020. DOI: https://doi.org/10.1177/0170840618815527. Seckelmann, Margit. Zur Verwaltung technischen Wissens: das kaiserliche Patentamt zwischen Bürokratisierung und Netzwerkbildung (1877–1914). In: Jahrbuch für europäische Verwaltungsgeschichte, volume 20, pp. 7–32. Baden-Baden: Nomos, 2008.
108 | Cornelia Bohn Simmel, Georg. Excursus on written communication. In: Wolff, Kurt H., editor, The Sociology of Goerg Simmel, pp. 352–355. Glencoe, IL: The Free Press, 1950. (German original: Simmel 1992). Simmel, Georg. Exkurs über den schriftlichen Verkehr. In: Rammstedt, Otthein, editor, Soziologie. Untersuchungen über die Formen der Vergesellschaftung, volume 11 of Georg Simmel Gesamtausgabe, pp. 429–433. Frankfurt am Main: Suhrkamp, [1908], 1992. Simon, Herbert Alexander. The Sciences of the Artificial. 3rd edition, Cambridge, MA: MIT Press, [1969], 1996. Skare, Roswitha. Complementarity: A concept for document analysis? Journal of Documentation, 65(5):834–840, 2009. DOI: https://doi.org/10.1108/00220410910983137. Sombart, Werner. Der moderne Kapitalismus. volume II: Das europäische Wirtschaftsleben im Zeitalter des Frühkapitalismus. 2nd edition, Munich, Leipzig: dtv, [1916], 1987. Stäheli, Urs. Soziologie der Entnetzung. Berlin: Suhrkamp, 2021. Vismann, Cornelia. Akten. Medientechnik und Recht. 2nd edition, Frankfurt am Main: Fischer, 2001. (Abridged Eng. translation: Vismann 2008). Vismann, Cornelia. Files: Law and Media Technology. Stanford, CA: Stanford University Press, 2008. (Shortened translation of German original: Vismann 2001). Weber, Max. Economy and Society: An Outline of Interpretive Sociology. Berkeley, CA: University of Calfornia Press, 1978. (German original: Weber 1985). Weber, Max. Wirtschaft und Gesellschaft. 5th edition, Tübingen: Mohr, [1925], 1985. Weizman, Eyal. Forensic Architecture. Violence at the threshold of detectability. New York: Zone Books, 2017. URL: https://forensic-architecture.org, (31.05.2021). Wöhrer, Renate, editor. Wie Bilder Dokumente wurden. Zur Genealogie dokumentarischer Darstellungspraktiken. Berlin: Kulturverlag Kadmos, 2015a. Wöhrer, Renate. “More than mere records.” Sozialdokumentarische Bildpraktiken an der Schnittstelle von Kunst und sozialpolitischer Kampagne. In: Wöhrer, Renate, editor, Wie Bilder Dokumente wurden. Zur Genealogie dokumentarischer Darstellungspraktiken, pp. 315–335. Berlin: Kadmos, 2015b.
Frederik Schlupkothen and Karl-Heinrich Schmidt
Legibility and Viewability
On the Use of Strict Incrementality in Documents
1 Introduction: Text and Video, Book and Film Over the past few decades, the concept of text has been applied to very different kinds of media, including ones that, on the face of it, appear not to look like text at all. Films, for instance, have also been treated as texts. A prominent example of this can be found in Branigan (1992, e.g., 87–88), who presents the following definition in his classic Narrative Comprehension and Film, which has seen numerous reprints over the decades and been translated several times: I will define a “text” as a certain collection of descriptions of an artifact where the artifact must be one that materializes a symbol system, and the descriptions that are offered of it must be sanctioned by a society. Thus a “text” is more than the material of an artifact and more than the symbols materialized; a text is always subject to change according to a social consensus about the nature of the symbols that have been materialized. (Branigan 1992, 87)
A footnote to the first sentence in this quotation adds: “My definition of a text is meant to rule out, for example, such objects as trees and tables, as well as a book being used to patch a hole in the roof.” (Branigan 1992, 246) Drawing on Branigan, and at the same time moving beyond him, the distinction between text as a content architecture and books as a form of presentation will be picked up in what follows, where we draw an analogous distinction between video as a content architecture and films as a form of presentation (with “front matter” and “back matter” comparable to that of a book in the guise of opening and closing credits).¹ This is based on the assumption that the word “text” does not generally refer to a form of presentation. For example, “Please send me your text” is understood differently from “Please send me your book”: the former expression leaves the form 1 If the content model for books envisaged in the DocBook document description language, for instance, is transferred to films by replacing the word “book” with “film,” a perfectly plausible content model for films results, right down to the navigational elements. Cf. https://tdg.docbook. org/ (11.01.2022). Frederik Schlupkothen, Karl-Heinrich Schmidt, University of Wuppertal https://doi.org/10.1515/9783110780888-006
110 | Frederik Schlupkothen and Karl-Heinrich Schmidt of presentation open. Analogously to this, “Please send me the video recording” is normally treated differently from “Please send me the film” or “Shall we watch a film tonight?” In order to avoid relying solely on natural-language intuition, the following pages operate primarily with the text and video content architectures.² The associated forms of presentation—books and films respectively (“Did you read the first edition of the book?” and “Did you see the film in the director’s cut version?”)—do figure in what follows. However, they are treated primarily in terms of media history, as (materially realized or realizable) forms of presentation belonging to clearly different production and reception cultures with corresponding dispositifs. Movingimage data can be realized in books (e.g., flip books, to take an example from media history), and texts can be integrated into films (with the potential to have a significant effect, even outside the opening and closing credits, in the age of silent film in particular; see below)—but it is only with media convergence as manifested, for instance, on screens through the World Wide Web that it becomes possible to employ them with equal weight.³ This is a significant motivation for treating them together in the present chapter. In what follows, features relevant to document processing and modeling are identified, and characterized with respect to their similarities and differences,
2 These two content architectures were originally specified in the Multipurpose Internet Mail Extensions (MIME) standard as categories for different media types. See https://www.iana.org/ assignments/media-types/media-types.xhtml (25.10.2020). This chapter does not go beyond a concept of text that is compatible with the assignment of MIME types. The upshot of this is that only something that has the text MIME type as its preferred architecture can be a text. It makes no difference if a text appears somewhere else, for instance in a painting or in the intertitles of a silent film (see below), because those texts can be translated into the text MIME type and can then be treated as such—in an automated manner too, given suitable methods (character recognition). Thus, a normal film beyond the opening and closing credits “is” a priori not a text here. The same goes for extended concepts of text such as those suggested by Posner: “If something is an artifact and has not only a function (a standard purpose) but also a (coded) meaning in a culture, we call it a ‘text of that culture’” (“Wenn etwas ein Artefakt ist und in einer Kultur nicht nur eine Funktion (einen Standardzweck), sondern auch eine (kodierte) Bedeutung hat, so nennen wir es ‘Text dieser Kultur’” [Posner 1992, 21]). Posner gives the example of a sequence of noises made by high-heeled shoes. In the context of our chapter, this might result in an audio document if recorded; but that does not make it a text unless provisions for translation into a text architecture are available for use (e.g., in the case of footsteps tapping out Morse code). Metaphorical usage, of course, is not affected by this. 3 Consider, for instance, the case of linking to parts of a video with Uniform Resource Identifiers: “In order to make video a first-class object on the World Wide Web, one should be able to identify temporal and spatial regions”; https://www.w3.org/2008/01/media-fragments-wg.html#scope (12.09.2020); our emphasis.
Legibility and Viewability | 111
for sequences of letters (e.g., that can be assigned to a “string” data type) and sequences of frames (as can be found in classic film). With regard to the differences, a fundamental distinction is made, drawing on Goodman’s symbol theory, between those semiotic representations that can be treated as tokens of a type and those that cannot.⁴ This fundamental distinction, supplemented by a concept of reading direction, is then applied to sequences of tokens or of images. Both can present the viewer with a characteristic called incrementality, which is introduced with a definition further below and essentially means that in a sequence of tokens or images, a viewing can take place “bit by bit” and even, in the strict case, without overlaps. This in turn leads to a distinction between strictly incremental legibility for a sequence of tokens and strictly incremental viewability for a sequence of images. With these last two properties, we can account for every document part that is typically compatible with the text or video content architectures (the latter here being confined to its visual component), and thus for a large part of documentbased cultural heritage and communication to date. In addition, with the rise of on-screen display such content architectures are increasingly appearing together and as parts of individual documents. The legibility of a textual product and the viewability of an image-sequence product are illustrated here with reference to specific media products, but formulated for visual output in general terms. To this end, a conceptual apparatus that is common to both, rather than being text-specific or film-specific, is used. This apparatus is in large part supplied here by (digital) document processing and theory; they also allow fundamental characteristics of the logical structure of the relevant document parts and their transformation into outputs to be set out—characteristics that can be recognized by a viewer with the necessary knowledge.
2 Structure The chapter begins with an introduction to “(Structured) Documents of the Text Type” (section 3). On this foundation, section 4 formulates some first conditions for the extraction of information from documents under the heading “Document Processing and Information Appraisal.” Section 5 introduces strictly incremental legibility, a condition of information extraction from texts that is fundamental to the subsequent argumentation, and applies it to documents whose content can 4 In the present chapter, “type” is used in this sense to refer to a set of tokens (see also Stetter 2005, 78). The word “type” is also found as a compound element in “MIME type.” There is also the word “type” in its everyday sense.
112 | Frederik Schlupkothen and Karl-Heinrich Schmidt be represented in what have been called text-flows (see Bateman 2008, 175). This section elaborates on the use of type–token relationships, which is crucial for the reading of text-flows. Section 6 then allows the specification of multiple directions for particular readings, before section 7 builds on this by showing that even in the case of type–token relationships that can be recognized (by the user), the reading direction can be left completely open if strict incrementality is dispensed with; this is generally correlated with what is known as the monolithic status of a segment. The methods introduced thus far are then used in section 8 to introduce strictly incrementally viewable (sub)documents of the kind manifested in (silent) films as dynamic image-flows analogous to text-flows (employing the same concepts in the process). Their structurability is characterized in section 9. Section 10 uses a concluding example to consider structured documents with subdocuments that can be built out of shots and even larger segments. Section 11 contains a summary of what has been achieved. Finally, section 12 embarks on an essayistic retrospect that considers the dispositifs of cinema and typographeum (the world of print). Practical considerations mean that the argumentation in the following pages is subject to a number of limitations. The conceptual resources employed here could also be used to treat sequences of static images (as in comics) that consist of what are known as image architectures. For reasons of space, that cannot be done here. Individual document parts of the image type are addressed only in the context of a contrastive juxtaposition of document parts of the text and video types, and otherwise just serve illustrative purposes. In addition, document (parts) of the audio type are passed over completely. Finally, cases in which, analogously to letters, images can be classified as tokens of types are not considered here.⁵
3 (Structured) Documents of the Text Type In conceptual terms, we start by treating a document as an informational object for human viewers⁶ that can be exchanged and used as a unit.⁷ Documents can be structured or unstructured. Unstructured documents do not provide any viewerindependent specifications for the identification of subdocuments (this is the
5 We feel this would belong better in a context where such aspects are treated together with audio documents. Thus, it is only symbol schemata in Goodman’s (1968, 130–141) sense that are considered in what follows. 6 “Viewer” in this section means a human being; the concept of a viewer will later be extended to cover artificial systems as well, e.g., optical character recognition (OCR) systems. 7 Portions of this and the following section are based on Schlupkothen and Schmidt (2020).
Legibility and Viewability | 113
case with many photographs); structured documents do exactly that (e.g., a text document with a title that is recognizable as such, or lines in a line-based text; see further below in this section). In our conceptual framework for structural documents, we use the basic architecture model for document processing in ISO/IEC 8613-2 (1993, 4). This standard makes an initial distinction between two perspectives on the form of a document in terms of which its content portions can be organized: – logical view – layout view The logical view describes the structural organization of content portions in (potentially recursive) part–whole relationships. Thus, for example, a book consists of several chapters, each of which can be divided into sections in which further sections or paragraphs can figure as logical units. The layout view describes the organization of a document’s layout elements for the purposes of presenting the content portions in an output medium. A book, for instance, is divided into pages across which chapter headings and paragraphs can be distributed. The content portions are involved in both perspectives and can employ various content architectures—for output on a flat surface, this often means plain text, images, or moving images whose data formats can be identified, for instance, by their own MIME types. The logical structure of any given document is modeled with a tree structure. ISO/IEC 8613-2 (1993) defines various types of node in this context: – document logical root – composite logical objects – basic logical objects The document logical root is the logical object that is the ancestor of all the other logical objects, and it can contain any number and combination of composite and basic logical objects. A composite logical object is the child of a composite logical object or the document logical root. It in turn can contain any number (apart from zero) and combination of composite or basic logical objects. A basic logical object is a terminal node in the tree structure that can host content portions and does not itself contain any further logical objects. The structural depth of the logical view of a document is simply the number of levels between the document logical root and the basic logical objects. Figure 1 on page 115 shows, on the left, a possible logical document structure that will be associated with real-world documents in what follows. A tree structure can be generated analogously for layouts (see figure 1 on page 115, middle). Layout structure and logical structure are independent from
114 | Frederik Schlupkothen and Karl-Heinrich Schmidt Listing 1: Encoding of a logical document structure in TEI.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
...
Ιησοῦς Χριστός Θεοῦ Υἱός Σωτήρ
each other and can therefore diverge. But both, as illustrated in figure 1 on the facing page, share the same content portions, which are divided between the basic objects of the layout structure in a layout process. As our first example for discussion, we take the following five-line text, which lies behind the Christian ichthus: Ιησοῦς Χριστός Θεοῦ Υἱός Σωτήρ In the first instance, this yields a TEI document (see Burnard and Bauman 2015), given in truncated form in listing 1, that meets the structure outlined in figure 1 on the facing page. The document logical root () here is followed first by two composite logical objects ( and ). The element contains further elements: the composite logical objects and . is followed by five basic logical objects ( in each case). Thus, in macronavigational terms (i.e., in the sense of identifying parts of the document’s tree structure), five lines below can be singled out.
Legibility and Viewability | 115 logical structure
layout structure
layout process
visual rendition
imaging process 1 2 3 4
1 1
2
3
4
2
3
4
5
5
5 document logical/layout root composite logical/layout object
basic logical/layout object content portion
Fig. 1: A possible layout process and visual rendition for our five-line text.
The content portions assigned to the basic logical objects describe text nodes (in this case: five) that consist of a sequence of permitted textual signs (letters, whitespace, as well as numbers, various special characters, etc.). Text nodes here are always of the text MIME type and are written in a content architecture that is defined at least by a character set⁸ but can also be subject to further constraints (see below). In micronavigational terms (i.e., in the sense of reading within the content portions), the content portions can be read as Greek text by a suitably qualified viewer one letter after another and line by line, for instance, or vertically as an acrostic.
4 Document Processing and Information Appraisal In addition to the description of documents, an architecture model for document processing, from the conversion of a document’s logical structure into a layout structure through to the final output, is needed. A general architecture model of this kind can also be found in ISO/IEC 8613-2 (1993, 12–15). Here, the logical structure of a document is the result of an editing process that encompasses the creation and adaptation of the document with regard both to its logical structure and the establishment of the content nodes. The layout process transforms the logical structure and its content nodes into geometrically describable objects that
8 In the XML world, an individual character is ultimately an indivisible textual unit, in accordance with the ISO/IEC 10646 (2020) specification (also known as Unicode, or, in ISO nomenclature, the Universal Character Set, UCS).
116 | Frederik Schlupkothen and Karl-Heinrich Schmidt can in turn, as the result of a subsequent imaging process, be visually rendered in a material medium such as, for example, paper or a monitor (see below). Taking listing 1 on page 114 above again, figure 1 on the preceding page shows a possible way to convert the TEI-structure subtree containing the five-line text into a line-based layout structure by means of a layout process. Logical components can be subject to certain stipulations in the layout. For instance, in figure 1 on the previous page the five lines are placed in a shared layout container that serves to ensure the left-alignment of the lines that is important for the acrostic. It is true that this architecture model was essentially conceived with a view to the processing of textual documents, but it nonetheless works more generally as well. Just as pure text documents consist only of textual data, pure video documents, for instance, consist only of video data. The content portions of video documents typically describe moving-image data that may yield a shot (a succession of frames that belong together for a viewer). Such a document is also subject to a layout structure and a logical structure. The layout process in this case typically defines the sequencing of the content portions into an output stream for visual rendition with a player. The logical elements themselves often belong together as part of larger structures or segments such as, for example, a sequence consisting of several shots, or a succession of shot–reverse shot alternations representing a dialogue (see also section 10). To introduce a filmic example—The Girl and Her Trust, which we use as a prototypical case in this chapter—we take (again) five content portions: the first five spatiotemporal shots (S3 , S4 , S5 , S6 , and S7 ), which follow an introductory title card and first intertitle (see section 10).⁹ These five shots are part of two longer sequences that alternate with each other in the layout structure: S3 , S5 , and S7 belong to the first sequence of the film (in Grace’s room), S4 and S6 to a second sequence (in Jack’s room). Figure 2 on the facing page shows five stills from these sequences, alternating as they do in the film: Grace (with an unknown admirer in the still from S3 ) and her admirer Jack (together with his unknown counterpart 9 Unlike the acrostic, for reasons of space the content of the film cannot be presented in full here. The following summary will have to suffice. Grace and Jack work at a railroad station with a telegraph office. A train arrives at the station with $ 2,000; the money is put under lock and key. Two tramps also arrived on the train; they work out where the money is stored. After Jack has left the station to take care of something somewhere else, the bandits force their way into the station. They are partially successful: they get hold of the money, but Grace, after locking herself in the office, manages to telegraph for help. The message results in an engine being dispatched from another station to help her. Jack, returning from his errand, encounters the engine and takes command of the rescue. The bandits are dealt with, and Jack finally gets the kiss from Grace that he has been wanting since the beginning of the film.
Legibility and Viewability | 117
S3
S4
S5
S6
S7
Fig. 2: Stills from the first five shots in time and space of The Girl and Her Trust.
in the still from S4 and with Grace in the still from S5 ) work at a railroad station with a telegraph office; Jack is overcome by the urge to kiss Grace (still from S5 ) and is sent back to his office at once (still from S6 ); Grace remains behind, unsure of herself (still from S7 ).¹⁰ Independently of the use of a particular document language, the layout process has to ensure the alternation of the shots described here. The visual rendition of each frame for a viewer takes place on a two-dimensional surface, classically the cinema projection screen, lasting only as long as the frame is projected. A first general possibility for assigning the content portions of the logical structure to the basic layout objects of the layout structure in a layout process can be formulated thus: Definition 1. A (sub)document has a document-order layout if the associated set of content portions can be assigned to a set of basic layout objects in such a way that those objects are arranged (spatially and/or temporally) in the logical order. For electronic documents in the world of XML, the default positioning of layout objects in CSS is sufficient to meet this requirement.¹¹ Figure 1 on page 115 shows a document-order layout for the five content portions of the logical objects (lines) of the five-line text in listing 1 on page 114; this can be seen from the dotted arrows that map out the ordering. In the filmic example in figure 3 on the following page,¹² by contrast, the alternating positioning of S3 , S5 , and S7 on the one hand and of S4 and S6 on the other means that the document order is not preserved; again, this can be seen from the dotted arrows that map
10 The fact that the stills in figure 2 belong to two different sequences is apparent from the fact that the first, third, and fifth stills (S3 , S5 , S7 ; part of the first sequence) and the second and fourth stills (S4 , S6 ; part of the second sequence) each have similar backgrounds. 11 This default is found as “normal flow” in, for example, Bos et al. (2011, sec. 9.4). 12 The node numbers in the diagram describe the position of the content nodes in the document tree, and are intended to make it easier to follow the rearrangement of the content nodes in the various processes involved. These numbers should not be confused with the numbers of the shots (S3 –S7 ).
118 | Frederik Schlupkothen and Karl-Heinrich Schmidt logical structure
layout structure
layout process
visual rendition
imaging process 3
1 1
2
3
4
4
2
5
5
2
4
1
3
5 document logical/layout root composite logical/layout object
basic logical/layout object content portion
Fig. 3: Alternative, filmic layout process and visual rendition.
out the ordering. The local order within both sequences, though, is preserved. A weaker formulation can therefore be made: Definition 2. A (sub)document has a basic-order layout if the associated set of content portions is assigned to a set of basic layout objects in such a way that the order of all the basic logical objects beneath their respective composite logical objects is preserved. This means that content portions can—as in the empirically important case of alternation in our film—be rearranged without, for instance, the sequential order specified by the logical structure being lost. There is a further stipulation in the case of a human viewer of, for example, a text or video if he is to be presented with the content in layout order by the imaging process: Definition 3. The layout of a (sub)document is rendered visually preserving the order for a viewer if the associated set of content portions in a visual rendition can be read or viewed in the spatial and/or temporal order of the layout. With the totally ordered sequence of rectangles defined by the layout objects,¹³ Figure 1 on page 115 depicts an order-preserving visual rendition of the five content portions of the logical objects (lines) in the five-line text from listing 1 on page 114. In the case of the five shots in time and space from The Girl and Her Trust shown in figure 2 on the previous page, a visual rendition that preserves the order is produced by playing back the relevant frames in the layout sequence with a player.
13 Often called blocks or areas in layout languages; thus, for example, in Bos et al. (2011, sec. 8) or Berglund (2006, sec. 4).
Legibility and Viewability | 119
In and of themselves, even when they occur together, a layout that employs at least basic order and an order-preserving visual rendition do not ensure that a viewer will be able to view the content portions of a document properly. Even overlaps are enough to prevent that from happening. The following definition addresses this: Definition 4. A (sub)document is appraisable for a viewer in an informationextraction situation s if there exists in s an order-preserving visual rendition on the basis of which the associated content portions can be ascertained completely and correctly by the viewer. On the recipient’s side, the information-extraction situation is a partial equivalent of the utterance situation (Devlin 1990, 218–220) in face-to-face (oral) communication. Appraisability serves the extraction of information and thus means that the viewer (which can also be an OCR system in the case of text documents, for example) in an (information-extraction) situation can perceive a (sub)document correctly (as specified by a correctness criterion; see below) and in full in a visual rendition.¹⁴ Appraisal does not have to take place following the layout order—indeed, that can even be undesirable: a layout-order appraisal of the visual rendition of a polyphonic musical document (e.g., a score), for example, might lead to the loss of information about synchronization between the main voice and accompaniment. In the case of a document such as that in listing 1 on page 114, a viewer will often want to be able to appraise (in particular, to read in the usual way) the content portions like a string of beads, as it were. That is often exactly what the author intends as well. The extent to which a document may—potentially in accordance with the author’s intentions—be able to frustrate such a desire on the part of the reader cannot be determined a priori and typically needs to be considered on the basis of each individual case.¹⁵ If a form that can be appraised all in one go, so to speak, is indeed desired, the fulfillment of a further requirement is always implied: Definition 5. A (sub)document is incrementally appraisable for a viewer in an information-extraction situation 𝑠 if there exists for a layout an order-preserving visual rendition on the basis of which the associated content portions can be
14 For “normal” text documents, the requirements of completeness and correctness determine an expected decoding (e.g., when walking dictation is assessed at school). On these criteria in general, see Schmidt (1992, 79–81). 15 On this in the case of plain-text documents, see also the discussion in Martens (1989, 1–25, esp. 23–25).
120 | Frederik Schlupkothen and Karl-Heinrich Schmidt ascertained completely and correctly by the viewer in a single appraisal process in 𝑠. No distinction is made here in terms of whether such appraisals that end up covering everything one piece at a time (potentially with overlaps) do so by proceeding from start to finish or by starting from various different points. Especially for the visual rendition of text documents (and in contrast to the viewing of static images; see below), it should further be noted with regard to the microscopic structure of the appraisal process that it can be sufficient for individual textual signs in the text nodes in a visual rendition to be traversed once in the context of a reading, and that in this sense strict incrementality can pertain in a reading process.¹⁶ Definition 6. A (sub)document is strictly incrementally appraisable for a viewer in an information-extraction situation 𝑠 if there exists for a layout an order-preserving visual rendition on the basis of which the associated content portions can be ascertained completely and correctly by the viewer in 𝑠 in a non-overlapping appraisal process. Human viewers do not always interpret in a strictly incremental manner—even for visual renditions of text, “it is by no means the case that the eye acquaints itself with every letter, or even every word.”¹⁷ The appraisal processes treated here are not related to the actual sign perception of a (human) viewer; instead, they provide distinctive features that can now be used to characterize text-flows with a view to defining them.
5 Specific Readings with Prescribed Type and Direction The appraisals discussed in section 4 are often bound to a specific identification of signs on the part of a viewer. For appraisal to be successful in this sense, the viewer needs to be “prepared” for a given (sub)document.¹⁸ Modeling the viewer
16 For a summary of an algebraic analysis of the interplay of object properties that takes a book and (reading) event properties as an example, see Krifka (2003, esp. 4). This analysis also shows that reading requires special treatment in event analysis with regard to its incrementality; we intend to put that into practice here. 17 “Die Augen tasten keineswegs jeden Buchstaben, nicht einmal jedes Wort ab” (Groß 1990, 236). 18 Parts of sections 5–11 are published in German in Schmidt et al. (forthcoming).
Legibility and Viewability | 121
thus involves the competence of the entity that parts a text, for example, from its anchoring in a carrier material and is, in Kondrup’s terms, responsible for the transition from a material text (Materialtext) to a text that is subjectively arrived at by a particular viewer (Realtext).¹⁹ In other words, appraisability is linked to assumptions about the viewers of a given document. In this sense, the following are assumed on the part of the viewer if an appraisal of a given document is to fulfill the correctness requirement: – a set of types – to which, in the appraisal process, tokens are assigned, optionally in a specified reading direction,²⁰ – in an 𝑛-dimensional display space (𝑛 > 0) and in one or more parts (such as lines).²¹ Let us take the flow 𝐹text,t1 = “Hello World” displayed on a monitor as an example. It can be read correctly by a viewer 𝑉 from left to right (“ltr”) in one line (“l”); a three-part reading base Rbase for the viewer is given by: Rbase (𝑉, 𝐹text,t1 ) = ({d, e, l, o, r, H, W} ∪ 𝑆, ltr, (l)) .
(1)
The non-empty set 𝑆 (for “space”) is a stock of types for whitespace. If we further assume for the reading base in (1) that a viewer can also employ the Latin-1 character set (ISO/IEC 8859-1 1998) and is prepared for the fact that a text can consist of more than one line (“l+”)—is able, that is to say, to handle line breaks—then
19 Working out the Realtext for a given Materialtext can be seen as an inverse problem; Kondrup also writes that “the more linear and ordered a text is, the easier it is for it to be parted from its material anchoring in one document and transferred to another—even into a different medium (by being reading aloud, for instance). Vice versa, spatiality and topography increase in significance the more a text has the status of a draft—the more underlining, emendations, alternative versions there are” (“ein Text sich desto leichter aus seiner materiellen Verankerung in einem Dokument lösen lässt und auf ein anderes—auch in ein anderes Medium (etwa beim Vorlesen)—transferiert werden kann, je linearer und geordneter er ist. Vice versa kommt der Spatialität oder Topographie umso mehr Bedeutung zu, je größer der Entwurfcharakter eines Textes ist: je mehr Unterstreichungen, Hinzufügungen, alternative Varianten vorhanden sind” (Kondrup 2013, 9). 20 For a specification for horizontal and vertical reading models, see in particular Davis et al. (2021) and Lunde and Ishii (2021). 21 This basic modeling of the viewer on the basis of type–token relationships does not include a (potentially introspective) component to cover the question of whether the knowledge for “better” readings is lacking. It also remains open how and under what conditions means of navigation can be integrated into a reading base.
122 | Frederik Schlupkothen and Karl-Heinrich Schmidt Rbase (𝑉, 𝐹text,t1 ) = (Latin-1, ltr, (l+))
(2)
is also a sufficient reading base for 𝑉. Alternatively, units such as (arbitrarily defined) words can often meet the needs of a reading base for viewers, as in Rbase (𝑉, 𝐹text,t1 ) = ({Hello, World} ∪ 𝑆, ltr, (l))
(3)
or—with the additional specific assumption that at least one line must be present but two lines can also appear (“l, l?”)—in Rbase (𝑉, 𝐹text,t1 ) = ({Hello, World} ∪ 𝑆, ltr, (l, l?)) .
(4)
This transition from an “alphabet” to a “dictionary” exploits the double articulation of natural languages.With the alphabet alone, it is already possible for definitions of reading bases to make use of the fundamental schema of alphabetic writing, against which on the lowest level of articulation, the correctness or deviance of any articulation in alphabetic writing can be checked. On this foundation, orthographic registers make possible higher-level articulation schemata such as dictionaries or grammars. These can in turn be set alongside a text in alphabetic writing as a means of checking it; and indeed, they are applied as such in literacy practice to any text that has a degree of publicity that involves authority to any extent.²²
It is now possible to introduce the concept of legibility (which rests on strictly incremental appraisability and the use of a reading base) and, building on that, the concept of text-flows.
22 “fundamentale Artikulationsschema der Alphabetschrift, gegen das auf der untersten Artikulationsebene Korrektheit oder Abweichung jeder alphabetschriftlichen Artikulation geprüft werden kann. Auf dieser Basis werden mittels der orthographischen Register höherrangige Artikulationsschemata wie Wörterbücher oder Grammatiken möglich. Die können ihrerseits als Kontrollinstanzen einem alphabetschriftlichen Text entgegengestellt werden, und sie werden so auch in der literalen Praxis auch an jeden Text eines halbwegs verbindlichen Öffentlichkeitsgrades herangetragen” (Stetter 2005, 11). Stetter identifies four orthographical registers: the sequence of letters, whether compounds and other expressions are written as separate words or not, capitalization, and punctuation (see Stetter 1997, 52). The quotation continues immediately with: “The clarity of orthographic norms is the telos of a medium that is built on a basic digital schema and whose use is determined by normative reference to largely digitalized schemata such as alphabet-writing dictionaries and grammars” (“Das Telos eines Mediums, das auf einem digitalen Grundschema aufbaut und dessen Gebrauch durch den normativen Bezug auf weitgehend digitalisierte Schemata wie alphabetschriftliche Wörterbücher und ebensolche Grammatiken bestimmt ist, liegt in der Eindeutigkeit von Schreibweisen”).
Legibility and Viewability | 123
Definition 7. A (sub)document is strictly incrementally legible for a viewer with his reading base in an information-extraction situation 𝑠 if there exists for a layout an order-preserving visual rendition on the basis of which the associated content portions can be appraised strictly incrementally by the viewer in 𝑠 as specified by the reading base. Definition 8. A (sub)document can be represented in a normal text-flow for a viewer in an information-extraction situation 𝑠 if a layout in document order is possible in 𝑠 and the associated non-empty text nodes, of which at least one contains two signs or two contain at least one sign each, are strictly incrementally legible for the viewer in an order-preserving visual rendition. We shall refer informally to the actual visual rendition of a document that can be represented in a normal text-flow for a viewer as a (normal) text-flow for that viewer. If the set of viewers is not relevant, we simply speak of a (normal) text-flow.²³ Representation in a normal text-flow generates, as the result of an imaging process, a legible text on a material medium (paper or monitor) as a material text for the assumed viewer.²⁴ Frequently, there will be many material texts—varying, for instance, with changes to line breaks due to changes to the display window—for a given normal text-flow, of which potentially only a small number will actually be realized. The above definition is a strict one in the following respect: If, say, a document has two identical text nodes and they are swapped in the course of layout, it is not represented in a normal text-flow. The underlying order in the logical structure cannot be changed.²⁵ The sequence in the logical structure is thus invariant; in the case of pure text documents, a viewer can often deduce it on the basis of the normal text-flow. It is not just texts in the everyday sense of the word that are covered by the definitions we have presented so far. Figure 4 on the next page shows a boustrophedon (i.e., the reading direction alternates, turning like a plowing ox) text-flow; with its specific reading direction, it visualizes Cantor’s first diagonal argument. The reading order of the numbers amounts to the mathematical assertion that the set of rational numbers, ℚ, and the set of natural numbers, ℕ, are equinumerous: it is clear that all the natural numbers (in parentheses) are used up as one moves, 23 Alternatively, it would be possible at this point to define a normal visual flow in general terms and to derive from that a normal text-flow specifically. For reasons of space, that is not done here. 24 Following Shillingsburg, we treat the material text here as a “union of linguistic text and document: a sign sequence held in a medium of display” (Shillingsburg 1997, 101). 25 This in turn corresponds to the “normal flow” standard position schema in Bos et al. (2011, sec. 9.4).
124 | Frederik Schlupkothen and Karl-Heinrich Schmidt 1 5 (11)
→
2 4 (⋅)
2 5
⋯
3 3 (⋅)
3 4
3 5
⋯
4 2 (⋅)
4 3
4 4
4 5
⋯
5 1 (10)
5 2
5 3
5 4
5 5
⋯
⋮
⋮
⋮
⋮
⋮
1 1 (1)
→ ↙
2 1 (3)
↓
1 2 (2)
↗ 2 2 (⋅)
↗
3 1 (4)
3 2 (8)
4 1 (9)
→ ↙
2 3 (7)
↙
↙ ↓
1 3 (5)
1 4 (6)
↗
↗
↗
↗
Fig. 4: Cantor’s first diagonal argument.
ox-like, between the fractions, and that no other numbers than the natural numbers are needed for the counting that accompanies this movement. There is thus a bijective mapping between the set of rational numbers and the set of natural numbers. In the visual rendition, the text-flow helps one to read oneself into what is going on, as it were, by following the arrows that are an aid to understanding the process. The strictly incremental reading of the text-flow is also significant here in terms of content, for a rational number from ℚ that has already been read cannot be counted again for an equivalent fractional representation of the same rational number. The reading base for the text-flow outlined in figure 4²⁶ has to contain the set of all fractional representations: Rbase (𝑉, 𝐹text,Cantor ) = ({ 𝑎𝑏 | 𝑎 ∈ ℤ ∧ 𝑏 ∈ ℤ\{0}} ∪ 𝑆 ∪ {(, ), ⋅, → }, boustrophedonCantor , (l)) .
(5)
The inclusion of the fractional representations in the reading base also makes clear once again that the “meaning” of the signs to be read is not what is at stake in reading bases. Including ℚ instead of the fractional representations would be erroneous.
26 Image source: https://de.wikipedia.org/wiki/Cantors_erstes_Diagonalargument (25.10.2020).
Legibility and Viewability | 125
Fig. 5: An acrostic; the marked lower-case letters are the only signs that can be identified by a viewer in the case of the constrained reading bases (8) and (9).
6 Specific Readings in Multiple Directions In contrast to the previous examples, a greater freedom in the choice of reading directions may be appropriate for the viewer. This is the case, for example, with the acrostic introduced in section 3 and presented in figure 5.²⁷ In the first instance, the Greek alphabet (e.g., following ISO/IEC 8859-7 (2003)) with the “ltr” reading direction and specified line breaks is sufficient for a viewer here. First, let the following two possible reading bases for the horizontal text-flow of the first five lines (“l[1..5]”) of 𝐹text,t2 be identified: Rbase (𝑉, 𝐹text,t2 ) = (ISO/IEC 8859-7, ltr, (l[1..5])) ,
(6)
Rbase (𝑉, 𝐹text,t2 ) = ({Ἰησοῦς, Χριστός, Θεοῦ, Υἱός, Σωτήρ} , ltr, (l[1..5])) .
(7)
The two bases differ in that in one of them, a (Modern Greek) character set is specified again, whereas in the other, part of a Greek lexicon, with diacritics, is specified. In the case of the second base, a complete and correct reading is achieved only if the viewer is able to ignore the diacritic marks in the reading base when applying it. In other words, an additional ability, which may require an explanatory comment (see below), is assumed on the part of the viewer. Let us now take 𝐹text,t3 as the horizontal text-flow of the first three lines and cover both the “normal” reading direction and the alternative of being read from right to left (“rtl”) and from bottom to top (“l[3..1]”): Rbase (𝑉, 𝐹text,t3 ) = ({ι, χ, θ, υ, ς}, ltr, (l[1..3])) ,
27 Based on https://de.wikipedia.org/wiki/Datei:Ixtus.gif (13.10.2020).
(8)
126 | Frederik Schlupkothen and Karl-Heinrich Schmidt Rbase (𝑉, 𝐹text,t3 ) = ({ι, χ, θ, υ, ς}, rtl, (l[3..1])) .
(9)
A complete reading of the whole document is not possible with these last two bases. However, it is also a consequence of the reduction to three lines that a reading of 𝐹text,t3 with (8) and a reading of 𝐹text,t3 with (9) both lead to the same result, as can be seen from the blue marking in figure 5 on the preceding page.²⁸ In figure 5 on the previous page, switching to the upper-case letters {Ι, Χ, Θ, Υ, Σ} would permit a specific search for the acronym “ΙΧΘΥΣ,” for example with: Rbase (𝑉, 𝐹text,t4 ) = ({Ι, Χ, Θ, Υ, Σ}, ltr, (l[1..5])) .
(10)
The same reading outcome, albeit with the additional possibility of the acronym being identified as an acrostic in the first vertical column of letters, c1 , would result if the reading base were to be confined to the upper-case letters {Ι, Χ, Θ, Υ, Σ} with a vertical reading direction (“ttb”: top-to-bottom) by column (“c”: column):²⁹ Rbase (𝑉, 𝐹text,t4 ) = ({Ι, Χ, Θ, Υ, Σ}, ttb, (c[1..7])) .
(11)
If, finally, we treat 𝐷ichthus , the complete document presented in figure 5 on the preceding page, as a pure text document (see section 4) and in the case of this document credit the viewer with knowledge of Greek letters in general and with the ability, with reading base (11), to read the specified upper-case letters vertically, the following possible reading base for the whole document results: Rbase (𝑉, 𝐷ichthus ) = (ISO/IEC 8859-7, ltr, (l[1..5])) ⊕ ({Ι, Χ, Θ, Υ, Σ}, ttb, (c[1..7])) .
(12)
For convenience, let the ⊕ sign here stand simply for an appropriate combined use of the summands, without going into further detail; let it further not be permissible to replace the five upper-case letters of the second component with “ISO/IEC 8859-7,” even though they are included in this character set (otherwise, the search request would be underspecified within this character set in the modeling of the viewer). Overall, (12) is a minimal representation, as a modelable reading base, of what can be expected in a person or device searching for this acrostic (and its permutations) in Greek texts. Remarks such as those in the previous paragraph can aid the understanding of reading bases. Provision should therefore be made for the optional inclusion of 28 From this point, it becomes possible to use such constraints (and other additional ones) to, for instance, start searching for palindromes of various types in a given document. Even the choice of a reading base affects how “meaningful” the palindromes to be found in such a process can be. 29 The “ttb” reading direction has been introduced only for the purposes of this example and is not related to discussions of writing modes such as that in Etema and Ishii (2019).
Legibility and Viewability | 127
comments that explain the stipulations that a reading base contains. A comment 𝐶 on the full reading base for 𝐷ichthus might read: “This reading base is for illustrative purposes. It consists of two components with Modern Greek character sets, so as to avoid having to go into an Ancient Greek character set.” Sometimes, comments on individual components may be worthwhile, for instance a comment 𝐶′ on the second summand: “The word ΙΧΘΥΣ in a unit set {ΙΧΘΥΣ} provides a minimal reading base with which to search for the acrostic; with {Ι, Χ, Θ, Υ, Σ} all selections from and all sequences of ‘Ι,’ ‘Χ,’ ‘Θ,’ ‘Υ,’ ‘Σ’ are permitted.” The result: Rbase (𝑉, 𝐷ichthus , 𝐶) = (ISO/IEC 8859-7, ltr, (l[1..5])) ⊕ ({Ι, Χ, Θ, Υ, Σ}, ttb, (c[1..7]), 𝐶 ) ′
(13)
This reading base represents part of the viewer knowledge assumed for a particular use, explained in part by the comments, of the document 𝐷ichthus . Recognizing the importance of reading bases in general is crucial to modeling an extraction of information if the modeling explicitly or implicitly describes, and ascribes to viewers, models for ascertaining the form of a document. That is always the case when criteria for determining form are provided by reception abilities in the reading culture addressed by a document. If an individual text-flow is replaced with sets of text-flows, reading bases can be specified or generated for entire text corpora on the basis of character sets, word lists, and so on, in order once again to set out complete and correct readings. In particular, extracts from word lists in a reading base can be used in indexes. Given that the producers of a document (including, e.g., writers and printers) are often also viewers of the document they make, the modeling of the viewer’s role in such cases also encompasses the producer’s role. A reading base Rbase can thus also be a writing base Wbase . Both, however, irrespective of what they may and may not have in common, always represent assumptions on the part of a modeler.
7 Specific Readings without a Specified Direction In the case of the illustration of Cantor’s diagonal argument in figure 4 on page 124 and the 𝐷ichthus document in figure 5 on page 125, specification of the reading direction is important for the reading, and thus potentially also the understanding, that is sought. This, however, is in many situations (including everyday ones) not the case, as will now be considered. Use of the type–token relationship in reading bases to ensure completeness and correctness is not confined to character sets and word lists, but can also be
128 | Frederik Schlupkothen and Karl-Heinrich Schmidt
basic sign (prohibition) symbol (car) supplementary sign
in below
Sonn- u. Feiertage
Sonn- u. Feiertage
Fig. 6: German prohibition sign with accompanying text.
applied straightforwardly to non-linguistic components, as the following example shows. Figure 6³⁰ not only illustrates such non-linguistic components; if we set aside the supplementary sign, it also shows that strictly incremental type–token configuration of a reading, in which after each classified sign an unclassified one has to be read, does not have to result in a reading with a prespecified order. This allows reading directions to be left open in principle and can potentially place the sequence of the elements to be classified by the viewer at his discretion. It is for this reason that reading bases in general do not have to contain three parts: specification of the reading direction can be left out. Let us illustrate this by taking a reading base—again with two components—for the traffic sign above, so as to fulfill the following functions: in the first component, the supplementary sign needs a suitable sign set of permitted characters (let this be Latin-1 again), which as usual are to be read in a strictly incremental manner; in the second component, let the main sign be read on the basis of the signs in the Straßenverkehrsordnung (the German traffic rules), without a direction being specified. The two can then be read together using the following reading base without any further stipulations being included (the ⊕ sign is thus commutative here):³¹ Rbase (𝑉, 𝐷noAutomobiles ) = (Latin-1, ltr, (l+)) ⊕ (regulation signs following Anlage 2 zu § 41 Absatz 1 StVO, (c)).
(14)
30 Based on http://www.fb10.uni-bremen.de/khwagner/grundkurs1/images/vzeichen10.gif (13.10.2020). 31 The Verwaltungsvorschrift zur Straßenverkehrsordnung (VwV-StVO) states that a maximum of three traffic signs can be attached to a single post; these signs can be viewed at https://de.wikipedia.org/wiki/Bildtafel_der_Verkehrszeichen_in_der_Bundesrepublik_Deutsch land_seit_2017 (13.10.2020).
Legibility and Viewability | 129
In document processing, the delineation of a segment without specifying a reading direction for viewers is often what lies behind the treatment of that segment as monolithic in the sense that fragmentation of it should be avoided as far as possible in all visual renditions.³² The visual rendition of images, as well as videos, is typically subject to this requirement—but tables or parts of tables are also often candidates for monolithic treatment, for instance if there is a desire to keep them together on a single page in a paginated output medium. In flat-surface outputs, individual images and videos are often treated identically in the intended display area. Correspondingly, individual images, for example in the guise of key frames, are often envisaged as a substitute for the integration of video segments (which it may not be possible to find or play back) into markup languages. In film editing, the lost silent classic London after Midnight offers an example of a film reconstruction made with the help of production stills (Schmidlin 2002). In general, however, exploitation of the monolithic status shared by objects of the image MIME type and objects of the video MIME type has already reached its limits in such undertakings. “Behind” the display area, videos can be fragmented along the time axis, meaning that the display area for the frames—but not the sequence of images shown in it—can be monolithic. It is then easily possible to leave viewing/reading directions unspecified and to dispense with strict incrementality for the viewer when it comes to the viewing of the frames. Viewing the sequence of images, on the other hand, is often subject to strict incrementality for that same viewer. Equally, an order can be imposed on the static “exegesis” of images (as in comics), such that the sequence of images that results likewise constitutes what is known as an image-flow; image-flows are always to be appraised in a strictly incremental manner.
8 Strictly Incremental Viewability in Dynamic Image-Flows In what follows, “image-flow” refers to the flat-surface visual rendition of a (sub)document that consists only of content architectures of the image or video type. There is the potential for such a (sub)document to be visually rendered for a viewer in one monolithic segment or several monolithic segments using a suitable
32 The description of visually rendered parts of a document as monolithic follows the terminology in the CSS3 fragmentation module (see Atanassov and Etemad 2018).
130 | Frederik Schlupkothen and Karl-Heinrich Schmidt
Fig. 7: Four zoetrope strips and a zoetrope.
playback device with a moving-image effect—as with the four strips, each with four monolithic frames, on the left-hand side of figure 7.³³ In this historical example, it is only the level of visual rendition (static display on a flat surface or the use of a zoetrope) that determines whether the images in the sequence are used by the viewer all at once (as a static image-flow) or with a moving-image effect (as a dynamic image-flow).³⁴ In the dynamic case, the imageflow in the example meets the eye of the viewer directly. In both the historical viewing situation and a corresponding output on a computer screen,³⁵ a strictly incrementally viewable (sub)document is present. Accordingly, it can be stated: Definition 9. A (sub)document with content portions of only the image and video MIME types can be represented in a normal dynamic image-flow for a viewer if it can be given a layout in basic order and the associated content portions, of which there is at least one with at least two frames or at least two with at least one frame each, can be viewed, strictly incrementally in the order specified by the content portions, by the viewer in a complete and monolithic order-preserving visual rendition of the frames. The “normalness” included in the definition adds to the other requirements a placeholder for the inclusion of further conditions (e.g., further requirements of
33 The zoetrope strips on the left are from http://de.wikipedia.org/wiki/Einzelbild_%28Film% 29. The image on the right is from http://www.zeno.org/Brockhaus-1911/A/Zootrop?hl=zootrop, Zenodot Verlagsgesellschaft mbH, public domain (both 20.09.2020). 34 The terminology, including the distinction between static and dynamic image sequences, is based on Bateman (2013, esp. 67). An analysis of the moving image from the perspective of film theory that also uses the zoetrope example employed here can be found in chapter 5 of Bateman and Schmidt (2011). 35 There is an animated zoetrope at https://andrew.wang-hoyer.com/experiments/zoetrope/; see also http://www.youtube.com/watch?v=z--YZq68fmA (both 20.09.2020).
Legibility and Viewability | 131
human perception),³⁶ which is not undertaken here. For a human viewer, even the viewing of a normal dynamic image-flow can deviate from normal human perception of one’s surroundings. Thus, for instance, in the case of the “cinematic image,” the viewer is presented with an “equally sharp surface at a constant distance”: “This does inhibit binocular convergence at various levels of depth (which means that distances cannot be calculated by the brain as they are in reality), but it also permits scanning by eye-leaps (saccades) on the part of the viewer and thus the ‘effet de fenêtre.’”³⁷ The minimal case in the definition, where only two frames are to be visually rendered, is realized in the thaumatrope. The thaumatrope actually depends on more than just a normal dynamic image-flow—but it is easy to define a thaumatropic image-flow by requiring, in addition to a normal image-flow, that an afterimage effect is ensured on the part of the viewer and that the two thaumatropic frames go together. In addition, there is a basic prohibition on changing either the layout sequence of the content portions or any sequence that may be anticipated in them for the visual rendition of the frames. Thus, taking an animated GIF as an example, no alterations can be made to the predetermined sequence of frames. An animated GIF is also an example of the possibility of dynamic image-flows outside the video MIME type. Finally, the completeness requirement excludes the possibility of a fast-forwarding effect (through the playback of only selected frames), for instance, in the representation of a normal dynamic image-flow. Normal text-flows with their strictly incremental legibility (see definition 7 on page 122) differ from normal dynamic image-flows with their strictly incremental viewability because it is not generally possible to assume for a set of viewers a correctness criterion for the viewing of frames analogous to a type–token relationship.³⁸ In those cases where such a criterion is present—in the visual rendition of text in films, for instance—strictly incremental legibility may also be present; it is subordinated to the order of the frames (in which further effects such as transitions can be employed) specified by the content portions.
36 This concerns the faculty of movement perception, for instance; see again Bateman and Schmidt (2011, 132). 37 “Kinobildes”; “gleichmäßig scharfe Fläche in konstanter Entfernung”; “Diese verhindert zwar die binokulare Konvergenz in verschiedenen Tiefenebenen (was dazu führt, dass Entfernungen nicht wie in der Wirklichkeit im Gehirn errechnet werden können), erlaubt aber ein Abtasten durch Blicksprünge (Sakkaden) der Zuschauerin und damit den ‘effet de fenêtre’” (Kirsten 2007, 152). 38 Correctness criteria for the appraisal of individual images can be found in chapters 7 and 8 of Schmidt (1999).
132 | Frederik Schlupkothen and Karl-Heinrich Schmidt Switches between viewability and legibility need to be identified on a case-bycase basis and with reference to the viewer.³⁹ In this sense, a “gallop alphabet” is conceivable for the example in figure 7 on page 130, to be used analogously to a conventional text-ticker and allowing the phases represented in the four “gallop strips” to be classified as tokens; the result would be that in the four strips (S1 , S2 , S3 , and S4 ) of a document 𝐷zoetrope , properly arranged one after the other, a gallop would be visually rendered in tokens. This, however, needs to be set out in a suitable reading base for the viewer in order to make the fact that they are related to types—a “fundamental property of symbolic representations”—explicit.⁴⁰ If this does not occur, there are no corresponding types for the viewer, meaning that there is an empty set in the first component of a “reading” base for the dynamic image-flow in the 𝐷zoetrope example above: Rbase (𝑉, 𝐷zoetrope ) = (∅, ltr, (l[1..4])) .
(15)
Omitting to specify a type for the signs visually rendered in the 4 × 4 arrangement leads here to “viewing” by the user being assumed. In our example, the possibility of a Realtext for 𝐷zoetrope is thereby also dropped. If someone really believes that a Realtext can be identified (as a secret or previously unnoticed message, say), the modeling of this particular part of the reading base for 𝐷zoetrope would need to be modified.
9 Ordering and Structurability in Dynamic Image-Flows Independently of the type–token relationships of the frames, a dynamic imageflow can have a preferred order; indeed, that order can often be worked out for a given set of images. For the example in figure 7 on page 130, it would be possible to deduce it from the image content, for instance: given sufficient (hippologically informed) engagement with the strips and sufficient viewability of the frames, an earlier mix-up of two of the sixteen images on the four strips, say, could be repaired. But this work would take place on the level of the image sequence. Furthermore, analogously to a four-line poem consisting of only a single sentence, this image sequence can be considered structured in the sense that it is only these four strips, each a complete subcomponent, that can be used in the zoetrope. 39 With “legibility” and “viewability,” a terminological distinction is introduced that will without doubt require further study and potential refinement. 40 “grundlegende Eigenschaft symbolischer Darstellungen” (Stetter 2005, 77).
Legibility and Viewability | 133
Fig. 8: Peter Tscherkassky’s visualization of his artwork Motion Picture.
Even if the individual frames are viewable, such formal work on the sequence level is excluded if the frames do not give any, or sufficient, indications of how they are to be arranged. As an example, figure 8⁴¹ contains an artwork called Motion Picture by Peter Tscherkassky; it was created using a single frame from the early Lumière film La Sortie des Ouvriers de l’Usine Lumière à Lyon. Tscherkassky writes on its genesis:⁴² I marched into the darkroom and mounted fifty 16 mm strips of unexposed film stock onto the wall, vertically covering a surface of 50 × 80 cm in total. Onto this blank cinematic canvas I projected a single frame from Workers Leaving the Lumière Factory (1895) by the brothers Lumière. I processed the exposed filmstrips and subsequently arranged them on a light table to form a 50 × 80 cm duplicate of the original Lumière frame. I then edited the filmstrips together, starting with the first strip on the left, and proceeding to the right.
As in the earlier acrostic with letters to be followed vertically, then, we find here sequences of images in “ttb” order in fifty columns (c1–c50 ). These fifty vertical image sequences exist on the one hand as an artifact with a light box behind them, as depicted in figure 8, such that the set of images can be viewed incrementally without a direction having been specified, like the single frame from the early Lumière film. On the other hand, there is also a version (i.e., output) that envisages strictly incremental viewing: the image sequence produced by placing 41 With kind permission from Peter Tscherkassky. 42 The following quotation is taken from https://www.kunst-der-vermittlung.de/dossiers/ fruehes-kino/bildbeschreibung-motion-picture/ (13.12.2020).
134 | Frederik Schlupkothen and Karl-Heinrich Schmidt the columns one after the other—let the corresponding document here be called 𝐷MotionPicture —can be played back as a dynamic image-flow (Tscherkassky 1984). A corresponding imaging of 𝐷MotionPicture does not allow the normal human viewer to find a substitute for the ordering information in the light-box representation: the two-dimensional “solution to the puzzle” is lost. There are therefore once again two different viewing possibilities, as in the example of the 𝐷ichthus acrostic; the second of them, however, functions in itself like an arbitrary sequence (and not like an acrostic). In contrast to 𝐷zoetrope , the frames in 𝐷MotionPicture do not as such place any constraints on the neighboring images to which they are adjoined; it is only additional information in the guise of the Lumière film image that was used that allows an image to be assigned a position in the image sequence. No preferred order follows from the given image set without knowledge of the underlying film image. This is also apparent in the modeling of the reading base for 𝐷MotionPicture . Given the making-of remarks by the artist quoted above, it seems reasonable to dispense with any specification of types.⁴³ A corresponding explanation can be provided by using a commentary 𝐶 to refer to, or include verbatim, the text by Tscherkassky that was quoted: “Tscherkassky writes on the genesis of the film: ‘I marched […].’”⁴⁴ With this background knowledge, analogously to 𝐷zoetrope , the following Rbase results for the sequence of frames in 𝐷MotionPicture : Rbase (𝑉, 𝐷MotionPicture , 𝐶) = (∅, ttb, (c[1..50])) .
(16)
The third component specifies how many different formal elements are present— here: sequences, to be viewed incrementally, of images. This meets the needs of the viewer, and is thus suitable for a reading base for 𝐷MotionPicture , only if those formal elements can also be identified by the viewer, as was the case with the lines in 𝐷ichthus or 𝐷zoetrope . In 𝐷MotionPicture they cannot, and so this specification of formal elements cannot be retained. Changing it, however, also entails changing the specification of reading direction. Using a temporal axis in a positive direction
43 In the spirit of the argument so far, it is of course conceivable for viewers of this image sequence to decompose the frames into a set of sets of copies. This decomposition could even be used to orient a reading process around type–token relationships if for every image, as for letters, it is possible to specify what “copy type” it belongs to. Analytical decisions come into play again here: is it merely a case of serialization of an image set here, or is more involved? In the extreme case, the latter might even lead to the assuming the possibility of secret messages. 44 This commentary also provides an aid to understanding that might need to be supplemented with the single frame from Workers Leaving the Lumière Factory.
Legibility and Viewability | 135
(“t+”), as suggests itself for moving-image data, and with the images from all fifty combined strips arranged correspondingly in a line, the result is: Rbase (𝑉, 𝐷MotionPicture , 𝐶) = (∅, t+, (l)) .
(17)
The triple “(∅, t+, (l))” still formally represents a reading base for a viewer 𝑉 of 𝐷MotionPicture , to which 𝐶 has been added for the purposes of clarity. The fact that this triple models almost nothing is a result of the fact that—again, as with the zoetrope—no assignment to types is anticipated on the frame level. The first component of the triple reflects this. In addition, though, no formal elements are identifiable for 𝑉 on the level of the image sequence, as the third component of the triple reflects. The image sequence in (l) is now no more than a container in which the frames can be permuted arbitrarily. In particular, 𝐷MotionPicture can no longer be considered a structurable document by the viewer above the level of the individual frames.
10 Structurable Documents of the Video Type The art video in 𝐷MotionPicture is treated here as a document that cannot be structured further. Many video segments, though, also contain analogues to the text-only structures discussed in section 3, such as lines or paragraphs; thus, these analogues in the field of moving images can be analyzed as shots and the macrostructures that group them in turn together (such as, for example, scenes and sequences; see below). These macrostructures are subject to a medium-specific logic and are, indeed, liberally used in the creation of filmic narratives in practice.⁴⁵ In analytical work, this brings us to the structural level of a moving-image data set and its mapping onto a layout—in the case of a “tape-like” output, to miseen-chaîne. In the context of the film medium, such work is distinct from work on frames in that questions of contiguity can play a significant role.⁴⁶ Fundamentally, the contiguity of shots can be constrained by relations of content that do not have a visual correlate, unlike (still analytical) work within the image sequence of a shot. Such work builds on visual fit at the right point in a (normal) dynamic image-flow, 45 A detailed analysis of narrative structures can be found in Bateman and Schmidt (2011). Descriptive structures are analyzed in Schmidt (2008, 137–189). 46 André Gaudréault distinguishes three groups of codes for treating a filmic document: (1) miseen-scène codes (what is filmed?), (2) mise-en-cadre codes (how is it filmed?), and (3) mise-en-chaîne codes (how is it presented in the context of the film?). What follows is concerned only with the last two groups. On all this, see Beil et al. (2016).
136 | Frederik Schlupkothen and Karl-Heinrich Schmidt
Fig. 9: A title shot.
Fig. 10: A closing shot.
and it is only in the case of the first or last frame in a shot that the fit can potentially not be constrained by frames before or after it. Text-flows and dynamic image-flows have so far largely been treated separately. In the spirit of example-based argumentation, a document that employs both will now be considered again—the silent film The Girl and Her Trust from the early days of film history, which was introduced earlier. This film by Griffith from 1912 is preserved as a document with (usually) 140 shots,⁴⁷ which fall into four different types. These types are: 1. Title shot. This type is instantiated in the “front matter”; in the layout Griffith chose for The Girl and Her Trust, it is positioned accordingly as the first shot, S1 , of the whole document (see figure 9). 2. Closing shot. This type is instantiated in the “back matter”; in the layout Griffith chose for our example film, it is positioned accordingly at the end of the whole document, in shot S140 (see figure 10). 3. Text insertions in the main part (“body matter”) of the film, intended to be used in terms of types and tokens. In some cases, it may be necessary to distinguish between generic parts and specific parts. The example film contains nine text insertions in shots S2 , S12 , S36 , S43 , S65 , S78 , S94 , S97 , and S106 . The text at the top, together with the line and logo graphic elements, belongs to the generic part of the document (analogously to a slideshow, one could say that this represents the master slide); the rest is specific and has its own text in each case. Shot S2 from the example film is presented in figure 11 on the facing page. 4. Shots in the main part of the film whose content consists of photographic representations of states of affairs in time and space. In the example film, these are all the shots from S1 to S140 not classed under types 1–3 (129 in total). The first five shots of this type, S3 , S4 , S5 , S6 , and S7 , are presented in figure 2
47 There are minor variations in the transmission history; cf. Etling (forthcoming).
Legibility and Viewability | 137
generic specific generic Fig. 11: A text insertion.
Fig. 12: Shots in time and space.
on page 117; the conflict between Grace and Jack that arises in these opening shots is resolved in a happy ending in the last two shots of this type, S138 and S139 (see figure 12). It is in principle possible that further types exist, but for the example film and the purposes of this chapter, this inventory is sufficient to outline the space of a dispositif in which filmmakers can operate (see below). In theory, the 140 total shots in the example film can be given a layout in 140 ⋅ 139 ⋅ 138 ⋅ … ⋅ 2 ⋅ 1 = 140! (a 242-digit number) different ways. We already have two shots that have a prominent place at the beginning and end, and that identify the other shots between them as belonging to film as a form of presentation. This does at least reduce the set of possible layouts by 140 ⋅ 139 (almost 20,000) possibilities. But we are still in the first instance facing 138! (a 237-digit number) possible filmic derivatives of The Girl and Her Trust that can be made by combining the remaining “content” shots. Let the document 𝐷TheGirl be divisible without overlaps into 𝐷TheGirl.frontmatter with S1 and 𝐷TheGirl.backmatter with S140 , and the main part 𝐷TheGirl.body . As already indicated in section 2, architectures of the image type in the true sense are not treated in the present chapter. Thus, the generic part of the type-3 shots, which
138 | Frederik Schlupkothen and Karl-Heinrich Schmidt all contain the same graphic elements, is not considered here. With this caveat for the type-3 shots, let a distinction be made for the analysis of the subdocument 𝐷TheGirl.body between a 𝐷TheGirl.body.text part for reading and a 𝐷TheGirl.body.video part for viewing. The specific texts in the type-3 shots in the film can be read individually with Latin-1 (analogously to the supplementary sign in the traffic sign in figure 6 on page 128) and together in the given sequence: Rbase (𝑉, 𝐷TheGirl.body.text ) = (Latin-1, t+, (S2 , S12 , S36 , … , S94 , S97 , S106 )) .
(18)
Type-3 shots can often be slotted in easily with the help of content-based arguments. In addition, as textual insertions they are often (but not always) self-sufficient in the structure of a film and do not form wider structures within it. That is the case in The Girl and Her Trust, meaning that the above modeling is sufficient. Analogously to the previous filmic examples, the type-4 shots are a priori not to be understood in terms of a type–token relationship. In contrast to the two previous examples of moving-image data (the four cyclically usable image sequences for the zoetrope and the fifty filmstrips in 𝐷MotionPicture ), however, there is more to do here when it comes to specifying constraints for structuring the 129 type-4 shots in 𝐷TheGirl.body.video . These remaining shots are part of a very large layout space, the use of which can also be subject to considerable constraints as a result of the logical structuring of the shots (just as in the case of texts). In general, document parts that instantiate type-4 shots in a filmic document can be expected to satisfy a medium-specific syntagmatics with a spatiotemporal semantics. Above the basic logical objects, a logical structure for the type-4 shots can combine several (at least two) shots into ordered representations of spatiotemporal units as narratively basic (see Bateman and Schmidt 2011, 212) scenes and sequences. Scenes here have a spatially and temporally coherent diegesis; sequences also have a spatially coherent diegesis but display at least a temporal gap in their diegesis.⁴⁸ In the tree structure of a filmic document, both function as composite logical objects that can be cut together in the (default) layout without their order being abandoned (see also definition 2 on page 118). For the 129 shots of The Girl and Her Trust that instantiate type 4, ten sequences, not including the final pursuit, that are interleaved in the layout can be identified. This interleaving can be treated mathematically. The partition function P(𝑛) gives the number of possibilities for breaking a given positive integer 𝑛 down into positive integral summands. That is exactly what we are doing when we consider 48 For formal definitions, see Bateman and Schmidt (2011, 295–297).
Legibility and Viewability | 139
for a given number of shots how many possibilities there are for dividing them up into narratively basic individual shots, scenes, or sequences. If (on whatever basis: screenplay fragments, identification of prohibited image sequences in censorship certificates, etc.) we know or can estimate the number of type-4 shots, we can narrow down their possible montage combinations (mise-en-chaîne), potentially quite considerably. Let us work through an example with small numbers to illustrate the combinatorial effect. In principle, a film with ten type-4 shots has 10! = 3,628,800 different layouts. Further, there are P(10) = 42 different partitions, and thus in the narratively basic case 42 different possibilities for dividing the shots into single shots, scenes, and sequences. The partitioning assumed has a considerable influence on the set of possible solutions when it comes to calculating, or at least estimating, the possibility space for the montage. If the whole film consists of only a single scene or sequence, there is only one (default) layout. If of the P(10) = 42 different partitions we take the five two-part partitions (e.g., to present a dialogue in the shots), there are, entirely apart from content-related considerations (!), two equally 10! large scenes or sequences in the case of 5 + 5, and the film has only (5!⋅5!) = 252 different possible (default) layouts; in the case of 6 + 4, there are only 210; in the case of 7 + 3, only 120; in the case of 8 + 2, only 45 different (default) layouts. The 9 + 1 case means that a free-standing shot is positioned before, between, or after the other nine shots, which are bound into a single scene or sequence; there are then only 10 possible (default) layouts. Thus, despite the weaker initial logical conditions for the layout (see definition 2 on page 118 in contrast to definition 1 on page 117), work specifically on the mise-en-chaîne of a film can still be subject to very strict constraints. Macroscopic work on the form of document trees that structure sequences of shots is thus not fundamentally any less complex than macroscopic work on the form of document trees that structure plain text. For sequences of shots, as for texts, “higher-level” prescriptions can be used in the generation and analysis of the document parts and their transformation into an output.⁴⁹
49 For texts, see Stetter (2005, 11). An example of a filmic prescription: two scenes that together form a sequence can only be separated in such a way in the layout that no shot in the first scene is positioned after a shot in the second one.
140 | Frederik Schlupkothen and Karl-Heinrich Schmidt
11 Summary This chapter has examined document parts that are to be appraised strictly incrementally by viewers. To this end, token sequences and image sequences were characterized in a way that makes it possible to treat features of those sequences in the same way but elements of the sequences in a manner that accounts for their differences. The sequence features are associated with text-flows on the one hand and image-flows on the other. For the most part, text-flows visually render content of the text MIME type; the associated producing action is fundamentally guided by type–token relationships, with strictly incremental legibility in informational use as the goal. For the most part, image-flows visually render content of the image and video MIME types; the associated producing action can also be guided by type–token relationships (e.g., in text inserted into films) with strictly incremental legibility as the goal, but it often has to manage without such relationships. In the latter case, production on the level of (individual) images strives for viewability. For a dynamic image-flow that includes such individual images, the producing action is again directed at strictly incremental viewability in a preferred order, if such an order can be assumed. Furthermore, if image sequences can be divided up into segments (e.g., shots), for instance on the basis of spatiotemporal diegesis, such segments can be subject to additional constraints that may place considerable limits on how the document structure is converted into a layout (see section 10), analogously to the use of higher-level specifications such as dictionaries or grammars for texts (see section 5).
12 Further Perspectives: On Dispositifs and the Legacy of Cinema and Typographeum We have described strict incrementality both for the legibility of “classic” text documents and for the viewability of (moving-)image documents; we have also seen what they have in common when it comes to the conversion of logical structure into visual renditions. Where media history is concerned, this raises the question of whether the same parallelism is apparent in the use of distinct historical document platforms such as book and film, not least given that these are prominent representatives of, respectively, paged and continuous output as distinguished in digital
Legibility and Viewability | 141
document processing.⁵⁰ With this question in mind, we conclude by turning— without making any claim to systematic comprehensiveness—to Foucault’s idea of the dispositif. As has been said: “All media establish a dispositif with a particular order.”⁵¹ If this is true at least of “successful” media, in the context of this chapter we can ask what it means today for the book as a media platform with book-reading as its core use and for the film as a media platform with film-viewing as its core use. Let us take the well-known dispositif of cinema as a starting point. This “dispositif binds film and spectators, and is involved in the effect of the film. In the case of cinema, the situation can be characterized as follows: stationary spectating subjects are arrayed in a relatively dark room facing a large screen onto which images are projected, originating from a device that is (not visible to them) installed behind their heads.”⁵² In building on this reasonable characterization of film-viewing as a core use of film, a caveat in the Foucault-Handbuch should be addressed first: “in media theory, in particular film and television theory,” there has been “contamination between Foucault’s [dispositif ] and Jean-Louis Baudry’s […] cinema-related concept of a dispositif,”⁵³ which has essentially meant that the role of the filmmaker who controls the dispositif has been passed over. The resultant model does, though, retain the basic structure in which “an ‘objective’ pole of instruments and topoi (a mechanical complex, a spectrum of resources) has its counterpart in a ‘subjective’ ordering pole (emblematically, a military strategy). The subject pole here thus involves the subjectivity of the person who ‘disposes over’, i.e., that of the person who controls the dispositif, the strategist, the holder of power.” For the cinema, this means the filmmaker. “This subjectivity of the master, though, has been curiously set aside in reconstructions to date in favor of the exclusive subjectivity of those in his thrall, less abstractly: the film spectator in a dark room, a manipulated person, in military terms the soldier. But dispositif -analysis nonetheless needs to distinguish between the ‘ordering’ subjectivity of those who dispose over others
50 See the terminology set out in CSS, for example: https://drafts.csswg.org/mediaqueries-4/ #continuous-media (07.01.2022) 51 “Alle Medien konstituieren eine dispositive Ordnung” (Kirsten 2013). 52 “Dispositiv verbindet Film und Zuschauer und hat Anteil an den Wirkungen des Films. Im Fall des Kinos lässt sich das Arrangement wie folgt charakterisieren: immobile Anordnung der Zuschauersubjekte in einem relativ dunklen Saal vor einer großen Leinwand, auf die Bilder projiziert werden, die von einem Apparat stammen, der (für die Zuschauer unsichtbar) hinter ihren Köpfen installiert ist” (Kirsten 2013). 53 “Medien-, speziell in der Film- und Fernsehtheorie”; “Kontamination zwischen Foucaults und Jean-Louis Baudrys […] kinobezogenem Dispositiv-Begriff” (Link 2014, 238).
142 | Frederik Schlupkothen and Karl-Heinrich Schmidt and the ‘ordered’ subjectivity of those who are disposed over. This distinction is crucial to Foucault’s version of the concept.”⁵⁴ Where the ideas drawn out in the present chapter are concerned, the preceding pages have addressed this contamination by treating those who are disposed over and those who dispose over them equally in the digital view of documents. For those who dispose over others in particular, the following applies along the lines of the Gaudreault terminology already used earlier: their subjectivity can manifest itself only in the use (1) of the mise-en-scène codes, which do not need to be analyzed at all here given that they generally have no further influence on document contents after shooting; (2) of the mise-en-cadre codes for producing content portions that are to be viewed strictly incrementally and that can, when output today, be selectively rendered visually in various ways for those who are disposed over; and (3) of the mise-en-chaîne codes for representing a tree-like logical structure in an output, whereby up until the time of visual rendition all possible rearrangements can in principle be carried out and any representation of a tree structure in an output (worked through illustratively in section 10 above) can be subject to restrictions. If we turn to the book, for digital document processing we arrive at, among other things, the mise-en-page codes and the underlying logical structure. In digital document processing, the legacy of the typographeum as the “totality of all of the institutions of book printing with movable type”⁵⁵ manifests itself centrally in the use of prespecified or bespoke logical book languages and the associated book-compatible stylesheets that are designed for book reading as a core use.⁵⁶ For this core use, text-reading is but one element and is generally distinct from 54 “Einem ‘objektiven’ instrumentellen Topik-Pol (maschineller Komplex, ‘Klaviatur’) steht ein ‘subjektiver’ Verfügungs-Pol (am prägnantesten eine militärische Strategie) gegenüber. Mit SubjektPol ist dabei also die Subjektivität des ‘Disponierenden’ gemeint, d.h. die des Verfügenden über das Dispositiv, des Strategen, des Mächtigen”; “Diese Subjektivität des Herren ist nun in den bisherigen Rekonstruktionen eigenartigerweise ausgespart geblieben zugunsten der alleinigen Subjektivität des Knechtes, konkret des Filmzuschauers im dunklen Saal, des Manipulierten, militärisch gesprochen des Soldaten. Dennoch muss die Dispositiv-Analyse die Verfügungs-Subjektivität der Disponierenden und den ‘verfügten’ Subjektivitäten der Disponierten unterscheiden. Dieser Unterschied ist für Foucaults Fassung des Begriffs wesentlich” (Link 2014, 238). 55 “Gesamtheit der Einrichtungen des Buchdrucks mit beweglichen Lettern” (Wikipedia 2022). 56 Let us mention just two prominent examples in this non-technical context. Developed over several decades, the DocBook book language mentioned earlier, which provides a very rich descriptive language for the logical structure of books, has been widely adopted in the structural legacy of the typographeum; cf. https://tdg.docbook.org/ (11.01.2022). Even at an early date, there were book-oriented style specifications in the CSS style language; from 2005, it was possible to use them in HTML dialects for the production of printed editions; see https://alistapart.com/article/ boom/ (11.01.2022).
Legibility and Viewability | 143
viewing images and scrutinizing tables in a book; but there is ample provision for that as well in the document languages for producing books. If we consider media products from the perspective of ensuring the human extraction of information from documents, we realize that there are only a few requirements for the extraction of information where documents are to be output visually and on a flat surface. For the flat-surface case of the visual rendition of a document on a printed page (the print CSS media type), incremental viewability (e.g., for an individual image or a table) and strictly incremental legibility are the typical production goal when it comes to ensuring the extraction of information. The typographeum dispositif is what historically provides the essential basis for the establishment of strictly incremental legibility; the cinema dispositif additionally provides the essential basis for the establishment (including by creative means) of strictly incremental viewability. With this, the tools for reading or viewing use that have developed historically are assembled ready for modern-day screen use (the screen CSS media type) and for the first three MIME types (text, image, and video). (Strict) incrementality turns out to be a central category of the dispositif order for those who are disposed over and those who dispose over them in typographeum and cinema; it defines not only the textual and filmic traditions but also modernday screen use in legibility or viewability. The screen media type will be the measuring post for all media output in the foreseeable future. Whatever dispositifs it might develop in future is not the concern of the present study. But at any event, it has no difficulty in handling the (strictly) incremental core—examined in the present chapter—of the ways of dealing with the underlying flows that have been developed in the cinema and typographeum dispositifs; and it is by nature suitable for handling the tree-oriented structuring of those flows into content portions and their transformation into visual renditions.
Bibliography Atanassov, Rossen and Elika J. Etemad. CSS fragmentation module level 3. W3C candidate recommendation, W3C, December 2018. URL: https://www.w3.org/TR/2018/CR-css-break3-20181204/, (13.10.2020). Bateman, John A. Multimodality and Genre: A Foundation for the Systematic Analysis of Multimodal Documents. London: Palgrave Macmillan, 2008. Bateman, John A. Multimodal analysis of film within the GeM framework. Ilha Do Desterro, 64(1), 2013. DOI: https://doi.org/10.5007/2175-8026.2013n64p49. Bateman, John A. and Karl-Heinrich Schmidt. Multimodal Film Analysis: How Films Mean. New York, London: Routledge, 2011. Beil, Benjamin, Jürgen Kühnel, and Christian Neuhaus. Studienhandbuch Filmanalyse: Ästhetik und Dramaturgie des Spielfilms. pp. 20–22. 2nd edition, Munich: Brill Fink/UTB, April 2016.
144 | Frederik Schlupkothen and Karl-Heinrich Schmidt Berglund, Anders. Extensible stylesheet language (XSL) version 1.1. W3C recommendation, W3C, December 2006. URL: http://www.w3.org/TR/2006/REC-xsl11-20061205/, (25.10.2020). Bos, Bert, Tantek Çelik, Ian Hickson, and Håkon Wium Lie. Cascading style sheets level 2 revision 1 (CSS 2.1) specification. W3C recommendation, W3C, June 2011. URL: https: //www.w3.org/TR/2011/REC-CSS2-20110607/, (25.10.2020). Branigan, Edward. Narrative Comprehension and Film. New York: Taylor & Francis, 1992. Burnard, Lou and Syd Bauman. TEI P5: Guidelines for electronic text encoding and interchange. 2.8.0. 06-apr-2015. Technical report, TEI Consortium, April 2015. URL: https://tei-c.org/ Vault/P5/2.8.0/doc/tei-p5-doc/en/html/, (19.01.2022). Davis, Mark, Aharon Lanin, and Andrew Glass. Unicode bidirectional algorithm. Unicode Technical Reports. Standard Annex #9, version 14.0.0. 2021. URL: https://www.unicode. org/reports/tr9/tr9-44.html, (25.10.2020). Devlin, Keith J. Logic and Information. Cambridge: Cambridge University Press, 1990. Etema, Elika J. and Koji Ishii. CSS writing modes level 4. W3C candidate recommendation, W3C, July 2019. URL: https://www.w3.org/TR/2019/CR-css-writing-modes-4-20190730/, (25.10.2020). Etling, Fabian. Modellierung kritischer Filmeditionen: Eine Annäherung am Beispiel der Varianzen im überlieferten Material zu D. W. Griffith’s The Girl and Her Trust. In: von Keitz, Ursula, Wolfgang Lukas, and Rüdiger Nutt-Kofoth, editors, Kritische Film- und Literaturedition. Perspektiven einer transdisziplinären Editionswissenschaft. Berlin: De Gruyter, forthcoming. Goodman, Nelson. Languages of Art: An Approach to a Theory of Symbols. Indianapolis: The Bobbs-Merrill Company, 1968. Groß, Sabine. Schrift-Bild. Die Zeiten des Augenblicks. In: Tholen, Georg Christoph and Michael O. Scholl, editors, Zeit-Zeichen. Aufschübe und Interferenzen zwischen Endzeit und Echtzeit. Weinheim: VCA Acta humaniora Verlag, 1990. ISO/IEC 10646. Information technology—universal coded character set (UCS). Technical report, Standard, International Organization for Standardization (ISO) and International Electrotechnical Commission (IEC), Geneva, December 2020. ISO/IEC 8613-2. Information technology—open document architecture (ODA) and interchange format—document structures. ITU-T recommendation T.412, Standard, International Telecommunication Union (ITU), Helsinki, March 1993. ISO/IEC 8859-1. Information technology—8-bit single-byte coded graphic character sets— part 1: Latin alphabet no. 1. Technical report, Standard, International Organization for Standardization (ISO) and International Electrotechnical Commission (IEC), Geneva, April 1998. ISO/IEC 8859-7. Information technology—8-bit single-byte coded graphic character sets— part 7: Latin/greek alphabet. Technical report, Standard, International Organization for Standardization (ISO) and International Electrotechnical Commission (IEC), Geneva, October 2003. Kirsten, Guido. Claude Baiblé, Dispositivtheoretiker. Montage AV, 16(2):147–156, 2007. Kirsten, Guido. Dispositiv. In: Wulff, Hans Jürgen, editor, Das Lexikon der Filmbegriffe. Kiel: Universität Kiel Institut für Neuere Deutsche Literatur und Medien, 2013. URL: https: //filmlexikon.uni-kiel.de/doku.php/d:dispositiv-7749, (07.01.2022). Kondrup, Johnny. Text und Werk – zwei Begriffe auf dem Prüfstand. Editio, 27(1):1–14, 2013. DOI: https://doi.org/10.1515/editio-2013-002.
Legibility and Viewability | 145
Krifka, Manfred. Wie man in fünfzehn Jahren einige semantische Probleme löst. Ringvorlesung Linguistische Fehlargumentationen. HU Berlin, 2003. URL: http://amor.cms.hu-berlin.de/ ~h2816i3x/Talks/TimeSpanScope.pdf, (16.06.2020). Link, Jürgen. Dispositiv. In: Kammler, Clemens, Rolf Parr, Ulrich Johannes Schneider, and Elke Reinhardt-Becker, editors, Foucault-Handbuch: Leben – Werk – Wirkung, pp. 237–242. Stuttgart, Weimar: J. B. Metzler, 2014. DOI: https://doi.org/10.1007/978-3-476-01378-1_27. Lunde, Ken and Koji Ishii. Unicode vertical text layout. Unicode Technical Reports. Standard Annex #50, Version 14.0.0. 2021. URL: https://www.unicode.org/reports/tr50/tr50-26.html, (25.10.2020). Martens, Gunter. Was ist ein Text? Ansätze zur Bestimmung eines Leitbegriffs der Textphilologie. Poetica, 21(1–2):1–25, 1989. DOI: https://doi.org/10.30965/25890530-0210102002. Posner, Roland. Was ist Kultur? Zur semiotischen Explikation anthropologischer Grundbegriffe. In: Landsch, Marlene, Heiko Karnowski, and Ivan Bystrina, editors, Kultur-Evolution: Fallstudien und Synthese, pp. 1–65. Frankfurt am Main: Peter Lang, 1992. Schlupkothen, Frederik and Karl-Heinrich Schmidt. ‘Commentary’ and ‘explanatory note’ in editorial studies and digital publishing. In: Nantke, Julia and Frederik Schlupkothen, editors, Annotations in Scholarly Editions and Research: Functions, Differentiation, Systematization, pp. 351–371. Berlin, Bosten: De Gruyter, 2020. DOI: https://doi.org/10.1515/9783110689112016. Schmidlin, Rick. London after midnight. Film, Turner Classic Movies, Atlanta. [1927], 2002. Schmidt, Karl-Heinrich. Texte und Bilder in maschinellen Modellbildungen. Tübingen: Stauffenburg Verlag, 1992. Schmidt, Karl-Heinrich. Wissensmedien für kognitive Agenten. Bonn: Infix-Verlag, 1999. Schmidt, Karl-Heinrich. Zur chronologischen Syntagmatik von Bewegtbilddaten (III): Deskriptive Syntagmen. Kodikas/Code, 31(3–4):217–270, 2008. Schmidt, Karl-Heinrich, Frederik Schlupkothen, Britta Reppel, and Laura Rehberger. Zur Edition strikt inkrementeller Flows in Dokumenten: Text und (tonloser) Film. In: von Keitz, Ursula, Wolfgang Lukas, and Rüdiger Nutt-Kofoth, editors, Kritische Film- und Literaturedition. Perspektiven einer transdisziplinären Editionswissenschaft. Berlin: De Gruyter, forthcoming. Shillingsburg, Peter. Resisting Texts: Authority and Submission in Constructions of Meaning. Ann Arbor: University of Michigan Press, 1997. Stetter, Christian. Schrift und Sprache. Frankfurt am Main: Suhrkamp, 1997. Stetter, Christian. System und Performanz: Symboltheoretische Grundlagen von Medientheorie und Sprachwissenschaft. Weilerswist: Velbrück Wissenschaft, 2005. Tscherkassky, Peter. Motion Picture (La Sortie des Ouvriers de l’Usine Lumière à Lyon). DVD 08, 3:23 minutes, black and white, silent. ARGE INDEX, Vienna. 1984. Wikipedia. Typographeum. Wikipedia, The Free Encyclopedia. 2022. URL: https://de.wikipedia. org/w/index.php?title=Typographeum&oldid=185403408, (09.01.2022).
John A. Bateman
A Semiotic Perspective on the Ontology of Documents and Multimodal Textuality 1 Introduction As seen running though all the contributions to this collection, the notion of ‘document’ continues to raise deep questions concerning just what the referent of the term might be. Positions range from including material objects (Otlet 1934), to the strictly formal, sometimes even reduced to strings of (abstract) characters (Renear and Wickett 2010), to the heavily sociocultural, where documents, whatever the term is taken to include, are seen as playing a crucial role in the performance and maintenance of ever more complex social activities (e.g., Brown and Duguid 1996; Schamber 1996; Smith 2012, 2014). Problematic relationships between ‘documents’ and a further, rather ill-defined cloud of terms, such as ‘text,’ ‘work,’ and so on, do little to clarify the situation. Many of these by now quite longstanding debates (cf. Otlet 1934, 1990; Briet 1951) receive a new urgency in the context of the movement towards digital artifacts (Buckland 1998; Renear and Dubin 2003). This is of particular importance for areas such as the digital humanities and editorial studies, where the terms occupy core defining positions within those disciplines’ understandings of themselves and their activities; for a broad overview, see, for example, Lund (2009). But the challenges are considerable. Some have even suggested that the terms are so fluid and variable in use that providing definitions is in any case of doubtful value. For ‘text,’ for example, McGann (1991, 16) states unequivocally that “what is textually possible, cannot be theoretically established,” while Huitfeldt (1997) suggests that defining ‘text’ is not the most beneficial way to begin in any case since any definition will depend on what one wants to do. Longstanding debates with respect to ‘documents’ and the utility of definitions circulate as well (cf. Frohmann 2009). Leaving core terms undefined appears inadequate, however. As Sahle (2013) argues in considerable detail in the context of digital editions, tighter understandings of what one is dealing with are essential in order to characterize just what activities are being performed and to evaluate both the methods developed and those methods’ products.
John A. Bateman, University of Bremen https://doi.org/10.1515/9783110780888-007
148 | John A. Bateman The present contribution will take up this task of definition from a rather different perspective enabled by recent developments in the theory of multimodal semiotics and multimodality (cf. Bateman et al. 2017). It will be argued that many of the disagreements in long debates in the digital humanities and text encoding traditions may be traced directly to weaknesses in the notions of ‘text’ employed. Many current concerns with the materiality of documents together with those documents’ extensions both into digital contexts and beyond written language can receive rather natural treatments from the broader semiotic foundation offered by multimodality theory. Indeed, the close engagement of multimodality theory with meaning-making and communication beyond the specifics of any particular form of expression (such as written language) offers much for clarifying difficult notions such as ‘text,’ ‘materiality,’ ‘digital materiality,’ ‘works,’ ‘markup,’ ‘editions,’ ‘transcriptions,’ and their interrelationships. In places, this will show similarities and parallels with some of the more sophisticated views of these constructs currently discussed within several disciplines, including editorial studies. Of particular relevance in this respect is the approach to ‘text’ taken up in Sahle’s (2013, 8, 45) construction of a ‘text wheel,’ where six basic facets of ‘text’ are defined, and, of course, the view of documents from three perspectives proposed by the Roger T. Pédauque collective that forms the overarching scaffold for this collection (cf. Pédauque 2006 and relevant chapters of the current volume). Pédauque defines documents in terms of their perceptible ‘form’ (abbreviated as what is seen: vu), their ‘content’ (abbreviated as what is ‘read’: lu), and their function and use in social contexts (abbreviated as what becomes ‘known’: su). The collective also considers in some detail the consequences of digitization and ‘digital documents’ for their categories. In all of these areas, it will be argued that an appeal to a broader multimodal semiotics may support a more systematized anchoring of the necessary relationships between the constructs constituting texts and documents, allowing a critical reappraisal of the general state of discussion. The chapter begins by briefly reviewing certain aspects of the discussion of documents and text as it has unfolded within the digital humanities and editorial studies. In central focus here will be the re-emergence and general commitment to questions of textuality and materiality, and their interrelationships—particularly within digital contexts. This will provide the starting point for a review of some of the more formal models of documents and texts that have been proposed, which will in turn feed into a discussion of approaches attempting to provide even more formal clarity undertaken within the field of formal ontology. The state of this discussion will be summarized and several weaknesses and lines for further development established. Both the more document-oriented and the ontological approaches will be argued to exhibit relatively weak semiotic foundations. The
A Semiotic Perspective on the Ontology of Documents and Multimodal Textuality | 149
final parts of the contribution then construct an explicit connection to the state of the discussion within multimodality studies so as to suggest a beneficial crossfertilization of concerns for a more robust and applicable formal characterization of the core concepts involved. On the basis of this, a multimodally-grounded characterization of the notion of ‘work’ will emerge, offering a complementary view on this central target of editions research.
2 Documents, Texts and Materiality As a preliminary orientation for what follows, it will be useful to reiterate some important strands of theorizing concerning the nature of text, particularly as employed within the digital humanities. Here we pick out two rather different directions of development. The first, drawing on the development of ever more complex schemes for textual annotation and digital curation, progressively moves the notion of ‘text’ away from materiality—that is, in Pédauque’s terms, a shift in focus from the perceptible (‘vu’) to the content (‘lu’). Materiality has always played a role in theoretical accounts of documents, ranging from considerations of the inclusion of physical objects of quite diverse kinds to the actual physicality of printed materials. The second direction arises partly as a reaction against the first, since precisely the process of transferring information into digital form forces an increased awareness of those aspects of ‘texts’ that are not then readily transferred— an obvious case being all aspects of the materiality of the artifacts considered. In addition, and shared across both of these lines of thought, is the challenging question of interpretation, and the relationships of ‘texts,’ however construed, with their users. The details of these discussions cannot be reproduced in detail, but the broad brushstrokes of the divisions emerging will be important below in our evaluation of the potential contributions of multimodality theory.
2.1 Digital Renditions of Texts and Text Encoding Whereas the nature of ‘text’ has been discussed in many disciplines over the years, the debate took on a particular slant with the need to place the concept within a digital context as it is often unclear to what extent the earlier constructs retain their utility. This was also a central concern in the discussions undertaken by the Pédauque group. Making documents and text accessible for computationally-mediated use has gone through several phases of development and there is still a perhaps surprising
150 | John A. Bateman number of open questions concerning quite fundamental issues. On the one hand, the long taken-for-granted material nature of texts under study was no longer tenable, while on the other hand, the introduction of digital representations of texts based on annotations of various kinds suggested that perhaps materiality was not so important in any case. Looking at the current state of the discussion can, however, readily give the impression that the various sides in the debate have agreed to differ rather than achieving a reconciled understanding. As will be argued further below, these disagreements can largely be traced to weaknesses in the semiotic underpinnings of the basic constructs required. The move to digitized environments and making texts and documents accessible within such environments led to critical treatments of several positions on ‘annotations,’ i.e., the issue of how information can be provided beyond treatments of texts as bare strings of characters—which was, and sometimes still is, how many researchers and practitioners see what the ‘computational medium’ provides (cf. Huitfeldt 1997). Four broadly chronological phases of annotation are usefully distinguished by Renear (1997). These are identified as orthographic, image-based, format-oriented and content-based. Renear, and much of the text encoding community, have now largely settled on content-based annotation as the option of choice and so this is where we will begin. Content-based annotation (Pédauque’s lu) includes a variety of structuring information largely drawn from traditional notions of text and document structure and genre. Thus text as a sequence of characters might be additionally structured into chapters, sections, paragraphs if a book, or into verses and lines if a poem, and so on. ‘Form’ issues (Pédauque’s vu), such as typography, are generally factored out as non-essential. Although such a division already raises a host of challenges, the approach is well established and underlies the broadest and most widely accepted set of annotation specifications currently, that provided by the Text Encoding Initiative (Vanhoutte 2004; Text Encoding Consortium 2021). ‘Annotation’ is then adopted as one means of attempting to retain information deemed necessary or useful beyond that of a bare ‘character string’ representation of some text or document, but just what such annotation may or should include remains a point of controversy. Indeed, the division between information distinguished by ‘content’ and ‘non-content’ has stood behind much of the ensuing debates shaping the field. First, it is by no means clear in general what must be included as ‘content.’ And second, any distinction of this kind establishes a difference between what is being ‘digitized’ and the resulting digital object simply by virtue of the fact that ‘something’ has been (more or less deliberately) excluded. Discussions of these issues are generally considered to be far more than ‘merely’ technical in nature. Indeed, the early proposals for effective annotation schemes also weighed in on rather more fundamental issues concerning the nature
A Semiotic Perspective on the Ontology of Documents and Multimodal Textuality | 151
of ‘text’ as such. Several authors involved with the definition of annotations consequently suggest that models of annotation are themselves also models of text and textuality. While partially understandable because of the goal of capturing in annotations what was to be considered ‘essential’ to any text represented, positions of this kind also raise substantial problems. DeRose et al. (1990), for example, started with the claim that the favored view of annotations as strictly ordered hierarchies of content objects (OHCO) was definitional of text—i.e., a text is an OHCO. This position was soon forced to adapt to allow several concurrent such hierarchies for any single ‘text’ considered because there are typically multiple useful descriptions of texts with claims to being ‘essential’ that are not automatically structurally aligned with the units of other, equally well justified, hierarchies. Consequently: “we now know that the breaking of strict hierarchies is the rule rather than the exception” (Durand et al. 1996, 68). A single ordered hierarchy of content objects was therefore replaced by sets of such hierarchies, each hierarchy corresponding to a view of particular kinds of content-objects, such as genre structures, syntactic structures, prosodic structures, and so on. The content-noncontent distinction remains, but widened to consider a greater range of potential structural organizations. Renear (1997) labels the assumption, or claim, that the OCHO-viewpoint was capturing what texts actually are as Platonism: The Platonistic view is that texts simply are hierarchical structures of certain sorts of objects— and specifically, of editorial objects such as chapters, titles, paragraphs, stanzas, and the like. (Renear 1997, 117)
This style of definition naturally pre-configures the kind of entities that it can include: i.e., the class of ‘texts’ considered are just those where this form-content division appears (relatively) unproblematic. It remains open to what extent that class may turn out to be rather small or even empty, rather than covering everything that we might want to consider as ‘text.’ The move to accepting multiple such hierarchical descriptions, all of which may nevertheless be considered constitutive of ‘text,’ Renear (1997) then characterizes as Platonic pluralism. The origins of such descriptions are generally seen to lie within particular disciplinary perspectives and interests; thus: Pluralism explicitly recognized the critical role that disciplinary methodologies, theories, and analytic practices play in text encoding. (Renear 1997, 122)
Renear nevertheless seeks to maintain the ontological claim of the original approach, arguing for pluralistic realism—that is, the various perspectives taken towards texts are still genuinely properties or features of some entities called texts.
152 | John A. Bateman These properties or features are, at least potentially, objective descriptions of those entities of investigation. A broad division enters into the discussion at this point between those who maintain Renear’s commitment to realism and those who favor instead the dissolution of such realist claims. The latter orientation Renear labels ‘anti-realism,’ broadly associated with more constructivist, humanistic, or post-modern views of meaning. Here evaluation criteria are related explicitly to the aims and purposes of setting out an annotation, rather than as capturing what is considered the real ‘essence’ of some object of study. Within such a perspective, ‘texts’ are not independently existing entities in any case and so there are no ‘inherent’ or ‘intrinsic’ properties to discover: the essential question is not about a true representation, but: Whom do we want to serve with our transcriptions? Philosophers? Grammarians? Or graphologists? What is ‘correct’ will depend on the answer to this question. And what we are going to represent, and how, is determined by our research interests […] and not by a text which exists independently and which we are going to depict. (Pichler 1995, 690) Texts are not objectively existing entities which just need to be discovered and presented, but entities which have to be constructed. (Robinson 2009, 45)
A potentially unbounded relativism is avoided here by aligning more with a constructivist pragmaticism in which communities of practice may well come to significant agreements concerning just what constitutes ‘appropriate’ or ‘useful’ descriptions (cf. Huitfeldt 1997). The positions in this debate are summarized in some detail by Biggs and Huitfeldt (1997). Interesting parallels can be drawn here with the state of the discussion concerning ‘transcription’ in linguistics; developments there offer much for conceptualizing the tasks of annotation in an editorial context as well. Originally, rather similarly to early discussions of annotation schemes being ‘correct’ with respect to the object annotated, transcription was intended to ‘fix’ fleeting linguistic events. For example, prior to ready access to recording devices, researchers of conversational interaction would make detailed written transcripts of what was said and how, and those transcriptions would then form the main body of data analyzed. But any ideas that such transcriptions could replace the data at hand were dealt a deathblow in the seminal article by Ochs (1979), where it was shown unequivocally that even linguistic transcription—with its claims of objectivity—necessarily entails theoretical decisions and interpretation. This realization then also surfaces in discussions of annotation and digital editions. Sahle (2013, 331), for example, quotes the following:
A Semiotic Perspective on the Ontology of Documents and Multimodal Textuality | 153
A neutral description is, we suggest, merely one in which the set of interpretative distinctions made happens to coincide with the set of such distinctions most people would wish to make most of the time; an analytic or interpretative transcription is one in which the set of distinctions made is peculiar to some specific analytic goals or agenda. (Greenstein and Burnard 1995, 144)
The question as to whether such descriptions are ‘real’ in some sense is then left to the philosophical positions adopted by the individual analytic schemes involved. Moreover, just which position is presented in the discussions as the more convincing often depends on the kinds of activity being undertaken; the ‘preexisting’ text might appear natural in more straightforward cases of digitization, whereas more complex editorial activities, such as those set out in Pichler’s (2021) discussion of the diverse fragments of materials making up Wittgenstein’s Nachlass, are considerably more challenging for the ‘pre-existing text’ view. Doubting the existence of some object to be described when considering text descriptions is naturally more aligned with areas of study where that ‘object’ is not straightforward to find. In editions research and practice, considerable historical and practical work may be involved even to make the assumption that there is some singular ‘text’ to be described in the first place. Within such contexts the simple view that there ‘is’ some object that is receiving descriptions of an ‘objective’ nature comes under considerable pressure. Problematic across the entire debate, however, is the very notion(s) of ‘text’ exchanged. In fact, it is highly doubtful that ‘text’ is being used the same way across the diverse contributions. For some, the ‘content’ component of a text has already been restricted to strings of alphanumeric characters, which fails as soon as there are more visual components, such as the truth-tables and deliberately ambiguous drawings found in Pichler’s Wittgenstein case (cf. Biggs and Huitfeldt 1997, 357). For others, it is evident that a far broader range of information is expected—Pédauque’s ‘form’/vu characterization, for example, is already more in line with the account of perceptible materiality to be approached below from the perspective of multimodal semiotics. Moreover, even for alphanumeric strings, it is not always possible to guarantee that the linguistic content is fixed as various kinds of ambiguity, even intended ambiguity, may be present. Other researchers argue that one of the essential components of text must surely be the ‘meaning’ of the text, which should then receive some kind of annotation according to the content-form division—but since meaning is a matter of interpretation and interpretations can vary dramatically, this would indicate that there could be an unlimited set of potential annotations and so may be taken as a reductio ad absurdum for content-object based annotation at all (cf. Biggs and Huitfeldt 1997).
154 | John A. Bateman There is considerable lack of clarity here, perhaps best epitomized in the discussions of whether a text still exists if there is no one left who can understand it. Thus, in answer to the question: “Does a text contain knowledge if there is no one around to read it?” (Biggs and Huitfeldt 1997, 361), it is suggested that there should be two clear outcomes. On the one hand, realists must answer ‘yes’ because ‘text’ is an independently existing entity; while, on the other hand, ‘anti-realists’ should answer ‘no’ since, as Robinson describes it, For, text without human perception is just marks on paper, or sounds in the air. These marks and sounds only become text when we find meaning in them. (Robinson 2009, 45)
Renear’s position, in contrast, remains firmly oriented towards the veridical. Even though, as Huitfeldt makes clear: there are no facts about a text which are objective in the sense of not being interpretational. (Huitfeldt 1997, 237)
Renear argues that this fails to establish any grounds for abandoning realism. Just because the production of descriptions involves interpretation, this does not necessarily entail that those descriptions are ‘not real’ or ‘not objective.’ Renear sees any equation of the presence of interpretation with an alleged impossibility of objectivity as harking back to a quite unwarranted “positivism.” To show this, he presents the following example to make the point that it matters that an annotation characterizes the object of annotation in some fairly rigorous fashion: Suppose a transcription has nothing at all to do with the text but helps the researcher win a prize. In such a case a (false) transcription would serve the researcher’s interests quite well, but no one would claim that it is thereby a reasonable encoding, or one which is to be in any sense commended as a transcription. (Biggs and Huitfeldt 1997, 355)
Although perhaps reasonable at first glance, Robinson (2009, 46) quotes the example and declares himself ‘astonished’—for Robinson, judging the annotation negatively by virtue of the motives of the transcriber is strictly off-limits, smacking of a false ‘intentionality fallacy.’ Further, he suggests that if the transcription won the prize then the judges must have had some good reason for choosing it. The transcription is then unlikely to be completely worthless and so obviously has some claim to ‘validity’: “Someone found it useful; therefore it is useful.” Consequently, he concludes, “for us, ‘anti-realism’ is not a contention. It is a simple description of what we do when we encode” (Robinson 2009, 46). But this misses the point of Renear’s example. Renear stipulates that the annotation has “nothing at all to do with” the text, and so can only be considered to be
A Semiotic Perspective on the Ontology of Documents and Multimodal Textuality | 155
winning the prize on some other grounds than adequacy as a transcription. The question is more whether it is possible to distinguish between annotations that have nothing to do with a text, and those that do, in some manner that is stronger than a simple community agreement to attribute particular interpretations rather than others. And this is where it again becomes necessary to be far more specific about what constitutes ‘texts.’ There are clearly many interpretations that one can make of something identified as a text (including that identification in the first place) and some of these may well be useful. The realist position, however, needs to make the stronger case that there will be some interpretations whose theoretical status involves a far more intimate connection with the text being annotated. In short: those annotations need to be revealing (to the extent that a theory has developed) of characteristics that are actually properties of the text analyzed. These are properties that will hold necessarily even when there is no one who can interpret them; they are constitutive of the object analyzed. This strong realist position can be given further support by linking it more closely to a semiotic characterization as we will see below. Here, however, we can already note that properties of this kind must have certain quite specific semiotic statuses to function in the way required. First, they need to be indexical (in Peirce’s sense) with respect to their objects. And second, they need to be diagrammatic (again in Peirce’s sense) with respect to their objects. The kinds of difficulties and disagreements circling in the debate described are strong symptoms that this kind of precision is necessary. Indeed, it becomes increasingly evident that we need to find definitions and theoretical placements for the notion of ‘text’ that take us further in two key respects: first, these placements need to be sufficiently robust to move beyond suggestions that texts are simply collections of hierarchically organized content objects; and second, it is equally important to move beyond the idea that the selection of representations is primarily a matter of utility. This can be shown further in, for example, Robinson’s pronouncement on the basis of conclusions drawn from his own earlier practical editorial work that: […] our transcripts are best judged on how useful they will be for others, rather than an attempt to achieve a definitive transcription of these manuscripts. (Robinson 2009, 45)
Semiotically, this position conflates several issues because several very different kinds of utility need to be differentiated. Even though descriptions (including transcriptions) can be used for a variety of purposes, the lesson from linguistic transcription is that an important subclass of such descriptions can be beneficially pursued as revealing, i.e., making visible, properties of the object annotated. If the object changes, then the descriptions change accordingly (indexicality); and structural organizations and relationships exhibited by the object can be made
156 | John A. Bateman visible in detail by corresponding transcriptions (diagrammaticity)—although transcriptions are nevertheless (always and necessarily) abstractions (interpretations) with respect to the described object rather than re-renderings. This offers as a by-product a position more inclusive of researchers who critique the idea of hierarchical annotation on the basis that their objects of analysis are simply not hierarchical (e.g., Huitfeldt 1997; Schmidt 2010). A diagrammatic representation abstracts certain properties from its object: the utility of OHCObased annotations from this perspective, i.e., when considered diagrammatically, is that they constitute statements that such structural organizations may be derived from the object of analysis, not that the object of analysis is ‘identical’ to those descriptions. Moreover, the very diagrammaticity of some transcriptions combines both interpretation and objectivity: a diagram is always an abstraction, and so has to be constructed, which can be considered a minimal notion of interpretation; but limiting the abstraction to diagrammaticity (i.e., exhibiting ‘secondness’ in Peirce’s terminology) anchors the interpretation to the object of analysis in a manner quite removed from ‘subjective’ interpretation. This is stronger than the position suggested by many that interpretations are saved from idiosyncratic variation by communities of agreement (Huitfeldt 1997) because it commits to a particular practice of scientific method as well—generally that already entailed in Peirce’s notion of pragmaticism in science as a social activity. This separates out the process of finding agreement on interpretations on specific issues from that of a general method of finding agreement at all. The notion of ‘objective truth’ rejected by Robinson is then, indeed, revealed by a semiotic consideration to be reminiscent of positivism, just as Renear stated over a decade earlier. Interestingly, however, even though the theoretical constructs and configurations Robinson appeals to provide very little support for the move, Robinson is still in several respects forced in this direction in any case. For example, after strongly arguing for the necessity of an anti-realist position with respect to transcription and annotation, he admits: If one is constrained in transcription only to record those phenomena for which one feels there is a clear use, then one might end up with impoverished transcripts and with editions serving only rather rigidly determined needs. Who is to determine what these needs are? We do not know, and can only guess at, the various uses our transcripts might serve. (Robinson 2009, 47) But there are times—many in fact—when we find ourselves thinking: we do not know who is to use this text, or how they will use it, but there is something here in the text which seems important, and which we will therefore encode. Here, we part company with Pichler, towards something nearer Renear’s realist model. We find ourselves thinking: what is the text saying at this point? Even if we cannot think of a use for this information, or even a transparent way of encoding it, we feel bound to try to encode what it is saying, somehow. (Robinson 2009, 47)
A Semiotic Perspective on the Ontology of Documents and Multimodal Textuality | 157
In short, Robinson sees his activity as providing an edition as bound to the ‘text’ and what the text is taken to be saying: “the text does have an independent existence; it is saying something; we are not just ‘constructing’ an artifact for the use of our readers, but we are trying to interpret an utterance” (Robinson 2009, 47). This is simply not compatible with his earlier statements of the centrality of utility. The theoretical configurations available to Robinson are evidently quite insufficient to bear the strain of even his own practice and theory. Finally, in a more recent development around this debate, Pichler (2021) also attempts to reaffirm the anti-realist position by bringing an explicitly more philosophical turn to the ontological question of texts and textuality. He begins by usefully distinguishing properties of annotations, such as assumed conformity to hierarchically organized content units, from the objects of descriptions, for example, ‘texts.’ As noted above, Pichler makes it very clear that one can produce useful descriptions of entities that do not share all the properties of those entities and so demanding equivalence of object and description—as embodied in the ‘text is OCHO’-view—is unwarranted. There is no reason to follow the line of development started by DeRose and colleagues that annotations are necessarily an image of textuality at all. With this false conflation out of the way, Pichler continues by seeking to uncover a stronger view of the nature of ‘texts’ drawing substantially on his experiences with complex projects, such as the Wittgenstein Nachlass. In projects of this kind, it is difficult to maintain any assumption of some text pre-existing the editorial activity because of the numerous and medially diverse fragments needing to be brought together. To develop this view, Pichler employs standard ontological distinctions drawn between events, objects and properties and attempts to position ‘text’ against this backdrop to “find out what sort of entities texts could be on a general level” (Pichler 2021, 17). As he notes, this is what Renear and colleagues were attempting from the outset but now approached ontologically rather than from the standpoint of annotations. Starting from the more concrete and observable, Pichler considers ‘writing’ as a form of basic action that can be viewed in terms of its physical movements. This action has a result, the ‘written.’ Pichler considers such action results to be ‘documents,’ an ontological jump that we will need to come back to and reappraise below as it by-passes much of the long discussion of what a document might be. Documents are then differentiated from ‘texts’ by linking the former to the physical traces alone, without meaning or understanding, and the latter to interpreted or understood traces. This is an important distinction, which, regardless of labeling, we will see more of below. In contrast, ‘text,’ for Pichler, is essentially concerned with meaning. The action of producing writing and the action of producing texts are considered fundamentally and ontologically distinct. Writing, as a physical activity, may be produced by one agent (and potentially also without meaning);
158 | John A. Bateman ‘texting,’ a term that Pichler introduces to designate the action of producing a ‘text,’ may only be produced by actions of writing and reading, and so is (often) not under the control of a single agent. Consequently: while writing produces a finite and rather stable result (namely documents), texting does not; rather it produces an instable and potentially continuously ongoing, endless and open-ended result. (Pichler 2021, 18)
Written documents are then characterized ontologically as objects, as are the ‘carriers’ of written documents (“papers, trees, stone, pergament etc.”). Documents and carriers are necessarily concrete, material objects, and so positioned at the perceptible vu-pole in Pédauque’s characterization. The nature of ‘text’ within Pichler’s ontological view is naturally more complex. Given the very different identity conditions exhibited by a more material view of writing and any ‘texts’ that result, Pichler comes to the conclusion that ‘text’ might be better conceived as events—in particular, the event of constructing an interpretation, which may, in turn, be an activity shared by many agents and be distributed across time and space. This has considerable merit as a position and has also been argued in a more materialist vein, for example, by Drucker (2011, 18). This allows Pichler to strongly reaffirm the anti-realist position described above in the following terms, where the understanding of material, i.e., Pédauque’s lu, takes center stage: text is something which cannot exist without being sustained by an act of reading with understanding. A text that loses the understanding reader will fall back on pure document level and cease to exist as a text. […] This position at least, I hope, should not be controversial, at least if one agrees with the principle that signs have meaning because they are furnished with meaning by humans, and that reading with understanding is thus meaning structure constituting rather than merely meaning and structure depicting […] (Pichler 2021, 22)
It is significant that Pichler here, at last, draws explicitly on at least a mention of ‘signs,’ albeit in a rather restricted sense. Pichler uses this position to draw out further the necessary distinction between text encoding and texts ‘themselves,’ emphasizing how encoding should be seen as a record of paths of understanding, “a protocol of perception, mapping and interpretation” as he cites from a presentation from Sahle (2015), rather than a depiction of what a text ‘is’ independently from any such process. ‘Text’ is then the understanding of physical traces and is never entirely present at one time (since one is always ‘in’ a process of interpretation rather than observing a finished product), rendering it an activity ontologically. Bearers of texts are then whatever entities support the activity of ‘texting,’ including humans reading or hearing the physical traces. This is certainly a valuable philosophical move,
A Semiotic Perspective on the Ontology of Documents and Multimodal Textuality | 159
but still shows an underdeveloped account of the semiotics involved. How can traces support such activity? While more detailed semiotic and ontological models for several of the key constructs are already available and have been applied to conceptualizing editorial practices, editions and text from the perspective of the digital humanities (cf., particularly Sahle 2013; Pierazzo 2015), this must now be taken further. A more robust view of text, textuality and the practices of annotations and constructing editions needs to be capable both of moving beyond relatively language-centered objects of interest and of specifying just what kinds of properties traces will need to have in order to be capable of participating in such complex activities as ‘texting.’ Both steps will be necessary for non-linguistic elements such as diagrams, images, etc. to be covered—forms which, as Sahle (2013, 355) notes, have been severely marginalized by the ‘string of symbols,’ one-dimensional view of representation and annotation that some consider inherent to the ‘medium’ of computers (cf. Huitfeldt 1997). To move beyond this, we will need to draw on more recent results in the theory of multimodal semiotics that grant both digital and non-verbal ‘textual’ phenomena appropriate support.
2.2 Materiality Strikes Back Alongside the development of ever more complex markup schemes and discussions of the status of their annotations as sketched in the previous subsection, a further parallel track of debate also began early on to question more closely the kinds of information that were being targeted. It was argued that important facets of any works so treated were simply not being addressed, placing the already problematic content-form division under increasing strain. Robinson (2009) points to the fact that, particularly for those working with manuscripts and older documents, even the ‘simple’ recognition of the characters involved in a written work can present challenges that might require interpretation and whose decisions certainly affect all subsequent annotation. Taking this further, an unbounded variety of organizational properties might be found relevant for any document over and beyond any identified sequence of verbal symbols. As mentioned above, decisions such as these have generally been considered to depend on the purpose of the text encoding. McGann (1991) suggests that such material aspects of an object of analysis might then also be captured by employing ‘bibliographic codes’ in addition to the more content-oriented codes constituting standard annotation schemes. This orientation has naturally found substantial resonance among disciplines regularly confronting the physicality of their objects of interest; systems of abstract annotation that treat those objects primarily as sequences of characters with additional
160 | John A. Bateman structuring information are commonly judged to fail such concerns. McGann also points to the fact that, for some authors, visual resources are already creatively employed as integral facets of their literary work. Particularly when characterizing works of this kind that have gone through a history of editorial revisions, it is necessary to consider very carefully just which aspects of their bibliographical codes may have been changed since such changes can have profound effects. Thus: This distinction, between a work’s bibliographical and its linguistic codes, is fundamentally important for textual criticism, and hence for critical editing. Without making and implementing the distinction in detailed ways, textual critics cannot fully elucidate—cannot analyze—the lines of materials which descend to them. (McGann 1991, 52)
McGann notes that these observations are by no means new: Not that scholars have been unaware of the existence of these bibliographical codes. We have simply neglected to incorporate our knowledge into our theories of text. (McGann 1991, 78)
Certainly a theory of text that is in any way restricted to the linguistic information, and even to ‘alphanumeric strings,’ is not going to be sufficient for the wealth of more complex documents and works that are considered even in traditional editorial studies, let alone the complexity commonplace in today’s media landscapes. This realization has led to a strong counter-reaction against the de-materialization of texts intrinsic to any factorization along the lines of content and form. The ‘materiality’ of any object being subjected to text encoding is consequently seen increasingly as worthy of discussion and even of inclusion (somehow) in digital contexts. Nevertheless, the notion of ‘materiality’ found in much of this discussion remains diffuse. McGann writes: the focus here is confined to works of imaginative literature (so-called) because such works foreground their materiality at the linguistic and the bibliographical levels a like. […] Were we interested in communication theory, rather than in textuality, such redundancies would be studied as ‘noise,’ and their value for the theory would be a negative one. (McGann 1991, 14)
All ‘non-linguistic’ regularities and redundancies seem then to be termed ‘bibliographic’ codes, which must then include tables of contents, illustrations, typography—in short, any aspects of design over and above the bare linguistic content. Elsewhere McGann suggests that the interpretation of ‘bibliographic’ codes and materiality is generally considered in the text encoding community to work in different ways to ‘semantic codes’ (McGann 2001, 145), but just what this means for analysis and annotation requires considerably more development. In fact, McGann does not yet really engage with materiality at all and one of
A Semiotic Perspective on the Ontology of Documents and Multimodal Textuality | 161
the first steps in his discussion even serves to further efface differences between established approaches and those that may be required for addressing materiality: both linguistic and bibliographical texts are symbolic and signifying mechanisms. Each generates meaning, and while the bibliographical text commonly functions in a subordinate relation to the linguistic text, ‘meaning’ in literary works results from the exchanges these two great semiotic mechanisms work with each other […] (McGann 1991, 67)
This position is clearly still very much rooted in a linguistic or literary model as providing the core organizing principle—as is probably justified when the focus is on literature or literary works of a fairly traditional kind. Although it is admitted that there are signifying practices that operate differently to those of language, the kinds of semiotic, i.e., signifying, patterns assumed for the non-linguistic components here remain similar in organization to the linguistic: they are ‘symbolic.’ Moreover, suggesting ‘two great semiotic mechanisms’ lumps a huge range of signifying systems together among the non-linguistic, some of which may indeed be very close to the linguistic—such as ‘tables of contents’ or (certain aspects of) diagrams—while others fall far closer to manipulations of the ‘pure’ materiality of the physical objects involved (for example, the thickness and shape of the paper employed). Such an undifferentiating position is scarcely an appropriate basis for characterizing textuality and the materiality of textuality more generally, as was McGann’s aim. What is required is a far more general starting point, such that, when we return to literary works, any excursions found into the ‘bibliographic’ are already expected and can find their place within a more holistic view. This will be critical for undoing the marginalization of the non-textual in digital humanities encoding efforts remarked on by Sahle (2013). The lack of differentiation of the non-linguistic in McGann’s account consequently creates a monolithic view of considerable semiotic complexity that (quite justifiably) proves difficult for some theorists to accept as being approachable at all. This can be seen well in Eggert’s rejection of McGann’s expansion of interest into the area of ‘bibliographic’ codes, i.e., codes that are anything (apparently) that are not narrowly linguistic. Eggert takes exception to the use of ‘code’ along lines similar to many criticisms of an earlier form of semiotics where a rigid notion of rules relating signifiers and signified—i.e., a ‘code-book’ view—appeared to be assumed: I question McGann’s notion of bibliographic code itself. The term has been taken up by a raft of editorial commentators and theorists but the attraction of it is, I believe, mainly rhetorical. If one is to be strict about the term, then there clearly is no such thing as bibliographic code. Dictionary definitions stress the systematic nature of codes: rigorously collected and arranged, as in legal codes; and the strictly defined substitution of words for other words, as in secret military codes. But the unpredictabilities of the gap between the physical features
162 | John A. Bateman of a book and their meaning are poor conditions for the specification of a code. We can talk about the art of page design and book binding. […] But code is going further than the evidence permits. It would require a full-blown semiotics. It seems to me that there can be no specifiable and invariable meaning for any particular mise-en-page. (Eggert 2010, 191–192)
This position is certainly understandable given the open set of possibilities that McGann’s discussion appears to concern—i.e., anything non-linguistic. But assuming both that this must then include all the material properties of any object of investigation prior to closer analysis and that ‘codes’ commit to fixed meanings is unhelpful. Faced with such a morass, Eggert can only conclude: With written or printed pages, then, to profess to be specifying the full material range of their possible significances—to our senses of sight, touch and smell—in order to turn them into a code would involve having to close the gap between the documentary and the textual, the gap between the material stimulus and the meaning for the reader. Given that there is no pre-existing code that can be drawn down for analysis, how is a ‘code’ to be specified? Clearly, it is impossible. (Eggert 2010, 192)
Eggert’s assertion of impossibility is, however, premature. Evaluating the situation in a more revealing fashion, making ‘codes’ (however they come to be defined) accessible for use in description, is perfectly possible, but it does require, as Eggert says, “a full-blown semiotics.” Just how one could be expected to approach complex signifying objects without such a semiotic analysis is unclear—one might just as well suggest performing analysis of the verbal content without any knowledge of language and languages. Once, however, one does commence such analysis, the monolithic block of ‘non-linguistic’ signifying information begins to show considerable internal organization, some of which moves directly in the directions of well specified codes. This is already useful for considerations of annotation without requiring immediately that all non-linguistic properties, i.e., the “full material range,” of some artifact be so treated. There are also further aspects of more sophisticated semiotic views that can be drawn on to support some of Eggert’s own concerns. Whereas Eggert proposes his own model of textuality that attempts to avoid the perceived shortfalls of McGann’s assumption of a code-based view of textual interpretation, those pitfalls are largely the consequence of the looseness of McGann’s semiotic account. When re-expressed with more semiotic precision, the positions resulting are far more naturally aligned. For example, Eggert takes particular exception to the apparent replacement of the human interpreter of texts with systems of codes that could ‘do’ (for example by computational processes) the interpretation instead of the reader. This is the view of semiotic codes that was quite prevalent in the
A Semiotic Perspective on the Ontology of Documents and Multimodal Textuality | 163
1960s–1970s, the code-book view mentioned above. Here no real interpretation is possible at all; a decoding ‘agent’ simply has to find instances of signs and look up their meaning—something that indeed might be considered straightforward to perform by computer. Codes in this sense are, however, precisely as Eggert argues, inappropriate in that they do not lead to the right kinds of properties being assumed of the textual objects of investigation they are to characterize. McGann is also quite aware of the importance of this interpretational direction, however. As he summarizes: Three points are especially important [for rethinking textuality]. First, we want to recover a clear sense of the rhetorical character of traditional text. Rhetorical exposes what texts are doing in saying what they say. It is fundamentally ‘algorithmic.’ Secondly, we want to remember that textual rhetoric operates at the material level of the text—that is to say, through its bibliographical codes. Third, […] texts are not self-identical, but their ambiguous character—this is especially clear in poetical texts—functions in very precise and determinate ways. (McGann 2001, 149)
The ‘rhetorical character,’ under-developed here as it is, is where more sophisticated questions of interpretation are placed—although the implications of calling this ‘algorithmic’ are unclear. The inappropriate equation of bibliographical codes and materiality is also evident. There is, therefore, still very much to be clarified and the consequences for annotation and text encoding are unresolved. It is uncontroversial that texts admit of multiple interpretations, and so, as Eggert asks: “How can the text have a stable identity that can be encoded, and yet be always different” (Eggert 2010, 194)? The notion of semiotic mode introduced below as a replacement and extension of ‘code’ addresses this directly. The semiotically outdated premise of fixed form-content mappings as well as an ‘all or nothing’ approach to the coverage of material distinctions are then shown to be not only unnecessary but also misleading. Concerning ‘documents,’ Eggert suggests, similarly to Pichler, that a ‘document’ be seen as the “material basis of text.” Such an entity has “by virtue of its physical or computational nature, a continuing history in relation to its productions and its readings […] The work emerges only as a regulative idea, the container, as it were, of the continuing dialectic” (Eggert 2010, 194). This is far closer to the view that will emerge semiotically below. Eggert also sees annotation and coding as one way of recording the results of such a continuing dialectic. Stand-off annotation in particular offers a way of accumulating new interpretations as they become relevant because each new set of annotations can be made independently of previous annotations (Eggert 2010, 197). Annotation and coding schemes may then function as repositories of shared knowledge, growing gradually as interpretative results are achieved. It is not then annotation as such that is rejected in Eggert’s model,
164 | John A. Bateman but rather any claims to being able to set out ‘in advance’ regulatory schemes that “enunciate the complete range of linguistic and bibliographic codes” (Eggert 2010, 197). Consequently: The interpretative files (or ‘tagsets’) created by scholars need to be accessible as a gradually evolving tradition of commentary and scholarship. (Eggert 2010, 198)
This is precisely what is suggested by a more refined semiotic foundation for ‘codes.’ Since we still know rather little about many of the ‘codes’ employed in communication, their formalization is an ongoing scientific endeavor that is not to be settled by committee or standardization bodies. Substantial empirical work is demanded and the state of any formalized description of a semiotic ‘code’ is only the currently developed ‘best approximation.’ This will be clarified substantially further below when the construct of a semiotic mode is introduced in more detail. Finally, a position that is more thoroughly materially oriented is that pursued by Hayles (2003), who similarly sets out a range of arguments in favor of ‘rethinking’ textuality with a more material slant, addressing concerns critically raised by changes in medium. Here materiality receives more explicit attention than was the case with McGann. For archives, she asks, whether slight color variations brought about by such changes in medium, particularly to the digital, might not also affect meanings. To theorize this further, Hayles adds a notion of ‘digital code’ to the problems and challenges of the linguistic and bibliographic codes. Differences between media where the work cannot be separated from the substrate (painting) and where that is (arguably) possible (verbal) are discussed in order to point out the dangers of assumptions that physical form can, in general, be factored out of the notion of ‘text’ at all. For Hayles, textuality needs to take stock of ‘physical form’ as well, although how precisely this might operate remains undeveloped. Noting a proliferation of terms around ‘work,’ ‘text,’ ‘version’ and so on, she argues that accounts are still very much anchored in a ‘print-centric’ view that, in addition, continues to marginalize notions of color, typography, layout and so on. These discussions are closely interwoven with decisions made for annotation and markup. Any selection of some markup rather than another is also a decision concerning what is considered important or relevant as part of a ‘text’ and, as discussed above, some researchers have indeed extended this to the status of ‘theory’ (cf. Renear 1997). But theories of ‘text’ of this kind, often implicit and always rather restricted, need to be seen quite negatively: constructions of theories of textuality on the basis of annotation principles offers far too thin a theoretical foundation for addressing a semiotic construct of the complexity of ‘text.’ In fact, picking up the argument from Hayles:
A Semiotic Perspective on the Ontology of Documents and Multimodal Textuality | 165
Perhaps it is time for a Copernican revolution in our thinking about textuality, a revolution achieved by going back and rethinking fundamental assumptions. (Hayles 2003, 266)
The approach to multimodality introduced below attempts to move us towards precisely such a rethinking of theories of text, one in which the semiotic distinctions necessary are drawn in a more developed and nuanced fashion than has proved possible without semiotic grounding.
3 Document Ontologies In addition to the more ‘text’ or ‘instance’ oriented approaches illustrated in the previous section where considerations of the nature of ‘text’ have proceeded from the practical concern of how to markup texts or documents, there are also substantial efforts that attempt to characterize the notions of ‘document,’ ‘text,’ and ‘work’ more directly at a conceptual level. These efforts have generally been located within the broad areas of knowledge engineering, formal ontology and the semantic web and range from the more practical, where the goal is to provide solutions for appropriately classifying digital entities within corresponding informational infrastructures, to the more theoretical, where the goal is to provide robust formalizations of the particular properties of the entities involved.
3.1 The Functional Requirements for Bibliographic Records One of the earliest and most broadly applied moves in this direction is the Functional Requirements for Bibliographic Records specification (FRBR: IFLA 2009; Plassard 2013). Here we see a response to the practical requirements of computational modeling and of providing semantically-rich characterizations of digital entities for librarianship and editorial work where there is an urgent need to group particular physical instances of works in appropriate ways. The FRBR may be characterized as a ‘lightweight’ ontology in that the aim is to provide usefully distinguished classes and properties that are not necessarily anchored into more philosophically motivated foundational categories. The original definitions of the FRBR were consequently expressed in terms of ‘entity-relation’ declarations of a fairly straightforward kind. Later developments have also aligned the FRBR entities and relations with foundational ontologies of the kind we will discuss further below, however (e.g., Peroni and Shotton 2012; Bekiari et al. 2015). For present purposes, we will be concerned only with the ‘Group 1’ entities from the FRBR specification as these directly address those entities constitutive of ‘intellectual
166 | John A. Bateman intellectual/artistic ‘content’
Work
is realized in 1:many
Expression
id, title, date, form: novel, play, poem, sonata, painting, … medium of performance (for music), … id, title, date, form: alpha-numeric notation, musical notation, … spoken word, musical sound, cartographic image, photographic image, sculpture, dance, mime, …
language, medium of performance, …
is embodied in many:many
Manifestation is exemplified by
id, title, date, publication details, typeface, typesize, playing seed, colour, file characteristics, …
1:many
physical recording of ‘content’
Item
id, provenance, location, physical properties, condition, exhibition history, …
Fig. 1: Basic FRBR entities and inter-relations constituting an ‘intellectual endeavor,’ together with some illustrative properties collected from IFLA (2009).
endeavors’ such as ‘works’ and their expressions; further FRBR groups present characterizations of meta-data such as authorship and classifications of content areas—these will not be of concern here. The Group 1 categories of the FRBR specification and their interrelationships are shown in figure 1. The uppermost category is the intellectual or artistic ‘work,’ which may then find expressions in various (more or less symbolic) abstract forms, which in turn receive physical manifestations. The physical manifestations are themselves seen as ‘types,’ which then require ‘exemplification’ in individual tokens, or items. Thus, a ‘work’ may be a novel or a piece of music, its ‘expression’ is then a sequence of words in a language or a specified sequence of musical notes, which then finds ‘manifestation’ in an actually written sequence of words or a musical score, which finally may then be instantiated as individual printed tokens of the words or notes of the score. These divisions were mostly driven by the needs of library classification, where one has a specific physical exemplar and needs to relate this to broader categories. So a particular physical book is then a print copy of an edition, which is taken as an expression of some work. This is also proposed as a solution to digital representations so that a manifestation of an expression may be, for example, a pdf file, an html file, or something printed on paper, etc. Typical illustrations of the need for such distinctions are readily found in the library context where, for example, there may be two copies of an audio compact
A Semiotic Perspective on the Ontology of Documents and Multimodal Textuality | 167
disk (CD) containing a particular recording of Beethoven’s Ninth Symphony from a particular orchestra, at a particular time and place, with a particular conductor and so on. In such a context, there are multiple singular entities involved: the physical CDs, the particular performance recorded, as well as the particular piece of music itself. One needs also to be able to state that various CDs might bear the same content and so on. Then, in the editions context, one might also need to characterize various distinct versions of the musical score discovered. Here again one needs to be able to characterize these distinct versions as having something in common as expressions of the ‘same’ work, even though they may be substantially different in other respects. Many standard challenges of the area can then at least be expressed, such as, for example, when characterizing the notion of a particular musical work, its expression as a score sheet, and, on the one hand, various copies of the score sheet and, on the other, various performances of the work. A positive result of the FRBR formalization is therefore that it becomes possible to be very clear about how particular words may be used in many quite different senses. Taking and slightly modifying Tillett’s (2004) illustration of different uses of the term ‘book,’ for example: – when we ask ‘who wrote that book’ the question is often about the originator of the work recorded in the book rather than any direct physical relationship with the forms occupying the book’s pages—that is, in Pichler’s sense above, someone more concerned with the original ‘texting’ than with the ‘writing’; – when we ask ‘who translated that book’ the question is more concerning a particular text, an expression of the work that has been ‘derived’ by translation from another expression of the same work; – when we ask ‘where can I find this book’ in a bookstore, we are generally looking for a physical object, but not any particular singular physical object since any copy of the intended book will suffice, similarly if the book is being reordered; – and, also in a book store, when a customer says ‘this book is damaged,’ the reference is then really to a particular physical object and the condition of that object and not to a type or edition of a book—nor would this constitute a critique of the work. These are all important clarifications, showing how quite different cases that are also relevant for practical library and bibliographical work can be productively held apart conceptually. Any further refinement of the area will certainly need to maintain at least these distinctions. Nevertheless, what further ‘ontologization’ of these basic categories and relations reveals is that the distinctions as drawn in FRBR are not yet entirely sufficient. There still appear to be cases where semiotically distinct entities are being grouped together, with potentially deleterious effects
168 | John A. Bateman for modeling and our understanding of how these terms are to be defined and distinguished. One problem area is the nature of expressions. Various ‘modes of expression’ are considered within the general FRBR scheme; these include sequences of alphanumeric characters, musical notation, choreographic notations, sound, still and moving images, three-dimensional objects and any combinations of these. Note, however, that this list already covers two rather different kinds of entities: notations and ‘objects’ of performance. Semiotically these have very different properties and so it can be predicted that the ‘expression’ concepts of the classification may exhibit problems when applied. Problems typically become evident when distinct types of entities are grouped together under single categories or when classification is under-constrained so that decisions need to be settled arbitrarily or by convention. A musical performance is, for example, sound; but a musical score for that performance is something very different: but both appear nevertheless to be seen as candidates for ‘expressions’ according to examples given in the FRBR description, where differences in form of expressions include both “the expression of a work in the form of musical notation and the expression of the same work in the form of recorded sound” (IFLA 2009, 21). ‘Manifestations’ appear then to be reserved for medial choices rather than material instantiations of a work. The relationships between these are not entirely clear, however, and so open up discussions concerning which classifications might be ‘best’ rather than establishing clear cut identity criteria. The FRBR also explicitly target translations, however, where it is generally assumed that a single work may receive expressions in different ‘languages.’ If translation is also seen as ranging over different modes of expression, a kind of ‘semiotic translation’ as Jakobson would have it (cf. Jakobson 1971), then the scoreperformance divergence may be covered. But the sense in which the relationship between a musical score and a performance of a musical work is analogous to a translation between two languages is perhaps interesting as a conjecture concerning transmediality, but is certainly not immediately self-evident. Under such a view, in translation one language expression would then stand as a guide for performing a language expression in a different language. This may capture some aspects of the process, although the very specific nature of the relationship between musical notation and performance is not covered. Several kinds of work–work, expression– expression, and manifestation–manifestation relationships are described in the FRBR specification, mostly with only informal definitions however. Several further uncertainties in the ‘work’/‘expression’ distinction are evident when approached ontologically. For example, both ‘work’ and ‘expression’ admit of attributes for the ‘medium of performance’ when dealing with musical works. This is considered necessary because the medium (e.g., which particular configuration
A Semiotic Perspective on the Ontology of Documents and Multimodal Textuality | 169
of musical instruments employed) originally intended for a work may differ from those employed for a particular expression (e.g., performance) of the work. The fact that this is possible suggests that ‘work’ as used in the FRBR may not be so straightforwardly separated from some form of expression after all. This will be addressed specifically below when the semiotic role of externalized representations is addressed in relation to the development of works across time. But perhaps the most detailed engagement with potential modeling problems with the FRBR to date has been that pursued by Renear and colleagues, beginning from the question as to where precisely an electronic ‘document’ employing the extensible markup language (XML) to organize its content should be classified in the FRBR scheme (Renear et al. 2003; Renear and Dubin 2007). Several interesting issues may be drawn from their analysis, although these are not always entirely in line with their own conclusions because the argumentation appears from an ontological perspective to be multiply problematic. First, they note that an XMLdocument seems to conform to the FRBR class of an expression; this is indeed eminently plausible as both the structure of an XML-document and its textual content may readily be related to the logical/content structure of a work. Moreover, an XML-document is produced with respect to the XML-specification and so may be said to be expressed in a specific language defined over symbols. Renear and colleagues then, however, spend considerable time considering to what extent an XML-document is also similar to a particular graphical rendition of an equivalent document, which would argue for an XML-document being a manifestation of some expression instead, even though an XML-document itself does not have properties of typeface, fontsize and so on as these are not regulated by the XML specification. This argument is flawed. Renear et al. (2003) suggest in detail that, although an XML-document appears readily treatable as an FRBR expression: On the other hand, the markup of an XML document can also be understood as functioning exactly like rendering events (font shifts, type size changes, vertical and horizontal whitespace, etc.) to effect, expedite, and disambiguate the recognition (whether by humans or computers) of the underlying textual objects. This would make the XML document seem more like a manifestation […] (Renear et al. 2003, 3)
All of the properties mentioned, font shifts, type size changes, etc., are indeed attributes of manifestations of expressions. But the XML-document does not possess these properties. Instead it contains ‘instructions’ or ‘indications’ that an appropriate manifestation of the expression that the XML-document is providing would exhibit those properties. They continue: XML markup may also perform the same function as these graphic devices. (Renear et al. 2003, 3)
170 | John A. Bateman This is semiotically confused. The fact that arbitrary labels within an XML document might be interpreted in ways that allow the (imaginary or actual) construction of a manifestation with particular physical features of rendering does not establish any equivalence with those physical features. As the long and difficult history of attempting to make browsers and markup standards agree in their presentation of documents (i.e., to produce conforming manifestations) makes clear, XML and any other markup are a long way from exhibiting manifestation properties directly. The goal of making them function in this way in a reliable fashion is by no means straightforward to achieve. Renear and Dubin (2007) nevertheless reiterate the confusion further: Generally, such a manifestation will have, as well as the marks indicating linguistic characters, specific features that make the apprehension of the embodied expression reliable and efficient. In familiar printed books, these features are, most importantly, graphic devices such as changes in horizontal and vertical spacing, font shifts, color, and so on. XML markup may also perform the same function as these graphic devices, making the apprehension and cognitive navigation of an expression efficient and unambiguous. (Renear and Dubin 2007, 4–5)
To the extent that the XML markup gives an indication of the logical structure of the work for which it is an expression, this is no different to providing rhetorical and sentential organization that helps, indeed constitutes, the content of what is being expressed. This is what the organization of expressions always does. Renear and Dubin’s conflation of these properties with graphic devices is another clear symptom of the problematic nature of the content-form conflation hardwired into this kind of markup philosophy, as well as a curious effacement of the actual processes of digital implementations. In short, the inclusion of markup in an XML document that is explicitly intended to be interpreted as constraints on possible manifestations does not result in XML documents becoming manifestations. They are still expressions—that is, to take one of Renear and colleagues’ examples, including a
in some XML markup does not make that markup magically into a manifestation, just as including space between paragraphs in a manifestation does not magically turn the page into an expression. The two roles are ontologically distinct. Even though the XML document may describe properties that will find their realizations in manifestations, and manifestations may exhibit properties that are so describable, further interpretative steps are required in both directions; the kinds of information at issue are semiotically and ontologically separate. Renear et al. (2003) use this argument concerning the alleged difficulty in assigning XML documents a single place in the FRBR classification to draw far broader conclusions, however. In particular, they apply a basic modeling prin-
A Semiotic Perspective on the Ontology of Documents and Multimodal Textuality | 171
ciple developed within formal ontology called rigidity (cf. Welty and Andersen 2005) to show that FRBR categories need to be considered as roles rather than independently existing entities. As now well understood in the formal ontological engineering community, mixing ontological types of categories, for example rigid and non-rigid categories, seriously compromises modeling integrity—and, unfortunately, confusing rigid categories with non-rigid categories is a common mistake. A ‘student,’ for example, is not a ‘subtype’ of person but rather a role that a person can play. These are crucial to distinguish because the properties and identity criteria involved are quite different. Renear and Dubin (2007) thus argue that three of the main FRBR entity types are roles, not types as their presentation as a simple ‘ontology’ of types and their use in practical modeling seems to suggest. Renear and Dubin’s (2007) discussion is relatively complex, presenting examples where they consider it plausible that under distinct (social) conditions particular entities may, or may not, receive FRBR classifications as item, manifestation or expression. Their consideration of single XML-documents as potentially being manifestations and expressions is taken as a prime example. The argument is, however, odd when considered semiotically. All of the categories involved in FRBR are semiotic in nature and so they are inherently signs. This means, following Peirce, that they are non-rigid of necessity. Renear and Dubin’s (2007) entire claim is then little more than to reiterate Peirce’s critical semiotic admonition: nothing is a sign unless interpreted as a sign (Peirce 1931–1958, § 2.308)
This perspective can of course hardly be considered new, even in the context of the discussion of documents. Buckland (1998), for example, cites the following perhaps more clearly expressed characterization of the same ideas: […] signs are never natural objects […] The reason is simply that the property of being a sign is not a natural property that can be searched for and found, but a property that is given to objects, be they natural or artificial, through the kind of use that is made of them. Both as objects and as means, signs have to be treated as something invented, and in this sense they are correlated to actions. (Sebeok 1994, 18)
Both the ‘role’ nature and inherent non-rigidity of the FRBR categories then follow necessarily: the only point of discussion could be whether or not FRBR categories are semiotic in nature, which has hardly been placed in question. A more interesting issue is then whether this realization has any consequences. Renear and Dubin suggest that, in practice, one would wish to ignore this complexity because once it has been decided that a particular entity is, for example, an expression in a real case, one no longer needs to bear in mind that perhaps, in some other ‘possible world,’ the classification may have fallen out differently.
172 | John A. Bateman But this has nothing to do with the role nature of the categories. Once a person is voted president, then the rights of being president follow. The ontological status of ‘being president’ as a role detracts neither from the seriousness of the consequences of taking that role nor from the importance of the role in the constitution and actual governing of a country. Moreover, it is, as we considered in the previous section, by no means always clear that these classifications are so straightforward to ‘fix’ as Renear and colleagues appear to assume. In fact, it may not be clear without extensive interpretative work that some physical phenomenon is to be considered as a manifestation of an expression at all, and that interpretative work may even turn out subsequently to have been in error. Seeing these attributions as role attributions is thus the only sensible choice since the physical object does not change in any way depending on the attribution. Again, the identity criteria for the physical object and the object-qua-role are quite distinct. Most interesting social categories actually turn out to function more like roles when viewed ontologically since they are inherently non-rigid. This does not mean, however, that they do not participate in complex axiomatizations that specify the properties and inter-relations between roles and their bearers—indeed, these can be of considerable consequence. Many central categories treated in ontologies are entities of this kind and so roles now receive extensive treatments in their own right. That this might not be the case seems to be Renear’s (2006) main concern when he writes of FRBR categories thereby failing to be “fundamental types of things” and so not eligible to build the “backbone” of an ontology or conceptual model. But the observation that the assignation of particular roles is “brought about by contingent social circumstances” is a property to be included and respected and not treated as grounds for demotion. For example, that particular categories, such as expressions and manifestations, are roles in an ontological sense by no means entails that they can apply to any bearers whatsoever. Roles can impose complex conditions on their bearers and this is observable again in a closer consideration of XML documents. XML documents as standardly used simply do not function as manifestations, whereas they do meet the requirements and expectations of bearers of expression roles. In essence, defining and describing what an XML-document is will be to characterize an entity that is already a complex role with substantial relationships with other roles. There is no option of simply taking the ‘XML-document’ concept and considering it free to receive distinct FRBR classifications; to do so changes the nature of the entity being addressed.
A Semiotic Perspective on the Ontology of Documents and Multimodal Textuality | 173
3.2 Two Foundational Ontologies: BFO and DOLCE The brush with foundational ontologies and principles of formal ontological engineering visible in Renear and Dubin (2007) does suggest, however, that more might be gained by applying general principles of formal ontology to the questions and challenges raised by the formalization of documents. Within this approach, foundational ontologies address the perceived need to, and scientific challenge of, providing formalized characterisations of necessary distinctions that are then also manifest in all specific domains. In terms of ontological engineering, foundational ontologies are seen as a way of ensuring high quality modeling while also avoiding repetitions of labor. Nevertheless, there is still no one overarching foundational ontology that is accepted across the board in the formal ontology community and the individual foundational ontologies proposed exhibit some sometimes quite substantial differences due to the different positions they take with respect to basic philosophical modeling decisions. For example, two of the most established foundational ontologies currently in use, the Basic Formal Ontology (BFO: Arp et al. 2015) and the Descriptive Ontology for Linguistic and Cognitive Engineering (DOLCE: Borgo and Masolo 2010), differ in that BFO is more oriented towards a realist metaphysics, whereas DOLCE sees itself as a characterization of the sociocognitive world of humans. A collection of articles introducing the currently most prominent foundational ontologies is given in Borgo and Kutz (2022). A methodological approach adopted across many of the foundational ontologies is to consider phenomena that are directly observable (as far as possible) and to derive from these phenomena and their interactions basic positions on what must be necessary ‘behind the scenes’ for those phenomena to exhibit the patterns and regularities they do. Several efforts are now underway to characterize entities overlapping with documents and texts in this context and so this is directly relevant for our current concerns. Motivations for this work differ. Some wish to provide ontologically more well-founded but still practically-oriented classification possibilities similar in some respects to the aims of FRBR but anchored more firmly into ontological categories and structures developed in one of the foundational ontologies. Others (sometimes overlapping with the first group) seek philosophically more sound treatments of the inherently very complex and challenging nature of the entities to be modeled. Much, for example, has been written on the ontology of various kinds of art, ranging from studies of particular forms of art, to the notion of an artwork as such (cf. Margolis 1974; Ingarden 1989; Thomasson 2016). Considerable diversity in opinion remains; some declaring that the question of, for example, literary works is ontologically ill posed—there being, it is claimed, no (interesting) essential properties held in common by example instances (cf. Howell 2002). These uncertainties naturally feed into attempts to offer more formalized no-
174 | John A. Bateman tions of what is involved in various kinds of artworks or similar objects of cultural value. Within formal ontology such objects are generally seen as falling within the scope of ‘information artifacts,’ which is also where notions such as documents, images, and texts are placed. Although much is to be gained from a close consideration of the relations between texts, documents and works possible within foundational ontologies, a full discussion of the issues arising would go well beyond the scope of the current contribution. A detailed comparative review of the state of the art has recently been offered by Sanfilippo (2021), however, in which he reviews how information artifacts are constructed and modeled within several current foundational ontologies (including BFO and DOLCE mentioned above) as well as in the FRBR. Despite some basic agreements that reoccur across the different philosophical starting points, there are also differences, or misalignments, both concerning how information artifacts are to be modeled and just what the most important properties involved in their modeling should be. These will now be briefly considered and related to the general goal of the current contribution. The methodological orientation of beginning ontological analysis with observations of physically manifested phenomena naturally aligns foundational ontological accounts of information artifacts with positions explored by Pichler and others concerning the relationship between physical traces and basic actions and their interpretations. Going into rather more detail, however, the physical properties relevant for information artifacts are treated ontologically quite explicitly as ‘patterns’ within some perceptible qualities inhering in a material. A distinctive feature of information patterns is then that they are necessarily generically dependent on their bearers. Generic dependence (contrasting with specific dependence) is a formal relationship commonly axiomatized in foundational ontologies that established a one-directional ontological dependency between the entities related. Generic dependence means that some entity is dependent on another, but not on some specific other. A generic dependency holding between bearer and patterns consequently means that if the bearer ceases to exist, then the specific set of patterns at issue are also gone, but those patterns may equally be carried by some other bearer. Thus, patterns on a sheet of paper or a CD can only exist as long as the paper or CD exists, but the ‘same’ patterns may well be found on other bearers as well. Similarly, a pdf-file, for example, is generically dependent on some memory device, but not on any specific memory store. Using the terms provided by BFO as a specific example, therefore, the core concept of information receives the following definition (Smith et al. 2013):
A Semiotic Perspective on the Ontology of Documents and Multimodal Textuality | 175
INFORMATION CONTENT ENTITY (ICE) =def an entity which is 1. GENERICALLY DEPENDENT on 2. some MATERIAL ENTITY and which 3. stands in a relation of ABOUTNESS to some ENTITY
Examples suggested include novels, journal articles, graphs, etc. Such entities are not themselves material, but are always dependent on some material as their bearer. Particular kinds of information content entities occur either as objects (‘endurants’/‘continuants’), such as databases or scientific publications, or as processes (‘perdurants’/‘occurrents’), such as acts of thinking, speaking, hearing, writing, and so on through which information content entities are ‘created and communicated.’ Within the information artifact extension of BFO information content entities also have subclasses, including Textual Entity, for verbally-motivated patterns, and Figure, for graphically-motivated patterns. Consequently, to make an Information Content Entity manifest, actual bearers are needed to ‘concretize’ them. These actual bearers of information are described, again taking BFO as an example, as Information Quality Entities. These are defined by Smith et al. (2013) as follows: INFORMATION QUALITY ENTITY (IQE) =def a QUALITY that is the concretization of some INFORMATION CONTENT ENTITY.
This definition makes the connection to qualities mentioned above explicit. Concretization refers to the physical manifestations in which the information patterns become evident. Qualities themselves require material in which to exist, which leads to the actual material information bearers. An information artifact is then an artifact with the particular designed or selected function of bearing an information quality entity. Examples offered are a hard drive, a traffic sign, a printed form, a passport, a currency note, an RFID (radio-frequency identification) chip, a SIM (subscriber identification module) card, and so on. From these examples it is clear how the primary target of the BFO approach to information entities to date has been very much one of technical devices and digital environments. However, as Sanfilippo (2021) makes clear, this material view of information artifacts leaves several modeling decisions unresolved because the patterns discussed appear to include both abstract patterns, corresponding to the information that might be shared across distinct manifestations, such as particular novels or their translations, and concrete patterns that involve the actual distribution of ink on pages. This means that the term ‘pattern’ “can then be understood in two senses—as referring either (i) to what is shared or communicated (between original and copy, between sender and receiver), or (ii) to the specific pattern before you when you are reading from your copy of Tolstoy’s novel” (Smith
176 | John A. Bateman and Ceusters 2015, 1). Sanfilippo (2021) discusses several problems following from this that we will return to below. In addition, and as can be seen in the definition of ICE’s above, in the BFO approach a central role is played by the notion of ‘aboutness’; this is not seen in the other foundational approaches. Smith and Ceusters (2015) suggest that information content entities inherit their ‘aboutness’—the assumed relationship of an information content entity to a portion of the world (which may or may not exist)—from an agent’s ‘direct cognitive representations’: We shall presuppose in what follows that information artifacts do not bear information in and of themselves, but only because cognitive subjects associate representations of certain sorts with the patterns which they manifest. We thus view the aboutness that is manifested by information content entities in accordance with the doctrine of the ‘primacy of the intentional’ (Chisholm 1984), according to which the aboutness of those of our representations formulated in speech or writing (or in their printed or digital counterparts) is to be understood by reference to the cognitive acts with which they are or can in principle be associated. (Smith and Ceusters 2015)
To a certain extent this sidesteps the question of ‘how’ information may refer by passing the work on to other cognitive processes. Aboutness is also assumed to rely on information similar in organization to linguistic representations, in the sense of exhibiting properties such as compositionality and so on. It is not then clear how this should generalize to non-linguistic kinds of information. The more explicitly cognitive orientation of the foundational ontology DOLCE allows quite a different take on these issues. DOLCE not only supports the usual (for foundational ontologies) distinction and ontological dependence between qualities and bearers of those qualities, but also offers a characterization of descriptions as social objects, which are non-physical entities generically depending on agents.¹ Information Objects are then defined as linking a description to both a ‘language’ as some information encoding system and the physical realizations serving as bearers. The kinds of entities that can function as bearers of information are, as was the case with BFO, very broad. Several distinct subtypes of information object are also proposed, such as ‘linguistic objects’ and ‘diagrammatic objects’; no further details are provided for these coding systems but they are assumed to provide motivations (or ‘order’) for the patterns of qualities that carry information. Information objects are in consequence types of sign whereas descriptions offer their contents and, less clearly, the ways in which those contents are organized. 1 The treatment of ‘descriptions’ is actually provided by one of the extensions of DOLCE rather than core DOLCE itself, but this level of detail will not be discussed further here—for more information, see Gangemi and Peroni (2016).
A Semiotic Perspective on the Ontology of Documents and Multimodal Textuality | 177
As Sanfilippo (2021, 121) notes, this latter move again introduces some ambiguity— this time between descriptions as particular organizations of information and the coding systems by which those organizations are constructed. Moreover, it is clearly necessary to capture details not only of sign-types, as semiotic principles of organization for particular classes of communicative acts or objects, but also of ‘sign-tokens,’ including properties such as typefaces and physical dimensions or conditions. Lack of differentiation between sign-types and sign-tokens has been fairly widespread in foundational ontology approaches for some time (cf. Bateman 2004). From the perspective of applied ontology, approaches to modeling patterns more generally also become relevant here, particular when sign-tokens are seen in terms of qualities. As Sanfilippo notes, this view contrasts with approaches that take sign-tokens as objects in their own right. The object-based view deals more straightforwardly with the physical details of sign-tokens such as colors, physical dimensions and so on, although there is already some tension when moving to properties that are more directly semiotic in nature, such as typeface. Whether the relationship between the entities that are characterized as sign-types and those entities that are characterized as sign-tokens is simply a type-token relationship in any case is not immediately self-evident either. More semiotic input is clearly required. Sanfilippo’s (2021, 128) comparison of ontologies in the broad domain of information objects or artifacts allows him to draw the following conclusions. Information entities are generally seen as having a temporal existence but not necessarily a spatial positioning; they are generically dependent on some physical bearers; they may be ‘about’ something, but only in the case of BFO is this seen as a necessary characteristic; and they generally have little or no commitment to describing some semiotic ‘whole,’ which can complicate accounts that attempt to describe complex ‘works’ consisting of parts. Little is said about just what constitutes ‘meanings’ more specifically. Taking these results together, Sanfilippo suggests four distinct positions that can be found in foundational ontologies concerning information entities: information entities may be semi-abstract entities (located in time not space), meanings, ideas, or ‘documentary entities’ that are materially present in the world and which may be grouped together on the basis of judged similarity of form or purpose. The ontologies examined all place themselves slightly differently with respect to these positions, from which Sanfilippo concludes: It emerges from these latter considerations the ambiguous way in which information entities have been treated in applied ontology. One and the same category is indeed used sometimes to capture the distinction between meanings and their encoding signs […], and some other times to refer to the signs themselves […]. This is misleading since the understanding of
178 | John A. Bateman information entities as meanings needs to be clearly distinguished from the understanding of information entities as signs. […] It is however challenging to fit these four positions and all kinds of information entities—from literary and musical works to documents, pictures, films, software, etc.—within a single, informative category. (Sanfilippo 2021, 130–131)
This leads Sanfilippo to argue that a more ‘pluralistic’ framework might be required that can span different modeling views depending on the entities addressed. The direction followed here, however, will be rather to attempt further clarification and disambiguation of the notion of information artifacts directly by expanding their semiotic foundations to explore whether a singular unified set of categories might be both possible and preferable after all.
4 The Semiotic Foundation: A Model of ‘Communication’ Appropriate for Documents and Works A reoccurring theme across all of the vantage points discussed in this contribution so far has been how a lack of semiotic foundations continues to seriously compromise the arguments made and the positions taken, often creating more puzzles than they solve. The proposal made here is that a re-engagement with some fundamental semiotic principles will help move us forward. In order to provide a more structured connection between the points raised above and the semiotic modeling introduced below, we can first extract certain challenges that have been touched upon but which still resist adequate resolution. These challenges will be picked up subsequently as semiotic mechanisms relevant for their treatment are introduced. The categories provided by the FRBR group 1 specification offer a convenient set of anchors to begin: – ‘Work’: how can works be characterized semiotically and what is their relationship to texts, documents, and materialities? – ‘Expression’: many different forms of expression are listed in the FRBR specification, and many are mentioned in discussions from various perspectives, but how is the diversity of those forms of expression to be captured in relation both to the meanings expressed and material manifestations of those meanings? – ‘Manifestation’: what are manifestations of expressions and how can they be explored empirically without pre-judging the results of interpretative efforts? – ‘Item’: how can the material specifics of physical objects and performances be related to their conditioning manifestations?
A Semiotic Perspective on the Ontology of Documents and Multimodal Textuality | 179
In addition, several further questions building on these are raised: – Where are annotations to be placed with respect to the categories listed so far? – How do digital entities, including those involving annotations such as XML documents, fit within the framework? – How can the content-form division be reconceptualized semiotically so that it follows from the above without imposing arbitrary divisions between what is considered ‘essential’ for annotations or text-encodings and what can be ignored? Viewed broadly we can see in these challenges a continuum of levels of abstraction at work. These levels need to be appropriately related to one another within any complete framework. In fact, in many ways this is also to take up a task already sketched in Hayles’ re-orientation to materiality. As she writes: The crucial move is to reconceptualize materiality as the interplay between a text’s physical characteristics and its signifying strategies. This definition opens the possibility of considering texts as embodied entities while still maintaining a central focus on interpretation. (Hayles 2004, 72)
This is precisely what will emerge in the semiotic treatment that follows.
4.1 Multimodality: Semiotic Modes and Their Media In this subsection, the basic model required to address the challenges and questions listed above will be presented without specific argumentation or justification so that we can proceed directly to its application for the task at hand. The distinctions drawn are all necessary in order to offer natural characterizations of what is occurring semiotically. For the theoretical and empirical motivations of the development of the model, see Bateman (2011, 2016); for general background and an overall introduction to this approach to multimodality as a phenomenon and its empirical and theoretical treatment, see Bateman et al. (2017); and for a detailed discussion of the relationship between the framework and semiotics in both the Peircean and Saussuro-Hjelmslevian traditions, see Bateman (2018). It will be argued that most of the properties and distinctions that emerged in the approaches discussed above arise naturally in this characterization, allowing them to find appropriate places and inter-relationships within the framework as a whole. As with the ontological approaches discussed in the previous section and the calls for more materially-oriented accounts in the text-encoding context, the present semiotic model also takes materiality as its starting point. Without a material realization of communicative practices there is nothing to get the semiotic
180 | John A. Bateman process of signification underway. Material will be approached from two sides. From the least abstract, most ‘physical’ orientation, signification always occurs against a physical background which imposes limitations and opens up possibilities for signifying practices to develop. This view of materiality is already anchored in human perception and action, however, and so is by no means the materiality of ‘physics.’ What is important are the possibilities that any material provides for leaving ‘traces’ that may serve as clues for interpretation (cf. Bateman 2021b). From the side of signification, particular semiotic practices come to rely on some subset of the overall material properties available. For example, some material, such as a sheet of paper with a pencil, may support traces involving visually accessible patterns of shade and light, regions of color, lines, and so on. However, the particular semiotic practice of writing then only needs to make use of a very limited subset of the overall possibilities for shade and patterns that would be supported. Semiotic practices of this kind are called semiotic modes. Any semiotic mode defines the kinds of material traces that it requires to operate. Those required kinds of traces are termed the canvas of the semiotic mode. The canvas of a semiotic mode is then simply the semiotic mode’s materiality when viewed with respect to the specific forms of traces required by that semiotic mode (Bateman 2021b). The notion of canvas consequently captures the inherent linking of form and material definitional for each and every semiotic mode. From a production or design perspective, the canvas is what sign-makers have to work with, or against, when traces for a particular semiotic mode are to be produced. Considering materiality ‘through the lens’ of its deployment as part of a semiotic mode in this way is beneficial for several further related theoretical and descriptive challenges that commonly hinder empirical research. It is also important when considering semiotic relationships across different materialities and media. Precisely because the canvas of a semiotic mode characterizes just those material distinctions or regularities that are necessary for that specific semiotic mode to operate, one can productively view the canvas as an ‘abstract’ or ‘generalized’ materiality as well (Bateman et al. 2017, 103). This means that, to the extent that any materiality supports at least the material requirements given by the canvas of some specific semiotic mode, then that materiality may serve to materialize the technical features (see below) of that semiotic mode. The material is then capable of supporting the traces that the semiotic mode demands. This is the reason that one can read written language regardless of whether it appears on paper, on a screen, carved in stone, or as holes cut out of a piece of paper, etc. Fairly close relations can be drawn here with the various kinds of information entities and their bearers discussed in the previous section in terms of generic dependence and qualities.
A Semiotic Perspective on the Ontology of Documents and Multimodal Textuality | 181
The specification of precisely which material patterns are used by a semiotic mode is then given by a second layer of semiotic abstraction. This second layer of a semiotic mode is called the form or ‘technical feature’ semiotic stratum. The task of this stratum is to impose qualitative classes or categorizations on top of the material variation of the material stratum. This is essential in order to identify (or make hypotheses concerning) the material regularities that are significant or relevant for that mode. It is this level of description that turns, for example, patterns of light and shade in a visually-accessed materiality into letters that might be read or lines on a graph. The technical features of a semiotic mode also provide the means for constructing hypotheses concerning what some material pattern might mean. It is important to realize that in the framework presented here, deployment of semiotic modes is the only way to make the move from material to interpretation. In the discussion from Robinson (2009) engaged with above, for example, the problem of recognizing individual letters in a Chaucer manuscript is raised. As Robinson notes, several identical patterns of shade and dark are ‘best’ allocated to different letters (cf. figure 2 on the next page, left-hand side): The three strokes here which constitute the second to last word are near identical to the three strokes which commence the last word (and indeed, they are identical if one takes the somewhat larger seraph on the first stroke of the second last word as purely ornamental). Yet, we declare in our transcript that the first set of three strokes stands for the letters ‘in,’ while the strokes which begin the last word stand for the letter ‘m’: thus, ‘in mariage.’ Why do we transcribe one set of three strokes as ‘in’ then, and another as ‘m’? We do this because this is the only reading which makes sense. (Robinson 2009, 43)
This quite usual task in decoding older sources can be characterized more precisely within the current semiotic model as follows. First, we must ‘decide’ that the particular variety of materiality under study can be beneficially ‘read’ as an application of the semiotic mode of written language. This gives the possible letter forms and constraints on their material realization (the canvas of the mode) that are relevant for engaging with the material. It does not allow a fixed allocation of letter forms to specific patterns, however, as the letter forms are generally also interpreted as realizing particular linguistic expressions. Patterns are only associated with forms in an interpretation when they increase the coherence of the overall description: this is an application of Peirce’s notion of abduction. The relation of forms to material, on the one hand, and to meanings, on the other, is always to be seen in this way regardless of semiotic mode. This is one of the principal mechanisms by which the present semiotic account moves beyond the ‘code-book’ view of semiotics mentioned above, while at the same time not being restricted to linguistic materials.
182 | John A. Bateman
Fig. 2: Two written forms whose material requires disambiguation by discourse interpretation. On the left is the example Chaucer manuscript discussed by Robinson (2009, 43) with the problematic forms shown underlined; on the right is a well-known illustration of discourse interpretation overruling material form taken from Dennett (1990, 178).
This flexibility of readings given to forms is made more explicit in a third and final layer of semiotic abstraction inherent to semiotic modes. The catalogs of formal distinctions making up the technical features of a semiotic mode are not already themselves contextually ‘meaningful.’ Instead, meaning is mediated by the application of the discourse semantics of the semiotic mode. Interpretative work is thereby postponed, making it possible to maintain a more direct link between the stratum of form and its ‘materialization’ which still avoids over-interpretation of material distinctions (cf. figure 2, right-hand side). The discourse semantics of a semiotic mode therefore performs ‘delayed,’ or indirect, interpretations of technical features by providing defeasible rules that assign contextualized interpretations to the formal distinctions—this is how ‘making sense’ is defined. These interpretations are dynamic and unfold during the process of interpretation. The result of their application is a structure consisting of discourse entities introduced into the discourse and configurations of relations binding those entities together; a more general characterization of the importance of discourse semantics for multimodality can be found in Bateman (2020). Semiotic modes within this framework are therefore three-way layered configurations of semiotic distinctions developed by communities of users in order to achieve some range of communicative or expressive tasks. The three constitutive layers, moving from least abstract (i.e., the closest to observable material distinctions) to the most abstract (i.e., closest to considerations of context, social actions and genre) are: material (and ‘canvas’), form (or technical features), and discourse semantics. These consequently span the vu and lu poles in Pédauque’s model, providing more semiotic detail concerning just what can function as ‘form’ on the one hand, and emphasizing more the dynamic process of ‘reading’ by which one can move from vu to lu in terms of discourse semantics. Methodologically, semiotic modes are seen as ‘current best hypotheses’ when exploring the meanings of particular uses of material regularities. At this broad level of description, all semiotic modes operate with similar mechanisms. Substantial differences then arise due to how interpretative work is distributed over the semiotic strata. Some semiotic modes (e.g., language) make considerable use of the technical details layer, generating structures that may be compositionally linked to discourse interpretations;
A Semiotic Perspective on the Ontology of Documents and Multimodal Textuality | 183
others (e.g., pictorial depictions) make considerable use of iconicity and recognizable material patterns that can be linked to discourse interpretations; others still (e.g., diagrams) combine these possibilities. This provides a more nuanced view than, for example, McGann’s binary division into textual and bibliographic codes (cf. McGann 2001, 145). While semiotic modes are then the primary location of semiotic work and acts of interpretation when engaging with materials, they are also anchored within a broader communicative model involving communicative purposes and media—i.e., the social context entailed by the placement of documents at Pédauque’s third pole, su. This standard anchoring of material and semiotic modes with respect to the more abstract categories of media and genres is shown schematically in the diagram in figure 3 on the next page. In many respects, therefore, this can be taken as offering a further refinement of Pédauque’s triangle model combined with a range of methodological consequences relevant for analysis. More specifically, applying the model and the interrelationships it sets up across its distinct categories and layers of abstraction naturally covers situations where a diversity of semiotic possibilities are co-active—thus, for example, typography in the medium of graphic novels may be combined with a range of more pictorial or schematized semiotic modes employed to express emotional attitudes. Similarly, concrete poetry codeploys written language and visual shapes, while page composition imposes segmentation and connection visually by graphic means. In all cases, the sole point of access is the materiality to be analyzed and connections are made to semiotic modes solely in terms of the range of material variation that each semiotic mode takes responsibility for. It is then often the case that some shared portion of materiality in a given media product is co-organized by several distinct semiotic modes simultaneously: this requires the co-presence of multiple semiotic coding schemes for most actual objects of communication, which in turn relates directly to Renear’s call for pluralistic realism discussed above. This co-organization of material by multiple semiotic modes occurs at a particular ‘location’ within the overall theoretical framework identified as the medium. A medium is defined as a socioculturally institutionalized grouping of semiotic modes for specific communicative purposes (Bateman et al. 2017, 124–125). Media are consequently ‘sites of practice’ where the orchestrated co-deployment of semiotic modes is highly likely to bring combinations and mergers of semiotic modes into play. Since the co-deployment of distinct semiotic modes is inherent to their functioning, those semiotic modes may become successively more intertwined, thereby echoing the suggestion of Winkler (2008, 213) that media be considered ‘biotopes’ for semiosis. Media are then likely to function as melting pots for semiotic activity—for multisemiosis—leading to diachronic developments in practices and
axis
discourse semantics
axis
discourse semantics discourse semantics
semiotic modes
axis
material
regularities in form
material
regularities in form
material
regularities in form
canvas of the medium
(virtual / technologically constructed)
184 | John A. Bateman
genres
· explaining · negotiating · proving · showing · narrating
multisemiotic artifacts
medium
· spoken language · written language · film · graphic novels · illustrated documents · infographics · blended media
Fig. 3: The relationships between semiotic modes, media, and genres as defined by Bateman et al. (2017). In this model, a medium is an institutionalized collection of semiotic modes (the ovals running down the left-hand side of the diagram) that are available for realizing genres.
semiotic modes as the system is used in particular, but always evolving, cultural configurations.
4.2 Applying the Model The framework can now be related directly to the concrete questions of annotation and text-encoding raised above, but in a manner that does not make prior decisions concerning form and content, nor of any restriction to language-based artifacts or performances. The general idea of text annotation is to enrich ‘texts’ by structures and categories that then allow more focused and abstract engagement with those texts. The already abstract nature of language ‘texts’ often obscures the processes at work here: it is not ‘texts’ themselves that are somehow subjected to enrichment but abstract representations that are already formally encoded. These abstract representations correspond to sequences of characters building words, relations
A Semiotic Perspective on the Ontology of Documents and Multimodal Textuality | 185
between elements representing structures, formal properties added to entities representing further linguistic information, and so on. They are consequently inherent to any engagement with text as language. In semiotic terms, one has already applied the semiotic modes of written (or spoken) language to pre-structure the materiality considered. For ‘purely’ linguistic readings the abstractions employed here have appeared relatively unproblematic: this is due to the good match between certain formal properties of the semiotic modes of written and (simply transcribed) language and the formal representation of the traditional levels of linguistic description as morphology, syntax, and semantics as forms of annotation. It is this match that lends plausibility to the early claims that text is simply an OHCO. Such abstractions are appropriate, and even necessary, as long as the phenomena of interest are not affected by the abstraction. This simply does not hold when we move to instances of multimodal communication, however. In such cases this apparent independence of form from material is revealed to be no longer self-evident and often false: in many situations we do not yet know just what kinds of distinctions are involved and so it is important to be more cautious and make the selection and operation of distinct semiotic modes explicit. Moreover, since we may well not know the details of the semiotic modes at work, this necessarily involves empirical or hermeneutic investigations and hypotheses concerning how best to characterize any material regularities observed. When characterizing any materiality, then, we need to make hypotheses concerning the distinctions that may be relevant. This is precisely the methodological work taken up by semiotic modes. The primary task of text-encoding is to consider, for any artifacts or performances under study, just what the necessary information concerning those entities is—i.e., just what information is to be abstracted and included within an annotated text for the purposes of further analysis. For language-based artifacts, this decision is generally assumed to correspond to a content-form distinction, but this is just the view that follows from the application of that particular semiotic mode. The content is provided by the discourse semantics (and its embedding in context); the form by the technical features (and links to materiality). Crucially, content-form distinctions (expanded to the three-way layering intrinsic to semiotic modes) may be imposed separately by each semiotic mode that might apply for any specific object of analysis. For example, semiotic modes of page composition, graphical forms, typography, and so on may make their own independent contributions and require quite different (but simultaneous) characterizations of the materiality under study. The different semiotic strata of a semiotic mode correspond directly to the general task of deciding on an annotation, and so provide a strong semiotic motivation for several different kinds of annotations. All of these may be captured formally
186 | John A. Bateman in similar ways (typically as ‘stand-off’ annotations), but nevertheless have very different statuses in terms of the stages of empirical analysis they correspond to. Semiotic modes are, in this view, ways of organizing and bundling our research activities and can play out quite concretely in terms of distinct schemes of annotation. In many cases, formulating a coding scheme for data is at the same time making one’s current best hypothesis concerning relevant semiotic modes explicit as well. This consequently offers the formal support necessary for Eggert’s proposal cited above that ‘tagsets’ should act as repositories for evolving traditions of “commentary and scholarship” (Eggert 2010, 198). Moreover, whereas all forms of commentary and interpretation can be included in tagsets, those that are anchored in specifications of semiotic modes go further. To the extent that the regularities in data revealed using the annotation schemes correspond either to patterns in the data or the results of experimental studies employing those patterns, one knows that support for the particular specification of the semiotic mode used has been gained. When mismatches occur, this is then a solid empirical indication that revisions are necessary. By these means we retain contact with the realism advocated by Renear, providing at the same time a broader semiotic foundation for encoding details concerning any forms of materiality—including those areas for which Eggert declares the impossibility of finding ‘codes.’ The exploration of materials via the application of co-occurring semiotic modes, each of which is further anchored into discourse interpretation, provides precisely the interpretative engagement with materiality called for in the quote from Hayles at the beginning of this section. It is semiotic modes that assign meaning and signification to material regularities and making the technical details of those modes explicit is the task of constructing ‘codes.’ In terms of the distinctions discussed in the ontological approaches to information artifacts and entities, we also achieve more clarity. The ‘information coding systems’ are simply the semiotic modes proposed in this model. These systems now receive further internal structuring, however, relating them more clearly to the information-bearing qualities that inhere in physical objects and to the construction of meanings via discourse interpretation. The canvas of a semiotic mode has the task of describing just those pattern-types that the mode is responsible for providing interpretations for; the materiality itself is where physical properties of the ‘object’ are anchored. All descriptions within a semiotic description of this kind are characterized further along the dimension of actualization: ranging from potential to instance. The semiotic mode specification describes the potential; while actual cases of the use of any semiotic mode are instantiations of that potential. This provides more detail for the more informal application of the type-token distinction in the semiotic domain. A ‘text,’ construed quite generally and independently of
A Semiotic Perspective on the Ontology of Documents and Multimodal Textuality | 187
individual semiotic modes such as language, is then simply a full instantiation of the potential defined by some semiotic mode. This relationship is similar to the type-token relationship or to that holding between annotation schemas and actual annotations conforming to those schemas. This then separates the contributions of expressions and manifestations in the FRBR senses as well. An expression covers the possibilities offered by the technical details semiotic stratum of a semiotic mode. The system underlying any expressions is given by the potential of the semiotic mode in play. Actual expressions are then instantiations of that system down to the layer of technical details. Manifestations are then characterizations of the material realizations of the technical details. These have quite different properties. An item is then one specific physical object bearing the necessary material patterns. These constructs are now all semiotically and ontologically clearly distinguished so that the use of particular terms, such as ‘document,’ can be measured against the identity criteria that the framework provides. Since actualization can apply at all three levels of semiotic abstraction, this also establishes a natural bridge to the annotation of interpretations. As we saw in the first sections of this chapter above, the question of interpretations and their relationships to annotations has been one of the most problematic areas for discussion concerning texts for a considerable time. The usual problem hindering such discussions is a lack of differentiation concerning just which kinds of interpretations are involved. To simply talk of the ‘meaning’ of the text is not semiotically helpful and suggests a far greater variability than is actually the case. Many aspects (i.e., levels of abstraction) of texts are by no means so free. Appropriate solutions must then be able to respect these kinds of differences so as to allow texts to be both stable (relatively) and ‘unstable’ in the sense that different paths of interpretation and explanation can be followed. This maps directly to the levels of semiotic abstraction proposed. Interpretations may be seen as directly anchored in properties of a ‘text,’ generated by diagrammaticity as described above. This applies to the lowest levels of semiotic abstraction. Discourse semantics, however, already moves beyond diagrammaticity as it necessarily involves abductive hypotheses concerning discourse organization (cf. Bateman 2016, 2020). This is a more free kind of interpretation but one that is still anchored into the formal distinctions attributed to a text. More abstract still are readings of the ‘text,’ where what is understood as ‘text’ is now already an instantiation of possible descriptions made at all of the lower levels of abstraction. There is, in principle, no problem in annotating such explorations; all that must be borne in mind is that the theoretical status of such annotations is quite different to that of the more textually-anchored descriptions provided. Realist claims can reasonably be made for the distinctions defined with respect to a semiotic
188 | John A. Bateman ANNOTATION TYPES genre / register /situation
Metadata
contextualized readings
Annotations of Readings
discourse semantics discourse interpretations
Discourse Interpretation Annotations
technical features regularities in form
Form Annotations
formal distinction coding schemes
‘canvas’
semiotic mode
regularities in material
materiality
Material Annotations measurements
Fig. 4: The distinct types of annotations generated by the semiotic model. Each semiotic ‘stratum’ corresponds to annotations with quite distinct formal properties, all of which are however necessary for a complete treatment.
mode, but hardly for the readings exploring more abstract and community-specific contextualizations of any entities under investigation. In this sense, then, and refining and complementing Pichler’s (2021) claims quoted above, a ‘text’ can exist “without being sustained by an act of reading” as long as we restrict attention to the lower semiotic strata—these are properties taken to hold of the entity characterized independently of its reception. For the more abstract semiotic domains, however, annotation must be seen more in terms of the “protocols of […] mapping and interpretation” identified by Sahle (2015). These distinct types of annotation need to be properly distinguished. An overall map of the connections between semiotic layers of analysis and types of annotations relevant for digital representations of cultural artifacts and performances is shown graphically in figure 4. On the left-hand of the figure we can see the usual tri-stratal model of semiotic modes embedded within context and related to register and genre. The lowest levels of abstraction in the model involve materiality and are captured via ‘measurements’ of that materiality: measurements is understood here in a very general sense ranging from physical dimensions to facsimiles. The technical details stratum of a semiotic mode then corresponds to traditional form-oriented coding schemes. Each semiotic mode involved will define its own formal units, however—and no one semiotic mode is considered intrinsically less central or more important than any other. The next stratum of discourse semantics provides categories for annotating both hypotheses concerning
A Semiotic Perspective on the Ontology of Documents and Multimodal Textuality | 189
likely discourse interpretations (potential) and particular discourse interpretations that may be argued for the object being annotated (instances). Moving beyond the responsibilities of the semiotic mode, contextualization may include particular readings made on the basis of the discourse interpretations proposed. Such readings are by no means limited to information that is directly available in the materials annotated and include hermeneutic explorations, arguments and interpretative hypotheses/claims in general. Such readings may make use of any of the units defined by the less semiotically abstract annotations. Finally, broad categories concerning the genre or situation of the materials provide information for metadata of traditional kinds. The theoretical framework embodied here is defined independently of whether we are dealing with digital entities or non-digital entities; the reasons for this seen from a semiotic perspective are discussed more in Bateman (2021a). Although it is not the case that a digitally-presented ‘page’ and a physical page are similar entities ontologically, refocusing on the ‘materiality’ that recipients and participants are confronted with allows us to transfer semiotic descriptions and descriptive schemes across cases quite easily. A good approximation to this view is also provided by the notion of ‘interface’ as discussed by Drucker: A book is an interface, so is a newspaper page, a bathroom faucet, a car dashboard, an ATM machine. An interface is not so much a ‘between space’ as it is the mediating environment that makes the experience, ‘a critical zone that constitutes a user experience.’ I don’t access ‘data’ through a web page; I access a web page that is structured so I can perform certain kinds of queries or searches. We know that the structure of an interface is information, not merely a means of access to it. (Drucker 2011, 10; original emphasis)
Crucial here is the notion of ‘mediating environment.’ ‘Documents’ in general can be characterized as entities playing the role of establishing environments with such a functionality. Thus, to be a document is to play a particular role, in the foundational ontological sense, in social activities involving mediations of information and meaning (cf. Smith 2014 and Pédauque’s su). This status as a role explains why it is not possible to determine ‘physical’ properties sufficient for recognition criteria. Nevertheless, whenever the role of being a document is carried by a semiotic entity, such as a text, then the properties of such semiotic entities follow the principles set out in the previous subsection. A digital document is consequently an emergent entity that a particular constellation of technical processes give rise to for a user to interact with. The ‘content’ of some digital document is made up of just those information patterns that support the application of the semiotic modes intended in order to provide an interpretation of those patterns. Such patterns can naturally be carried by many quite distinct and diverse bearers and may also often be transferable to non-digital bearers—as
190 | John A. Bateman in when a pdf file is printed out. This is not always possible: a 3D interactive model will not transfer to a sheet of paper because the affordances of paper do not support the semiotic modes employed for the original model. But as long as there is some overlap in the abilities of the materialities employed and the requirements of the employed canvases, then some medial depiction will be possible. An exact match is seldom required. This also aligns with, and extends, some of the discussions made concerning digitally managed documents within the text encoding community. To consider such entities as strings of characters as often done is clearly only the tip of a semiotic iceberg. Strings of characters are particular instantiations of the potential made available by the technical features stratum of some specific semiotic mode. Their significance is only given by their anchoring within the contextualized discourse interpretation offered by that semiotic mode. This provides more background for making sense of the rather contentious conclusions reached in a series of articles by Renear and Wickett discussing the consequences of considering a (digital) document solely as a string of characters (Renear and Wickett 2009, 2010). First, they conclude that documents cannot change, since mathematical objects such as ‘sets’ do not change. All talk of adding or changing documents must then be about something else: It is common to speak as if digital objects change, and yet if digital objects are things like bit strings, sets, tuples, graphs, and such then they cannot change. We’ve presented a resolution: digital objects do not change, what changes are our attitudes, individual or collective towards those objects. (Renear et al. 2008, 3)
This is a good indication of the need to address notions such as ‘documents’ with more ontological and semiotic precision; no simple relationship to strings and markup can be assumed. Thus, although the characterization of at least languagebased documents in terms of character strings and structural markup has been enormously useful in securing access to many aspects of text-based information, any formal restriction of ‘document’ to abstract strings also encourages less productive lines of development. In contrast, the view of ‘content’ as (information-bearing) qualities naturally allows change: properties and qualities of objects are changing all the time. The presentation and tracking of particular ‘objects’ whose informational properties can change is simply one of the tasks that any appropriate computational infrastructure needs to support. This is rather more specific than talk of ‘attitudes’ that may or may not be held. An XML document is then a digital entity in this sense. Information patterns are maintained in some technologically implemented ecosystem of ‘document’ and ‘file’ use and the content of those patterns is a reading as an XML document.
A Semiotic Perspective on the Ontology of Documents and Multimodal Textuality | 191
Although this is a very restricted kind of text document, it is still subject to the stratified semiotic view introduced above. Such documents can be used for many purposes as they are simply a structuring of data whose discourse semantic interpretation allows them to be resolved as referring to quite different entities. Semiotically, therefore, they could also be used for specifying any of the semiotic layers of a semiotic mode, both potential and actual. They are, in this sense, descriptions (expressions) of those semiotic constellations. This appears to be the position that Renear and colleagues were attempting to work towards in their attempt to place XML-documents within the FRBR, although the details remained unclear. Finally, we can turn to the most abstract construct of a ‘work.’ Here the FRBR definition again provides a good starting point, defining it as a “distinct intellectual or artistic creation” (IFLA 1998, 17). Following the documentation of the ‘ontologized’ version of FRBR in Bekiari et al. for more detail: An instance of […] Work begins to exist from the very moment an individual has the initial idea that triggers a creative process in his or her mind. […] Unless a creator leaves one physical sketch for his or her Work, the very existence of that instance of […] Work goes unnoticed, and there is nothing to be cataloged. (Bekiari et al. 2015, 27)
This view meshes well with the semiotic view of ‘texts’ (construed broadly as any instantiation of a collection of semiotic modes within a medium) as externalized supports for further semiotic work. Creators of works of any kind typically produce a broad variety of such supports as part of their intellectual projects. Externalization of this kind is in fact an essential facet of producing complex works. But there is no requirement that those supports constitute what one might consider a single ‘text’—indeed, over time a single ‘work’ might cover a range of texts, partial and perhaps mutually inconsistent, and even employ a variety of media. The general properties of these externalizations are already known: they are simply the consequences of using the semiotic model set out above. This includes all forms of possible externalization, regardless of whether these are linguistic, graphical, or some other combination of expressive forms. All such externalizations are nevertheless ‘texts’ in the extended sense defined here: that is, they are instantiations of semiotic modes anchored in specific materials. The material distinctions that are relevant are then precisely those that are ‘claimed’ by the use of particular semiotic modes. Again: semiotically, there are simply no other options for producing interpretable traces. Grouping such supports together may require considerable editorial effort of its own of course and, in general, there can be no guarantee that the ‘work’ created was actually so intended by its originators. The Nachlass example from Pichler cited above is a good illustration of this.
192 | John A. Bateman Semiotically, however, this process of ‘work’-building and its constituent activities are still relatively straightforward. To the extent that the discourse contributions of individual components can be made to cohere (by acts of editorial interpretation), one might work towards a ‘text’ realizing the work, but it is perfectly possible that heterogeneous ‘work fragments’ remain. It is, in the last resort, only the interpretation of the individual fragments employing any semiotic modes necessary (e.g., pictorial representations, diagrams, graphs, logical formulas, truthtables, etc.) that make such activity possible. And, of course, this by no means rules out the simplest case often discussed, i.e., where some intellectual effort is encapsulated in a ‘text’ and ratified by its creator(s) as constituting ‘the’ work. That ratification may, however, stretch across all semiotic layers of the modes employed and so cannot, in principle, be restricted to abstract notions such as, for example, the FRBR expression category. Paintings, for example, commonly reach not only into their manifestations but also down to the individual items involved. The usual questions and issues around distinctions such as Goodman’s (1969) autographic and allographic art forms are all relevant here, but have few consequences for the semiotic characterization as this remains strictly more general, covering all of these individual cases.
5 Conclusions The account set out in this contribution has sought to bring together a range of long discussed issues in and around texts and documents within a single overarching semiotic perspective. It has been suggested that it is precisely the lack of such a perspective that has led many proposals to fall short of the often rather complex phenomena they are targeting. Assuming a semiotic foundation of the kind introduced here provides a new web of relations for distinguishing the interrelations and connections necessary in a systematic manner. In addition, we have established several interesting parallels with the tri-focal view of ‘documents’ developed in Pédauque (2006). The layers of abstraction required in the multimodal semiotic model find natural correspondences in their vu–lu–su division. Although some have criticized the Pédauque treatment on the grounds that it is not immediately evident how “a general document theory emerges out of these three perspectives” (Lund 2009, 38), relating their terms to the richly internally structured and inter-related categories of the multimodal semiotic model may well help advance this goal considerably. Such triangulation will become even more relevant when considering the consequences of applying the model in digital contexts. Further bridges may be established here by incorporating more refined
A Semiotic Perspective on the Ontology of Documents and Multimodal Textuality | 193
notions of Drucker’s (2011) ‘mediating environments’ re-constructed within the multimodal semiotic model as proposed in, for example, Bateman (2021a). Clearly, much remains to be articulated further; but the framework as provided can already serve to clarify questions that need to be asked, while constraining sensibly the kinds of answers that will be most beneficial when engaging with the nature of documents, texts, and other similarly complex semiotic constructs.
Bibliography Arp, Robert, Barry Smith, and Andrew D. Spear. Building Ontologies with Basic Formal Ontology. Cambridge, MA: MIT Press, 2015. Bateman, John A. The place of language within a foundational ontology. In: Varzi, Achille C. and Laure Vieu, editors, Formal Ontology in Information Systems: Proceedings of the Third International Conference on Formal Ontology in Information Systems (FOIS-2004), pp. 222–233, Amsterdam: IOS Press, 2004. Bateman, John A. The decomposability of semiotic modes. In: O’Halloran, Kay L. and Bradley A. Smith, editors, Multimodal Studies: Multiple Approaches and Domains, pp. 17–38. London: Routledge, 2011. Bateman, John A. Methodological and theoretical issues for the empirical investigation of multimodality. In: Klug, Nina-Maria and Hartmut Stöckl, editors, Handbuch Sprache im multimodalen Kontext, pp. 36–74. Berlin: De Gruyter Mouton, 2016. Bateman, John A. Peircean semiotics and multimodality: Towards a new synthesis. Multimodal Communication, 7(1), 2018. DOI: https://doi.org/10.1515/mc-2017-0021. Bateman, John A. The foundational role of discourse semantics beyond language. In: Zappavigna, Michele and Shoshana Dreyfus, editors, Discourses of Hope and Reconciliation. On J. R. Martin’s Contribution to Systemic Functional Linguistics, pp. 39–55. London, New York: Bloomsbury, 2020. Bateman, John A. What are digital media? Discourse, Context & Media, 41, 2021a. DOI: https://doi.org/10.1016/j.dcm.2021.100502. Bateman, John A. Dimensions of materiality: Towards an external language of description for empirical multimodality research. In: Pflaeging, Jana, Janina Wildfeuer, and John A. Bateman, editors, Empirical Multimodality Research: Methods, Evaluations, Implications, pp. 35–64. Berlin: De Gruyter, 2021b. Bateman, John A., Janina Wildfeuer, and Tuomo Hiippala. Multimodality – Foundations, Research and Analysis. A Problem-Oriented Introduction. Berlin: De Gruyter Mouton, 2017. Bekiari, Chryssoula, Martin Doerr, Patrick Le Boeuf, and Pat Riva. Definition of FRBRoo : A Conceptual Model for Bibliographic Information in Object-Oriented Formalism. The Hauge: IFLA, 2015. Biggs, Michael and Claus Huitfeldt. Philosophy and electronic publishing. Theory and metatheory in the development of text encoding. Monist, 80(3):348–367, 1997. Borgo, Stefano and Oliver Kutz. FOUST – the foundational stance. Applied Ontology, 17, 2022.
194 | John A. Bateman Borgo, Stefano and Claudio Masolo. Ontological foundations of DOLCE. In: Poli, Roberto, Michael Healey, and Achilles Kameas, editors, Theory and Applications of Ontology: Computer Applications, pp. 279–296. Dordrecht, Heidelberg, London, New York: Springer, 2010. Briet, Suzanne. Qu’est-ce que la documentation? Paris: ÉDIT, 1951. Translated in What is documentation? English translation of the classic French text by Ronald E. Day, Laurent Martinet, and Hermina G. B. Anghelescu. Scarecrow, Lanham, MD, 2006. Brown, John Seely and Paul Duguid. The social life of documents. First Monday, 1(1), 1996. URL: https://firstmonday.org/ojs/index.php/fm/article/view/466/820, (13.11.2021). Buckland, Micheal K. What is a “digital document”? Document Numérique, 2(2):221–230, 1998. Chisholm, Roderick M. The primacy of the intentional. Synthese, 61(1):89–109, 1984. Dennett, Daniel C. The interpretation of texts, people and other artifacts. Philosophy and Phenomenological Research, 50:177–194, 1990. DeRose, Steven J., David G. Durand, Elli Mylonas, and Allen H. Renear. What is text, really? Journal of Computing in Higher Education, 1(2):3–26, 1990. Drucker, Johanna. Humanities approaches to interface theory. Culture Machine, 12:1–20, 2011. Durand, David G., Elli Mylonas, and Steven J. DeRose. What should markup really be? Applying theories of text to the design of markup systems. In: Proceedings of ALLC/ACH Conference, pp. 67–70, Bergen, 1996. Eggert, Paul. Text as algorithm and as process. In: McCarty, Willard, editor, Text and Genre in Reconstruction: Effects of Digitalization on Ideas, Behaviours, Products and Institutions, pp. 183–202. Cambridge: OpenBook Publishers, 2010. Frohmann, Bernd. Revisiting ‘what is a document?’. Journal of Documentation, 65(2):291–303, 2009. Gangemi, Aldo and Silvio Peroni. The information realization patterns. In: Hitzler, Pascal, Aldo Gangemi, Krzysztof Janowicz, Adila Krisnadhi, and Valentina Presutti, editors, Ontology Engineering with Ontology Design Patterns: Foundations and Applications, volume 25, pp. 299–312. Amsterdam: IOS Press, 2016. Goodman, Nelson. Languages of Art. An Approach to a Theory of Symbols. London: Oxford University Press, 1969. Greenstein, Daniel and Lou Burnard. Speaking with one voice – encoding standards and the prospects for an integrated approach to computing in history. Computers and the Humanities, 29(2):137–148, 1995. Hayles, N. Katherine. Translating media: why we should rethink textuality. The Yale Journal of Criticism, 16(2):263–290, 2003. Hayles, N. Katherine. Print is flat, code is deep: The importance of media-specific analysis. Poetics Today, 25(1):67–90, 2004. Howell, Robert. Ontology and the nature of the literary work. The Journal of Aesthetics and Art Criticism, 60(1):67–79, 2002. Huitfeldt, Claus. Multi-dimensional texts in a one-dimensional medium. Computers and the Humanities, 28:235–241, 1997. IFLA. Functional Requirements for Bibliographic Records: Final Report. Munich: K. G. Sauer, 1998. International Federation of Library Associations and Institutions. IFLA. Functional requirements for bibliographic records. Technical report, International Federation of Library Associations and Institutions, February 2009. URL: http://www.ifla. org/VII/s13/frbr/, (13.11.2021).
A Semiotic Perspective on the Ontology of Documents and Multimodal Textuality | 195
Ingarden, Roman. Ontology of the Work of Art: The Musical Work, the Picture, the Architectural Work, the Film. Raymond Meyer and John T. Goldwait, translators. Athens: Ohio University Press, 1989. Jakobson, Roman. Language in relation to other communicative systems. In: Roman Jakobson: Selected Writings, volume II: Word and Language, pp. 570–579. The Hague: Mouton, 1971. Lund, Niels Windfeld. Document theory. Annual Review of Information Science and Technology, 43(1):1–55, 2009. DOI: https://doi.org/10.1002/aris.2009.1440430116. URL: https: //asistdl.onlinelibrary.wiley.com/doi/abs/10.1002/aris.2009.1440430116, (13.11.2021). Margolis, Joseph. Works of art as physically embodied and culturally emergent entities. The British Journal of Aesthetics, 14(3):187–196, 1974. McGann, Jerome J. The Textual Condition. Princeton: Princeton University Press, 1991. McGann, Jerome J. Radiant Textuality: Literature after the World Wide Web. New York, Basingstoke: Palgrave Macmillan, 2001. Ochs, Elinor. Transcription as theory. In: Ochs, Elinor and Bambi B. Schieffelin, editors, Developmental Pragmatics, pp. 43–72. New York: Academic Press, 1979. Otlet, Paul. Traité de documentation. Brussels: Editions Mundaneum, 1934. Otlet, Paul. International Organisation and Dissemination of Konwledge: Selected Essays of Paul Otlet. W. Boyd Rayward, translator. Amsterdam: Elsevier, 1990. Pédauque, Roger T. Le Document à la lumière du numérique. Caen: C&F éditions, 2006. Peirce, Charles Sanders. Collected Papers of Charles Sanders Peirce. Cambridge, MA: Harvard University Press, 1931–1958. Vols. 1–6, 1931–1935, edited by Charles Hartshorne and Paul Weiss; Vols. 7–8, 1958, edited by Arthur W. Burks. Peroni, Silvio and David Shotton. FaBiO and CiTO: Ontologies for describing bibliographieic resources and citations. Web Semantics: Science, Services and Agents on the World Wide Web, 17:33–43, 2012. Pichler, Alois. Transcriptions, texts, and interpretations. In: Johannessen, Kjell S. and Tore Nordenstam, editors, Culture and Value: Philosophy and the Cultural Sciences. Beiträge des 18. Internationalen Wittgenstein Symposiums. 13.–20. August 1995, pp. 690–695. Kirchberg am Wechsel: Austrian Ludwig Wittgenstein Society, 1995. Pichler, Alois. Hierarchical or non-hierarchical? A philosophical approach to a debate in text encoding. Digital Humanities Quarterly, 15(1), 2021. URL: http://www.digitalhumanities. org/dhq/vol/15/1/000525/000525.html, (13.11.2021). Pierazzo, Elena. Digital Scholarly Editing: Theories, Models and Methods. Farnham, Surrey: Ashgate, 2015. Plassard, Marie-France. Functional Requirements for Bibliographic Records: Final Report. Basel: K. G. Saur, 2013. URL: http://cds.cern.ch/record/2109506, (13.11.2021). Renear, Allen. Out of praxis: three (meta)theories of textuality. In: Sutherland, Kathryn, editor, Electronic Text: Investigations in Method and Theory, pp. 107–126. Oxford: Clarendon Press, 1997. Renear, Allen. Is an XML document a FRBR manifestation or a FRBR expression? – both, because FRBR entities are not types, but roles. In: Extreme Markup Languages, Montréal, Québec, August 2006. Renear, Allen, Christopher Phillippe, Pat Lawton, and David Dubin. An XML document corresponds to which FRBR group 1 entity? In: Extreme Markup Languages, Montréal, Québec, August 2003.
196 | John A. Bateman Renear, Allen H. and David Dubin. Towards identity conditions for digital documents. In: Dublin Core Conference, DCMI International Conference on Dublin Core and Metadata Applications, pp. 181–189, 2003. Renear, Allen H. and David Dubin. Three of the four FRBR group 1 entity types are roles, not types. Proceedings of the American Society for Information Science and Technology, 44(1):1–19, 2007. Renear, Allen H. and Karen M. Wickett. Documents cannot be edited. In: Balisage: The Markup Conference 2009, 2009. DOI: https://doi.org/10.4242/BalisageVol3.Renear01. Renear, Allen H. and Karen M. Wickett. There are no documents. In: Balisage: The Markup Conference 2010, Montréal, Canada, August 2010. DOI: https://doi.org/10.4242/BalisageVol5. Renear01. Renear, Allen H., David Dubin, and Karen M. Wickett. When digital objects change – exactly what changes? Proceedings of the American Society for Information Science and Technology, 45 (1):1–3, 2008. Robinson, Peter. What text really is not, and why editors have to learn to swim. Literary and Linguistic Computing, 24(1):41–52, 2009. Sahle, Patrick. Digitale Editionsformen. Zum Umgang mit der Überlieferung unter den Bedingungen des Medienwandels. Teil 3: Textbegriffe und Recodierung. volume 9 of Schriften des Instituts für Dokumentologie und Editorik. Norderstedt: BoD, 2013. Sahle, Patrick. Traditions of scholarly editing and the media shift. Presentation at International Seminar on Digital Humanities: Scholarly Editing and the Media Shift (Verona, 8.9.2015), 2015. Sanfilippo, Emilio M. Ontologies for information entities: State of the art and open challenges. Applied Ontology, 16:111–135, 2021. Schamber, Linda. What is a document? Rethinking the concept in uneasy times. Journal of the American Society for Information Science, 47(9):669–671, 1996. Schmidt, Desmond Allan. The inadequacy of embedded markup for cultural heritage texts. Literary and Linguistic Computing, 25(3):337–356, 2010. Sebeok, Thomas A., editor. Encyclopedia Dictionary of Semiotics. 2nd edition, Berlin: De Gruyter, 1994. Smith, Barry. How to do things with documents. Rivista di Estetica, 50:179–198, 2012. Smith, Barry. Document acts. In: Konzelmann-Ziv, Anita and Hans Bernhard Schmid, editors, Institutions, Emotions, and Group Agents: Contributions to Social Ontology, pp. 19–31. Dordrecht: Springer, 2014. Smith, Barry and Werner Ceusters. Aboutness: Towards foundations for the information artifact ontology. In: Proceedings of the Sixth International Conference on Biomedical Ontology (ICBO), volume 1515, pp. 1–5. CEUR, 2015. Smith, Barry, Tatiana Malyuta, Ron Rudnicki, William Mandrick, David Salmen, Peter Morosoff, Danielle K. Duff, James Schoening, and Kesny Parent. IAO-Intel: An ontology of information artifacts in the intelligence domain. In: Proceedings of the Eighth International Conference on Semantic Technologies for Intelligence, Defense, and Security (STIDS 2013), volume 1097 of CEUR, pp. 33–40. CEUR, 2013. Text Encoding Consortium. TEI P5: Guidelines for electronic text encoding and interchange. Technical Report 4.3.0, ACH-ALLC-ACL Text Encoding Initiative, 2021. URL: http://www.teic.org/Guidelines/P5/, (23.10.2021).
A Semiotic Perspective on the Ontology of Documents and Multimodal Textuality | 197
Thomasson, Amie L. The ontology of literary works. In: Gibson, John and Noël Carroll, editors, The Routledge Companion to Philosophy of Literature, pp. 349–358. London, New York: Routledge, 2016. Tillett, Barbara B. What is FRBR: A conceptual model for the bibliographic universe. Technical report, Library of Congress. Cataloging distribution service, Washington, DC, 2004. URL: http://www.loc.gov/cds/FRBR.html, (13.11.2021). Vanhoutte, Edward. An introduction to the TEI and the TEI consortium. Literary & Linguistic Computing, 19(1):9–16, 2004. DOI: http://dx.doi.org/10.1093/llc/19.1.9. Welty, Chris and William Andersen. Towards OntoClean 2.0: A framework for rigidity. Applied Ontology, 1(1):107–116, 2005. Winkler, Hartmut. Zeichenmaschinen: oder warum die semiotische Dimension für eine Definition der Medien unerlässlich ist. In: Münker, Stefan and Alexander Roesler, editors, Was ist ein Medium?, pp. 211–222. Frankfurt am Main: Suhrkamp Verlag, 2008.
|
Part III: Background
Peter Tscherkassky. Motion Picture (La Sortie des Ouvriers de l´Usine Lumière à Lyon). Film, 1984. The film used, Workers Leaving the Lumière Factory in Lyon, directed and produced by Louis Lumière in 1895, is often referred to as the first motion picture ever made. Courtesy of Peter Tscherkassky.
Gerald Hartung and Karl-Heinrich Schmidt
Simmel’s Excursus on Written Communication A Commentary in Eleven Steps
1 Preliminaries Simmel’s excursus on written communication is to be found in the fifth chapter of his Soziologie, which bears the heading “The Secret and the Secret Society” (Simmel 1908; Wolff 1950). By written communication, Simmel specifically means remote communication by letter, which—fittingly, given the heading just quoted—seems to boil down to an analysis of secrecy in letters. This special element of communication by letter does indeed play a role in the excursus, but it is not the central secret with which Simmel is concerned here. It is instead the “secret of the other,” in relation to the interpretation of an (epistolary) utterance, that is central for him: when does the recipient of a letter need interpretive skills—and when can he take an epistolary utterance in its “logical sense” (see below) without interpretive effort? Simmel is concerned here with manifestations—in his view unique to letters—of the basic fact that another person can act differently (in communication) than I expect, and that this other person can act differently than I expect precisely insofar as and because he knows what I expect. Or, to put it another way, Simmel’s excursus is concerned with the question of how this double contingency becomes apparent in remote communication in the letter as a genre (Simmel writes “form of communication”). To address this question, Simmel develops a central maxim in comparison to and contrast with oral exchange: the letter is clearer than speech where the secret of the other is not the issue; but where it is the issue, the letter is more ambiguous.
In what follows, this maxim is treated as an essential insight to be drawn from Simmel’s analysis of communication by letter. That analysis brings together various themes that impinge representatively (and for Simmel, as has been said, uniquely) on the use of a complex genre in remote communication; subsequently, they have also turned out to be fundamental to the analysis of communication in human societies. Gerald Hartung and Karl-Heinrich Schmidt, University of Wuppertal https://doi.org/10.1515/9783110780888-008
204 | Gerald Hartung and Karl-Heinrich Schmidt The development of this central maxim will be analyzed below in the form of a commentary. To this end, the excursus will be presented in its entirety, broken down into sections under thematic headings. These sections—eleven in total—are contextualized by Gerald Hartung (GH) with a sensitivity to the history of philosophy and cultural theory; this is intended to present an informed picture of Simmel in his own time. Karl-Heinrich Schmidt (KHS) undertakes to look ahead from there by tracing, with Simmel as the starting point, a theoretical strand in which Simmel’s ideas and intuitions are drawn into a formalizable conceptual apparatus. Bringing together these two perspectives will reveal the systematizing potential of this early classic in the literature on sociology and the philosophy of culture for a future theory of the use of documents, and thereby also remote communication, in society.
2 Simmel’s Excursus on Written Communication: Text and Commentary 2.1 Introduction “Some remarks on the sociology of the letter are appropriate here, since the letter, evidently, represents a very peculiar constellation even under the category of secrecy.” (Wolff 1950, 352) (GH): The theme—“some remarks […] are appropriate”—invokes a fundamental question (one that dominates all of Simmel’s Soziologie): how is society possible? Part of the answer lies in the fact that we think we know something about one another. We have to make assumptions about the other with whom we are dealing in any given case of social interaction in order to want to deal with that other in the first place. Simmel is interested, in the first instance, in the general problem of how society is possible when there is so much we do not know about others and about the conditions that define social coexistence. And he is particularly interested in the media of social coexistence. The letter is a medium of social interaction that can be used to illustrate the tensions between knowing and not-knowing, between an “open” and a “closed” form of communication, and between explicit communication and “secrets.” (KHS): Against this background, the following modeling categories are used in the elucidation of Simmel’s excursus below. For “closed” point-to-point communication by letter, which is subject to the normative requirement that it cannot be
Simmel’s Excursus on Written Communication
𝑢
𝑒
𝑑 Φ𝑢
| 205
𝑣 bd
Fig. 1: Simmel writes a letter to a friend. Image: Michael Ruml.
accessed by third parties, we can begin by distinguishing a production situation 𝑢 and a processing situation 𝑣; the message Φ𝑢 is produced in the former and picked out, or extracted and obtained, in the latter. Simmel always has written form in mind, which means that a textual document arises. It is further assumed that what Simmel himself envisaged is a paper vehicle of transmission that is delivered to a recipient as a complete document after the process of its production. Furthermore, he treats the production situation and processing situation as located in two areas of space and time, neither of which can be observed from the other, which means that we can consider remote communication in the true sense (overcoming spatiotemporal distance) to be involved. This remote communication does not, it is true, exclude in principle the possibility of being read by third parties who might gain access to it; but that is treated as undesirable (“something particularly ignoble”; see below). Simmel does not explicitly say anywhere whether there is a preexisting discourse context between sender and recipient that would require a discourse situation (𝑑 in figure 1) to be modeled for communication by letter—in the form of an epistolary friendship, for instance. He does, though, go on to refer to the possibility of “memories of direct personal contact” being involved (see below), so it must be concluded that a discourse context can (but does not have to) be present. It is further assumed that sender and recipient are, in temporal copresence, embedded in a situation that subsumes the production situation, the processing situation, and the discourse situation. This embedding situation (𝑒 in figure 1) allows use to be made of information that is not supported by the shared discourse situation but is directly relevant in Φ𝑢 .
206 | Gerald Hartung and Karl-Heinrich Schmidt
2.2 Writing and Publicity “In the first place, writing is opposed to all secrecy. Prior to its general use, every legal transaction, however simple, had to be concluded before witnesses. The written form replaced this necessity, inasmuch as it involves an unlimited, even if only potential, ‘publicity’: not only the witnesses, but everybody in general, may know of the business concluded.” (Wolff 1950, 352) (GH): It is noteworthy that Simmel does not refer to the “function” of writing but instead addresses its nature (Wesen in the German). This gives an indication of how he combines different methodological approaches. From a historical perspective, there is an era before writing; from a sociological perspective, the function of writing changes; but from another perspective—let us simply call it that of the philosophy of culture—writing is just one particular way of expressing an interiority (feelings, thoughts, interests) that articulates itself for the purposes of fostering social interaction. Written expression is no different: in the case of human authorship, it entails revealing an inner stirring of drive, desire, and so on. In order to preserve a social interaction, what is expressed by one person must be recognized by another: it must be witnessed. This is the case for oral expression, as the provisions of many legal cultures make clear. The advantage of written expression is that it fixes what is expressed beyond the present of the production situation in any given case. Written documents call on all readers as witnesses: in principle, an unlimited cultural sharing of knowledge thereby becomes possible. (KHS): Being witnessed can be desired or undesired in the case of written documents. The undesired reading of a letter by third parties generally takes place not for the purposes of inspecting the form of the document but to evaluate its message. It therefore makes sense that in his excursus, Simmel does not approach the theme of writing in terms of specific (textual) signs but instead puts the use of the content of what is written in the foreground. In so doing, Simmel moves away from the letter genre and engages with text more generally as a content architecture defined by specific conditions of production and processing (see also Kramme and Rammstedt 1996, 67 f.). Here, command of a set of signs and of orthographical norms is simply a necessary condition of a witnessing that is made possible by text. Building on this, such witnessing assumes many additional abilities if content is to be extracted accurately, not least where the use of natural languages is involved. This ultimately means that the parties concerned need to be sufficiently embedded in a shared cultural context. Simmel now formulates this specifically.
Simmel’s Excursus on Written Communication
| 207
2.3 Objective Spirit “Our consciousness has a peculiar form at its disposal, which can only be designated as ‘objective spirit.’ Natural laws and moral imperatives, concepts and artistic creations lie ready, as it were, for everybody able and willing to use them; but, in their timeless validity, they are independent of whether, when, and by whom they are thus used. Truth, as an intellectual phenomenon, is something quite different from its passing, actual object: it remains true, no matter whether or not it is known and acknowledged. The moral and juridical law is valid, whether lived by or not.” (Wolff 1950, 352) (GH): Here, Simmel addresses one of the basic problems that he is concerned with in all aspects of his oeuvre: the relationship between the historical development and normative validity of cultural phenomena. “Relationship” here means that all cultural phenomena exist only because they have come into being. Nothing has been there forever. At the same time, however, the claim of cultural configurations—such as natural laws and legal and moral precepts, among other things—to validity cannot be derived from this status: if that were the case, the validity of all values, norms, beliefs, and so on would depend on their historical, sociocultural context. That is the position of cultural relativism. Since his initial work on the Philosophy of Money (German original 1900), Simmel made various efforts to understand more precisely why natural laws, concepts, and so on have a supraindividual and supratemporal validity. When Simmel refers to “timeless” validity here, he is speaking from the perspective of a participant in the cultural environment. We cannot conceive of a sociocultural world different from the world we know and are familiar with. Simmel links this supraindividual and supratemporal aspect with the concept of form. “Form” means a dimension that is accessible to all of us in a material and impermanent object. The validity of a law or an artistic creation is independent of whether we know about it, acknowledge it, or live by it. The truth of mathematical laws presents itself in this independence. This is a major conundrum for Simmel. (KHS): The necessity of a shared cultural embedding for the successful use of cultural configurations is apparent in laws or artistic creations, but also in the “fine power of culture” in everyday life, and there most of all in oral and written communication, in particular in the application of everyday concepts. Simmel argues “upward” from the letter genre, via its assumed written character, to objective spirit. Decades later, Harvey Sacks works “downward” by drawing phenomena of—in his eyes—a “minor” nature (see below) into an empirical analysis of model texts where he demonstrates the necessity of shared concepts. In his groundbreaking
208 | Gerald Hartung and Karl-Heinrich Schmidt study On the Analyzability of Stories by Children, Sacks does this by examining not laws or works of art but relatively simple linguistic output such as The baby cried. The mommy picked it up.
Here too, shared concepts—in this example, of stage-of-life, family, and gender, if not more—guide the extraction of content when the text is read or listened to, allowing it to be understood immediately by readers and listeners. The mother is the female parent of the crying baby, and that baby is picked up because it is crying. This is not stated anywhere—but nonetheless, “our” culturally accreted understanding of how, in families, mothers interact with their babies guides the use of the text by sender and recipient. “The fine power of culture” reveals itself: It does not, so to speak, merely fill brains in roughly the same way, it fills them so that they are alike in fine detail. The sentences we are considering are after all rather minor, and yet all of you, or many of you, hear just what I said you heard, and many of us are quite unacquainted with each other. (Sacks 1972)
Even when there is a considerable lack of knowledge about another person, this “fine power” makes successful communication and analysis of that communication possible. Sacks shares Simmel’s astonishment—but for very different reasons: for Sacks, it comes from analyzing “minor” phenomena that can easily be grasped empirically (see below). Sacks analyzes specific examples of what is communicated; Simmel is concerned, among other things, with means of communication such as writing and their contextualization in a theory of culture.
2.4 Objective Spirit and Writing “Writing is a symbol, or visible vehicle, of this immeasurably important category. In being written down, the intellectual content receives an objective form, an existence which, in principle, is timeless, a successively and simultaneously unlimited reproducibility in the consciousness of individuals. But its significance and validity are fixed, and thus do not depend on the presence or absence of these psychological realizations. Writing, thus, possesses an objective existence which renounces all guarantees of remaining secret.” (Wolff 1950, 352) (GH): The objectivity of a cultural configuration—like that of a scientific theory, a moral conviction, or an artistic configuration—presents itself in its supraindividuality and experienced timelessness. Whether it is also grounded in them or not, need not concern us here. “Writing is a symbol, or visible vehicle,” of truth. A written
Simmel’s Excursus on Written Communication
| 209
document does not add anything to intellectual content in terms of its claim to truthfulness; instead, it gives that content an existence, an objective form. Simmel is interested in the fact that in the document intellectual content provides itself with a second form, a material aspect of its form. The intellectual form acquires a material form. This is how the basic supraindividuality and supratemporality of a law, a value, a concept, and so on enters existence. This is how wider use—in the spatial simultaneity and temporal sequentiality of social and communicative actions—becomes possible. Nonetheless, the experienced timelessness of the validity of objective forms is not dependent on their use. It makes ongoing use possible; but this use does not in itself endow them with a claim to validity. The objective existence of laws, values, and convictions involves them being made evident, being declared in public. Written documents merely have the function of documenting and passing on the existence of intellectual content. (KHS): The use of writing realized in a document leads to a “material text”¹ (Kondrup 2013, 10) as a specific material form. This makes possible—simply by virtue of the fact that, by definition, individual documents can generally be moved (UNESCO 2017)—a material-dependent “simultaneousness and successiveness”² of the texts that are subjectively arrived at by different readers (Realtexte; Kondrup 2013, 10) as a result of what Simmel calls “reproducibility in the consciousness of individuals.” Beyond the individual material text, there is the possibility of generating further material vehicles of transmission by means of copying methods that involve copying sign-by-sign (as in transcribing) or more comprehensively (as in photographing). This makes possible a diversity of processing situations, independent of the original material text, in which a reading contract in Pédauque’s sense is presented (Pédauque 2022). The effectiveness of that contract for the person beholding the text is apparent in the legibility of the material form (as the vu component of the reading contract) in any given case and in the intelligibility of the content (as the lu component of the reading contract). In the conceptual framework of this volume, both are required for a document to be used as intended (the su component of the reading contract)—and for Simmel, such use can equally well fail to transpire, without the “intellectual content” suffering any harm.
1 “Materialtext.” 2 “Nebeneinander wie Nacheinander.”
210 | Gerald Hartung and Karl-Heinrich Schmidt
2.5 The Letter Genre in Secret Communication “The letter, more specifically, is likewise wholly unprotected against anybody’s taking notice of it. It is for this reason, perhaps, that we react to indiscretion concerning letters as to something particularly ignoble—so that, for subtler ways of feeling, it is the very defenselessness of the letter which protects its secrecy. The mixture of these two contrasts—the objective elimination of all warranty of secrecy, and the subjective intensification of this warranty—constitutes the letter as a specific sociological phenomenon. The form of expression by letter is an objectification of its content, which involves, on the one hand, the letter’s being addressed to one particular person and, on the other hand, the correlate of this first fact, namely, the personal and subjective character in which the letter writer (in contrast to the writer of literature) presents himself. It is particularly in this second respect that the letter is a unique form of communication.” (Wolff 1950, 352–353) (GH): Simmel abruptly switches perspective here. Against the background of the assumption that all written documents are in principle meant to explicate their own intellectual content, we can also describe them as “unprotected” against all practices of extracting content—with the caveat that in psychological terms, this description makes sense only if access to the intellectual content of a document is not meant to be open to all participants in the spatial simultaneity and temporal sequentiality of communicative actions. Simmel thereby directs our attention to a class of documents that is inscribed with the difference between public and private: the letter. In his analysis of the letter—as a written document and a sociocultural phenomenon—Simmel manages to deftly elucidate a complex state of affairs. Here again, the question of how society is possible is at stake. Society arises through social interactions. These interactions have two aspects: first, the interrelationships between individuals, i.e., how they relate to one another and how they set themselves apart from one another; second, how individuals relate to the field of the supraindividual and how they set themselves apart from it. Both these relationships are filled with tension, for they involve individuation and self-cultivation at the same time as socialization. They become apparent in the letter as a written document. The point in both cases is that there is a relationship between knowing and not-knowing that spans both the private and the public fields. Simmel hones in on the former here. In a letter, I objectify intellectual content as my evaluations, convictions, and so on for a particular other; and I give myself, as the letter writer, an objective form. My subjectivity is constituted in the expression of my evaluations, interests, and so on; my personality is constituted in how I relate to a reader of my letter.
Simmel’s Excursus on Written Communication
| 211
(KHS): Simmel connects the protection of a letter from intrusion to a specific form of representation and to remote communication between only two participants. This brings together – “the form of expression by letter,” which here takes place in writing without encryption and thus essentially permits access; – “being addressed to one particular person”; and – the “personal and subjective character in which the letter writer (in contrast to the writer of literature) presents himself.” In the first instance, therefore, the whole excursus is an analysis of a minimal form of social remote communication between two participants—there is only one letter sender, only one letter recipient, and specific protection of the message against access by third parties. This in itself is nothing remarkable. For Simmel, the letter becomes “a unique form of communication” only through the special role of the “personal and subjective character” of the writer in the production situation and of the reader in the processing situation. The uniqueness asserted here is, for now, something we have to take Simmel’s word for until we can examine it further (see the section after next).
2.6 Face-to-Face Communication versus Communication by Letter “Individuals in physical proximity give each other more than the mere content of their words. Inasmuch as each of them sees the other, is immersed in the unverbalizable sphere of his mood, feels a thousand nuances in the tone and rhythm of his utterances, the logical or the intended content of his words gains an enrichment and modification for which the letter offers only very poor analogies. And even these, on the whole, grow only from the memories of direct personal contact between the correspondents. It is both the advantage and the disadvantage of the letter that it gives, in principle, only the pure, objective content of our momentary ideational life, while being silent concerning what one is unable, or does not wish, to say. But the characteristic of the letter is that it is, nevertheless, something wholly subjective, momentary, solely-personal (except for cases where it is a treatise in unprinted form)—and, by no means, only when it is a lyrical outburst, but also when it is a perfectly concrete communication.” (Wolff 1950, 353) (GH): Communication by letter is different from a direct communication situation. Simmel means more here than the trivial observation that the letter is a medium of remote communication. The spatial and temporal distance between the letter
212 | Gerald Hartung and Karl-Heinrich Schmidt writer and the letter reader makes both objective and personal expression possible in the medium of the letter. Communication by letter thus enables a paradox to gain intellectual and material form: the aporia of the objective and the personal that is inherent to all linguistic expression. While it is a “both-and” relationship, analysis of social interactions generally suggests “too much of this and too little of that.” Communication by letter is—from a sociocultural perspective—an interesting medium of interaction because in it a balance between the two aspects can be attained. When we grasp the objective in personal terms and the personal in objective terms, something new can take shape. (KHS): In this excursus, for Simmel, a letter is a handwritten letter, or perhaps also a typed one, but clearly not an artifact printed by a third party. From a present-day perspective, emails and such like can thus be considered as falling within the remit of his analysis of the letter genre as long as they are given form by a human author. The writing needed to produce a letter is “to be grasped from the outset as an intentional action that is conventional to a greater or lesser degree” (Stetter 1997, 289).³ This subsumes a plethora of very different sub-actions: “striking out, rearranging, correcting, adding—all are operations that produce the specific shape of the text through work on individual ‘forms.’ These operations have practically no equivalents in the oral domain, setting aside corrections to slips of the tongue, which are, however, not planned but made with the same ‘impulsive’ spontaneity on which the slip of the tongue itself depends” (Stetter 1997, 296).⁴ The possibilities of written treatment addressed here involve first of all the opposition between oral utterances and written utterances. Simmel immediately moves from this opposition to the opposition between oral utterances and epistolary utterances, which he feels are marked by “something wholly subjective, momentary, solely-personal.” In the (private) letter, Simmel identifies a genre that is characterized by a distinctive combination of private and public. The written form of a letter (as has already been observed) is the source of a potential vulnerability to others becoming involved—which, just as we would expect, does not actually transpire with letters. But written form has a private side as well: in contrast to oral utterances, “writing makes it possible to work with one’s thoughts in private”
3 “von Anfang an als intentionales, mehr oder weniger konventionelles Handeln zu begreifen.” 4 “durchstreichen, umstellen, korrigieren, ergänzen, alles Operationen, die in der Arbeit an einzelnen ‘Formen’ die spezifische Gestalt des Textes herstellen. Für diese Operationen gibt es im Bereich des Mündlichen kaum Äquivalente, sieht man von der Korrektur des Versprechers ab, die jedoch nicht kalkuliert vorgenommen wird, sondern mit derselben ‘unbedachten’ Spontaneität, welche Vorbedingung des Versprechers ist.”
Simmel’s Excursus on Written Communication
| 213
(Stetter 1997, 291).⁵ The letter provides a special form for this, one that makes these possibilities available on the sender’s side and imputes these same possibilities on the recipient’s side (where they figure in interpretation). This sets the letter apart not only from oral remote communication but also, for example, from notes, which can also be directed at precisely one recipient, and covertly at that, but which do not, as a genre, require the recipient to reckon with the possibility of redaction by the sender when interpreting them. This is why traces of working with private thoughts are entirely acceptable in notes but are in part deliberately (made) invisible in letters.
2.7 Cultural Constraints of Communication by Letter “This objectification of the subjective, this stripping of the subjective element of everything pertaining to the matter at issue and to oneself which one does not (as it happens) want to reveal at the moment, is possible only in periods of high culture. It is then that one adequately masters the psychological technique which enables one to give a permanent form to momentary moods and thoughts, and to consider and receive them with the understanding that they are momentary, commensurate with the requirements of the situation. Where an inner production has the character of a ‘work,’ this permanent form is entirely adequate; but, in the letter, there lies a contradiction between the character of its content and that of its form. Only a sovereign objectivity and differentiation can produce, come to terms with, and utilize, this contradiction.” (Wolff 1950, 353) (GH): Simmel adopts a normative concept of culture (in general) and cultivation (individual). In lower cultures, “moods and thoughts” are fixed to the situation and a point in time. In higher cultures, the “momentary” can be given a “permanent form.” Simmel refers to a “psychological technique,” by which he means a special process. Content is not stripped of its subjectivity in order to let its objectivity stand out more clearly. Instead, the subjectivity is itself objectified; that is to say, all aspects of it that are not to be put on display are filtered out. “Moods and thoughts” that are made objective are involved here. The subjectivity does not disappear but comes to the fore as intellectual content. We are dealing with a process of centering and coalescence. Thus life becomes a work. Elsewhere—in his studies on Rembrandt and Goethe, for example—Simmel addresses possibilities for reconciling life—its inner production—and form—its external character as a work.
5 “ermöglicht Schrift einen privaten Umgang mit dem Gedachten.”
214 | Gerald Hartung and Karl-Heinrich Schmidt What is special about the letter form is that it stands halfway between formless life and a form that petrifies life—in the midst of the content–form contradiction, in other words. Anyone who writes and reads letters is constantly producing this contradiction. Those involved have to be cultivated if they are to be able to deal with this form and its inherent contradiction. It is a mark of particular cultivation if we can go so far as to play with the difference between a subjectivity objectifying itself and an objective form dissolving into something approaching subjectivity. Great letter writers can do exactly that. (KHS): “Momentary moods and thoughts” that can only be “consider[ed] and receive[d] […] with the understanding that they are momentary, commensurate with the requirements of the situation,” can be given permanence in the letter in a way so specific that the letter is, for Simmel, “a unique form of communication” (see above). Under the conditions of communication today—and this brings the significance of the cultural context of Simmel’s remarks into our analysis—this uniqueness no longer pertains, however, because what Simmel saw as the specific functionality of the letter can now be separated from the form of communication that carries it. If the textual message of a written letter is replaced by a photographic self-portrait that originates in the moment and at the same time reveals, for instance, a character trait of the portrayed person in a conventionalized manner (e.g., imitation of the Rolling Stones logo tongue), the result is a (sub)genre functionally equivalent to the written letter that Simmel envisaged. Like a personal letter of that kind, the private portrait sent photographically and electronically to only a single addressee is a case of what is now a genre family that serves to give permanence to a part of (and, in the case of the photograph, a moment in) a specific production situation that can, in terms of content, be captured using conventions and incorporated into the message in an addressee-specific manner. The necessary “psychological technique,” which manifests itself in writing competence in the case of the letter, can now be realized graphically as well, by means of image-processing, in the case of the photographic message described here. Simmel was not really in a position to cover these possibilities at the time he wrote his excursus. That is why he, in a text-oriented manner, characterizes the textual letter as a unique form of communication that is intended for only one addressee and that captures the production situation in a specific manner. Today, it is easier to see that the textual form is one way of realizing this functionality, and thus to keep form and use more clearly apart for letters as a genre family. The evolution of media technology as part of cultural technology offers further means of fulfilling the function that Simmel associated with the written letter. The cultural context of the object, like that of its analysis, has widened.
Simmel’s Excursus on Written Communication
| 215
2.8 Determinateness and Ambiguity “This synthesis finds its further analogy in the mixture of determinateness and ambiguity which is characteristic of written expressions and to, the highest extent, of the letter. Determinateness and ambiguity are sociological categories of the first rank in regard to all utterances between man and man; evidently, all of the discussions in this chapter [part] belong in their general area. Yet here the point is not simply the more-or-less, which the one lets the other know about himself; but, rather, the fact that, what he does give, is only more or less clear to its recipient, and that this lack of clarity is as if compensated for by a corresponding plurality of possible interpretations. It is almost certain that there exists no enduring relation between individuals in which the changing proportions of clarity and interpretability of utterances do not play an essential role, although we usually become aware of this role only through its practical results.” (Wolff 1950, 353–354) (GH): It is not always clear what Simmel means when he speaks of analogies. It is unclear whether the similarities he lists are located on the same hierarchical level. So far, we have been dealing with a fundamental opposition between life and form, subjectivity and objectivity—and have identified their syntheses in a particular form, namely the letter form. The opposition between determinateness and ambiguity addressed now, however, is not of an ontological (life–form) or epistemological (subjectivity–objectivity) nature; instead, it marks a sociological category. It seems very likely that Simmel understands these categorial relationships as complementary. One relationship cannot, that is to say, be derived from another or, conversely, traced back to it. It further seems likely that Simmel is weaving the basic idea that we humans live a life of contradictions through his work on cultural theory, aesthetics, theory of religion, sociology, and so on. We are in sociology here, and the thesis is that all social interactions and communicative processes, everything interpersonal, can be characterized as a mix of determinateness and ambiguity in what is expressed. This is a fundamental idea in the language theories of Herder and Humboldt, and those of Simmel’s predecessors Karl Wilhelm Ludwig Heyse and Chajim Steinthal. And it goes without saying that this is not a quantitative relationship (such that it might foolishly be thought that someone makes himself 33 percent or 50 percent understood to someone else in spoken or written expression). Instead, we are concerned with the medium in its limited clarity and the choice of interpretations it offers. The clarity of an oral or written utterance and its interpretability stand in relation to each other, but this relation varies. If interpersonal relations are to be made enduring, they need to take this variability and openness into account.
216 | Gerald Hartung and Karl-Heinrich Schmidt (KHS): “Determinateness and ambiguity are sociological categories of the first rank in regard to all utterances between man and man.” Simmel wrote this at the start of the twentieth century, and it appears in the founding texts of sociology. It was some time, however, before this distinction was elaborated analytically, and ultimately with concepts that can be formalized, for the purposes of treating utterances. A good half-century later, in Harvey Sacks, for instance, we find not just general assumptions but also maxims for the sociological analysis of how people deal with the determinateness and ambiguity of an utterance and of observable social happenings more generally. This can be illustrated with the example text discussed by Sacks that we considered earlier: “The baby cried. The mommy picked it up.” The fact that what someone gives “is […] more or less clear to its recipient,” as Simmel puts it, is largely due to the fact that a hearer/reader of these two successive sentences interprets them using what Sacks formulates as the following hearer’s maxim (HM1): (HM1) If two or more categories are used to categorize two or more members of some population, and those categories can be heard as categories from the same device, then hear them that way.
A “device” here is a structurable set of categories. In the case of the two example sentences, the “family” device leads (with further assumptions) to the categories “baby” and “mommy” being used in such a way that their instantiations in these example sentences are seen as belonging to a (single) family and (as already noted under 2.3 above) the mother is understood as the mother of the baby—even though this is not stated at any point. “The fine power of culture” reduces ambiguity here to such an extent that this one particular interpretation suggests itself to the hearer. Sacks thereby gives an empirical grounding to the primacy of “determinateness and ambiguity” as sociological categories, which Simmel merely asserted, as well as providing explicit analytical methods, as for instance in the HM1 maxim above. This was refined decades later by Devlin and Rosenberg, who substitute what “can be heard” with what “is normally heard,” in order “to make very clear the distinction between what our knowledge of language allows us to express, and therefore what ‘can be heard,’ and what our social knowledge requires us to understand—what ‘is normally heard’” (Devlin and Rosenberg 1996, 105). The result: (HM1′) If two or more categories are used to categorize two or more members of some population, and those categories would normally be heard as categories from the same device, then hear them that way. (Devlin and Rosenberg 1996, 93 f.)
Simmel’s Excursus on Written Communication
| 217
Devlin and Rosenberg embed this emendation in a situation-theoretical analysis that also offers means for the detailed analysis of “utterances between man and man” (see above). In the process, they draw on the work of (cf. Barwise and Perry 1983) by relating their insights into the “efficiency” and “indexicality” of natural language to the difference between what “can be heard” and what “is normally heard” on the basis of shared knowledge (cf. Devlin and Rosenberg 1996, 105 f.). And conversely, Simmel’s objective spirit acquires a generative component here—in the words of Devlin and Rosenberg: Shared knowledge is due in large part to the efficiency and indexicality of language. (Devlin and Rosenberg 1996, 54)
Simmel’s general opposition between determinateness and ambiguity is thus given a central role in the analysis of natural language. His opposition has a twofold grounding in the constituents of the textual letter: in the properties of natural language (varying with its textualization, as will be discussed again briefly in what follows), and in the specificity that his central letter maxim, introduced at the start of this chapter, asserts for remote communication by letter (adding a genre-specific complement to Sacks’s general maxims). Simmel goes on to develop both groundings further.
2.9 Communication Where the Participants Are Present versus Written Communication “Superficially, the written utterance appears to be safer in the sense that it seems to be the only one from which ‘no iota can be taken away.’ Yet this prerogative of the written word is only the consequence of a lack of all those accompaniments—sound of voice, tone, gesture, facial expression—which, in the spoken word, are sources of both obfuscation and clarification.” (Wolff 1950, 354) (GH): Simmel’s argumentation here follows naturally from his preceding remarks. What is assumed, though, is the premise—not substantiated here—that clarity is sought in communication situations. Perhaps we are dealing with an anthropological constant: human beings want to be able to create clarity in social interactions. This is the basis for the cultural-theoretical assumption that it is characteristic of higher cultures that society is possible for the people involved, despite a lack of clarity in communication situations becoming more and more apparent. Cultivation, after all, also means seeing a lack of clarity in communication as offering a variety of possible interpretations to choose between—and at first coping with this
218 | Gerald Hartung and Karl-Heinrich Schmidt state of affairs, before shaping it as well. Participants in communication who make use of the letter form are not spared this contradiction between what one seeks (clarity) and what one gets (interpretations to choose between). Written expression is a deficient form in comparison to oral expression if, without it being desired, the lack of proximity to a communication partner results in a distance across which the interpretations to choose between for written utterances proliferate. (KHS): Simmel here takes up the opposition between written and oral utterances once again (see section 2.6 above)—albeit with a sign-oriented focus on the sequences of written letters from which “no iota can be taken away” and which can, with a suitable material vehicle of transmission, be used in document-based remote communication without any “technical” problems, but at the same time not without the disadvantages mentioned.
2.10 Genre-Specific Freedom and Unfreedom of Communication by Letter “As a matter of fact, however, the recipient does not usually content himself with the purely logical sense of the words which the letter surely transmits much less ambiguously than speech; innumerable times, indeed, the recipient cannot do so, because even to grasp the mere logical sense, more than the logical sense is required. For this reason, the letter is much more than the spoken word the locus of ‘interpretations’ and hence of misunderstandings—despite its clarity, or more correctly, because of it.” (Wolff 1950, 354) (GH): The risks and opportunities of communication are also evident where the letter form is concerned. Written expression, for instance—in contrast to its oral counterpart—goes hand-in-hand with the assumption that it can be comprehended by analyzing the “logical sense.” But this, however, is not so. Simmel’s argument here is remarkable: someone who wishes to understand the logical sense of a statement in natural language cannot take recourse to the logical sense alone; the recipient and reader of a letter need to know the context if they are to narrow down the interpretations to choose between. (KHS): Simmel switches to the processing situation and thereby moves from the sequence of signs written by the sender to its significance for the recipient. The transition lies in an act—not included in Simmel’s analysis—of circulation (here by putting the letter in the mail, for instance) as part of a distinct document act. This is likewise a culturally shaped action “that has to be learned in a very different
Simmel’s Excursus on Written Communication
| 219
sense than the forms of oral communication” (Stetter 1997, 297).⁶ The production process is complete; as a final step, in the case of the written letter, its logical content (Simmel also calls this the “intended content”; see above) is endowed with its own selective document context. This context then has to be used on the part of the interpreting recipient to arrive at an understanding—not without its pitfalls—of an assumed intended content. Here, Simmel—at the beginning of the twentieth century—brings into play the unavoidable indexicality of communication in natural language, which found the following (formalizable) formulation at the end of that century with the help of crucial work by Garfinkel and Sacks (1970), Barwise and Perry (1983), and Suchman (1987): “The significance of an expression in a particular communicative event, however, does not depend on its literal meaning (if such there is). Rather, it lies in its relationship to the circumstances in which it is used—that is, to the features of the ‘utterance situation.’ Relevant features of the utterance situation may be who the speakers and listeners are, the purpose of their interaction, and other aspects of the social context in which linguistic expressions occur. It is in this sense that language is indexical—embedded in the situation in which it is used” (Devlin and Rosenberg 1996, 54). This indexicality figures distinctively in the letter genre, as Simmel now explains with the letter maxim that was mentioned at the start of this chapter. “Corresponding to the cultural level at which a relationship (or period of relationship) based on written communication is possible, the qualitative characteristics of such a relation are, likewise, sharply differentiated: what in human utterances is clear and distinct, is more clear and distinct in the letter than in speech, and what is essentially ambiguous, is more ambiguous. Expressed in terms of the categories of freedom and unfreedom on the part of the recipient of the utterance: his understanding, in regard to its logical core, is less free; but, in regard to its deeper and personal significance, his understanding is freer in the case of the letter than in that of speech. One may say that, whereas speech reveals the secret of the speaker by means of all that surrounds it—which is visible but not audible, and which also includes the imponderables of the speaker himself—the letter conceals this secret. For this reason, the letter is clearer than speech where the secret of the other is not the issue; but where it is the issue, the letter is more ambiguous.” (Wolff 1950, 354–355) (GH): Culture in the general sense has two sides. These can be seen on the one hand in the objectification of subjectivity (cultivation—subjective culture), on the
6 “die man in einem ganz anderen Sinne lernen muss als die Formen mündlicher Kommunikation.”
220 | Gerald Hartung and Karl-Heinrich Schmidt other in the relative stability of cultural forms (objective culture). For Simmel, both aspects belong together. Cultivation depends on a certain level of objective culture (institutions, forms, practices) if it is to be successful. Objective culture retains its living character, and avoids decaying into a “mere mechanism” (Simmel) or “iron cage” (Weber), only when it makes cultivation possible. Different cultural levels can be identified on the basis of how the interrelationship between subjective and objective culture produces qualitative differentiations. The more refined the differentiations—for instance, in linguistic expression, in fashion, in consumption, or in gender relations—the greater the possibilities that take shape. The letter form is set apart from the form of speech in terms of both clarity and ambiguity. With this observation, Simmel once again addresses a fundamental contradiction whose various manifestations have already been discussed. Here, he condenses this contradiction into an aporia in a specific case. Written expression in the letter form pursues clarity and creates ambiguity. For the recipient, this means that the contradiction between necessity and freedom with which every social interaction is inscribed, becomes concrete. Simmel deftly switches between levels of analysis again at this point. The ontological, epistemological, psychological, and sociological dimensions of a fundamental contradictoriness in life are joined by an ethical dimension. Every communication situation involves a play between freedom and unfreedom. Unfreedom results when not a single bit of subjectivity is protected; freedom is made possible when subjectivity is embedded in something held back, in a secret. For the letter form this means that where subjectivity seeks to objectify itself, a maximum degree of clarity can be aimed for; but where something is to be held back, where there is still a secret to be kept, the letter can generate ambiguity. (KHS): The distinction introduced above between “what would normally be heard” and “what can be heard” can now be applied on the recipient’s side in the case of the letter genre. A letter amplifies both: “what in human utterances is clear and distinct, is more clear and distinct in the letter than in speech, and what is essentially ambiguous, is more ambiguous.” Taking (oral) speech as a point of reference means that natural language is a condition of possibility for these epistolary amplifications in Simmel: textualization as such just makes remote communication by letter “technically” possible; it is only the textualization of specifically natural language as the language of the message that leads—above all by selective use of the production situation (the interiority and external environment of the writer) in compiling the message—to the amplifications of both clarity and ambiguity that are specific to the letter. The end result is that the natural-language, written “letter is clearer than speech where the secret of the other is not the issue; but
Simmel’s Excursus on Written Communication
| 221
where it is the issue, the letter is more ambiguous.” This can be summarized in a genre-specific letter maxim (LM): (LM) What in human utterances is clear and distinct, is more clear and distinct in the letter than in speech because what would normally be heard is emphasized, and what is essentially ambiguous, is more ambiguous because the possibilities of what can be heard are extended.
With the ascription of letter passages to the “secret of the other,” we arrive at the all-important formulation quoted at the start of this chapter and again in this section. The remarks that accompany it conclude the excursus.
2.11 The Secret of the Other in Written and Oral Communication “By the ‘secret of the other’ I understand his moods and qualities of being, which cannot be expressed logically, but on which we nevertheless fall back innumerable times, even if only in order to understand the actual significance of quite concrete utterances. In the case of speech, these helps to interpretation are so fused with its conceptual content that both result in a wholly homogeneous understanding. This is, perhaps, the most decisive instance of the general fact that man is quite incapable of distinguishing what he actually sees, hears, and experiences from what his interpretation makes of it through additions, subtractions, and transformations. It is one of the intellectual achievements of written communication that it isolates one of the elements of this naïve homogeneity, and thus makes visible the number of fundamentally heterogeneous factors which constitute our (apparently so simple) mutual ‘understanding.’” (Wolff 1950, 355) (GH): In addition to what is made (or: can be made) explicit, factors that have only an implicit effect are crucial in social interactions, including communicative action. Simmel speaks here of “qualities of being,” which is not meant obscurely but rather covers everything that forms part of the first- and second-person perspectives of the letter writer and letter reader: moods, feelings, positions and attitudes, interests, and much more besides. In the ideal case of mutual understanding, we produce a synthesis of all the implicit and explicit factors. In actual communication, however, this is precisely where the difficulty lies: we are unable to maintain a distinction that, although it can be drawn, will always tend to collapse again. In other words, as participants in communication situations we are unable to adopt a third-person perspective. Simmel gives Kant’s fundamental problem of epistemology a life-philosophical and cultural-theoretical twist: the question is now not simply how synthetic judgments are possible a priori (this interested Kant with respect to mathematics) but the more far-reaching question of how we
222 | Gerald Hartung and Karl-Heinrich Schmidt handle the fact that we live our lives making judgments that are always already synthetic—drawing together, that is to say, objective and subjective factors in social interaction. One option in dealing with this “problem of life” is gaining distance. Written communication makes possible a greater distance than a face-to-face speech situation. By isolating the various factors at work in a communicative situation, we can understand the interplay between them and recognize how complex and precarious a result of social interaction understanding—as it is naively called in cultural hermeneutics—is. We recognize with Simmel that mutual understanding is not the rule, but rather the exception. (KHS): Examining a particular form of communication—the letter—as if under a microscope, Simmel identifies a “number of fundamentally heterogeneous factors” in this case of document-based remote communication. A (in the case of the letter) protected message becomes, with the genre expectation of potentially intense processing on the part of the sender, part of a process of remote communication that is, as a rule, minimal in terms of the number of participants (one sender and one recipient). Interpretation of the message is unavoidable here. To this end, there are various options on the side of the sender for offering guidance for the gleaning of information from the overall document; in the case of a natural-language text, the extraction of information in the course of interpretation on the recipient’s side can assume clarity where idiosyncrasies of the sender and the production situation (as the “secret of the sender”) do not play a role, and a greater need for interpretation where the “secret of the sender” is at stake. This is the essence—not just genre-specific, as it might initially seem—of the whole excursus. The formulation of the entire preceding paragraph, however, already points to the complexity of the form of communication under consideration. We can join Simmel in his amazement that this complex form is a genre that is routinely used in human remote communication, one that has been mastered and employed in numerous variants.
3 Conclusion Our analysis of Simmel’s excursus leads to the results below. In line with our two perspectives, we distinguish between findings that relate to a theory of modern culture and findings that relate to a theory of document use. At the same time, the point of our textual analysis is intended to lie in the fact that our two perspectives converge in many places.
Simmel’s Excursus on Written Communication
| 223
From the perspective of the philosophy of culture, Simmel’s text is particularly interesting because he manages to pack his general theses on modern culture into the analysis of a specific cultural phenomenon—and in this way gives remarkable insights in both respects. He shows, for instance, that the conditions of communication situations include the fact that we have to tolerate a knowing and not-knowing about the other, a clarity and lack of clarity in what happens socially and communicatively. It makes a difference here whether the communication is based on oral or written expression and whether it employs speech or the letter as its form of communicative interaction. In both cases, comparable yet distinct processes can be analyzed behind the genesis and validity of the sociocultural form in question. This goes for its intellectual form and material form, as well as for its various temporalities. In the case of written communication, the question of whether and how intellectual content (information) is recognizable, clarified, communicable is linked to the material form (writing). According to Simmel, the letter form is an excellent example of the emergence of a social difference between private and public, for this difference offers a new possibility for cultivating the tension between knowing and not-knowing with respect to the other (one’s correspondent) in communicative situations. This state of affairs embodies a special condition that makes society possible. Simmel uses the excursus on communication by letter in his Soziologie (1908) to present a representative case in which the tension between individuation and socialization can be demonstrated. The letter form can be seen as a form of communication in which modern subjectivity cultivates itself by making the personal objective and the objective personal in written expression. For Simmel, it is a mark of higher cultures that the social actors are able to cope with, and perhaps even to shape, complex communicative situations that present an irreducible variability of interpretations to choose between. A special point of Simmel’s analysis—which is not, however, spelled out—is that a mutual relationship between content and form can be discerned, whereby the implicit and explicit aspects of written expression are each tied to both aspects—content and form—of what is expressed. This means that only by analyzing the use of a document (here, the letter) is it possible to determine for that document what is content and form, what is implicitly or explicitly the object of social interaction, in any given case. Where a theory of documents and their use is concerned, Simmel’s excursus from the start of the twentieth century already contains many of the themes that went on to shape the analysis of document-based remote communication in the course of that century and down to the present day: the difference between oral and written communication, the informational relevance of the production situation (e.g., what German-language research has called the Schreibszene, or writing scene) and the processing situation, cultural knowledge as the basis for and consequence
224 | Gerald Hartung and Karl-Heinrich Schmidt of necessarily indexical exchange in natural language, the specificity of forms of communication in remote communication, and the identification of genre-specific rules of use. Simmel covers all these themes in his microscopic sociological work on the letter genre. He treats the letter genre like a device for processing bodies of information. He identifies qualitatively essential components of the device (see above), but he does not provide a differentiated or ultimately formalizable compositional analysis of specific bodies of information. Doing so remains a task for further intellectual work into our own, twenty-first century, a task that has certainly not yet been completed. It is hoped that the present volume will help support the progress necessary to change that.
Bibliography Barwise, Jon and John Perry. Situations and Attitudes. Cambridge, MA: MIT Press, 1983. Devlin, Keith J. and Duska Rosenberg. Language at Work: Analyzing Communication Breakdown in the Workplace to Inform Systems Design. Stanford: CSLI Publications, 1996. Garfinkel, Harold and Harvey Sacks. On formal structures of practical actions. In: McKinney, John C. and Edward A. Tiryakian, editors, Theoretical Sociology: Perspectives and Developments, pp. 338–366. New York: Appleton-Century-Crofts, 1970. Kondrup, Johnny. Text und Werk – zwei Begriffe auf dem Prüfstand. editio, 27(1):1–14, 2013. DOI: https://doi.org/10.1515/editio-2013-002. Kramme, Rüdiger and Otthein Rammstedt, editors. Hauptprobleme der Philosophie. Philosophische Kultur. volume 14 of Georg Simmel Gesamtausgabe. Frankfurt am Main: Suhrkamp Verlag, 1996. Pédauque, Roger T. Document: Form, sign, and medium, as reformulated by digitization. A completely reviewed and revised translation by Laura Rehberger and Frederik Schlupkothen. Laura Rehberger and Frederik Schlupkothen, translators. In: Hartung, Gerald, Frederik Schlupkothen, and Karl-Heinrich Schmidt, editors, Using Documents. A Multidisciplinary Approach to Document Theory, pp. 225–259. Berlin: De Gruyter, 2022. Sacks, Harvey. On the analyzability of stories by children. In: Gumpertz, John and Dell Hymes, editors, Directions in Sociolinguistics: The Ethnography of Communication. New York: Holt, Rinehart and Winston Inc., 1972. Simmel, Georg. Soziologie. Untersuchungen über die Formen der Vergesellschaftung. Berlin: Duncker & Humblot, 1908. Stetter, Christian. Schrift und Sprache. Frankfurt am Main: Suhrkamp, 1997. Suchman, Lucy. Plans and Situated Actions: The Problem of Human Machine Communication. Cambridge: Cambridge University Press, 1987. UNESCO. General guidelines, thirteenth meeting of the International Advisory Committee (IAC) of the memory of the world programme. Definition 2.6.2. October 2017. URL: https://en. unesco.org/sites/default/files/iac_2017_13th_mow_general_guidelines_withcover_en.pdf. Wolff, Kurt H., editor. The Sociology of Georg Simmel. With an introduction by Kurt H. Wolff, translator. Glencoe, IL: The Free Press, 1950.
Roger T. Pédauque
Document: Form, Sign, and Medium, as Reformulated by Digitization A Completely Reviewed and Revised Translation by Laura Rehberger and Frederik Schlupkothen
Abstract This paper presents group discussions taking place within the multidisciplinary topical network 33 of the CNRS Information and Communication Science and Technology (STIC) Department. It attempts to clarify the concept of document in its transition to electronic form, based on research which rather privileges form (as a material or immaterial object), sign (as a meaningful object) and medium (as a communication vector). Each of these terms reflects the radical transformations that are taking place. Their superposition stresses the importance of multidisciplinarity for a lucid and complete analysis of the concept of document and its evolution.
Context Very few scientific papers give a definition of document and even fewer discuss the definition. The document appears as a direct object of analysis in only a few rare scientific communities: Information Science researchers, based on work concerning documentary techniques which have changed considerably due to electronic data processing, researchers investigating the digitization of documents and indexing-cataloging problems, who have often extended their reflections to electronic document management, and those developing electronic publishing tools. Furthermore, documents are discussed in case they are essential tools for a discipline’s construction and progress, such as in history and particularly in archaeology, or in geography, and here especially with regard to maps, or, furthermore, in law when dealing with legislative texts and articles, regulations or circulars, but in an instrumental perspective and rarely directly.
https://doi.org/10.1515/9783110780888-009
226 | Roger T. Pédauque Many dictionaries, lists of standards, and encyclopedias present definitions that are rather designations or descriptions than a profound reflection on the concept of document. From the Latin documentum giving the word roots in teaching (docere = to teach), to its marginalization by the more recent, more frequent but hardly more accurate term of “information,” the concept appears to be commonly based on two functions: evidence (the jurists’ so-called “piece of evidence” or the element of a case file) and information (a representation of the world or a testimony). For instance, contemporary archival science recognizes these two functions by granting the document a “value of evidence” (of activity), which has a somewhat broader meaning than judicial “evidence,” and a “value of information,” corresponding to the sense given above. A very large number of other research papers use a different vocabulary, sometimes rigorously defined, but often also subject to different interpretations, to designate comparable objects. For instance, computer science researchers, from network analysis to data base engineering, text mining, information search, automatic language processing, up to knowledge engineering; or, furthermore, corpus linguists, semiologists, psychologists of learning, sociologists of culture or organization, economists of the media or information, jurists of intellectual property rights, and generally “the humanities” use a variety of terms such as information, data, resource, file, written material, text, image, paper, article, work, book, journal, sheet, page etc., which of course are not synonymous, each of which has a justification in the particular context of the research in question, but all of which are related (generally not assumed) to the concept of document. Finally, documents are ubiquitous in our daily life (especially in administration and even in science). The concept is therefore intuitive for all of us, without us feeling the need to clarify it. Today, this lack of clarity is a problem. As a matter of fact, digitization fundamentally overthrows the concept of document, without one being able to clearly measure the impact and consequences due to a lack of preliminarily outlined contours. This transformation from the widespread paper carrier to the electronic one, is obvious when it comes to the material aspect, cognitive treatment, perception, and, furthermore, usage. This reconsideration, although announced by a few pioneers’ texts and prepared by the increasingly obvious convergence between writing and audiovisual techniques, is very recent, still chaotic and undoubtedly irreversible. It is probable that the many researchers investigating these issues from many different angles would have much to gain from an overall view allowing them to see things more clearly. The contrast between the relative stability that existed heretofore and the speed and depth of the changes now occurring undoubtedly explains the delay
Document: Form, Sign, and Medium, as Reformulated by Digitization |
227
in analysis. There was no need to investigate, except as a historian, an object so commonplace as to be self-evident, and today one has not really had enough time to stand back and assess the situation from a distance. The document was constructed as an object, whose most common material form is a sheet of paper, during a centuries-long process mingling tools, knowledge, and status. After some decades of digitization, we have entered a new stage, of which some features are in direct affiliation with the previous period, whereas other features, on the contrary, mark a radical change and perhaps the emergence of a different concept embodying all or part of the social utility that we were calling “document.” The most obvious manifestation of this change is therefore the loss of stability of the document as a material object and its transformation into a process constructed on request, which sometimes undermines the trust placed in it. The question between rupture and continuity does not arise only for the object. The methods of analysis or the epistemologies are rapidly changing as well.
A Multidisciplinary Approach We feel that these difficulties can only be resolved through a determinedly multidisciplinary approach. Our opinion was encouraged by the CNRS STIC Department, which initiated a multidisciplinary topical network called “Document and Content: Creating, Indexing, Browsing” (http://rtp-doc.enssib.fr) including about a hundred researchers. The concept of document is not central to some of the disciplines covered by the network and the researchers only have a partial understanding of what this concept covers. The purpose of the network is therefore to shift this oblique focus in order to make the document an essential subject of research, at least for a time, by combining the contributions of the different researchers. It is not certain whether there is a consensus between disciplines or even within each discipline on the issues under discussion. Our aim is not to harmonize or define a line, a current or a school of thought, but to clarify and detail the concepts in order to dispel misunderstandings, open up new perspectives, and identify possible disagreements. We are convinced that a dialogue between disciplines cannot be fruitful unless we have succeeded in identifying the essential concepts so that we can discuss them or use them as basis. This attempt is not without risks. On the one hand, nonsense or simple superficiality is possible. On the other hand, the different bases of the disciplines or currents may be contradictory. In addition to conceptual difficulties, the objective may run into more common obstacles. Each specialty naturally develops its own culture and vocabulary, for both good (rigor) and bad (protection) reasons. The same words sometimes have different meanings in different communities and are
228 | Roger T. Pédauque often even unknown to outsiders. In a multidisciplinary text, we are constrained to use a common vocabulary, in all senses of the term, at the risk of distorting. Concretely, this text is the result of collective work within the network. Given the method used for its writing and the many contributions it embodies, we decided not to give any quotations or direct bibliographical references. Doing otherwise would indeed bias the group dynamics by inducing competition between authors or schools of thought. However, a bibliography can be found on the rtp-doc website.¹
Propositions We will use an analogy with the linguistic distinction between syntax, semantics, and pragmatics to organize our propositions. Without going into a discussion on the validity of this analogy or even the legitimacy of this tripartite division used in linguistics, we can see that it allows a fairly simple classification of current research and its underlying currents. We will distinguish between: – The document as a form; under this category, we will classify approaches that analyze the document as a material or immaterial object and study its structure in order to improve its analysis, use, and manipulation. – The document as a sign; for these researchers, the document is primarily perceived as meaningful and intentional; the document is thus indissociable from the subject in its context, which constructs or reconstructs it and gives it meaning; at the same time, it is considered as part of a documentary system or knowledge system. – The document as a medium; this dimension finally raises the question of the document’s status in social relations; the document is a trace, constructed or found, of a communication disengaged from space and time; at the same time, it is an element of identity systems and a vector of power. The analogy with linguistics remains informal. It could be argued that the first category is more specifically related to morphosyntax and that the second includes both semantics and pragmatics. But the comparison is feeble, all we need is for the analogy to be efficient in our work. Each category should be viewed as a dominant but not exclusive dimension. For instance, researchers who use the “document as a form” approach do not necessarily neglect the other two approaches, but their analysis and reasoning privilege the first approach, and the other two remain complementary or external 1 https://web.archive.org/web/20101027185248/http://rtp-doc.enssib.fr/ (18.01.2022).
Document: Form, Sign, and Medium, as Reformulated by Digitization |
229
constraints. The term “entry” would perhaps be the most appropriate. Each of these entries is indeed a way of approaching the research subject, the document, from which the other dimensions will be found through developments, constraints, obstacles or limits that appear in the primary reasoning. However, each approach probably also tends to over-relativize the others. We will discuss each category using the same scheme: – First, we will identify the main disciplines, know-hows or specialisms that privilege this point of view. The aim is not to discuss their validity or scientificity but to review the diversity of the research representing this orientation, without judging either value or importance. – Then we will suggest an interpretation of the evolution of the points of view regarding the transition from traditional document to electronic document. – We will gradually construct a definition of “document” based on each entry. – We will identify a few outstanding questions in each category, beyond the scope of the current specific researches. In each instance, we will attempt to identify the essentials, without dwelling overmuch on nuances, exceptions, and special cases. The aim is to emphasize what is fundamental, not to be exhaustive. As concerns the definition, one method would consist of a systematic search for cases not corresponding to it and then construct a universal definition. This method does not seem very functional to us. Our aim is not to answer everything, but to construct a generic definition, even though exceptions might be identified that represent either very special cases or intermediate or transitory situations when they do not simply emanate from an incomplete or incorrect analysis. In the conclusion, we propose a synopsis of the three entries to highlight the elements of continuity and rupture with respect to the preceding period.
The Document as Form It may be argued that the term “form” is ambiguous, but we use it for lack of a better one. It should be understood here as “contour” or “figure”; in other words, the document is seen as an object or an inscription on an object, whose boundaries are identified and at the same time as a reference to “formalism,” because this object or inscription obeys rules that constitute it. Here, the document is viewed as an object of communication governed by more or less explicit formatting rules that materialize a reading contract between a producer and a reader. The document is mainly studied from the angle of this
230 | Roger T. Pédauque implicit communication protocol, irrespective of its specific textual or non-textual contents.
Specialisms Concerned From the outset, the particular place occupied by writing needs to be emphasized, a technique whose widely shared education has placed the document, since its appearance, in a fundamental social situation. The know-how, be it professional or otherwise, that privileges this point of view is varied and in some cases very ancient, such as calligraphy and typography. Accordingly, this also applies to other forms of representation, such as techniques of music, video, and cinema, as well as library science, focused on document cataloging, classification, and management, and also archival science. Consequently, information specialists who digitize material objects, i.e. image specialists, have strong ties to these first specialisms. They are rapidly interested in the internal structure of documents, with automatic pattern recognition systems: first of all, automatic character recognition, followed by handwriting and page and image layout recognition. In their area, they are confronted with problems of formats, exchange, storage, description, addressing, as well as the preservation and processing of large quantities. This concerns automatic document reading or analysis. The researchers attempt to decode the object by explaining/exploiting the underlying communication protocol (the reading contract). Similarly, all those interested in typefaces, page layouts, editorial formats, international standardization in these areas, text processing, those who construct digital video systems, decline to renew ancient know-how. Other information specialisms have also chosen this first point of view. The design of electronic document management systems, as the name implies, proceeds well on the assumption that the document preexists as an identifiable object, even if it is virtual. Although the starting point is an electronic file, not a concrete object, many of the problems posed in this case stem from the same fundamental questions. Finally, a sudden change of scale occurred with the invention and phenomenal success of the World Wide Web: an intensive research, design, negotiation, standardization, and development activity unfolds around the Web, notably but not exclusively within the W3C consortium. Although these researchers sparsely use the word “document,” preferring “resource,” which covers many other objects, many of the questions raised by the designers of the Web in its current version (i.e. before the “Semantic Web”) considerably emanate from this first approach, too. The investigation focuses on how to interconnect the resources on a global scale
Document: Form, Sign, and Medium, as Reformulated by Digitization |
231
and therefore to define standards and systems applicable to all machines and to assign an identifiable address to these resources. Among these resources, many have the features of a document as understood in this first dimension: HTML or XML file, images, audio or video recordings, streaming media, etc.
Evolution A first definition of “document” could be represented by the equation: traditional document = carrier + inscription. Initially, the emphasis is placed on a carrier that can be manipulated (literally), bearing a lead that can be interpreted, depending on its form, by sight, hearing or touch, in the case of Braille, and why not other senses tomorrow, with or without prostheses. This lead represents the content, materialized by an inscription. The predominant (but not exclusive) traditional carrier is paper and the lead is writing, handwritten or printed. The written page, as basic element, can be enriched by layout formatting and paratexts and extended by binding, cross-references, etc., giving the document a major plasticity and complexity. The “codex” (the book with bound pages) is undoubtedly the most sophisticated form of the traditional document. Its quality can be measured by the robustness of its “specificities,” practically unchanged for more than a millennium! At the cost of a major social effort (school, allowing acquisition of the reading protocol), this type of document is directly perceptible, i.e. without any intermediate high tech instruments (except eyeglasses for some), by a more or less important share of the population of a given society: those who have learned to read. When, in the course of history, this notion (carrier + inscription) was expanded to other forms of representation, such as recorded music, cinema, and audiovisual broadcasting, the carrier did not keep its faculty of direct appropriation. Even though the representation was more accessible to direct human perception (and thus could be decrypted without a complicated learning process), the reading process became more sophisticated. It is necessary to make use of a machine to listen to a record, project a film (recorded on celluloid) or play a (recorded) video tape. The object is still necessary for reading, but is no longer sufficient. Superiorly, radio and then television broadcasting have allowed separating decryption of the signal from transmission. Therefore, broadcast audiences listen to or watch “programs” whose transmission eludes them. They do not control their moment of reading, unless they selectively tape the program. In a way, when broadcasting entered the home, it partially dispossessed the audiences of the space-time autonomy they had gained by manipulating mastered or recorded objects.
232 | Roger T. Pédauque Thus, audiovisual broadcasting facilitated an evolution in the use of carriers, but for us the essential change is the passage of the inscription of an analogue to a digital signal, with all the data processing facilities involved. This has radical consequences for the ensemble of written, pictorial, and audiovisual documents. These changes are reflected in the reading-writing systems and in the documents themselves. Concerning the systems, first of all, an extraordinary give-and-take between writing and audio-visualization is observed. The first integrates a system familiar to the second. It is no longer possible to read without a machine. Although the production of printed matter requires powerful technical equipment, its reading is, as we mentioned, straightforward or almost so. Optical or magnetic disks, recorded tapes, signal processing, and rendering devices as well as network connections are indispensable tools that must be purchased individually to read electronic documents, even to return to the previous state by means of its printing. Listeners and viewers can, for their part, control the start and stop of streaming media on the Web, a capability which used to be lost in broadcasting, where the only way to modulate the uninterruptible flow of programs was to tape them. The second consequence on systems is the interlacing of carriers and signals. The concept of carrier becomes more complex and ambiguous. Is the carrier the file, the hardware on which it is stored or the screen’s surface on which it is displayed? As it traverses the network, a document is fragmentarily copied in routers for a short time and may especially be stored in its entirety in caches for varying periods of time. Moreover, the same “carrier” can contain any type of representation, provided it is digital, and even the representations themselves can be combined, provided their formats are compatible: one might “read” tightly interleaved text, image, audio, and animation. Whereas printing privileged the physical carrier because of the technological complexity inherent in any document production activity, electronic publishing has made possible the on-demand production of documents (equally on screen or paper). The carrier has thereby lost its privileged status to the benefit of electronic publishing. On this subject, it can be recalled that one of the major advances of computer-aided publishing was wysiwyg (What you see is what you get) which enabled the visualization on screen as on paper. Finally, since the parallel growth of computers and telecommunications, the devices themselves multiply and autonomize, such as laptops, PDAs, telephones, and various types of integrated tools, searching for the best possible way of appropriating the readers’ behavior with their generic and/or specific needs. Regarding the future of documents, it matters that cell phones have spread much faster and more extensively than computers.
Document: Form, Sign, and Medium, as Reformulated by Digitization |
233
The concept of carrier has thus lost its initial clarity. But, in our equation (carrier + inscription), digitization’s consequences on the second term, inscription, are just as radical. Inscription can be considered a type of coding, a familiar operation for computer specialists. They therefore attempted to isolate the logical elements that form this dimension of the document in order to model these elements, automate operations, and rearrange the thereby perfected elements. In this area, a comparison can be made with the concept of program as it is often presented in computer science: program = software + data. A document would simply be a special case of computer program whose software part would represent the “structure” and whose data part would represent the “contents.” The equation would become: electronic document = structure + data. In line with this first entry concerning the form, researchers neglect content and, on the contrary, closely investigate the structure, which, by definition, can be modeled and which, in a way, independently of the carrier, represents the “reading contract” concluded between the document producer and their potential readers. The structure varies enormously according to the type of document. Some documents are practically unstructured, such as certain spontaneous works of art or texts where form and content are indissociable. Others, on the contrary, follow rigid formal rules. The structure also differs according to the types of media. For instance, audiovisual broadcasting introduces a time dimension which is practically absent from written documents. However, the analysis has allowed several levels of structuring to be identified and isolated in the most general case. These levels have been constructed from two research currents, one from analogue to digital and the other from digital to analogue. Before coming back to the concept of structure, it is preferable to understand the logic of their reasoning. The first current’s task is converting traditional documents into electronic form in order for them to benefit from the performance of computers. In other words, a traditional document will have to pass from one equation to the other: carrier + inscription to structure + data. From digitizing the original document onward, the operation precisely attempts to dematerialize it, using an image processing and pattern recognition approach. Furthermore, it is possible to reason simply on the representation of a document, directly reconstructing the visual equivalent of all or part of its representation without using the original carrier. Note that the operation is not socially trivial. It must be executable in both directions, especially for legal reasons. We will come back to this in the third entry. In this first current, one finds image processors whose research attempts to reconstruct the image, i.e. the formal representation of a document, as their name implies. The principle is pattern recognition. To be recognized, a form must first be known. The more the original document is based on generic structures, the easier it is to transpose. The complexity therefore increases when going from typographic
234 | Roger T. Pédauque characters to graphics, diagrams, then images, and finally three-dimensional objects. Although the aim is to reproduce a perception that is similar or analogous to that of the original object, the process is nevertheless a new translation which may mask significant elements or on the contrary lead to the discovery or rediscovery of new ones, depending on the technological choices made and the files’ future use. Other researchers directly start with the final equation (electronic document = structure + data); in other words, they go in the opposite direction. They develop algorithms, the essential element at the heart of computer science, to reconstruct the documents, retracing their logical or internal structure step by step to obtain a representation readable on the screen. This second current derives from the common use of text in programming languages with gradual integration of the concern of form (wysiwyg). This led at first to the development of office automation tools and tools for electronic publishing and was finally confronted with the necessity of being able to exchange documents on a large scale, before it then truly exploded with the Web revolution. The computer scientists argued for the use of layers to isolate the elements from the document structure and process them separately. In this way, they discovered or rediscovered the different logical levels of this structure, the lowest of which being that of the text or analogue signal, which was attempted to be standardized as Unicode, MPEG, etc. Since the transition from electromechanical to digital phototypesetters, the concept of text markup has been added to that of document structure. It gradually established two principles: the tags describe the structure rather than the physical characteristics of the document and they are understandable both by a program and a human interpreter. Without detailing the history of this process, it can be said from our standpoint presented here that the Web can be described as an infinity of interlinked documents. Its architecture is based on three pillars: resources identified by a universal addressing scheme (identification), which are represented by a nonexclusive set of schemes (representation) and exchanged according to standard protocols (interaction). This architecture assumes that documents can be accessed from anywhere, using any type of hardware, and according to the specificities of the user groups. The two currents, traditional document recognition and direct construction of electronic documents, are not independent. Starting from different points, they converge to reach the same target. In particular, they allow two basic levels of document structure to be emphasized: the logical structure (the construction of a document as interrelated parts and sub-parts) and the formal representation of the layout, the “styles” in data processing (for instance, the typographical choices for text). As far as we are concerned, the fundamental revolution is perhaps the
Document: Form, Sign, and Medium, as Reformulated by Digitization |
235
gradual uniformization of document formats (in the sense of data processing), because this is what enables the simple processing of these two levels. A document should be readable on any type of computer and decodable by a variety of applications. The trend is towards fragmentation: “proprietary” formats are invading the market and condemning “universal” and “free” formats to remain a luxury of specialists. In addition, “non-universal” formats lead to situations of illegibility: a program cannot read a file, an application cannot open a document, a Web page cannot be displayed correctly on the screen. Furthermore, the format must be able to transcribe the alphabet: the “format” should be suitable for transcribing several languages. Standardization is therefore essential. It is probable that the increasingly widespread success of the XML (extensible markup language) standard and its many particular derivatives marks a new step or even a conclusion of these movements. The XML standard, resulting from the computerization of publishing techniques (SGML—standard generalized markup language) and the sophistication of the first Web tags (HTML—hypertext markup language), integrates structure and content in the same file using a standardized text markup language which allows one to recover and even largely exceed the plasticity and complexity of bound pages we mentioned at the beginning of this section, and a few of their features which had been lost along the way. But by renewing the terms of the old reading contract, whereby the link between perceived representation and logical structure was fixed by the carrier, we also introduce new questions. An XML-approach captures the structure and content from which the form can be derived in different ways. It is not represented intrinsically. It can be said that the form is no longer the essential dimension of the document. But much work is being conducted on the different ways of representing and producing the form of an electronic document, especially for XML documents. Thus, our equation could once again be transformed: electronic document = structure + data would become XML document = structured data + formatting, whose second part (the “style”) is largely variable. In the XML world, the form is defined separately from the data structure, using a style sheet (XSL or CSS). A possible, but not certain, evolution would be that documents “written” in this way join centralized or distributed databases. Then, this file collection would increasingly resemble one or more vast “Lego” sets where building blocks of different sizes, shapes, and forms of use would be arranged in a great variety of configurations. Thus, a last step would be one that is actually happening. A document would only have a specific form at two moments: the one of its conception by an author, who would have to visualize or hear it in order to assure that it corresponds to his/her choices (which is not even necessary if the document is produced automatically), and the one of its reconstruction by a reader. It is very unlikely that the document will always be the same in both cases. Another way
236 | Roger T. Pédauque of conceiving this evolution would be to consider that the document is now the database itself, whose different outputs would be only a partial interpretation of its richness. A community of researchers is studying this issue in the context of the semantic web, in terms of “personalizable virtual documents.” This evolution raises the problem of the management of one or more documents and their writing, enriching, and rewriting by various participants at different points in time. It is already complicated, for individuals, organizations, and on the scale of the Web, to manage the consecutive versions of a single document. Procedures need to be invented to relate a text to an author (or group of authors) while allowing each author to appropriate, or reappropriate, all or part of the documents produced by other authors or themselves. These procedures are needed in order to limit “noisy” proliferation of the different versions of the same information in the network and identify the nature and origins of these modifications. The aim is the coherent management of all the currently available electronic documents, independently of their format and status and outside any centralized institution. We very clearly perceive the premises of this final step and equally well anticipate the problems it raises. It is much more hazardous to predict the evolution and therefore the consequences, except to say that they will definitely be important and long-lasting.
Definition 1 The observation of this first dimension leads us to formulate a definition of “electronic document”; at this stage it is incomplete but representative of a major movement taking place. This definition must take into account the carrier’s marginalization and the basic role now played, on the contrary, by the articulation between logical structure and styles to redefine the reading contract, understood here as the legibility contract. An electronic document is a data set organized in a stable structure associated with formatting rules allowing a shared legibility between its designer and its readers. This definition is probably too long to be easily memorized. Let us recall the transformation of the equation: traditional document = carrier + inscription to electronic document = structures + data. And we suggest the current evolution whose outcome is still uncertain: electronic document = structures + data transformed to XML document = structured data + formatting, remembering that, stricto sensu, the XML standard does not define the formatting, which is defined by XSL.
Document: Form, Sign, and Medium, as Reformulated by Digitization |
237
Questions This first approach, by the form, leaves several unanswered questions. Here, we will focus on those concerning the relation between the perceptible world and the digital organization of the new documentary environment. A first series of questions emanates from the display of documents. Even though, “material bibliography” very closely studied all the aspects of the object “book,” the transition to electronic form appears mainly to have focused on the structure based on the logical entry and its processing purposes. Thus, in this first dimension, researchers willingly consider that, since the structure is integrated in the file, any display is therefore possible, so that questions of perception arise due to another set of problems. This conception, taken to the limit, would assume that structure and content are independent, which is, at the very least, debatable. Form has meaning and researchers have long been studying, for instance, the cognitive importance of the possible navigation through hypertext links. However, much work still remains to be done on electronic reading to have a better understanding of the mechanisms of interdependence between the two terms of the equation. It should be noted that this separation destroys the bases of archival diplomatics, one of whose purposes is to authenticate the document’s content by analyzing its form. The result is that authentication (validation) must (will have to) be ensured by other technical (electronic watermark) or organizational (trusted third parties) methods. Could one imagine requirements concerning the form—and among others the authentic form—to be imposed and/or validated by the existence of a “signed” or “watermarked” style sheet? These issues are even more sensitive as a given document can be commonly read on different reading devices. Should we reason as if reading devices had no effect on perception? We just need to compare the screens of a computer, a tablet, a personal digital assistant (PDA), and a cell phone to be convinced of the contrary. This brings us back to the carrier that is read, which we thought to have left behind. These matters are the subject of extensive discussions, in particular within the W3C consortium (device independence). The progress made in the layouts of screen displays, especially based on the work of the Xerox Palo Alto Research Center (Xerox PARC) laboratory, and the improvement of office automation tools, is limited to visual organization, which is, for documents, layout and filing. A few graphics specialists suggest interesting compositions. But these efforts appear to be relatively unconnected to the previous research. Similarly, e-books and the hopes placed in electronic ink have not yet led to very convincing applications in everyday life, even if the possibility of reconstructing an electronic codex is exalting. Certainly those for whom spatial representation is essential, such as geographers and architects, have already inves-
238 | Roger T. Pédauque tigated this issue in depth, but they are the exception, not the rule. Once again, it is perhaps the audiovisual that is opening up promising prospects with augmented reality integrating analogue aspects and digital reconstructions. A second series of questions is related to the longevity of electronic documents. These questions are often discussed. On the one hand, they do not differ from the very old problems of archiving and preservation, simply transposed to other techniques. On the other hand, radically new problems have arisen: XML files are theoretically inalterable, provided they are regularly refreshed and preserved under good conditions, since all the information they contain is in digital form. Therefore, some consider that the problems of document long term preservation will be solved shortly. Conversely, these files are far from representing the form(s) in which the documents are read. Thus, a complete memory of these documents would require preserving all the consecutive reading equipment and systems and permitting to access them. Here again, much theoretical and practical work remains to be done. Finally, without claiming to be exhaustive, let us emphasize a third series of questions. A traditional document is a manipulable material object. This object fades in digitization, to the extent that the document ultimately becomes a sort of jigsaw puzzle whose pieces are joined together at the request of the reader. However, a reader always accesses a document from a machine, the device on which the document is displayed. Will we witness an extreme version of this idea, whereby a document is nothing more than a modern form of magic drawing board on which significant multimedia items are displayed on request, restricted only by a logic of meaning and specified needs? Or will there be a restructuring of typical documents meeting special requirements or situations, whose possible dynamic will be confined to strictly defined ranges? And might we not assume that the visual stability of paper, the handling ability, and the co-existence of pages have an important role to play in cognition? If so, should we not encourage efforts towards an “electronic codex”? What impact will the new reading systems have on our knowledge systems? What about the (individual or collective) author’s legal or simply moral responsibility? Or, more directly related to our entry on form: can the document’s elaboration be separated from its perceptible form and, therefore, is it simply conceivable to envision a formal rupture between the elaboration of the author (who is also the first reader) and the suggestion made to the readers? The success of facsimile formats (PDF) is often analyzed as momentary resistance to change. Is it not rather an indispensable perceptive stability? These questions could be summarized by a single one: by removing the carrier, did we not in fact neglect the form too much?
Document: Form, Sign, and Medium, as Reformulated by Digitization |
239
Document as Sign As for the previous entry, the terms of the title of this part should not be interpreted too academically. The sign has long been the subject of much scientific research. Even if we make use of some of this research here, our purpose is not to discuss the concept. Our aim is simply to group and present research that considers the document primarily as a meaningful object. The entry that interests us here is the processing of the content. If the form is sometimes considered, it is considered as a meaningful element.
Specialisms Concerned This category concerns disciplines that are substantially different from the previous one, some claiming to represent historical progress regarding that previous category, as if going from form to sign meant getting closer to the problem’s core. Thus, as regards professional know-how, we go from library science to documentation, then to information professionals who, rather than managing objects, provide answers to readers’ questions. Or again, electronic document management (EDM) becomes knowledge management (KM) which, beyond a file store management system, allows one to directly identify knowledge useful to an organization. And especially the Web gains an adjective qualifying it as “Semantic Web”, meaning that better use of the capabilities of interconnected machines could allow online file content processing in order to align services more closely to the cognitive demands of Internet users. From an academic standpoint, this category reunites first those who work on text, speech, and image, i.e. linguists or semioticians of all schools, both those analyzing discourse, corpus linguistics, semantics and those constructing automatic language processing tools for translation or automatic information search. Currently, they are coming together with a second category of computer specialists, rather from the field of artificial intelligence, who, starting from the attempt to model the reasoning process, are trying to build tools that are also capable of answering questions by searching in files. At the same time, the concept of information is being replaced by the concept of knowledge, which has the advantage over the first of integrating reasoning. Thus, a new discipline called “knowledge engineering” is emerging. Very rapidly, it appeared that research on information about information (metadata) was useful, and even essential in some cases. From cataloging to indexing, then from thesauruses to ontologies, “metadata” have become an essential tool and subject of research.
240 | Roger T. Pédauque Here, as for the previous entry, the explosion of the Web modified the situation by changing the scale of available resources. Thus, the attempt to construct a Semantic Web, initiated by architects of the traditional Web, was welcomed enthusiastically by the researchers of this dimension.
Evolution According to this dimension, the definition of a traditional document could be symbolized by the equation: document = inscription + meaning. Here, the medium is ancillary, even for the traditional document, providing it preserves the inscription. What is important is the content, materialized by the inscription, which conveys the meaning. The meaning constructs itself according to the document production and distribution context, which influences the interpretation of the content. Three leading ideas appear to us to form the basis for this dimension, in a conventional semantic triangle. The first concerns the documents’ creation, the second their interpretation, and the third the signs of which they are composed. “To think is to classify”; when producing documents, we isolate and order discourses to help us make sense of the world. Producing a document is a way of constructing or translating our social understanding. Thus, the concepts of textual genre and collection are fundamental. Actually, documents are grouped in major categories whose different items are homologous and interrelated. This operation is carried out both upstream (producing documents) and downstream (organizing collections). The classification varies according to the situation and era. The classification can be highly formalized or simply implicit. It can refer to very specific and organized actions (IDs, forms, contracts, etc.) or simple attention, impressions, feelings (media, fiction, etc.). It marks our social representation and our readings of the world. It necessarily requires a system allowing the document to be placed in a set and retrieved from it, a literal or figurative indexing, and therefore concrete or abstract classification systems. The second leading idea is interpretation. What links does the document suggest or establish and how? A document is only meaningful if read or interpreted by a reader. The interpretation largely depends on the context in which it is made. The same document can have different, even opposing, meanings depending on the period and social or individual status of the person interpreting it. In a way, the reader recreates the document each time when isolating and reading it. Here, the reader must be understood in a general sense, including a physical person, a group of people in different spaces and times, and perhaps even a machine.
Document: Form, Sign, and Medium, as Reformulated by Digitization
| 241
For the dimension now being examined, the document is considered in a dual relation: relation with the documentary world (classification) and relation with the natural world (interpretation). These relations are established through an “expectation horizon,” a set of familiar signs that constructs the reading contract between reader and document by allowing the reader to decrypt the meaning without difficulty as the reader is automatically placed in the interpretation context. The publishers, by their intervention on the text, its layout, and also their commercial action, are the first artisans of this construction for published documents. Therefore, the concept of “reading contract,” whose importance we stressed in the previous entry on form, takes on additional substance, since it is also necessary for understanding the document. The third leading idea concerns the signs themselves. Any object is potentially a sign and could be a “document.” A discussion which has become a classic, demonstrated for instance that an antelope in a zoo (therefore in a social system of classification) was a document. But a very great majority of documents are constructed from language, mostly written or also spoken. The zoo itself is built around a discourse and the antelope can be said to be “documented.” The same remark can be made about audiovisual documents that are always accompanied by “reading captions” in the form of a very large number of texts from their production up to their exploitation. The written language’s structure, from the letter of the alphabet in Indo-European languages to the discourse, therefore organizes most documents. They are made of discrete pieces, more or less isolable and recombinable, analyzable, subjected to syntax, discourse structuring, and style rules. This use of natural language gives documents a very great plasticity. The information explosion, i.e. the sudden increase in the number of documents, manifest and relentless since the end of the 19th century, led to the invention of what has been called “documentary languages” (bibliographical references, indexes, thesauruses, abstracts, etc.), organized associatively or hierarchically, which are directly derived from the above triad: it was actually possible to construct an artificial or formal language from the documents’ texts (or images, or the objects themselves), enabling their classification in order to retrieve them on request. For a long time, archivists have equally collected metadata on documents and their producers, in the archival description framework, which presupposes the concept of the document’s context as the essential prerequisite for its future exploitation. The construction of such “languages” raises many problems. First of all, it requires standardization, a certain number of common rules agreed upon by the different protagonists. But agreement is not sufficient; incentive must be added. Each person participating in the common effort must have a clear advantage in it; otherwise it is unlikely that the collective construction will be effective. Finally, the languages continually oscillate between the universal and the contingent. This
242 | Roger T. Pédauque oscillation is often misunderstood. It is not an issue of conceptual weakness or the inability to choose. On the contrary, it is the documentary movement’s underlying dynamic, based on the triad presented in the introduction to this entry: signs considered in a dialectic between the general that classifies and the particular that refers (interpretation). The insistence, justified or not, of documentalists, that they differ from librarians by the service they provide, information retrieval, also reveals a certain conception of information and its independence from the carrier. Documentalists would devote themselves to analyzing the documents’ contents to directly present the users the answers they expect, instead of just the document(s) that might contain these answers. Documentalists thus participate in the interpretation of the available documents, reconstructing for the reader a document or a documentary file that is adapted to the reader’s needs. “Information science” emerged from this movement. The term “information” remains poorly defined. It is situated somewhere between “data” and “knowledge.” A more correct term would probably be “documentary units.” Information science investigates how the units fit together (a scientific idea is described in a paper, published in a journal’s issue, distributed in a book, gathered in a collection, etc.) and are distributed according to highly regular statistical distributions. Information science attempts to perfect documentary languages and moreover analyze in detail the search for information as it takes place between a user or reader and an access system. The electronic form was initially used by documentalists simply as an efficient tool for classifying the items of the documentary languages in bibliographical databases. But this situation changed rapidly with the computerized processing of natural language, followed by electronic document production and management, the success of the Web, and, finally, the modeling of the reasoning process. Automatic language processing largely exceeds documentary issues. However, conversely, document processing is necessarily affected by the progress and difficulties of language processing tools where full text analysis is involved. Computer scientists and linguists have united their expertise, using statistical and morphosyntactic tools to create either automatic indexing, abstracts or even question-and-answer systems. In their way, they followed a path similar to that of documentalists, using filters and computation to reconstruct, if not a computer language, at least text supposed to represent the document content in a structured format, and thereby enabling automatic processing by machines. The results were initially less promising than had been expected by the promoters. Even the best tools required human intervention and were rather aids than automatic tools. However, for the net surfer, if not for the initiate, their efficiency is spectacular in their Web application as search engines. It is striking to find that search
Document: Form, Sign, and Medium, as Reformulated by Digitization
| 243
engines, perhaps because they now address in very large numbers (of both documents and net surfers) the old questions of library science, already reformulated by information science: bibliometric laws (Zipf’s law), collections (cached copies), indexing and keywords (metadata), quotes (links), loans (hits), etc., obviously widely renewed by computation power, using the contributions of automatic language processing, but often resulting more from empirical fiddling with methods than from a highly rigorous scientific analysis. As in the previous approach, computer scientists have attempted to isolate and model logical elements. But in this case, they worked directly on content. As above, we could represent the transformation by the first equation: document = inscription + meaning becomes for electronic documents electronic document = informed text + knowledge. The replacement of inscription by informed text would signify that the text (in its broadest sense, including audiovisual) has been or could be subjected to processing allowing the extraction of the units of information. The replacement of meaning by knowledge would introduce the concept of personalization for a given reader or user. The announced arrival of the Semantic Web can be understood both as a continuation of these results and as at least a methodological breakthrough if not a rupture. Regarding the first interpretation, it can be noted that, for example, the structure of the documents is increasingly formalized (XML) and indexing is stressed (RDF—resource description framework). From this point of view, what is being constructed is a distributed multimedia library on the scale of the network of networks, integrating more efficient search tools. The ambition is also broader. The aim is to progress from a Web which is merely an interlinked set of files to a network fully using the linked machines’ computing power, in particular for semantic text processing. The use of “metadata” that can be modeled and combined is essential for this purpose. Therefore, in their way, the promoters of the Semantic Web construct sorts of documentary languages that they call “ontologies.” The encounter between Semantic Web promoters and knowledge engineering researchers whose objective is to model the reasoning process was then inevitable. The latter have been reflecting since the 1990s on how to give an account of the reasoning contained in documents. In particular, they integrate the issue of document status, modeling of the reasoning and especially of the ontologies. Ontologies have been defined as representations of a domain, accentuating the dissociation (temporary in some cases) between heuristic reasoning and a description of the concepts manipulated by these heuristics. This assumed dissociation was also a way of making it easier to model two types of knowledge, initially considered independent. Ontologies are focused on the essence of a domain (such as medicine or a medical specialty, for instance), on its vocabulary, and, beyond that, on the meaning it conveys. This meaning has two aspects, that understood by human
244 | Roger T. Pédauque beings, which is interpretative semantics, and that “understood” by machines, which is the formal semantics of ontology. Ontologies can be seen as richer structures than the thesauruses or lexicons used until now, because they introduce on the one hand a semantic dimension (the conceptual network) and, on the other, in some cases a lexical dimension that improves access to documents. But one of the main assets of ontologies is indeed their formal structure, which will allow them to be used by a computer program, where thesauruses fail. This formal structure is obtained by decontextualizing the concepts included in the ontology. This makes it necessary, for reasons of understandability and maintenance, to link the ontology to the lexical dimension from which it arises, to the texts. Therefore, as for the previous entry, but doubtlessly in a less advanced way, we are perhaps on the threshold of a new phase regarding electronic documents, through the contribution of the Semantic Web. We could represent this step by the transformation of the previous equation: electronic document = informed text + knowledge would become Semantic Web document = informed text + ontologies. However, documents accessible in a form not including metadata are becoming much more numerous than “indexed” documents. Even worse, competition on the Web is leading to opportunistic indexing strategies aimed at purposely misleading the search engines. It is therefore likely that two parallel dynamics will coexist, at least initially. On the one hand, for self-regulated communities that have an interest in developing efficient document searches (experts, business, media, etc.), “specialized languages” will be applied to documents as far upstream as possible in their production, probably in an assisted-manual way. On the other hand, simpler automatic metalanguages, possibly adapted to searches in broad categories, will continue to be perfected for tools widely used by net surfers. The evolution and progress of research in this dimension exhibit a cyclical aspect: old questions have to be raised again for changes of support, scale or tool. The controversial construction of a parallel language emerges at each step. The advocates of the previous step thus have the impression that the new arrivals are rediscovering old problems, whereas the latter feel that the realized breakthrough requires a reconsideration of all the problems. It is not really surprising for the construction of such a language to be cyclical. Each change in carrier or scale requires reconstructing its structure. Besides the mass of data to be represented, their multicultural and multilingual aspect must now be taken into account. At the same time, the foundations, though, are not called into question; they simply are (or should be) better known and stronger.
Document: Form, Sign, and Medium, as Reformulated by Digitization |
245
Definition 2 According to this second dimension, we could present a new definition of “document,” still not claiming that it completely covers the concept. This definition must take into account the ability to process the content for information searches or simply for identifying the document. It emanates from the second part of the reading contract we identified, that of intelligibility: An electronic document is a text whose elements can potentially be analyzed by a knowledge system regarding its exploitation by a competent reader. Once again, the definition is somewhat laborious. Let us therefore recall the transformations of our equations, more schematic but easier to remember: Document = inscription + meaning becomes, for electronic documents, electronic document = informed text + knowledge, which could lead with the Semantic Web to: Semantic Web document = informed text + ontologies.
Questions In coming closer to human communication, the researchers of this entry considerably increased the complexity of the problems to be dealt with. There are still many outstanding issues. As regards languages, for instance, there is the problem of the tools’ use for languages with a structure and written expression that is different from that in Indo-European languages. Furthermore, the boundary between automation and human intellectual work remains unstable. But for our purpose, it should mainly be noted that for the researchers privileging this entry, the document often appears as a secondary concept, and only the text, the content, really matters. Yet, as we saw in the introduction to this approach, the content is only valid regarding a context. Is not the document, exactly, one of this context’s constructions, positioning the information it contains with respect to that contained in other documents and allowing the reader to have an indication of the contents’ value by the status of the document? In other words, can it not be considered that focusing too exclusively on text processing underestimates the semantic value of its inclusion in a given document? New research projects are aimed at giving more importance to the material form. The most immediate is to benefit from structural (and semantic in the future) markup to modulate text analyses, knowledge identification or annotation. A more detailed analysis would concern integration of material formatting elements such as font, case, indenting, lists, etc. Collaborations between document and knowledge modeling specialists will then be necessary.
246 | Roger T. Pédauque A series of questions based on the triad given in the introduction of this entry then arises: – Is it possible to process a document’s meaning without relating it closely or loosely to the set to which it refers (collection, category, footnotes, bibliography, etc.)? In other words, beyond laboratory work on closed corpora, how to integrate document analysis as “headend,” generating structures that address the meaning? The whole question, raised anew by electronic documents, lies in the documents’ enrichment by means of interlinkage, in new hypertextualization situations or in construction motivated by collections. – Can information be validated other than by the authenticity of the document containing it? The problem of confidence is a topic currently being investigated by computer scientists (among other topics in knowledge representation) interested in the Semantic Web and by the W3C consortium. They are looking for a technical solution (adding a formal layer, interpretable by software agents) allowing users/readers of a site to increase or decrease the credibility or confidence that can be attributed to the information the document contains. This technical approach to the problem is related to the idea that on the Web, it is the Internet users who validate information and make a site popular or not. The project could be pertinent within a specialized community which has its own conventions, but rapidly reaches its limits on the scale of the Web. The problem is more complex than a “vote” which largely approves a site and grants it credibility. More sociological approaches are required. – When analyzing the content of a document to generate knowledge models for particular uses, the validity and relevance of the document are undermined by other knowledge sources, i.e. the experts in a field or the users. Following the pursued goal, the very methods used to extract knowledge from texts assign the same or even higher weight to orally expressed knowledge. The added value brought on by the authentication, certification or recognition of the text may therefore be disregarded in certain cases. To what extent can a meaningful element be isolated in a set which has a unity of meaning, i.e. the document as a whole? Does not this unity often have a decisive weight in the meaning of the elements it comprises? How to take into account the global meaning, the semantic unity, of a document if only its parts are captured? The questions posed largely exceed those posed about texts in semantics. The passage from text to document deserves doubtlessly a more thorough analysis. The answers to these questions undoubtedly differ according to the types of documents to which they are applied. But for the time being, lacking any real progress on typology, we feel they remain largely open.
Document: Form, Sign, and Medium, as Reformulated by Digitization
| 247
Document as Medium Let us repeat our precautionary note on vocabulary for a last time: the term “medium” must be understood here in a broad sense. It includes all the approaches that analyze the document as a social phenomenon, a tangible element of communication between human beings. This entry therefore emanates from the analysis of communication, a special instance of communication, whereby the document is understood as the vector of a message between people. We can thus affirm that it is a third dimension of the reading contract, that of sociability.
Specialisms Concerned It should first be noted that the social domain concerned can be separated into two parts: on the one hand, organizations that use documents for their internal regulation and to achieve the objectives they set for themselves, and the open societies or communities in which the documents circulate. Undoubtedly, all the above researchers could be included in this category, since they are all interested in a social activity; nevertheless we classify here those whose entry is primarily social before being instrumental. As concerns traditional know-how, the professions already mentioned also fall into this category. However, we will place the emphasis on archivists, whose main mission is to keep trace of human activity by saving documents as they are produced, and publishers, whose business is to promote the construction and publicity of documents that are of interest to a social group. The disciplines of the humanities and social sciences that focus on exchanges are potentially concerned by this dimension. Therefore, sociologists, economists, jurists, historians, a few psychologists, a fair number of philosophers, and, of course, researchers in communication science, political science, and management science are directly or indirectly interested in documents from their disciplinary approach. Digitization has renewed the interest of many researchers in these disciplines, concerning both the phenomenon as a whole and particular situations. For instance, without it being always assumed, there is a relation between considerations on documents and the new attention to communities of interest, working groups, networking, memory and heritage, intellectual property, etc. But for this entry, the gap between computer scientists and other researchers is larger. Very few specialists in social and human sciences are well versed in computer science. Conversely, computer scientists often have a very limited under-
248 | Roger T. Pédauque standing of social issues. This gap sometimes leads to fascinated enthusiasms or, on the contrary, radical rejections from both the part of social and human sciences and the supporters of engineering sciences.
Evolution A document gives information, a materialized sign, a status. It is upheld by a social group that elicits, disseminates, safeguards, and uses it. As we suggested in the introduction to this paper, it is compelling evidence of a state of affairs and harbinger of an event. It is also a discourse whose signature relates it to an author. It is a testimony, even if that was not necessarily its purpose at the moment of conceptualization. It is a filed unit. To be consistent with the previous entries, we propose the third and last definition as an equation: document = inscription + legitimacy. This equation seems to us to represent the social process of document creation. Document status can be acquired under two conditions: to become legitimate, the inscription must exceed private communication (between a few people) and the legitimacy must be more than ephemeral (go beyond the moment of its enunciation) and therefore be recorded, inscribed. These two conditions imply that although any sign can be a document, a particular sign, even satisfying the two dimensions discussed above, is not necessarily a document. For instance, a diary is not a document unless someone takes the initiative of making it public or at least communicating it beyond the circle of relations of its author. Or furthermore, a live radio or television broadcasting is not a document unless someone records it for future social use. This position does not meet with the agreement of all the contributors to this text. For some, the value of a document could indeed preexist its sharing or recording. Document status is not gained once and for all. It is acquired and may be definitely lost in collective oblivion. Or furthermore, it may be regained if someone rediscovers and relegitimizes a document which has disappeared from the collective consciousness but has not been destroyed. However, the equation does not account for the social function of documents. Documents are used to regulate human societies by ensuring communication and the durability of the norm and knowledge necessary for their survival or their continued existence. In a way, it could be said that the reading contract, two of whose dimensions we have identified, corresponding to the two entries above— legibility and understanding—takes on its third dimension here: sociability, i.e. appropriation, whereby the reader marks participation in a human society by
Document: Form, Sign, and Medium, as Reformulated by Digitization
| 249
becoming aware of the documents or, conversely, the inscription on an artifact of a representation of the natural world and its inclusion in a collective heritage. A document is not necessarily published. Many documents, for instance because they settle private matters (medical files, transactions between individuals) or because they contain undisclosable confidential information, can only be consulted by a very limited number of people. However, they have a social character in that they are written according to established rules, justifying their legitimacy, are used in formal relations and can be submitted as reference in case of dysfunction (dispute). Conversely, publication, broad or limited, constitutes a simple means of legitimization, since once a text has been made public, i.e. available for consultation by a large number of people, it is part of the common heritage. It can no longer easily be amended, and its value is appreciated collectively. The multiplication of documents is therefore connected with the evolution of societies by two dynamics, one external and the other internal, which mutually reinforce one another: first the one of the social use of documents and then the one of their proper economics. Political and social organization is based on the production and exchange of documents. Religions and their clerics, states and administrations, productive organizations and trade, civil society, in their different components, their historical evolution, their specific geographies and cultures, their changing functions, have used and still extensively use documents for their internal regulation as for the competitive assertion of their identity and position. For instance, let us list the main sources of documentary activity in western countries, without claiming to be exhaustive: – In France, the transition from the Ancien Régime to the Republic, then from a police state to a welfare state and finally, today, integration of the state into more vast groupings such as the European Union, or globalization, have all had consequences on the production, role, and number of documents. For comparison, it is sufficient to mention the importance of the document in the parallel history of the administration of China to understand how basic it is, while at the same time specific to each civilization. – Industrialization, with all the technical, organizational, transactional, and accounting knowledge and standardizations which accompanied it, “produced” a considerable number of documents. It is perhaps the main factor in the documentary explosion mentioned above. – The progress in science and education considerably increased the number of document producers and consumers, for the internal functioning of science and even more for the popularization of the immense concurrent partial knowledge.
250 | Roger T. Pédauque –
–
Exchanges, commercial or other, which exploded with the development of transportation and telecommunications, and the opening of borders, use a considerable number of documents to “flow smoothly” (materialization of transactions, technical notes accompanying products and services, sales information, etc.). The development of leisure time, a higher life expectancy, and the increase in “public space” are also essential factors in the development of culture and one of its main vectors: documents.
This dynamic has been examined from specific angles, but the few attempts at a general understanding appear to us to be more speculative than demonstrative trials, perhaps because they remain isolated works. The second dynamic that allows the establishment of the document as a medium is its internal economy, based on changes in the technologies from which it is constructed (changes developed in the two previous entries) and, on the other hand, on the document creation terms. These terms indeed require work whose means of realization have yet to be found. The creation of a document can be analyzed as an ordinary act of communication with, on the one hand, one (or more) senders and, on the other hand, one (or more) receivers. Professions specialized in one or other moment of the process or one or other area of application. Systems have been constructed and formalized to meet the production’s compliance. Small and large entrepreneurs have set off to meet the challenge or organizations have taken care of it. These measures have a cost of implementation and maintenance, as well as an inertia. Two main research currents are focused on investigating the economy of this document creation. The first is interested in organizational communication and studies documents primarily as a business process; the second analyses media communication and investigates the publishing process. Research on organizational communication studies documents immersed in business practices and situations characterized and therefore constrained by rule systems. So, it examines documents on several levels: first of all as written material identified within a context, formalized by rules governing writing, dissemination, use, recording of an intention related to an action, and preserving a trace of the social and technical negotiations conducted around it. This leads in particular to examining document production and management processes, activities which are no longer only restricted to specialized players but which are redistributed throughout the organization. It also considers documents as an element structuring the organization, as coordination support. Finally, it considers documents as means used by individual or collective agents in their different strategies. From a methodological standpoint, documents are considered as “observables”, allowing the study of the relations
Document: Form, Sign, and Medium, as Reformulated by Digitization |
251
between players (documents are mediators), regulation modes (they are a management tool), and organizational recompositions (documents are one of the elements that reveal them) at the same time. Initial progress of the analysis concerns only one part of documentary activity and is still far from formalizing the contours of a general document economy. Many uncertainties remain to be dispelled, such as those concerning the relations between documentary systems and organizational systems, the evolution of the different professions of mediation or, furthermore, the economy of libraries or archives. Among archivists in particular, there are many discussions around records management and business re-engineering practices, which are becoming standardized. The current doctrine (but which is far from being assimilated by institutional document producers) demands that the missions (institutional objectives) generate processes (organizational functions), which generate procedures (formalized methods of action), which generate documents (or transactions). As concerns the media, the economy of several sectors is well known because it has been the subject of particular analyses. Let us mention the role of paper publication, peer review, citations, and prepublications in scientific communication. Or furthermore, mainstream media, with the gradation between the edition, an artisanal activity based on the individual sale of objects, the dialectics between backlist and best sellers, distribution networks and radio broadcasting, which is a more industrial activity, organized to capture the listener’s attention in their homes which is sold to interested advertisers. Other topics have been investigated in more detail, taking into account the economic stakes they represent, such as intellectual property rights. In this area, some document properties can be distinguished by the different traditions between the Latin “droit d’auteur” and the Anglo-Saxon “copyright.” The first privileges the author’s attachment to his/her work, whereas the second puts forward the notion of publication, giving intellectual property to the person that takes the initiative. In a way, it could be said that the “droit d’auteur” is a work’s right, whereas the “copyright” is a document’s right. There is one last point to be mentioned concerning document economy that is very important for our subject: the more the existence of a document is known, the more it will be read, and the more it will be read, the more its existence will be known. A resonance phenomenon can develop thanks to the relations between readers and those between documents. It can take on different forms and different names depending on the sectors and specialisms. Marketing professionals and media strategists regularly make use of this phenomenon by constructing public awareness which they then sell on other media. In scientific communication, the impact factor based on the number of quotations in papers emanates from the same process (and leads to the same excesses). This feature may explain many of the
252 | Roger T. Pédauque characteristics of document distribution: bestsellers in publishing, prime time in broadcasting, various modes, concentration and expansion, or the almost perfect regularity of bibliometric laws when large numbers of documents are equally accessible to large numbers of users. Digitization leads to contradictory movements that are not easy to interpret. The first observation could be the disappearance of a large number of documents, which, kept in their traditional form, reported on procedures. This disappearance is difficult to measure, because it took place in a disorganised way. The dissemination of computer tools often enough led to the dissociation of functions performed earlier by a single type of document. That is the case for civil registers, which continue to be kept on paper for legal reasons while being in electronic form for consultation. Increasingly, the replacement, coinciding with the disappearance of middle management, is total: forms, schedule boards, pilotage, instructions for use which were commonplace in public and private bureaucracies are replaced by databases and data networks. This movement, dubbed the “computerization of society” a few years ago, risks accelerating even more with the developments stressed in the previous dimensions. But a concurrent increase in written text and document creation is observed in organizations, amplified exponentially by the quality approach. Therein, documents bear social and organizational standards, turning them into action supports as well as memories of relations. The fact of storing data and procedures in databases does not efface their prescriptive value, quite the contrary. For instance, intranets grant documents a status (as reference or tool among others) by associating identification and circulation rules with them while modeling and anticipating the possible uses. They amplify the visibility of decisions and activities by making them largely accessible. In this perspective, less than ever before, the document would only make sense on its own, but would be constituted by the electronic storage of transactions defined in advance. The display of information, ephemeral and necessarily dependent upon evolving technologies, would not in itself constitute a document, but should be validated by certified procedures. Many documents can be similar to the transcription of procedures or one of their stages: thus, such documents could then only be understood in relation to the mode of information translation to which the procedures they transcribe have been subjected. Adding the progress made on electronic signatures, many transactions could be completed in the future without the formalisms adopted for printed documents. This mutation considerably increases the supervision possibilities through information cross-referencing capabilities. In the social domain, France has acquired a protection that is legal but fragile considering the development of electronic transactions, by the “Informatique et liberté” (Information and Freedom) law. In
Document: Form, Sign, and Medium, as Reformulated by Digitization |
253
the economic domain, certain analysts have seen the emergence of a new economy whose hazard studies far exceed the scope of this paper. For our subject, let us retain the idea of a radical change in social and economic structures. As the industrial age was marked by the interchangeability of parts, the information society would be characterized by the possibility of reusing information. Thus, on the organizational communication side, we might have identified a first change of our first equation document = inscription + legitimacy to electronic document = text + procedure. But this last equation does not account for another very important movement taking place on the media side since the advent of the Web. The Web suddenly projected digitization to the scale of society as a whole. To understand the success it has had, measured by its explosive dissemination within populations and depending on the types of activity, it is necessary to return to the spirit underlying its architecture. The organization of the Web is consistent with the guidelines of the designers of the Internet, imagined as a many-to-many communication network where each node, large or small, has the same tools and is both producer and consumer. The Web assumes a social understanding, or rather a social communication, similar to a “Republic of Sciences” or the free software movement. In such a society, each person is an actor and responsible to the community for his or her acts. Translating this into our domain, we can say that everyone is capable of reading or writing documents concerning community life and everyone must be eager to publish only documents that enrich the community. The pioneering geniuses of the Web, a combination of the Internet and hypermedia, built a system reflecting them, or rather the information community to which they belonged. This idea is very present in many discourses and initiatives in this domain, beginning with those of the W3C consortium. The content-distributing industry (“Industrie du contenant”; software and telecommunications industry), not without discussion, fights, and compromises, is very attentive to these developments which strengthen their positions because they promote the increase in traffic and processing at the cost of the content-producing industry (“Industrie du contenu”). But everyone cannot speak to everyone, this would be a cacophony; representatives are necessary. Up to now, this difficulty has been solved using one or more filter systems that allow the selection of relevant authors and configure representative and useful documents. Such systems have a cost which can only be diluted in community operation, since the equality of actors has disappeared. Only a few write on behalf of others and professional mediators organize the publication and access system as a whole. The editorial system is an avatar of this organization, a compromise between private and public interests.
254 | Roger T. Pédauque There is therefore an initial misunderstanding, willed or not, between systems designed by Internet pioneers and supported by the content distributors and the ordinary reality of social communication. This misunderstanding is however very fertile, since it opens a space of exchange for communities in which communication is restricted by the traditional system. It also gives many institutions, starting with those of public interest, a simple tool for communicating with people. Thus, the Web is a huge bazaar where one can find a multitude of interlinked documents that may be consulted by the reader free of charge. Some believe this organization to be only temporary, illustrating the youth of the medium. This analysis may not fully understand the rupture that the Web produces. It is also possible that filtering and selection will no longer take place a priori on the Web as in traditional media, but will instead be performed a posteriori using a “percolation” system whereby the most relevant documents would gradually be identified and highlighted by the number of hyperlinks and the operation of search engines. The main thing then would be the Web itself, which, by its continual movement (links are created and destroyed, engines run, pages appear and disappear), would allow the identification of documents. The involvement of a substantial number of net surfers, heretofore excluded from the small closed world of the broadcasting media, and the manifest success of this new media in the practices support this hypothesis by providing a dimension and speed unheard of in the ordinary dynamic of legitimacy through renown. Following a parallel reasoning, but from a more literary perspective, several researchers and essayists have seen in the advent of the Web, and especially in hypertext and hypermedia techniques, a disappearance of documents. Thus, the conventional author-work-reader triad, at the origin of the construction of a literary document, could give way to an interactive process in which the links between accessible pages would play a more important role than the text as it was first constructed by the author. Nevertheless, even if interesting experiments in hypertext writing have been and are still being conducted with non-negligible semantic and cognitive consequences, the explosive development of the Web appears to have led on the contrary to an exponential increase in the number of documents put online. The links between pages appear to be becoming gradually structured to create new paratext standards, reinforcing on the contrary the documentary aspect of the Web. We could summarize our development on the Web by transforming our initial equation to Web document = publication + access cuing. Publishing alone would not guarantee legitimacy. Renownedness would also be necessary by access cuing. Consistently with our reasoning, the traditional media have not been able to construct economically viable business models for the Web. Only a few sectors, which already had affinities with networks, found terms of funding: financial
Document: Form, Sign, and Medium, as Reformulated by Digitization | 255
information and scientific information. It is also possible that music, because of peer-to-peer exchanges, is in the process of redefining its distribution and pricing mode. Conversely, digitization has strengthened the “old” media, publishing and broadcasting, by allowing them to make substantial gains in productivity by promoting synergies and diversification. For instance, using the same database, a newspaper can publish news in a newspaper and on the Web, broadcast it on the radio and by SMS, and audiotel, etc. And each medium can enhance its own areas of excellence (renownedness through television or radio, interactivity through the Web and telephone, appropriation through publishing), and the resonance mentioned above can lead to unprecedented profitabilities. These recent changes still need to be evaluated. They also lead to high investments whose returns are not immediate, whereas the future is uncertain. After a period of hype and trial and error, the Web appears as just another medium whose intrinsic features need to be well understood to articulate it with existing media. The announced developments of the Semantic Web are a sign of other developments, in particular in the relations between document and service. But that is still in the future and in this entry difficult to cover with a social dimension.
Definition 3 In this perspective, we identified strong movements, in some cases divergent, often chaotic. At this stage, it is not easy to propose a definition that clearly reflects this third entry. That is why we will give only a very general definition: An electronic document is a trace of social relations reconstructed by computer systems. We recall below the equations we constructed and transformed: document = inscription + legitimacy becomes electronic document = text + procedure and Web document = publication + cued access. Despite the difficulty in constructing a definition, we must stress the importance of this third dimension identified in the reading contract, i.e. sociability.
Questions The first series of questions concerns the concept of archive, for which the basis is the recording and preservation of documents. The role of an archive is to preserve the memory of a human activity. A new, more active role is emerging for archives with digitization. With open archives, recovery of audiovisual programs
256 | Roger T. Pédauque at the source or of television broadcasts, archiving of the Web, etc., many new activities are developing while archive science practices are changing. The records management being set up in organizations appears as a prerequisite for satisfactory electronic archiving. Also, there are many questions that remain to be fully answered about a different role to be assumed: hesitation between a testimony of a past action and a record of an ongoing action; confusion between archiving and publication; simple recording or preparation for future use. Even more so, how to preserve a trace of the continual movement of renewing pages linked to each other? The second series of questions concerns the concept of attention (types of perception and intention), without which a document cannot have a reader. Human attention is limited by the time available and by the reader’s fatigue and technical or intellectual faculties. This problem is well-known to broadcasters. As the net surfers are necessarily active, there is no way of “hooking” them like broadcast audiences. In other words, the Web combines the freedom of choice in publishing with the accessibility of broadcasting, or extends library services to the entire planet for collection and consultation at home. Thus, bibliometric laws and resonance effects might occur on an unprecedented scale in some sectors: attention is focused very closely on a small number of documents and dispersed on a very large number. These phenomena and their consequences are still poorly studied. Furthermore, the intention of Web promoters is to make sites and documents available to the entire planet on an equal basis. But the penetration of innovations is very unequal. The Web and electronic documents are no exception. Even worse, the respective media access appears to be the most inequitably shared goods between countries and between the different populations within each country. The third series of questions concerns the omission of content funding. The path of least resistance principle applied to the accessibility of the Web means that net surfers prefer to avoid obstacles and barriers to browsing rather than confront them. They will therefore circumnavigate all direct requests for money. In the same dynamic, a militant movement asserts that the Web should be free and the access to knowledge and culture liberated from commercial imperatives. Opportunism and politics are combining to gradually configure the economics of content on the Web as an institutional B2B (business-to-business) market. Is it really certain that this financial structure guarantees the diversity and plurality of online documents in the medium term? Is it even certain that it contains sufficient resources in each sector to keep producing and managing documents? The idea of translating all existing documents from traditional medium to electronic form is unconceivable. However, the explosive development of information technology makes it necessary to envision a very large-scale processing
Document: Form, Sign, and Medium, as Reformulated by Digitization |
257
effort unless we accept a radical amnesia of our documentary culture. In the near future, reasoned choices will be required (what documents should be digitized as a priority?) and tools capable of processing huge volumes of documents will have to be constructed.
Conclusion The three entries discussed highlighted several of the basic themes regarding the document, reinforced or challenged by its electronic version. We now need to consider a summary giving a general view covering the three entries, somewhat like all the colors can be obtained from the three primary colors. Or, more academically, is it possible to envision a document theory from which we could better measure the present and future consequences of electronic documents? First, it is obvious that under each entry, we identified stages in the history of document digitization that we can now compare. Traditional documents consist of a medium, a text, and a legitimacy. The first stage of digitization, where we probably still are, has highlighted the document’s internal structures, the importance of metadata for processing, and the difficulty of validation. The second stage, which has undoubtedly already begun but whose conclusion remains uncertain, and which stresses the XML format as an integrating structure but not a form, is based on ontologies for retrieving and reconstructing texts and emphasizes personalized access. There is a meaning to this general evolution that must be better understood as to its orientation, consequences, and limits. It should then be stressed that the opposition between paper and electronic versions is futile. Almost all current documents have existed in electronic form at one stage of their life and those that have not risk being forgotten. Conversely, numerous electronic documents are printed at some point on a personal or professional printer. Therefore, it is important to better define the concept of document in general, whose electronic form is both revealing and a factor of evolution. Finally, it should be noted that under each entry, we emphasized the idea of the reading contract, expressed as legibility in the first case, understanding in the second, and sociability in the third. It is probable that this contract with three aspects, in all the nuances given, represents the reality of the concept of document. A document may finally be nothing more than a contract between people whose anthropological (legibility—perception), intellectual (understanding—assimilation), and social (sociability—integration) properties may form the basis for part of their humanity, their capability to live together. In this perspective, the electronic form is only one way of multiplying and changing such contracts. But the importance
258 | Roger T. Pédauque it has gained, its performance and its speed of dissemination make a thorough, careful analysis even more necessary. We clearly showed, in particular in the series of questions, that none of the entries was independent. It would be futile to attempt to separate them. On the contrary, the concept is only really clarified by superimposing them. But we also noted that each entry was taken up in a multidisciplinary research movement which had its autonomy and whose specialization involves expertise that is too particular to be fully shared. This paper is a call for more in-depth studies to compare each of these approaches and investigate how they intersect.
Translators’ Note On the occasion of the 20th anniversary of the paper published by the French network RTP-doc² under the pseudonym Roger T. Pédauque, which proposes a multi-perspectival model for analyzing and describing documents (form, sign, medium), the English text is here presented in a revised form. This new translation is based on a French text authorized by Roger T. Pédauque (Pédauque 2003b) and translated into English by Niels Windfeld Lund (Pédauque 2003a). This English text was completely reviewed and revised by Laura Rehberger and Frederik Schlupkothen. First, the correct transfer from the original French text was checked and linguistic revisions were made. Subsequently, the text was adapted in favor of its readability and comprehensibility. Finally, the text was checked again in general and for technical terms by third parties.³ At all steps, it was weighed up whether author fidelity or accessibility of the text were more urgent. All adjustments were recorded in the change history of the used word processor. An annotated version documenting the change process is available upon request from Laura Rehberger and Frederik Schlupkothen. The present reading does not contain any comments and has been adapted to the typographical conventions of this volume. Niels Windfeld Lund made the revision and new translation of this text possible. Stefan Gradmann is to be thanked for the dissemination of the original texts in German-speaking countries. Laura Rehberger and Frederik Schlupkothen would like to thank all those who facilitated this new translation.
2 Réseau thématique pluridisciplinaire “Documents et contenu: création, indexation, navigation” (RTP-doc) du Centre national de la recherche scientifique (CNRS); Interdisciplinary Thematic Network “Documents and Content: creation, indexation, navigation” (RTP-doc) of the French National Centre for Scientific Research (CNRS). 3 Translation agency Proverb oHG, Stuttgart; John A. Bateman, University of Bremen.
Document: Form, Sign, and Medium, as Reformulated by Digitization |
259
Bibliography Pédauque, Roger T. Document: Form, sign and medium, as reformulated for electronic documents. Niels W. Lund, translator. September 2003a. URL: http://archivesic.ccsd.cnrs.fr/sic_ 00000594, (24.01.2022). Pédauque, Roger T. Document: forme, signe et médium, les re-formulations du numérique. July 2003b. URL: https://archivesic.ccsd.cnrs.fr/sic_00000511, (24.01.2022).
List of Contributors John A. Bateman University of Bremen Faculty of Linguistics and Literary Studies Bibliothekstrasse 1 D-28359 Bremen Germany E-mail: [email protected] Cornelia Bohn University of Lucerne Faculty of Humanities and Social Sciences Department of Sociology Frohburgstrasse 3 CH-6002 Lucerne Switzerland E-mail: [email protected] Gerald Hartung University of Wuppertal School of Humanities Department of Cultural Philosophy Gaussstraße 20 D-42119 Wuppertal Germany E-mail: [email protected] Laura Rehberger University of Wuppertal Research Training Group 2196 “Document—Text—Editing” Gaussstraße 20 D-42119 Wuppertal Germany E-mail: [email protected]
https://doi.org/10.1515/9783110780888-010
262 | List of Contributors Frederik Schlupkothen University of Wuppertal School of Electrical, Information and Media Engineering Department of Electronic Media Rainer-Gruenter-Strasse 21 D-42119 Wuppertal Germany E-mail: [email protected] Karl-Heinrich Schmidt University of Wuppertal School of Electrical, Information and Media Engineering Department of Electronic Media Rainer-Gruenter-Strasse 21 D-42119 Wuppertal Germany E-mail: [email protected] Ulrich Johannes Schneider Leipzig University Beethovenstrasse 15 D-04107 Leipzig Germany E-mail: [email protected] Roswitha Skare UiT The Arctic University of Norway Faculty of Humanities, Social Sciences and Education Department of Language and Culture PO Box 6050 Langnes N-9037 Tromsø Norway E-mail: [email protected]