324 83 22MB
English Pages 340 [323] Year 2020
Philip F. Yuan · Mike Xie · Neil Leach · Jiawei Yao · Xiang Wang Editors
Architectural Intelligence Selected Papers from the 1st International Conference on Computational Design and Robotic Fabrication (CDRF 2019)
Architectural Intelligence
Philip F. Yuan Mike Xie Neil Leach Jiawei Yao Xiang Wang •
•
•
•
Editors
Architectural Intelligence Selected Papers from the 1st International Conference on Computational Design and Robotic Fabrication (CDRF 2019)
123
Editors Philip F. Yuan College of Architecture and Urban Planning Tongji University Shanghai, China
Mike Xie School of Engineering Cluster Civil and Infrastructure Engineering Melbourne, VIC, Australia
Neil Leach College of Architecture and Urban Planning Tongji University Shanghai, China
Jiawei Yao College of Architecture and Urban Planning Tongji University Shanghai, China
Xiang Wang College of Architecture and Urban Planning Tongji University Shanghai, China
Funded by National Key R&D Program of China (Grant No.2016YFC0702104), Shanghai Science and Technology Committee (Grant No.17dz1203405) and College of Architecture and Urban Planning (CAUP), Tongji University, China.
ISBN 978-981-15-6567-0 ISBN 978-981-15-6568-7 https://doi.org/10.1007/978-981-15-6568-7
(eBook)
© Springer Nature Singapore Pte Ltd. 2020 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore
Committees
Honorary Advisors • Prof. Dr. Jiaping LIU, Xi’an University of Architecture and Technology, China • Prof. Dr. Zhiqiang WU, Tongji University, China • Executive chief architect, Weiping SHAO, Beijing Institute of Architectural Design, China • Prof. Dr. Guoqiang LI, Tongji University, China • Director, Bernard STIGLER, Centre Georges-Pompidou, France • Prof. Dr. Philippe BLOCK, ETH Zurich, Switzerland • Prof. Dr. Achim MENGES, University of Stuttgart, Germany • Prof. Dr. Antoine PICON, GSD, USA • Prof. Dr. Patrik SCHUMACHER, Zaha Hadid Architects (ZHA), UK • Prof. Dr. Yi Min (Mike) XIE, RMIT University, Australia
Organization Committees • Prof. Dr. Philip F. YUAN, Tongji University, China (Workshop Coordinator) • Prof. Dr. Neil LEACH, Tongji University, China (Conference Coordinator) • Prof. Dr. Yi Min (Mike) XIE, RMIT University, Australia (Paper Selection Coordinator) • Prof. Dr. Guohua JI, Nanjing University, China (Award Coordinator)
Scientific Committees • Director, Bernard STIGLER, Centre Georges-Pompidou, France • Senior Associate, Shajay BHOOSHAN, Zaha Hadid Architects (ZHA), UK
v
vi
Committees
• Prof. Dr. Philippe BLOCK, ETH Zurich, Switzerland • Adjunct assistant Prof. Biayna BOGOSIAN, University of Southern California, USA • Prof. Dr. Jane BURRY, Swinburne University of Technology, Australia • Prof. Dr. Mark BURRY, Swinburne University of Technology, Australia • Associate Prof. Dr. Matias Del CAMPO, University of Michigan, USA • Associate Prof. Dr. Tengwen CHANG, National Yunlin University of Science and Technology, Taiwan, China • Assistant Prof. Dr. Benjamin DILLENBURGER, ETH Zurich, Switzerland • Prof. Dr. Fabio GRAMAZIO, ETH Zurich, Switzerland • Prof. Dr. Tim HEATH, University of Nottingham, UK • Associate Prof. Dr. Weixin HUANG, Tsinghua University, China • Prof. Dr. Guohua JI, Nanjing University, China • Prof. Dr. Neil LEACH, Tongji University, China • Associate Prof. Dr. Hyejin LEE, Tongji University, China • Prof. Dr. Guoqiang LI, Tongji University, China • Prof. Dr. Linxue LI, Tongji University, China • Prof. Dr. Jiaping LIU, Xi'an University of Architecture and Technology, China • Associate Prof. Dr. Wes McGee, University of Michigan, USA • Assistant Researcher, Dr. Xianchuan MENG, Nanjing University, China • Prof. Dr. Achim MENGES, University of Stuttgart, Germany • Prof. Dr. Antoine PICON, GSD, USA • Prof. Dr. Patrik SCHUMACHER, Zaha Hadid Architects (ZHA), UK • Executive chief architect, Weiping SHAO, Beijing Institute of Architectural Design, China • Prof. Dr. Xing SHI, Southeast University, China • Associate Prof. Dr. Chengyu SUN, Tongji University, China • Prof. Dr. Kostas TERZIDIS, Tongji University, China • Prof. Dr.-Ing. Oliver TESSMANN, Technische Universität Darmstadt, Germany • Prof. Dr. Makoto Sei Watanabe, Tokyo City University, Japan • Dr. Xiang WANG, Tongji University, China • Prof. Dr. Zhiqiang WU, Tongji University, China • Prof. Dr. Yi Min (Mike) XIE, RMIT University, Australia • Prof. Dr. Leiqing XU, Tongji University, China • Prof. Dr. Weiguo XU, Tsinghua University, China • PhD Candidate, Chao YAN, Tongji University, China • PhD Candidate, Yan ZHANG, Massachusetts Institute of Technology, USA • Associate Researcher, Dr. Jiawei YAO, Tongji University, China • Prof. Dr. Philip F. YUAN, Tongji University, China • PhD Candidate, Liming ZHANG, Tongji University, China • Prof. Dr. Yongming ZHANG, Tongji University, China • PhD Candidate, Hao ZHENG, University of Pennsylvania, USA • Associate Prof. Dr. Zhi ZHUANG, Tongji University, China
Contents
Introduction Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Neil Leach and Philip F. Yuan
3
Rethinking Intelligence What About Humans? Artificial Intelligence in Architecture . . . . . . . . . Antoine Picon
15
An Architecture for Cyborg Super-Society . . . . . . . . . . . . . . . . . . . . . . . Patrik Schumacher and Xuexin Duan
31
Do Robots Dream of Digital Buildings? . . . . . . . . . . . . . . . . . . . . . . . . . Neil Leach
59
A Hybrid Creativity in Architecture: From Brain Plasticity to Augmented Design Intelligence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chao Yan
75
Collaborative Intelligence Toward a Collaborative Robotic Platform: FUROBOT . . . . . . . . . . . . . Ming Lu, Wei Ran Zhu, and Philip F. Yuan
87
Interdisciplinary AI: A Machine Learning System for Streamlining External Aesthetic and Cultural Influences in Architecture . . . . . . . . . . 103 Güvenç Özel ArchiGAN: Artificial Intelligence x Architecture . . . . . . . . . . . . . . . . . . 117 Stanislas Chaillou Advanced Timber Construction Platform Multi-Robot System for Timber Structure Design and Prefabrication . . . . . . . . . . . . . . . . . . 129 Hua Chai, Liming Zhang, and Philip F. Yuan
vii
viii
Contents
Iterative Intelligence Bioinspired Generative Architectural Design Form-Finding and Advanced Robotic Fabrication Based on Structural Performance . . . . . 147 Ding Wen Bao, Xin Yan, Roland Snooks, and Yi Min Xie A Question of Style . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 Matias del Campo, Sandra Manninger, and Alexandra Carlson Day–Night Scene Transfer for Architectural Renderings Based on Color Transfer Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 He Wanyu, Nie Guang Yang, Wang Chuyu, and Jackie Yong Leong Shong Form Finding and Evaluating Through Machine Learning: The Prediction of Personal Design Preference in Polyhedral Structures . . . . 207 Hao Zheng Comparison of BESO and SIMP to Do Structural Topology Optimization in Discrete Digital Design, and then Combine Them into a Hybrid Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219 Gefan Shao Interactive Intelligence Emotional Intelligence: Affective Computing in Architecture and Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235 Behnaz Farahi Designing an Architectural Robot: An Actuated Active Transforming Structure Using Face Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253 Ji Shi, Yujie Wang, and Shang Liu CoFabs: An Interactive Fabrication Process Framework . . . . . . . . . . . . 271 Teng-Wen Chang, Chi-Fu Hsiao, Chun-Yen Chen, and Hsin-Yi Huang Materializing Intelligence 3d-Printed Bending-Active Formwork for Shell Structures . . . . . . . . . . 295 Xiang Wang, Kam-Ming Mark Tam, Alexandre Beaudouin-Mackay, Benjamin Hoyle, Molly Mason, Zhe Guo, Weizhe Gao, Ce Li, Weiran Zhu, Zain Karsan, Gene Ting-Chun Kao, Liming Zhang, Hua Chai, Philip F. Yuan, and Philippe Block A Computational Approach for Knitting 3D Composites Preforms . . . . 315 Yige Liu, Li Li, and Philip F. Yuan
Introduction
Introduction Neil Leach and Philip F. Yuan
1 Introduction A new movement is emerging. It is a movement that operates at the interface between advanced digital technologies and the built environment. It is a movement that involves some of the most talented and inspiring designers, fabricators, and theorists to have emerged in recent years. We call this new movement “architectural intelligence.” But what exactly do we mean by architectural intelligence? Molly Wright Steenson has already used the term in the title of her book, Architectural Intelligence: How Designers and Architects Created the Digital Landscape.1 In this historical overview, Steenson traces the contributions of four architects, Christopher Alexander, Richard Saul Wurman, Cedric Price, and Nicholas Negroponte, some of the key protagonists who helped to lay the foundations of the discourse of the digital in architecture. This new movement, however, is not about the past. It is about the future. And it is not limited to a few individuals. It includes all those involved with the built environment. Nor is architectural intelligence limited to the digital. There are many other manifestations of architectural intelligence. Architectural intelligence is a global movement that embraces all forms of intelligence related to the design, fabrication, and use of the built environment. It straddles both practice and theory, material and 1 Molly Wright Steenson [1]. Makoto sei Watanabe has also referred to “architectural intelligence.”
After describing a series of investigations into the potential uses of AI, Watanabe coins the term “AI Tect” to describe the role of an architect operating with AI: “Collectively they could be referred to as architectural intelligence (AI). The word AI Tect encompasses two meanings: architectural intelligence and artificial intelligence.” Makoto,Sei Watanabe, “AI Tect: Can AI Make Designs?” in Neil Leach and Philip Yuan (eds.) [2]. N. Leach (B) · P. F. Yuan College of Architecture and Urban Planni, Tongji University, Shanghai, China e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2020 P. F. Yuan et al. (eds.), Architectural Intelligence, https://doi.org/10.1007/978-981-15-6568-7_1
3
4
N. Leach and P. F. Yuan
immaterial, academia and the profession, and architecture and urbanism. But most importantly of all, it is a movement that straddles both east and west, north and south, bringing together some of the leading voices from across the globe.
2 Material Intelligence There are many manifestations of architectural intelligence, and not all of them digital. Take material intelligence, for example. Materials belong to the realm of the material world, whereas computation itself is immaterial. A sand dune, for example, is an expression of material intelligence. And we must remember that alongside digital computation, there is also material computation. The sand dune “computes” its form through the interaction of wind on the particles of sand. The sand dune might look beautiful. But fundamentally its form is the result of material processes. Indeed, any kind of morphogenesis is an expression of material intelligence. We can see morphogenesis at work in a soap bubble. If we were to dip a tube into liquid soap and blow through that tube, a bubble of soap film will be formed. A perfect sphere. The material “computes” its shape through a calculation based on internal and external pressures, surface tension, and so on. Frei Otto experimented with soap film to produce morphogenetic forms on which to base his designs for structures. Likewise, Antoni Gaudi famously suspended weights from strings to generate a catenary curve—the perfect form in tension that could be inverted to become the perfect structural form in compression. Material intelligence, however, can be modeled computationally. Examples include structural engineering software, such as Ameba developed by Mike Xie or RhinoVault developed by Philippe Block illustrate.2 The objective here is to simulate the structural behavior of form using digital tools so as to produce a structurally intelligent solution. We might therefore also include structural intelligence within the broader category of architectural intelligence. Furthermore, architecture could also be transformed by new materials and their embedded new intelligence, such as those researched by Lyla Wu’s NEUNI Lab, or those applied in robotic printing. These new materials could draw our contemporary design discourse away from the traditional dialectics of tectonic and anti-tectonic, giving architecture a new form of production as well as a new form of intelligence.
2 https://ameba.xieym.com/;
https://block.arch.ethz.ch/brg/tools/rhinovault.
Introduction
5
3 Swarm Intelligence Alongside material intelligence there is digital intelligence.3 There are many forms of digital intelligence, such as swarm intelligence, a technique involving multi-agent systems could be understood as an early form of artificial intelligence. In the book, Swarm Intelligence: Architectures of Multi-Agent Systems, swarm intelligence is described as follows: “The complex aerial choreography that unfolds through the motion of a flock of birds exemplifies the emergence of collective behavior. Underlying the coherent elegance and fluidity of the flock is a highly sophisticated form of swarm intelligence premised on the local interaction of individual agents that gives rise to a complex global behavior. The resultant order is not enforced from above, but emerges from the bottom-up interaction of the agents in the swarm. This model of complexity is increasingly being understood as the logic that underlies systems as diverse as flocks of birds, colonies of insects, human social networks and even the operations of a city. This distributed mode of formation has also begun to emerge as the methodological and conceptual basis for an array of generative architectural design strategies.”4 Within the field of architecture, swarm intelligence has become an important field of research. Educators, such as Alisa Andrasek. Jose Sanchez, Karl Chu, Paul Coates, Cecil Balmond, Roland Snooks, and Ed Keller, have explored the potential of using these systems to generate designs using processing and other multi-agent systems. Swarm intelligence, however, is not only used to produce progressive designs, but it is also used to model behaviors. For example, the logic of slime mold— another aspect of swarm intelligence—has been used by The Living in the design of a panel for an aircraft, in collaboration with Autodesk.5 Likewise, in his essay in this volume Patrik Schumacher of Zaha Hadid Architects (ZHA) also explores the use of populations of “multi-agents systems” to simulate the potential behavior of occupants in buildings, with the eventual aim of informing and modifying the layout based on the simulated behavior of the agents. Meanwhile, in his book Emergence: The Connected Lives of Ants, Cities and Software, Steven Johnson extends the logic of swarm intelligence to the entire city.6 For Johnson, the city operates as a dynamic, adaptive system, based on interactions with neighbors, informational feedback loops, pattern recognition, and indirect control. As John Holland notes: “Like any emergent system, the city is a pattern in time.”7 Moreover, like any other population composed of a large number of smaller discrete elements, such as colonies of ants, flocks of birds, networks of neurons, or even the
3 Digital
Intelligence was the title of an exhibition embracing advanced computational work by students of architecture from all over the globe. Neil Leach and Xu Weiguo (eds.) [3]. 4 Neil Leach and Roland Snooks (eds.) [4]. 5 Margaret Rhodes [5]. 6 Steve Johnson [6]. 7 John Holland, quoted in Johnson, p. 27.
6
N. Leach and P. F. Yuan
global economy, it displays a bottom-up collective intelligence that is more sophisticated than the behavior of its parts. In short, the city also operates through a form of “swarm intelligence.”
4 Artificial Intelligence With the development of more advanced forms of artificial intelligence, such as deep learning, we are witnessing something of a revolution in digital design techniques. In particular, Generative Adversarial Networks (GANs), invented by Iain Goodfellow in 2014, have opened up a new chapter in generative design. As Stanislas Chaillou comments: ‘Goodfellow’s research turns upside down the definition of AI, from an analytical tool to a generative agent. By the same token, he brings AI one step closer to architectural concerns: drawing and image production. All in all, from simple networks to GANs, a new generation of tools coupled with increasingly cheaper and accessible computational power is today positioning AI as an affordable and powerful medium.’8
Although GANs play a relatively insignificant role within the broader field of artificial intelligence, they have been responsible for opening up an extraordinary new range of possibilities for architectural design. In this volume, Chaillou himself writes about ArchiGANs, a new version of GANs that he has developed for architects. Indeed, such is the popularity of GANs, that Wanyu He, Hao Zheng, Daniel Bolojan, Matias del Campo, and Guvenc Ozel also write about them in this volume. But artificial intelligence is not limited to the design of buildings. New artificial intelligence tools are being developed to monitor environmental conditions, such that we are seeing the increasing proliferation of intelligent buildings. These are buildings that rely on sensors, brains, and actuators to respond to their environmental conditions and to their users, and to optimize their environmental performance. These informational innovations extend to the individual home, where smart devices, such as the sensor driven, self-learning thermostat, NEST, have introduced energy-saving efficiencies.9 Artificial intelligence is also used for controlling temperature and humidity, monitoring sound levels, and checking the quality of water.10 We might therefore include environmental intelligence as a further category of intelligence. This refers not only to the research undertaken by researchers, such as Biayna Bogosian, who monitor environmental pollution levels using GIS data, AR, and drones, but also to the whole body of research into the behavior of a building within the environment.11 Artificial-intelligence-based apps and technologies can be found throughout our cities. Tollbooths have been replaced by transponders and artificial-intelligencebased vehicle identification systems. Transportation tickets have been replaced by 8 Stanislas
Chaillou [7].
9 https://nest.com. 10 Anne-Sophie
Garrigou [8].
11 https://biaynabogosian.com/.
Introduction
7
digital tickets. Artificial-intelligence-based smart parking apps, such as Pixavia, have been developed that allow tracking of empty parking spaces.12 Artificial-intelligencebased traffic apps, such as Wayz, have helped drivers to avoid traffic congestion. Social media apps can now be used to pay for taxis and other transportational services. Likewise, in some airports, facial recognition cameras have already replaced the scanning of boarding passes, while in China it is already possible to pay for goods in a supermarket using facial recognition. Alongside intelligent buildings, then, we need to recognize the emergence of intelligent cities.13
5 Extended Intelligence Increasingly nowadays we are hearing references not to straightforward “artificial intelligence” but to “extended intelligence.”14 Ultimately, the issue is not about a competition between artificial intelligence and human intelligence, but rather a potential synergy between the two. We should therefore refer to “extended intelligence” [EI] whereby AI operates in tandem with human intelligence. As Joi Ito puts it: “Instead of thinking about machine intelligence in terms of humans vs. machines, we should consider the system that integrates humans and machines—not artificial intelligence, but extended intelligence. Instead of trying to control or design or even understand systems, it is more important to design systems that participate as responsible, aware and robust elements of even more complex systems. And we must question and adapt our own purpose and sensibilities as designers and components of the system for a much more humble approach: Humility over Control. We could call it “participant design”—design of systems as and by participants—that is more akin to the increase of a flourishing function, where flourishing is a measure of vigor and health rather than scale or power. We can measure the ability for systems to adapt creatively, as well as their resilience and their ability to use resources in an interesting way.”15
There is no question that AI will have an enormous impact on architectural design, just as it will on almost every profession and discipline. However, we need to view AI not as an end in itself, but as a potential tool to enhance the capacities of human beings–a prosthetic extension to the human imagination. As such, AI can be understood as augmenting the intelligence of human beings.
12 https://www.parking-net.com/parking-news/pixevia/artificial-intelligence-smart-parking. 13 Alibaba has developed its City Brain Lab as part of the logic of intelligent cities. It claims that this now allows ambulances to arrive at their destination 7 min earlier on average, customers to check in and out of hotels in 30 s, and drivers to pass through any pay station in an average time of 2.7 s. Blog Post, “City Brain Now in 23 Cities in Asia,” Alibaba Cloud Blog, October 28, 2019, https:// www.alibabacloud.com/blog/city-brain-now-in-23-cities-in-asia_595479. 14 Anant Jhingran refers to a similar concept, “intelligence augmentation” [IA]. He notes, “AI makes machines autonomous and detached from humans; IA, in on the other hand, puts humans in control and leverages computing power to amplify our capabilities” Jhingran, Anant [9]. 15 Joichi Ito [10].
8
N. Leach and P. F. Yuan
6 Cyborg Intelligence Andy Clark and David Chalmers take this even further, and claim that a tool can even become part of our “extended mind.”16 Indeed, it could be argued that we are all already cyborgs.17 Forget the romanticized notion of the cyborg—half human, half robot—that we see in the movies. A cyborg is simply a creature that depends on some technological device that serves as a prosthesis to its bodily operations. An ordinary walking stick is a prosthesis. But our increasing reliance on sophisticated digital prostheses—from cell phones, tablets, and laptops to robotic arms—is making us ever more cyborg-like.18 We have absorbed these external technological devices, so that they have become part of who we are. Indeed human beings are natural born cyborgs, as Clark has observed.19 For the human brain, as has often been observed, is plastic. It is constantly adapting to new tasks and challenges. And it is precisely our capacity to adapt quickly to new tools that allows us to become so cyborg-like. Robotic fabrication, then, can be understood within the logic of the cyborg as a form of cyborg intelligence. With industrial robots as the revolutionary construction platform in the digital era, the architectural profession is experiencing a significant paradigm shift from traditional crafts and industrial reproduction to a new human– machine collaboration. Indeed, all robotic fabrication tools are an extension of the operations of the fabricator. As such, it is not as though we should make a distinction between manual craftsmanship and digital fabrication. Rather the whole of robotic fabrication can be understood as a form of cyborg craftsmanship. If, as Alberti claimed, “The workman is a tool in the hand of the architect,” so too the robotic arm is a tool in the hand of the architect, a cyborg-like extension to the architectural imagination.20 We could therefore refer to the intelligent use of robotic fabrication technologies as robotic intelligence.21 Meanwhile, the new human–machine collaboration challenges traditional design authorship and questions the concept of creativity within the cycle of architectural design and construction. Through the reciprocal feedback loop between cloud computing and robotic fabrication, the building industry is beginning to exhibit a mode of collective production. The shared knowledge and new found creativity afforded by the robotic platform encourage a form of collective contribution. Through
16 Clark,
Andy, and David J. Chalmers [11]. Case, We Are All Cyborgs Now, TEDWomen, https://www.ted.com/talks/amber_ case_we_are_all_cyborgs_now?language=en (2010, accessed 21 September 2019). 18 Good design helps. It is easier to absorb an elegantly design device than an ugly one. And so too does ease of operation. Indeed it is interesting how intuitive gestures–swiping, clicking, cutting, and pasting–as part of the operations of our devices. For the role of design in assimilation, see Neil Leach, Camouflage, Camb., MA: MIT Press, 2006. 19 Clark, Andy [12]. 20 L B Alberti [13]. 21 In some cases, robotic intelligence employs artificial intelligence itself, such as AI Build in London, a company that employs AI to control their 3D printing technologies https://ai-build.com/. 17 Amber
Introduction
9
open-ended mass production and mass customization, the built environment is developing its own distinctive processes of growth and reforming, as the collective creativity of various participants interacts in both the virtual and spatial dimensions to reach a new human-to-human collaboration.
7 The Future of Artificial Intelligence In 2002, Bill Gates predicted that the first decade of the third millennium would be known as “The Digital Decade” in that, by the time that it comes to an end, the impact of the digital realm will have been so far-reaching that there will scarcely be any facet of human existence which will remain untouched by it.”22 Gates’s prediction proved largely correct. In 2010, I predicted that we would not be using the term “digital” any more.23 This would not be because we would stop using the digital, but for precisely the opposite reason. Almost everyone would be using the digital. For this very reason, the term itself would simply disappear. “Digital design” would simply become “design.” To a large extent, then, this prediction has also proved correct. Just look at the way that we refer to drawings. Initially, when digital drawings first arrived, and when most drawings were hand drawings, we would simply call hand drawings “drawings,” and would call computationally drawn drawings “digital drawings.” These days, however, the term “drawings” is more or less synonymous with digital drawings, to the point that if we wish to refer to drawings drawn by hand we need to specify them as “hand drawings.”24 What, then, will we call artificial intelligence in the future? As artificial intelligence becomes more and more present, and begins to eclipse human intelligence, the logical step would be to reject the distinction between “human intelligence” and “artificial intelligence,” and refer instead simply to “intelligence.” The irony here is that when John McCarthy originally coined the term “artificial intelligence” at the famous Dartmouth Conference back in 1956, he and others did not
22 Neil
Leach [14].
23 “My own prediction is that by 2020 we won’t even use the word ‘computation’ because it will be
everywhere.” On this see “Interview” in Leach, Xu Weiguo (eds.) [15]. 24 Or take the example of the car. In the early days of cars, a carriage drawn by horses would simply be called “a carriage,” whereas it was necessary to specify a car—in other words, a carriage not drawn by horses—as a “horseless carriage.” Now it is automatically assumed that a car is the dominant form of transportation to the point that if we refer to a carriage drawn by horses we now have to specify “a horse drawn carriage.” By extension, we could imagine that once self-driving cars are introduced, we might likewise reach a threshold moment when the number of self-driving cars exceeds that of cars driven by humans. As such, another phase transition will be reached, whereby self-driving cars will simply be known as “cars” and what we now call “cars” will be known as “cars driven by humans” or perhaps “hand-driven cars.”.
10
N. Leach and P. F. Yuan
really like the term, as the goal was to achieve genuine—rather than artificial—intelligence.25 But it appeared to be their only option, so they stuck with it. Interestingly, many of the central figures in artificial intelligence are more interested in how the brain works than in developing artificial intelligence itself. In 1946, Alan Turing, before the term artificial intelligence had even been invented, Turing confessed: “I am more interested in producing models of the action of the brain, than in the practical applications of computing.”26 Interestingly, Geoffrey Hinton, another luminary within the world of artificial intelligence, initially studied physiology: “I wanted to know how the brain worked.”27 Likewise, Demis Hassabis, who has a PhD in neuroscience and is CEO of the artificial intelligence company, DeepMind, does not refer to artificial intelligence, but simply to “intelligent systems”: “Our ambition in DeepMind is to build intelligent systems to help find solutions that can learn to solve any complex problem... Put another way, we want to use it to solve everything else.”28 Soon, we can predict, artificial intelligence will simply become intelligence.29
8 The Future of Architectural Intelligence What then will be the future of architectural intelligence? Architectural intelligence has established an agenda that looks set to influence the future of the discipline for some time to come. It demarcates what is essentially a whole new approach to architectural design and fabrication. In effect it charts out a new domain where architectural production is being informed increasingly by the introduction of intelligent modes of operating. These intelligent ways of operating are not intended to replace human operations, but rather to augment and supplement them. In fact, just as the invention of artificial intelligence was originally meant to form a symbiosis with human beings, so should architectural intelligence as well. But—looking further into the future—what will happen as architectural intelligence grows in popularity and becomes persuasive, so that every architectural system becomes intelligent? Surely, at that stage, the whole “architectural intelligence” movement will have canceled itself out. It will have become so universal that it would not mean much any more. At that point, surely, architectural intelligence will simply become “architecture.” In the future, all architecture will be intelligent.
25 N
J Nielson [16]. Turing, Letter to William Ashby, 1946. The W. Ross Ashby Archive. https://www.rossas hby.info/letters/turing.html. 27 Geoffrey Hinton interviewed by Martin Ford, in Martin Ford [17]. 28 Demis Hassabis and Fan Hui [18]. 29 Indeed, we might need to introduce a new category “human intelligence” to refer to the lower level form of intelligence of human beings. 26 Alan
Introduction
11
References 1. Steenson, M.W. (2017). Architectural intelligence: How designers and architects created the digital landscape. Cambridge, MA: MIT Press. 2. Leach, N., & Yuan, P. (eds.). (2017). Computational design (pp. 68−75). Shanghai: Tongji University Press. 3. Leach, N., & Xu, W. (eds.). (2013). Design intelligence: Advanced computational research. Beijing: CABP. 4. Leach, N., & Snooks, R. (Eds.). (2017). Swarm intelligence: Architectures of multi-agent systems. Tongji UP: Shanghai. 5. Rhodes, M. (2015). “AirBus” newest design is based on “Bones and Slime Mold”. Wired. Retrieved December 1, 2015, from https://www.wired.com/2015/12/airbuss-newest-design-isbased-on-slime-mold-and-bones/. 6. Johnson, S. (2002). Emergence: The connected lives of ants, cities and software. New York: Schribner. 7. Chaillou, S. (2019). The advent of architectural AI. Towards Data Science. Retrieved September 17, 2019, from https://towardsdatascience.com/the-advent-of-architectural-ai-706046960140. 8. Garrigou, A.-S. (2018, July 23). So what does a smart city really look like? The Beam. 9. Jhingran, A. (2016). Obsessing about AI is the wrong way to think about the future. Wired. Retrieved January 22, 2016, from https://www.wired.com/2016/01/forget-ai-the-human-fri endly-future-of-computing-is-already-here. 10. Ito, J. (2017). Resisting reduction: A manifesto. Journal of Design and Science: MIT Media Lab. November 01, 2017. Retrieved May 1, 2018, from https://jods.mitpress.mit.edu/pub/res isting-reduction. 11. Clark, A., & Chalmers, D. J. (1998). The extended mind. Analysis, 58, 7–19. 12. Clark, A. (2003). Natural born cyborgs: Minds, technologies, and the future of human intelligence. Oxford: OUP. 13. Alberti, L.B. (1988). On the art of building in ten books. In J. Rykwert, N. Leach, & R. Tavernor (trans.) (p. 3). Cambridge, MA: MIT Press. 14. Leach, N. (2003). Introduction. In N. Leach (ed.), Designing for a digital world (p. 6). London: Wiley. 15. Leach, X.W. (eds.). (2010). Design intelligence: Advanced computational techniques for architecture (p. 8). Beijing: CABP. 16. Nielson, N.J., & McCarthy, J. (2012). A biographical memoir. Washington, DC: National Academy of Sciences. 17. Ford, M. (2018). The architects of intelligence: The truth about AI from the people building it (p. 80). Birmingham, UK: Packt. 18. Hassabis, D., & Hui, F. (2019). “AlphaGo: Moving Beyond the Rules. In C. Woods, S, Livingston, & M. Uchida (eds.) (p. 88), AI: More than human. London: Barbican International Enterprises.ss
Rethinking Intelligence
What About Humans? Artificial Intelligence in Architecture Antoine Picon
Abstract Artificial intelligence is about to reshape the architectural discipline. After discussing the relations between artificial intelligence and the broader question of automation in architecture, this article focuses on the future of the interaction between humans and intelligent machines. The way machines will understand architecture may be very different from the reading of humans. Since the Renaissance, the architectural discipline has defined itself as a conversation between different stakeholders, the designer, but also the clients and the artisans in charge of the realization of projects. How can this conversation be adapted to the rise of intelligent machines? Such a question is not only a matter of design effectiveness. It is inseparable from expressive and artistic issues. Just like the fascination of modernist architecture for industrialization was intimately linked to the quest for a new poetics of the discipline, our contemporary interest for artificial intelligence has to do with questions regarding the creative core of the architectural discipline. Keywords Automation · Artificial intelligence in architecture · Human–machine interaction Envisaged from a humanist perspective on architecture, the main problem raised by the introduction of artificial intelligence in the design process has to do with the roles that the humans will play in a few decades from now in this process. What if artificial intelligence was soon to become a pervasive reality in architecture? What type of agency would humans retain once computers have taken over many tasks that they still carry out today? Before addressing this question, let me begin by an apparently naïve question. Why is it that we want so much to automate not only fabrication but also large This article is based on two lectures on artificial intelligence, design and architecture given respectively for the College of Design and Innovation and the College of Architecture and Urban Planning of Tongji University in June 2019. I would like to thank her Professors Kostas Terzidis and Phillip Yuan for their kind invitations. A. Picon (B) Harvard Graduate School of Design, Harvard University, Cambridge, USA e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2020 P. F. Yuan et al. (eds.), Architectural Intelligence, https://doi.org/10.1007/978-981-15-6568-7_2
15
16
A. Picon
parts of the design process itself? Why is it that we are gradually passing from a discourse on robots to a discourse on artificial intelligence in architecture? After all, one could wonder whether automation will truly happen in our field. Human labor comes still cheap compared to the machinery needed if we want to automate significantly the building industry. This is even truer of the design process. Young architects are not that expensive compared to the sophisticated machines that would be needed to replace them. Starting from this question will allow me to unpack a number of dimensions not currently discussed in the field of digital architecture.
1 Automation, from Myth to Reality The truth is that automation is not an entirely rational enterprise. It is all the more unavoidable that it is rooted in something that extends deeper than rational calculation, something that has to do with the recognition that there are forces in the world that go far beyond the usual factors that shape our ordinary actions and productions. How to be plugged into these forces was a question that the Surrealists had tried to raise through their practice of “écriture automatique,” automatic writing. The reason I am mentioning the Surrealists is not only because of their use of the notion of automatic writing and more generally automatic creation as a way to go beyond ordinary artistic production in order to tap into more fundamental forces such as those that move the unconscious. Among their disciples was a young intellectual who was later to become one of the major Post-War French sociologists, Pierre Naville. Naville worked on automation in the industry. But while doing empirical work on what was happening on workshops floors, he remained sensitive to the lesson that the Surrealists had taught him. He famously declared in an influential essay entitled Vers l’Automatisme Social, Towards Social Automatism, published in the early 1960s: “I am not far to believe, despite the violent rejections that it often provokes, that automatism represents an archetype as ancient and radical as our sense of symmetry or cycle, that it touches something intimate in us, a vibrant chord of our creative power, that it has to do with enchantment, with our will to power, and many other impulses that moves us, starting from our unconscious” [1]. For Naville, automation was ultimately about this general pursuit of automatism as an archetype. It had to do with the quest for a spontaneity usually reserved to natural phenomena and beings. It was an attempt to make nature do, by itself, through machines harnessing its power, what humans wanted it to do. It appeared as a Promethean quest that could never be fully satisfied by our technology, as sophisticated as it could be, for it possessed a foundational and even mythical character. Envisaged from this point of view, automation corresponds to something far more fundamental than the ambition to improve efficiency. It has to do with the desire to create something that can rival nature, living beings in particular. Another way to put it is to say that automation has to do with the desire to animate matter, to surround
What About Humans? Artificial Intelligence in Architecture
17
Fig. 1 William Blake, Elohim Creating Adam, circa 1805
oneself with artificial creatures that seem endowed with something akin with life. There are multiple legends and works of fiction that illustrate the enduring character of this theme, think of Pygmalion, the Golem, or Frankenstein (Fig. 1). From the start, robots were inheritors of this tradition. Like their mythical predecessors, they were situated on the fringe where the distinction between the natural and the artificial becomes blurred. Published in the 1880s novel by the Symbolist French writer Auguste Villiers de l’Isle Adam, L’Ève Future, The Future Eve, is characteristic of this blurring. The artificial woman featured in the novel gradually awakes to a life very similar to the human one. Similarly, in Fritz Lang’s Metropolis the robot seems endowed with a vibrant life that allows her to impersonate a human. The question of the relation between robots and life has of course known a new development with artificial intelligence and deep learning, but our fears of an artificial form of life challenging our alleged superiority as “real” human beings has remained unchanged. In many fictions, beginning with Metropolis, robots seem endowed with almost diabolical powers. The potentially diabolical dimension of robots offers a convenient transition toward a fundamental aspect of animation: its ambiguous, transgressive, and finally potentially sinful character. For, at least in the Western tradition, it is only God or the gods who can animate. This is what God does when he creates man and woman
18
A. Picon
Fig. 2 Animated Modernism: The La Tourette priory by Le Corbusier completed in 1961
in the Genesis. For humans to animate appears as a transgression that can easily lead to catastrophe. Think of the series of unfortunate events that follow the animation of the Golem or Frankenstein. Let me now turn to architecture. The discipline has a both intimate and complex relation to the question of animation. On the one hand, it tries to animate matter so that it can address humans, as if it was about to speak. This expressive power was traditionally entrusted to ornament. But ornament was not alone to fulfill this function. Composition played also a role in the capacity of architecture to address humans, to enter in relation with them akin to a sort of dialog. On the other hand, animation has to be incomplete; otherwise, it appeared as transgressive (Fig. 2).1 For a very long time, architecture had to do with animation, but its animation left almost entirely aside the question of automatism and automation. The discipline was in quest of an immobile animation. It was very different in that respect from other domains, from technology in particular with its multiple machines in motion. Now, it is striking to observe how what has happened in the past decade or two can be characterized as a dramatic shift in the relations in between architecture and animation. With the rise of digital culture and the extended use of computers and robots in architecture, animation has become inseparable from the question of automation. Following what I said earlier, such a transformation can be interpreted not only as a technological revolution but also as a reactivation of powerful mythical 1 On
animation in architecture, see Spyros Papapetros, On the Animation of the Inorganic: Art, Architecture, and the Extension of Life (Chicago: The University of Chicago Press, 2012), Antoine Picon, La Matérialité de l’Architecture (Marseilles: Parenthèses, 2018).
What About Humans? Artificial Intelligence in Architecture
19
Fig. 3 Fabio Gramazio and Matthias Kohler, robotic arm in operation, 2008
forces. This mythical character explains probably why the relation between automation and architecture is complex and even contradictory. Part of the power of myths lies in their capacity to host contradictory dimensions, just like dreams (Fig. 3). Automation in architecture is an unfolding process. So far, the most salient phenomenon has been the development of digital fabrication with laser cutters, milling machines, 3-D printers, and, of course, robots. With their multiplication, one has observed the rise of interrogations regarding the possibility to automate entirely the construction process, just like what has happened in the automobile industry. At this stage, it is possible to go even further and imagine that we could very well be on the eve on a new wave of automation, the automation of design tasks proper with the rise of artificial intelligence. This is what I will mostly discuss in the rest of this article.
2 Big Data, Machine Learning, and Architectural Conversation Under what condition is the development and use of artificial intelligence possible in architecture, in the design process in particular? Let me begin by evoking some of the basic conditions that will need to be fulfilled in order to have truly intelligent machines.
20
A. Picon
The first condition is to have gathered sufficiently large quantity of data enabling machines to learn from precedents in architecture, landscape architecture, urban design, and urban planning. A number of designers are currently exploring what this perspective may entail concretely. At Harvard Graduate School of Design, Andrew Witt is among them. His researches on the possible use of big data are emblematic of the issues raised by the development of big data related to buildings, landscapes, and cities.2 In architecture, if machines ingest large enough databases of types, tectonic details, and ornaments, one may reasonably suppose that they will be able to produce designs of their own. What will these design look like? They may combine elements that we don’t usually associate, such as Baroque compositions and modernist details, for instance. Should this scenario become true, two questions would immediately arise. First, will we be able to understand how machines reason? It is not only a matter of being able to picture how algorithms actually function. The elements on which the machine base itself may differ profoundly from ours. Where we see floors, walls, ceilings, and roofs, the machine may choose a different way to organize its reading of buildings. Among humans, there are already differences between the way buildings are understood because of linguistic nuances and technological variations. The French word “poutre” corresponds, for instance, both to the English terms beam and truss, which means that entities that are of the same nature for French builders belong to distinct categories for their English-speaking counterparts. Think of how disconcerting the elements of construction produced by machine deep learning could be from what humans ordinarily recognized as the fundamental components of the building industry. We are already observing significant differences between the way they tend to see cities from a radical perspective than ours. They tend to observe, for instance, clouds of data or rather traces of geolocated events that have become big data after being recorded (Fig. 4). Surely, the distance between our intuitive grasp of objects and phenomena and methods following entirely different premisses predates the invention of the computer. The diffusion of calculus during the eighteenth century was accompanied by a very similar interrogation on the relevance of the new mathematics pioneered by Newton and Leibniz since they were so far away from the geometric reasoning that had dominated since ancient times. What took place upon the course of a few decades was a dramatic realignment of human intuition, which enabled scientists and engineers to build coherent and above all operative mental images of how calculus did actually operate. Will humans be able to produce similar mental images enabling a better understanding of machine procedures of deep learning? Theorist and historian Mario Carpo doubt it [2]. Contrary to him, I personally believe that this be the case for specialized professionals of code, the contemporary equivalent of early-19th scientists and engineers confronted with the Intricacies of Calculus, who will be able to imagine how computers operate. But since design and coding are very probably going to remain 2 See
the project showcased on his website, “Certain Measures,” https://certainmeasures.com/.
What About Humans? Artificial Intelligence in Architecture
21
Fig. 4 “Conversation” between a Buddhist monk and a robot at Longquan Temple, near Beijing
separate activities, except in some specific cases, a new question arises. Should we force machines to work with the same type of elements than those we manipulate day after day in design? Should machines be obliged to combine floors, walls, and ceilings, instead of the elements that deep learning might have led them to identify? This would make machines easier to understand, but it might be the same token deprive them of any genuine creative power. With computers following our ways to understand buildings, we will soon have machines which will be able to produce tolerably good Baroque churches of modernist villas instead of producing hard to imagine combinations of elements of an unknown nature. But is it what we really want? The building industry may not have the same hesitation. After all, it has been feeding for centuries on libraries of established types and patterns. However, this might not be the optimal way to mobilize computation in design. Kostas Terzidis has, for instance, advocated to respect the fundamental “otherness” of computers in order to fully take advantage of their creative potential [3]. Now, the previous discussion revolves around a central issue, that is, the way we will communicate in the future with machines or, to put it in a slightly different light, how to have a true architectural conversation with it. It is worth remembering at this stage that the architectural discipline is actually based on a series of conversations between architects and clients, architects and entrepreneurs, and entrepreneurs and labor, to name only a few. For the funding fathers of Italian Renaissance architecture like Leon Battista Alberti or Daniele Barbaro, this conversational character was the most fundamental aspect of the discipline [4]. From such a perspective, the main challenge raised by artificial intelligence in architecture lies in the difficulty to reach a sufficient level of understanding of machines to be able to associate them in productive manner to the conversation. This will prove extremely difficult if machines are too different from us, but forcing them to mimic us may lead to the symmetric pitfall of making them not interesting enough to deserve consideration.
22
A. Picon
3 The Automation of Design and Fabrication At this stage, it seems plausible to assume that machines will soon be able to do more and more. It seems also plausible to assume than in most case we will force machines to consider elements and to follow rules that are somewhat similar or at least compatible with those that we have adopted in architecture and the building industry. Theorists like Karl Chu may dream of a completely alternative mode of design, but the industry will probably not be as open to this possibility.3 Until recently, one assumed that automation would impact only poorly qualified jobs. This might not be the case. Architecture will be probably among the most severely hit disciplines. The reason for this high degree of vulnerability is that architecture is among the formalized of all the arts. The mechanical part is stronger than in other domains, and hence the traditional position of the discipline on the threshold between art and technology. The very use of orders and proportions by the Vitruvian tradition seems already to be calling for automation, as Mario Carpo has convincingly argued in his essay The Alphabet and the Algorithm [5]. The early-19th project of Durand to codify design procedures could constitute an even clearer forerunner of the reflections on the possibility to use intelligent machines to design. British Architect Cedric Price’s approach to projects like Generator seems also to announce an era of active machine involvement in design. One might be thus tempted to establish a genealogy leading from Durand to Price and from Price to contemporary machine learning. It is almost certain that we are heading to an era of active machine involvement in design. This era had been announced by many proponents of early computerin-architecture culture. It was, for instance, a key belief of the researchers led by Nicholas Negroponte in the Massachusetts Institute of Technology Architecture Machine Group, the ancestor of the Media Lab [6]. Retrospectively, the period during which computers and robots were mere tools will appear as a parenthesis in a long march toward a true cooperation between man and machine. Again, it is not sure that intelligent design machines will reason like humans. This means that as important as their intrinsic power to produce design solutions will be, their capacity to have a conversation with us will prove crucial. Negroponte’s Architecture Machine Group had a clear vision of the necessity to be able to truly enter into a dialog with the computer. This is the reason why it gradually turned to questions of interface, thus giving birth to the Media Lab. Intelligence will not be limited only to design proper. Construction will also be impacted. So far, robots are no more than mechanical slaves. One may imagine that they will acquire some autonomy. We might then see the emergence of disagreements between these various forms of artificial intelligence. Design could very well remain a conversation. But between whom remains undecided. Besides humans and computers, robots might have a role to play in such a conversation, especially if we 3 See,
for instance, Chu, Karl, “Metaphysics of genetic architecture and computation,” in Constructing a New Agenda: Architectural Theory 1993–2009 (New York: Princeton Architectural Press, 2010), 427: 431.
What About Humans? Artificial Intelligence in Architecture
23
try to follow John Ruskin advocacy of the rights of labor to be associated to the creation of any form of collective work. Aren’t they about to become the new labor force of our digital age?4 Before dealing with what could still remain the human responsibility in this transformation, let me say a word about some professional changes that this evolution will accelerate. Some of these changes have already intervened with the diffusion of digital tools. While thousands of books and articles have been devoted to the development of digital architecture, it is surprising to observe how little is currently written on that matter.5 The first striking feature of what has happened in the design profession is the consolidation of practices. Giant firms are no longer the exception. The rise of artificial intelligence in architecture may increase further this phenomenon. Indeed, it requires substantial investment in hardware and software that may not be accessible to small- and middle-sized structures. In these extended practices, there is a risk of rapid obsolescence for designers because of the rapid evolution of a more and more demanding technological environment. Will the future of the profession concern mostly designers in their 20s and 30s, like what has happened in many sport disciplines? Finally, one observes a profound redefinition of authorship with the rise of more collaborative and networked forms of authorships.6 This tendency will be accentuated by the cooperation with machines. What will mean being an author in dense networks comprising not only humans, but also computers and robots? Bruno Latour’s ActorNetwork Theory offers only very partial answers to this interrogation [7]. It leads to the more general question of what could remain specifically human in the future of design that I have sketched. It is not an easy question to answer. One may of course play the humanist and invoke the monopoly of humans on genuine creation. But is this attitude sustainable? We are already seeing texts written by computers, paintings produced by them. These creations are not yet brilliant, but nothing indicates that there are asymptotic limits to what machines can achieve (Fig. 5).
4 What About Humans? What will remain specifically human in a design process suffused with artificial intelligence? The question is complicated by the fact that since we invent machines, there is something partly mechanical in the way we think. This mechanical dimension of human thought obsessed French eighteenth-century philosopher Denis Diderot, the Chief Editor of the Encyclopédie. It was the source of his interest for the mechanical 4 See
on this theme Antoine Picon, “Free the Robots!”, Log, n° 36, Winter 2016, 146: 151. have tried to address the subject in Wendy Fok, Antoine Picon, eds, Architectural Design, “Digital Property: Open Source Architecture,” vol. 86, n° 5, 2016. 6 For an example of a more collaborative form of authorship, see Philip F. Yuan, Collaborative Laboratory/Works of Archi-Union and Fab-Union (Hong Kong: Oscar Riera Publishers, 2018). 5 We
24
A. Picon
Fig. 5 Iris van Herpen with Jolan van der Wiel and Marjan Colletti, Quaquaversal centrepiece, Musée d’Histoire de la Médecine, Paris, 2015. A spectacular example of distributed authorship
arts to which he devoted a large part of the Encyclopédie. For Diderot, to describe a machine as he did with a knitting machine was synonymous with trying to address the question of what is mechanical in the human mind. Indeed, he assumed that we can design machines only insofar that we think like them [8] (Fig. 6). If we are partly resembling machines, what is unique to what we bring to the design process? An immediate answer lies in the fact that we are still endowed with a much more generalist form of intelligence than machine. The path leading to computers able to successfully compete with the all-purpose brain that we possess is still ahead of us. But let us assume for the sake of the argument that we will be confronted to much more powerful and general forms of artificial intelligence in the near future: what could remain then the exclusive competence of humans? Two factors must be taken into account at this stage. First, we have an embodied intelligence. Our conceptions and experiences of the body have proved eminently variable from one period and a civilization to another, but to be human is to have a body through which our brain and the awareness of who we are structure themselves. The findings of contemporary neurobiologists like Antonio Damasio tend to support this hypothesis [9]. The lack of embodiment, in the intimate sense I just referred to, represents a real limitation of robots and artificial intelligence. Second, because we have a body, we have emotions that mix the purely physical and the mental. Memory and most of our cognitive functions blend in a very similar way the physical and the mental. Such an interaction is what produces the human relevance of objects and actions as well as this series of echoes in each other minds that we call meaning. It is telling that for a very long time, architecture was interpreted as what happened to buildings when they were able to connect with us through some
What About Humans? Artificial Intelligence in Architecture
25
Fig. 6 Knitting machine described by Denis Diderot in the Encyclopédie
affinity with our body. The five orders of the Vitruvian tradition found, for instance, their legitimacy in the analogy between their proportions and those of the human body. More generally, the symbolic is very often connected to our body. I would personally assign to the humans the task to decide what truly matters, to choose among a range of possible solutions produced by the machine what seems the most relevant to humans. In other words, this entails a shift from tactics to strategy, from the how to the why. This shift has already begun to happen with the diffusion of digital tools, with the rise of parametric design, in particular, that may produce and infinity of solutions. In that latter case, the real decision-making has to do with selecting what are the right questions. Again, the why seem to become more important than the how. From this perspective, architecture becomes a question of choice, something akin to a form of action. Machines produce while humans perform, while they make choices that have to do with how the physical and the symbolic are actually coemergent. The interaction between the physical and the symbolic is what makes possible dwelling, or inhabiting, in the sense popularized by Martin Heidegger in his seminal text “Building Dwelling Thinking” [10]. Machines can occupy a space, but humans have the relatively unique capacity to inhabit it. Making choices about inhabiting
26
A. Picon
could very well become a unique prerogative of humans surrounded by artificial intelligence in the future. I am advocating here a kind of return to the phenomenological frame, with a major caveat; however, the fact that this return should not be accompanied with the hypothesis of a completely static conception of what constitutes the human. One should not assign a fixed essence to mankind. Even embodiment could characterize one day highly advanced machines. Meanwhile, pressing issues are on the table. What will be, for instance, the future of labor in a highly robotized design world? There again, there is nothing dictated by a fixed nature, but constructs in continuous evolution. This means that the distribution of the tasks between humans and machines is to a large extent a construct, a political construct. To be human is to be political, as Aristotle famously remarked [11]. We need urgently to redefine the politics of design envisaged as the relations between the various actors, human and non-human, that are involved in it.
5 Updating the Myth: The Need for a New Poetics In architecture, myth and poetics have particularly strong connection. Technology in architecture has no relevance if it does appeal to these closely related dimensions. This was the case with industrialization at the apex of modernism. Industrialization was never about rationalization only. It was about connecting to a broader dynamism and about poetics. It is probably too early to say how this poetic dimension will play out in the development of artificial intelligence in architecture. But let me note one thing: usually architectural poetics plays on the fringes of the dominant productive system. It appeals to nostalgia as well as to various themes that appear as counterpoints to the prevalent rationalization schemes. The dream of a reinvented craftsmanship haunted modernity and industrialization, for instance. The contribution of Jean Prouvé cannot be understood out of this frame. But this is also true of a radical advocate of industrialization like Buckminster Fuller. What moves us with the Dymaxion car is precisely its bricolage dimension quite different from mainstream industrialization (Fig. 7). What will be the fringes of an artificial-intelligence-driven world? A possible direction that deserves exploration could be to introduce glitches, to perturbate the perfection of machine design. This imperfection has again to do with the couple body/meaning. It was present in Ruskin reflections on ornament, or later in Le Corbusier’s fascination with the imperfections of concrete that made manifest the presence of man. What could imperfection mean in a new machine age? We could also invert the roles and have the machine produce disconcerting glitches and objects. Of course, all this might require a negotiation with machines. And here we are back to question of conversation. Speaking of conversation, why not imagine that buildings themselves could have eventually something to say? Why not envisage forms of artificial intelligence the body of which could be buildings,
What About Humans? Artificial Intelligence in Architecture
27
Fig. 7 Construction of a Buckminster Fuller’s Dymaxion Car prototype in the early 1930s
homes, in particular? Cedric Price’s Generator project suggests something like this. This would constitute the ultimate form of architectural animation (Figs. 8 and 9).
Fig. 8 “Mine the Scrap” project by Tobias Nolte, Andrew Witt, and al., Certain Measures, 2015– 2016
28
A. Picon
Fig. 9 Cedric Price, schema of the Generator project, early 1970s
Ultimately, we might want to question also what will be probably at the core of the role played by humans in an artificial-intelligence-driven design world: inhabiting. Are we going to keep forever the monopoly of this key dimension of architectural experience? Questioning the fringes of established practices and beliefs, reinventing the mythical might involve a critique of inhabiting as we have known it so far. A world in which we are no longer the only one to inhabit could become possible in the future. After all we have never designed only for humans. For instance, we have often drawn and built for horses as well as humans. A seventeenth-century Parisian hotel was meant to accommodate both. At the turn of the nineteenth and twentieth centuries, large cities were inhabited by tenth of thousands of horses, which an American urban historian has interestingly characterized as “living machines” [12] (Fig. 10). Science fiction novels and movies are already presenting us with multiple images of life with artificial companions with whom we inhabit. Take the movie Ex Machina, for instance. Design is among the key dimensions that makes us humans. It will probably remain so in this new machine age that I have evoked. To be human might have to do with the fact that we are never sure of what it means exactly. Design crystallizes this incertitude.
What About Humans? Artificial Intelligence in Architecture
29
Fig. 10 Still from the 2014 British science fiction movie Ex Machina
References 1. Naville, P. (1963). Vers l’Automatisme social. Paris: Gallimard. 2. Carpo, M., & Turn, T. S. D. (2017). Design beyond intelligence. Cambridge, Massachusetts: The MIT Press. 3. Terzidis, K. (2006). Algorithmic architecture . Boston, London, New York, Oxford: Elsevier, Architectural Press. 4. Choay, F. (1987). The rule and the model: On the theory of architecture and urbanism. Cambridge: The MIT Press. 5. Carpo, M. (2011). The alphabet and the algorithm. Cambridge, Massachusetts: The MIT Press. 6. Steenson, M. W. (2017). Architectural intelligence: How designers and architects created the digital landscape. Cambridge, Massachusetts: The MIT Press. 7. Latour, B. (2005). Reassembling the social: An introduction to actor-network-theory. New York: Oxford University Press. 8. Proust, J. (1977). L’article *Bas de Diderot. In M. Duchet, & M. Jalley (Eds.), Langue et langages de Leibniz à l’Encyclopédie (no. 10/18, pp. 245–271). Paris. 9. Damasio, A. (2010). Self comes to mind: Constructing the conscious brain. New York: Random House. 10. Heidegger, M. (1971). Building, dwelling, thinking (1954). In Poetry, language, thought (A. Hofstadter, Trans.), (pp. 145–161). New York: Harper and Row. 11. Jaulin, A., & Güremen, R. (Eds.). (2017). Aristote, l’animal politique. Paris: Éditions de la Sorbonne. 12. Tarr, J. (2007). The horse in the city: Living machines in the nineteenth century (animals, history, culture). Baltimore: John Hopkins University Press.
An Architecture for Cyborg Super-Society Patrik Schumacher and Xuexin Duan
Abstract This paper embraces the future-open, anti-humanist sensibility of cyborgism from a societal perspective and locates the origin of the ongoing historical transformation of human identities and ways of life in the technology-induced transformation of societal communication dynamics. The evolution of language, and later of writing systems, is identified as crucial empowering engines of human productive cooperation and cultural evolution. Equally crucial for collective human selftransformation is the ever-evolving construction of artificial environments. Built environments are as much a human universal as language and all societal evolution depends on them as frames within which an increasingly complex social order can emerge and evolve. They constitute an indispensable material substrate of societal evolution. These built environments do not only function as physical ordering channels but also operate as information-rich spatio-visual languages, as a form of writing. This insight opens up the project of architectural semiology as task to radically upgrade the communicative capacity of the built environment via deliberate design efforts that understand the design of built environments primarily as the design of an eloquent text formulated by an expressive architectural language. The paper ends with a critical description of a recent academic design research project illustrating how such a semiological project can be conceived. Extrapolating from this leads the authors to speculate about a potentially far-reaching, new medium of communication and means of societal integration, facilitating a ‘cyborg super-society’. Keywords Superman · Social order · Futurism · Anti-humanism · Parametricism · Tectonism · Extended mind · Parametric semiology
Chinese Translation published in: Architectural Journal (AJ), Special Issue: The Cyborg Futures of Architecture, No. 607, April 2019, Beijing, China. P. Schumacher (B) Zaha Hadid Architects, London, UK e-mail: [email protected] X. Duan College of Architecture and Urban Planning, Tongji University, Shanghai, China e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2020 P. F. Yuan et al. (eds.), Architectural Intelligence, https://doi.org/10.1007/978-981-15-6568-7_3
31
32
P. Schumacher and X. Duan
Fig. 1 Circular centrifugal interior of ‘Discovery One’ in Stanley Kubrick’s 1968 movie 2001: A Space Odyssey
The digital revolution in architecture which spawned parametricism is but a part of the general computational revolution that is allowing human civilization to progress to a new stage, a stage that is uniquely dynamic and promises unprecedented prosperity. The current breakthroughs in AI and AI-powered robotics imply an acceleration of the technological and social transformations of the human condition that are experienced as both thrilling and threatening. In the optimistic, explorative spirit with which futurism had embraced the technological take-off of the electric age one hundred years ago, parametricism now sees and seizes the positive opportunities for radical innovation of the computational age. The integration of computational optimization tools and robotic fabrication processes into the very DNA of architectural design methods and values is already manifest in parametricism’s most advanced stage and sub-style: Tectonism. These innovations not only transform the architectural design and construction process but will, if parametricism/tectonism succeeds, transform all aspects of the discipline’s output, i.e. the global built environment and the world of artefacts. These architectural and urban transformations are (supposed to be) in tune with the overall societal transformations, i.e. with the comprehensive technology-induced transformation of work patterns, lifestyles, as well as forms of sociality. Human culture and identity itself are thereby evolving and differentiating in as yet unpredictable ways, thus suspending our humanist conceptions of a presumably well-understood, stable human nature (Fig. 1). Our title plays on D. S. Halacy’s seminal 1965 book ‘Cyborg: Evolution of the Superman’.1 Halacy defined a ‘cyborg’ as a man who uses machines to increase his power, and considers tools, in general, prosthetics, as well as mind expansion via
1 D.
S. Halacy [1].
An Architecture for Cyborg Super-Society
33
psychic drugs, and he foresees that the human body might have to be medically transformed to survive extraterrestrial conditions. In contrast to these preoccupations, this paper emphasizes the technology-induced transformation of societal communication dynamics and how new artificial built environments have contributed and can further contribute to these progressive, transformative developments. All of us have always already been cyborgs, leading very original, artificial lives within highly artificial man–artefact symbioses. This will involve new, unheard of social identities. The proliferation of diverse social roles, lifestyles and identities in contemporary society inspired Donna Haraway, in her famous Cyborg Manifesto,2 to conscript the figure of the cyborg as a suggestive metaphor and slogan for her post-humanist brand of feminism embracing the invention of new hybrid, non-essentializing identities. Our paper also takes the cyborg as anti-essentialist metaphor and manifesto to point to a radical future openness. However, our manifesto is inviting architects to contribute to a radical enhancement of human societal processes via built environment innovations. The underlying insight and argument here is that the built environment is an indispensable part of societal communication processes that the city is facilitating the emergence and operation of a collective super-brain and is thus an arena of a societal mind expansion. This thesis is in line with concepts of mind, as exemplified a.o. in Clark and Chalmer’s Extended Mind Hypothesis,3 that understand that ‘mental’ processes are inherently social and in the case of homo sapiens increasingly involve external mind extension.
1 Becoming Human and Super-Human via Design This paper is taking the achievements of parametricism and tectonism for granted in order to speculate further about more far-reaching potentials for architecture’s contribution to a new level of civilization, a civilization in which diverse forms of man–machine symbiosis might usher in an era that we might anticipate as ‘the era of the cyborg’. It is no contradiction that this paper is also taking Andy Clark’s thesis of humans as ‘natural-born cyborgs’4 for granted as one of its premises because the new era implies a step change in what admittedly has always been an aspect of the human condition: self-transformation. Clark’s thesis emphasizes brain plasticity and thus the openness of the human mind and person for artificial body and mind expansions. This implies optimism about the adaptive capacity of the human organism and psyche with respect to radically life-changing transformations (Figs. 2 and 3). The artificiality of human settlements, artefacts and dresses becomes strikingly obvious when we compare the diversity of human visual-material cultures, each with its own rich internal differentiations, with the uniformity of other primate species. 2 Donna
Haraway [2]. Clark and David Chalmers [3]. 4 Andy Clark [4]. 3 Andy
34
P. Schumacher and X. Duan
Fig. 2 Electronic arm, Artist Stelarc, b. 1946
Fig. 3 High-performance prosthetics: carbon Fibre running blades
What sets human groups apart from other primates is what we might at earlier times, in the broadest sense, have called art, and what we should now refer to as design. This includes profane and sacred buildings and places, tools and other artefacts, as well all the artistry of decorative self-transformation via dress, jewellery and makeup. These practices of ‘artistic’ self-transformation are a universal phenomenon of all human groups (Figs. 4, 5, 6, 7, and 8). These practices of decoration also include all buildings and artefacts, i.e. in contemporary terms they include all design disciplines. The significance of these practices, then and now, is the visual marking and thereby conspicuous differentiation of social arenas, roles and identities without which no social order can be build up or maintained. These props and marks allow dominance hierarchies, as well as more complex societal differentiations, to be institutionalized. The social order becomes
An Architecture for Cyborg Super-Society
35
Fig. 4 Group of apes
Fig. 5 King of Kuba Kingdom, Congo
independent of the continuous combative physical reestablishment of hierarchy and this way also becomes scalable. Design is involved in the elaboration and reproduction of these vital semiological systems. Dress codes, as well as building ornamentation and the decoration of spaces, each form a (very loose) system because we always contrast and identify when we read a place or space or a person’s style. These systems also, to some extent, connect
36
P. Schumacher and X. Duan
Fig. 6 Henry VIII
Fig. 7 The 12 Justices of the UK Supreme Court
up to form an overall system. A person’s dress style is often loosely connected to the style of his or her dwelling. While in the old days the individual ‘designs’ evolved gradually by tradition bound reproduction with variation, today the individual designs are indeed consciously designed. However, the overall semiological system emerged bottom up rather than by design, and today continuous to evolve without any conscious steering efforts on the collective level of the overall semiological system. This insight invites us to
An Architecture for Cyborg Super-Society
37
Fig. 8 UK Supreme Court in session
speculate about the possibility to raise the level of conscious, discursive steering on this level. As Karl Marx already knew, the driving force of historical transformations is the productivity gains that accompany and indeed demand for better or for worse, societal self-transformations within a competitive world economy. The inherent rationality of the competitive process as optimizing evolutionary process implies that change is for the better, at least for most, if not for everybody. Productivity gains are thus not only the driving force but also the ultimately life-enhancing raison d’etre of this process. Humanity as a whole enhances its material freedom, i.e. our collective capacity to ‘domesticate’ the indifferent and often hostile physical universe and to turn it into our serviceable, pleasurable ‘home’. Anti-humanism or post-humanism in this context implies the rejection of an all-too fixed concept of our human nature in favour of an unprejudiced heuristic of future openness to forms of life that defy all prior human stereotypes. What these new forms of life will be like is hardly predictable in any detail, and nobody knows where and when we might hit limits to human adaptability. The evolutionary process of human becoming remains unpredictable in its pathways but can be trusted to deliver positive value, i.e. increasing levels of prosperity and individual freedoms, as defined by the new human subjects—the supermen in Nietzsche’s sense—that will have evolved in the process. The thesis here is that this process of becoming ‘superman’ is only conceivable as embedded in a worldsocietal becoming, as part of a ‘cyborg super-society’. A superman can only exist in a super-society. While there are also unknown psychological limits, whether a new super-society evolves and how it evolves depends on the societal dynamic in relation to the historical prosperity limits and potentials given by the hard conditions of what
38
P. Schumacher and X. Duan
Fig. 9 Interactive Robotic Fields, Marcel Ortmans, Ivan Subanovic, Markus Ruuskanen, I Yu, AADRL 1999/00
is possible technologically, and by the softer political and ideological conditions (Fig. 9). The emphasis here, therefore, will be placed not on speculating about a new human subjectivity or new personality types but instead on new patterns of social interaction and a new social order with new social processes and dynamics in which, very broadly and abstractly anticipated, intelligent responsive environments will join the social process, and individuated parts of it will potentially even join society itself, understood as communication process. We can imagine that intelligent, learning robotic environments to be accepted as consociates, i.e. as unique members of society. This emphasis on the societal process and social order rather than on technology or subjectivity is motivated by architecture’s and urbanism’s unique perspective and contribution to the evolution of civilisation, namely, via its core competency that is the spatial ordering of social processes. It is a crucial fact of human becoming that the expanded social order of all human societies evolved on the basis of artificial built environments that deliver the necessary cross-generationally stable and accruable material substrate of societal evolution. This dependency still holds: societal evolution is still supported by architectural and urban evolution (Fig. 10). The emphasis on social order in relation to the advance of civilization is also merited independently of our specific disciplinary perspective. The dynamism and transformative agency of human progress is located in the productive and creative
An Architecture for Cyborg Super-Society
39
Fig. 10 Frontispiece of Thomas Hobbes’ ‘Leviathan’, by Abraham Bosse, with creative input from Thomas Hobbes, 1651. The social body is depicted together with its architectural embodiment
power of human cooperation via the division of labour, rather than at the level of individuals’ capacities. To be sure, these capacities matter too, as given constraints. However, all the radical transformations of the human condition from the stone-age hunter and gatherer bands to our current single, integrated world society happened on the basis of virtually unchanged biological organisms. Those individual cognitive enhancements that make a difference between an ancient hunter and a contemporary world citizen are software rather than hardware based, i.e. due to culturally transmitted language, knowledge and learning capacities. More powerful than these individually downloaded capacity enhancements is the empowerment through cooperation within a single global mega-network that embodies what we might call ‘global swarm intelligence’, articulated into various discourses and also including the contributions of ‘organisational intelligences’, i.e. of all the organized social systems like firms, NGOs or government agencies. World society is constituted and connected up via architectural and urban spaces, transport including aviation, the mass media and the Internet that allows everybody to communicate with everybody else within seconds. Architecture and design play an indispensable part in facilitating these arenas and interfaces of the global collective intelligence, including all web-based communications via our graphic and web-design colleagues.
40
P. Schumacher and X. Duan
Social communication and cooperation is the crucial factor for both the epistemic and productive powers of mankind. Expanded communication implies both expanded markets and expanded discourses, allowing for the division of labour and cooperation on a new level of productivity, inventiveness and rationality. Scale matters. A larger and more integrated society allows for a more intricate, fine-grained division of labour and thus also for increases in per capita income. This insight about the positive prosperity effect of expanded markets via a thereby intensified division of labour was already stated explicitly in Adam Smith’s 1776 treatise ‘Wealth of Nations’.5 Communication is also crucially involved in all advanced forms of cognition, reasoning and knowledge production. Language, a uniquely human evolutionary advance, is itself a social achievement, and a crucial prerequisite of all human thinking and reasoning. Languages co-evolved with societal evolution. As Ludwig Wittgenstein demonstrated, there can be no such thing as a private language.6 Languages are like empowering software packages that run on our brain’s hardware. As Daniel Dennett poignantly stated: You can’t do much thinking with your bare brain.7 Twentieth-century philosophy is defined by the ‘linguistic turn’ which established the philosophy of language as ‘first philosophy’, dethroning epistemology and ontology, respectively.
2 Languages and New Modes of Communication Human language is, in contrast to all animal communication and signalling systems, uniquely empowered due to the combinatorial explosion of expressive power delivered by grammar. Only human languages possess grammar. It is said that it was the unique evolutionary advance of grammar-based language that opened up a new evolutionary take-off, indeed a whole new type of evolution, namely, cultural evolution. We are not disputing this but we must emphasize that verbal language by itself was not sufficient. Indeed, as stated above, artificial built environments were an equally indispensable, constitutive condition and ingredient of any cultural evolution, due to their material endurance, due to their ability to be accrued and modified, as well as due to their capacity to structure and stabilize a social order on ever larger scales. Human languages co-evolved with the spatial order of human settlement structures that distinguish, distribute and connect social activities, roles and status types. The artificial built environments themselves constitute a spatio-visual language, and thus evolved with a double functionality, as physical organization and spatio-visual codification of the social order. The same double function applies to tools and the world of artefacts, in general, especially with respect to clothing. The latter, in particular, as dress code, constitutes a crucial visual language
5 Adam
Smith [5]. Wittgenstein [6]. 7 Daniel Dennett [7]. 6 Ludwig
An Architecture for Cyborg Super-Society
41
Fig. 11 Painting depicting social gathering, ‘Five Dynasty’ (907–960 AD)
without which no extended social order is or would have been possible. Architecture, artefacts and dress are human invariants. There are no known human life forms without these ingredients and they still constitute indispensable arenas of societal evolution. The reproduction and indeed continuous innovation of these essential arenas is the responsibility of the design disciplines. Theoretically, there can now be a substitution of this communication function via electronic tags that would reveal information about the tagged person (or space) but this would require our identities to be made explicit. This would be rather different from the intuitive play with affiliations, contrasts, hints and deliberate ambiguities that aesthetic communication affords. Also, there is only so much text-based information that can be fed through via PDA or similar devices. In contrast, the built environment and world of artefacts and dress constitute a 360-degree interface of communication that addresses all the senses (Figs. 11 and 12). All design is communication. This slogan recognizes the importance of these design domains and their understanding as language domains for societal communication and thus social order. However, the various languages of the built environment, artefacts and dress are still rather crude, fragmented and full of visual noise, and most crucially they are mostly simple signalling systems with only the most rudimentary beginnings of grammar. What thus transpires here is a fantastic and yet totally untapped opportunity for momentous and potentially impactful innovation. If the built environment operates like a language then it can be made to operate much more so, and much more effectively. This insight motivates the research programme ‘Agent-based Parametric Semiology’. Parametric semiology implies that it is the formal repertoire of parametricism, and indeed of tectonism, that serves as palette with which to design the spatio-visual language. The qualification agent-based indicates that this research project involves a simulation capacity that tests the social ordering efficacy of the designed semiological system. Here, however, rather than presenting the first results of this research,
42
P. Schumacher and X. Duan
Fig. 12 Paris Social Scene, French School, eighteenth century
our agenda is speculative, i.e. speculation about the radical transformative potential of the semiological project for the built environment and indeed human civilization. What we are envisioning is the radical expansion and enhancement spatio-visual communication, if not the opening up of a whole new mode of communication. A few hints about the transformative impact of prior expansions of human communication might stimulate our imagination and anticipation here. After the original evolutionary take-off innovation of verbal human language, it was the evolution of writing systems that had hugely empowering consequences for human civilizations. In fact, what we usually describe as the first human ‘civilizations’ in contrast to earlier tribal groups are indeed all, and by necessity, writing-based, literary civilizations. We should also note here that the level of societal advancement we call ‘civilization’ also implies, without exception, the emergence of cities. All civilizations are literary, urban civilizations. Printing marked yet another momentous step, inaugurating what Marshall McLuhan termed the Gutenberg Galaxy.8 The media theorist McLuhan remains a classic reference for all Cyborg-related inquiries. He sees media as epochmaking, but perhaps places too much emphasis on media-induced shifts in human cognition and subjectivity as a mechanism of new media’s societal impact. Prime attention should rather be given to the new, extended forms of social organization 8 Marshall
McLuhan [8].
An Architecture for Cyborg Super-Society
43
that become possible. However, his slogan ‘the medium is the message’9 remains pertinent. This resonates with Jacques Derrida’s reflection on the decisive role of the medium of writing and its irreducible contribution, over and above speech, for philosophy. There is no way that elaborate discourses on morality, politics, law, economy, architecture, etc. could have emerged without writing, on the basis of an oral culture only. Indeed, as the sociologist Niklas Luhmann emphasizes, these modern discourses which became crucial historical accelerators quickly became extensive, specialized, critical literatures which could not have come into existence without printing which massively increased the participation and turnover in these discourses. In fact, discourses in the modern sense could not have emerged on the basis of writing alone, i.e. via hand-copied manuscripts. Printing was prima facie a momentous empowerment of writing rather than a new, sui generis medium of communication. However, the mass dissemination of written texts also triggered a qualitative shift in both the texts themselves and transformed the societal role and impact of written communication. Modernity was indeed Gutenberg’s galaxy. An elaborate philosophical discourse as exemplified in Immanuel Kant’s 1781 Critique of Pure Reason, a systematically ordered, thousand-page treatise that fast spawned a wave of intense philosophical work known as German Idealism, is unimaginable without printing, both with respect to its conditions of production, based on the availability of an extensive library, and with respect to its wide reception. Similarly to print, radio and television are dissemination engines, here for speech, face-to-face communication and theatre, rather than an altogether new medium as speech itself in relation to facial and gestural expressions, or as writing, and indeed drawing, over and above speech. The rather remarkable fact that writing and reading is possible with nearly the same speed as speaking and listening testifies to the enormous plasticity of the human brain and to its receptiveness to new forms of software delivery and new operational modi. Musical notation brings home this point too, as well as mathematical notations. These also show that writing systems do not have to go through silent, inward speech as alphabetical writing seems to do, at least for beginners. With sufficient training, all forms of reading, like listening to speech, develop from a process of concentrated deciphering to something more akin to effortless perceptual absorption. Finally, braille, the script for the blind, demonstrates that the tactile channel might be utilized, with its own arbitrary tactile code, and with equal proficiency as the auditory and visual channels. The density of information transfer seems to be more or less undiminished, once subjects are sufficiently trained up (Figs. 13, 14 and 15). The underlying brain plasticity and learning capacity are also strikingly evidenced by recent technologies for the effective substitution of vision for the blind. A matrix of differential stimulations, transcoded from the feeds from a head-mounted camera, is being delivered to the tongue, or alternatively to an area of skin of the subject’s 9 Marshall
McLuhan [9].
44
P. Schumacher and X. Duan
Fig. 13 Braille embossed tactile writing system for the blind
Fig. 14 Tactile information embedded in public navigation surfaces
back, substituting for the retina of eye-based vision. An area of about 3 × 3 cm of densely packed nerves on the tongue surface is stimulated by electrodes transcoding light–dark into strong–weak impulse. The subject gets accustomed to how the variations in the stimulation are systematically related to variations in her surroundings as well as her scanning head movements and locomotion, and constructs a vision-like intuitive, objectifying new form of perception. Returning to our natural perceptual channels, we know that auditory perception can also deliver a degree of space perception and spatial orientation. However, as with speech itself, the auditory channel can also been used for language-like information transmission, like systems of sound alarms, or for a kind of encoded writing as in the Morse code which can be represented in sequences of short and long tones or dots and dashes. Braille-like dot and dash patterns are also being used within the build
An Architecture for Cyborg Super-Society
45
Fig. 15 BrainPort, developed by neuroscientists at Wisconsin based Wicab. Inc. A pair of sunglasses wired to an electric ‘lollipop’ helps the visually impaired regain optical sensations via the tongue
environment to communicate useful information to aid the navigation of the blind within public spaces using the tactile channel.
3 Speculation: The Designed Environment as Total Communication These reflections bring us to the point where we can start to speculate about the gradual but ultimately comprehensive transformation and utilization of the built environment and the world of artefacts, i.e. the totality of the phenomenal world that surrounds us, into a continuous, densely articulated communication that addresses all perceptual channels or senses, i.e. vision, touch, hearing, taste, smell, thermoception, equilibrioception and proprioception, to be designed by the self-organized, bottom-up concert of all design disciplines including urban design, architecture as well as landscape, interior, furniture, product and fashion design, as well as graphic and web design. Ideally, the design of this totality of the human habitat, would, in each of its individually designed components, not only follow the style of tectonism but furthermore follow the rules of a unified semiological system. Each individual design, each individually authored urban, architectural, interior design, would be like a permanently broadcast utterance in accordance with the rules of a global, multimodal language. This unified language will not itself be designed at once but would emerge from the gradual proliferation and convergence of individual, project-specific semiological
46
P. Schumacher and X. Duan
systems. The language will probably not be required in all parts of the built environment but will first spread in the most advanced, dense, complex, high-performance metropolitan centres of world society. Because these are interconnected, convergence towards a single semiological system or language is likely. An architectural semiology can be expected to be motivated by indexes, i.e. derive from the forms that are selected to fulfil the physical part of the social function to be accommodated and expressed. While the function of communicating its purpose adds additional motivation and constraints, this implies that the forms that will end up as signifiers of their function will be rather a limited set, constraint to those forms that can indeed physically fulfil those functions, rather than being selected arbitrarily as in verbal languages. This fact, together with the fact that the overall meaning domain is very much restricted, eases the convergence process. This global language will probably still have many regional dialects and only a few large and complex institutions will probably exploit the full elaborateness of its vocabulary and grammar. What is, under current socio-economic and political conditions and also in future, surely unimaginable is that there could be a centrally devised and prescribed spatiovisual language in analogy to the system of traffic signs for all designs from urban design to product and web design. Such an expectation is unrealistic even only for urban design, as is evidenced by the forced retreat of urban planning in the face of the dynamism and complexity of the urbanization process and of the social process in general. However, the analogy of the traffic sign system is instructive. It can serve as a simple model for the generalized semiological project we envision. This system makes a complex and dense urban circulation possible, giving a rule-based ordering matrix to structure and facilitate the coordinated movements of multiple distinct sets of participants which the system distinguishes, including pedestrians, bicycles, cars, taxis, trucks, buses and trams. The registers of semiological articulation are mostly graphic and include street markings using various line types, line colours, arrows and hatchings, as well as vertically erect signage plates and traffic lights, placed both above and next to movement spaces. The signs are built up by means of a grammar that allows the composition of complex signs from multiple sign radicals that thereby compose a complex meaning. The domain of meanings is constrained to the concerns of traffic regulations and contains prohibitions, commands, warnings and guiding information facilitating navigation. Most signs are deictic in the sense of referring to the immediate context. STOP means stop here. The design of the signs does not only follow semiological prerogatives but evidently must also satisfy the phenomenological concerns of conspicuity and perceptual tractability (Fig. 16). All these features, the general function of facilitating the self-ordering of social processes, the concern with phenomenology, the deictic context relatedness, the distinction of status groups and the aspect of (here mostly non-legal) normative rules of access and behaviour are also determinative of the generalized semiological system envisioned here, including a constrained domain of the signified. This domain focusses on the immediate concerns of spatial navigation and on designating the social interaction type offered by each respective zone or space, each with its respective role expectations and protocols of interaction. In addition, this language
An Architecture for Cyborg Super-Society
47
Fig. 16 Traffic sign system: A grammar-based system that distinguishes prohibitions, warnings, commands and guiding information
informs users about who’s space it is or who to expect in the zone or space in question. Thus, there are three primary types of information to be conveyed which we term locational, functional and social information, respectively. The task of facilitating navigation within the built environment by means of communicating locational information, as well as the basic task of designation, is now explicitly taken care of only by wayfinding and signage design, as an add-on system of graphic or written signs. The advantages of what we propose here are pragmatic: signage systems are a crude crutch. Users have to first find the sign and this becomes increasingly difficult in a very complex visual field. Then, if it is text based, users have to read it. This takes concentration. In contrast, if the totality of the built environment becomes an immersive, 360-degree interface of communication, then much more can be taken in, and a lot of it subliminally.
48
P. Schumacher and X. Duan
The differentiation of both interaction types, i.e. types of social situation, and social types, distinguished by status, personality, affiliation, interest, etc., could become rather rich and nuanced, in accordance with contemporary society. Of course, this meaning domain, the scope of what can be expressed in the language of the built environment and artefacts, is much wider than in the case of the traffic sign system. However, compared with verbal language and even more so compared with writing, the scope is massively, and strategically so, restricted to the task of orienting and ordering the participants of the social interaction and communication process by means of broadly premising and framing the specific modes and contents of those communications that can then flow within these designed frames. Architecture itself does not symbolize everything, it does not tell us stories, it must only tell us what to expect and how to act within its bounds (or in its vicinity). We need to ask and answer the question: What does the user need to know about an urban or architectural environment to be enabled to most fully utilize it, and what can an urban or architectural space communicate about itself? The answer is, as indicated above, threefold: We expect a space to communicate its designated function, then who this space belongs to, and finally what we might expect to find beyond our current field of vision. This restriction is necessary and empowering. The failure to delimit the domain of the signified was one of the reasons why the earlier postmodernist semiology could not succeed. The next preliminary clarification concerns the minimal unit of any meaningful architectural sign or communication. In verbal language, it is the sentence that constitutes the minimal unit of communication. In any prospective architectural language, only a spatially defined territory can function as such a complete sign or minimal unit of meaning. Architectural elements or motifs can only count as incomplete sign radicals that by themselves communicate nothing, but might contribute to the demarcation and characterization of a territory or place. The crossing of a demarcation or threshold implies the entering of a different place and different (potential) social situation. The spatial distinction implies a social distinction. Only a territory is a full communication, i.e. something that calls for being either accepted or rejected. Boundaries or thresholds mean that crossing over implies entry into a new social situation with new protocols of behaviour and engagement. The same applies to a change in aesthetic atmosphere. Spatial and visual differences should make a social and situational difference, in accordance with Bateson’s definition of information as ‘differences that make a difference’.10 Each territory is a communication. It communicates an invitation to participate in the framed social situation. To enter the territory implies an acceptance of its spatial communication and the act of entering thus communicates one’s willingness to participate in the respective interaction scenario. Everybody who enters is expected to adopt the behavioural rules implied. That’s the point of all signification: the coordination of behaviours facilitating cooperation. The precise characterization of the situation depends on the orchestration of the various semiological registers that come together in the articulated territory: its position in the overall matrix of territories, 10 Gregory
Bateson [10].
An Architecture for Cyborg Super-Society
49
its spatial shape, its tectonic and material articulation, etc. The articulate territory might thus be designed according to a ‘grammar’ as a well-formed combination of sign radicals. The build-up of a spatio-visual grammar affords a momentous combinatorial enhancement of architecture’s versatility of expression. A small vocabulary might afford a vast number of different communications. The re-foundation of architectural semiology11 is thus based on three premises or axioms: 1. the domain of the signified is limited to function type, social type, location type; 2. the territory is the minimal unit of signification/communication; 3. architectural semiology must exploit the combinatorial power of grammar. So far only the theoretical framework has been elaborated and a series of academic design studies have been conducted at various universities in recent years, including the Architectural Association School of Architecture in London, the University of Applied Arts in Vienna and at Harvard’s Graduate School of Design in Cambridge MA.
4 Example: The Design of a Spatio-Visual Semiological System For our academic design research studies, we choose programmes such as a university campus or a corporate campus for tech firms like Google, and innovative work environments in general, as initial design research arena for the semiological project. These are the high-performance arenas at the most advanced frontier of our civilization where the new productivity potentials of our post-fordist epoch are explored and pushed forward. Architecture’s innovations must be relevant and congenial to these arenas. It is here that the complexity of social processes is most expressed, where communication is most intense and dynamic, where divers human actions and interactions are integrated in intricate patterns of cooperation rather just run in parallel. In contrast, retail or residential arenas are much simpler and less integrated, i.e. they accommodate parallel rather than integrated lives and thus, as it were, operate by addition rather than multiplication. The next decision was to privilege interior over exterior spaces. While urban fabrics matter, it is in the interior where the most important and most intricate social interactions reside. As example of a semiologically conceived architectural project we would like to present a recent student project developed by Yihui Wu, Lei Wang and Yanling Xu under the guidance of Patrik Schumacher, Pierandrea Angius and Lei Zheng at the Design Research Laboratory of the Architectural Association School of Architecture in London. The semiological design research project crucially includes the design of a spatio-visual language, i.e. the design of a semiological system or system of signification, with explicit vocabulary and grammar. The particular design project is 11 See:
Patrik Schumacher [11], sections 6.5–6.10.
50
P. Schumacher and X. Duan
then understood as only one among many possible applications or ‘utterances’ of the designed language. All semiological designs must proceed by means of building up two correlated systems of distinctions, the system of signifiers or symbols and the system of signifieds or meanings. This focus on system and distinction is crucial and was first emphasized by the founder of semiology, the Swiss linguist Ferdinand de Saussure. All terms are relative and acquire their meaning only in distinction and relation to the other terms. A language operates always on the basis of a total system of distinctions ‘in which all the elements fit together, and in which the value of any one element depends on the simultaneous coexistence of all the others’.12 In the design of a semiological system, the distinctions in the domain of the signifiers are to be correlated with the distinctions in the meaning domain. In the case of architectural semiology, the domain of the signifier is the world of architectural forms, i.e. the spaces and their defining components or properties, and the domain of the signified is the world of varied possible social situations to be accommodated. The project presented here proposes a work environment for a start-up incubator. In such buildings, very many different social situations must be distinguished and therefore many different spaces must be differentially characterized. Each space within the total manifold of spaces is defined by a particular combination of aspects. The design of each individual space therefore involves a number of appropriate selections from the articulation options made available by the language via its vocabulary and grammar. The vocabulary is ordered into categories, or substitution classes, or registers of semiological encoding: location, spatial shape, boundary type, colour, material, etc. These registers correspond to meaning dimensions within which the intended aspects of the social situation can thus be selected: destination versus circulation, business versus socializing, work versus meet, public versus private, allocated versus bookable, etc. The first distinction we might introduce to explain the designed system is the formal distinction between bound versus unbound spaces. To this form distinction on the signifier side corresponds, on the side of social meaning, the distinction between business spaces and leisure spaces. This formal distinction carries the further functional meaning of destination versus circulation space. Within the bounded business spaces, we introduce the subsidiary distinction of convex versus concave spaces, designed to encode the social distinction between workspaces and meeting spaces (Fig. 17). The posited distinctions, both the distinction bound/unbound and distinction convex/concave, are rather abstract and therefore allow for a lot of variation in shape and size without thereby infringing on the distinctive meanings preserved across all the possible variations. This is indicated in the small diagrams at the bottom. This abstractness allows for parametric variation, in line with the requirements of parametricism. We might therefore talk about ‘parametric semiology’. Now we introduce two further distinctions that both cut across the previously introduced distinctions and cutting across each other. Both meeting and workspaces 12 Ferdinand de Saussure, Course in General Linguistics (French: Cours de linguistique générale), Geneva 1916, p.113.
An Architecture for Cyborg Super-Society
51
Fig. 17 Dictionary of spatio-visual vocabulary: Two hierarchically ordered distinctions in the register of spatial shape: the spatial distinction bound vs unbound signifies the social distinction business vs leisure. Within the bound spaces the spatial distinction convex vs concave signifies the social distinction meeting vs working
might be private, semi-private or public, respectively, represented via thick boundary, dashed boundary or thin boundary. Public work/meet areas might be WeWork-style co-working areas. Private areas belong to particular start-up companies. Colour, grey versus white, further encodes allocated versus bookable spaces. The matrix indicates that the choices offered by these distinctions are freely combinable. Of course, it must be checked if all possible combinations make sense on the meaning side. If not, then restrictions on free combination must be introduced. Such restrictions or their absence, i.e. the regulation of sign-radical combinations, are determined in the grammar of the language (Fig. 18). We can see here how grammar-based languages operate by multiplication, by exploiting the proliferation of expressive power due to the power of combinatorics. The system of three distinctions offers here the expression of 12 different messages. If we would include the bound/unbound distinction in this game of combination we would arrive at 24 expressions. Another aspect of the paradigm of parametric semiology is the readiness to consider that distinctions might be introduced not as strict dichotomies but as a gradient spectrum of options defined by two poles. Formally, this can be achieved by ‘inbetweening’ or ‘morphing’ between the two poles of the spectrum. However, this operation makes only sense if we can meaningfully conceive of a corresponding gradient in the domain of social meaning. Within contemporary work dynamics, such a spectrum of situations between a clear-cut meeting situation, on the one hand, and a clear-cut concentrated individual work situation, on the other hand, could indeed make sense. If we assume a gradation into eight grades and combine these eight
52
P. Schumacher and X. Duan
Fig. 18 This matrix shows how three distinctions can combine to produce 12 different expressions
choices with the two new distinctions introduced above we arrive at 96 choices offered by working with only three distinctions (Fig. 19). As indicated above, the grammar of a language regulates the combination of sign radicals into full signs as well as regulating the combination of multiple signs into a larger communication or overall text. It is therefore required to design and thus decide on the syntactic rules for combining signs and related semantic rules about how the
Fig. 19 Parametric semiology: The dichotomy of convex vs concave is transformed into a continuous spectrum of shapes defined between the two poles with a continuum of shapes that are more or less convex or concave
An Architecture for Cyborg Super-Society
53
Fig. 20 Grammar of nesting spaces: The vocabulary of concave and convex spaces lends itself to the organization of efficient aggregations. Nesting also indicates that spaces belong together
combination of sign radicals or signs should be read to determine the overall meaning of the combined signs. It can be expected that the downstream combination of the designed vocabulary might create problems and is thus constraining and feeding back into the design of the vocabulary. The diagrams here test the possibility of combing work and meeting spaces and show that the convex meeting spaces can and should nestle into the concave niches provided by the working spaces. Further adjacent workspaces are expected to similarly interlock. We can thus posit the grammatical rule of nesting spaces (Fig. 20). Within contemporary architecture, since deconstructivism, it is possible and often advantageous to allow for territories to overlap. As social complexity and communicative intensity increases it becomes increasingly problematic to restrict the spatial organization of social processes to neatly separated zones. Rather than jumping from this order of zoning to a disordered condition where everything mixes everywhere, it is advantageous to allow for the determinate allocation and articulation of zones where particular social functions and their respective territories can overlap. The accommodation of this possibility is a recurring feature of parametric semiology. There is no guarantee that any given vocabulary can support the meaningful and coherent articulation of overlap conditions. The vocabulary introduced here does allow for the articulation of overlap, while maintaining a coherent system of signification. The overlap of two concave, amoeba-like workspaces might generate either a new concave work space, i.e. where two work groups might collaborate in a dedicated collaboration zone, or might generate a new convex space as shared meeting space that then conspicuously belongs to both workspaces that overlap there. The
54
P. Schumacher and X. Duan
Fig. 21 Grammar of overlap conditions: Two concave work zones generate a new concave workspace which belongs to both. Alternatively they might generate and overlap in a shared convex meeting space. A work zone and a meeting zone can only generate a new meeting space. These results make sense and are coherent with the initial definitions
overlap between a concave work zone and a convex meeting zone generates a new convex meeting space. This can make sense when a meeting space that belongs to a particular work group is at the same time characterized as belonging to a larger meeting zone (Fig. 21). The next move is building up the complexity of the semiological system tackles the initial distinction between bound and unbound spaces. This distinction is a dichotomy which with a clear-cut criterion: a bounded space is enclosed by a boundary. The unbounded space is the continuous space that flows between these bounded spaces. This criterion still holds in case of a long, meandering boundary line, as long as it eventually closes the loop. However, this distinction can be made ambiguous, and turned into an ordered spectrum distinguishing degrees of boundedness. On the one side of the spectrum, the distinction is clear and crisp as bounded spaces are indeed always fully closed. However, spaces might be defined that are very similar to fully bounded spaces where the boundary loop nearly closes, but leaves a small gap. These gaps might gradually increase and the distinction between bounded and unbound spaces becomes increasingly blurred. On the meaning side, this can be interpreted as follows: the distinction between business spaces for formal collaboration and spaces for socializing is getting blurred. A field that offers many such in-between situations might be desirable and conducive for informal communication and collaboration (Fig. 22). The project utilizes this possibility offered by the semiological system. In particular, it does so by arranging the spaces in such a way that an east to west vector of gradual transformation emerges whereby the distinction between bound and unbound
An Architecture for Cyborg Super-Society
55
Fig. 22 Gradient field condition: The dichotomy of bound vs unbound spaces is gradually dissolved moving from west (left) to east (right). The distinction between work and meet is being maintained as the distinction between business and socializing communication is being increasingly blurred
spaces which is very crisp at the western end of the space gradually dissolves as we move eastwards. This implies that the morphological encoding of the difference formal/informal business is redundantly over-coded by the locational encoding. Such redundancies are a useful option in cases where messages deserve to be reinforced because they might otherwise be overlooked or where messages are important and thus deserve emphasis via duplication. The model displays a further redundancy: the gradient from the zone with the crisp bound/unbound dichotomy to the blurred condition is once more reinforced by a gradient treatment of the contrastive colour distinction which gradually disappears as we move into the blurred condition. In turn, the ordered gradient spectrum also offers locational information and thus also operates as effective navigation aid (Figs. 23 and 24). In the further detailing of the three-dimensional articulation of the scheme, the boundary lines themselves blur and turn from platform edges and walls more and more into relief-like stepping edges as we move into the blurred zone and as the closure of these lines diminishes. Thus, users gain multiple local morphological clues that tell them where in the spectrum from formal to informal they are located. As the detailed model shows, the distinction between working and meeting zones can also, to some extent, be maintained in the informal, blurred zones. Finally, the semiological system also encompasses and makes semiological use of the furniture as well as of the ceiling and lighting design. The design succeeds in
56
P. Schumacher and X. Duan
Fig. 23 The field three-dimensionally articulated and furnished. The bound–unbound west–east gradient is redundantly over-coded via the gradual dissolution of the light–dark colour contrast
Fig. 24 Detailed three-dimensional articulation of the blurred end of the gradient spectrum. The boundary itself is dissolving while the distinction between working and meeting holds fast
reusing the correlation of the convex/concave distinction with the meet/work distinction on the furniture scale. The ability to reuse distinctions in this way is of course elegant and advantageous and makes the overall language cognitively less taxing. Also, the choice of forms has been motivated here by considerations of pragmatic social functionality. Convex tables and convex spaces make sense as meeting tables and spaces. Serendipitously, its opposite, the concave or convoluted, meandering form is viable for both work tables and work zones (Figs. 25, 26 and 27). This example shows that while signifiers, with respect to their communication function, can in principle be chosen arbitrarily, this is not always possible or advisable in architectural semiology. Here the domain of the signifier is indeed often pre-constrained to the range of forms that are pragmatically viable for the purposes they are to signify.13 What we further witnessed here is that the relative arbitrariness or degrees of freedom that exist at the outset when starting to design a semiological system become increasingly constrained by what has already been stipulated.14 13 More elaborate reflections of this point can be found in the chapter ‘Extra-Semiological Demands
on Architecture’s Medial Substrate’ in: Patrik Schumacher, The Autopoiesis of Architecture, Vol.2, Chapter 6.8.6. 14 More elaborate reflections of this point can be found in the chapter ‘Absolute and Relative Arbitrariness’ in: Patrik Schumacher, The Autopoiesis of Architecture, Vol.2, Chapter 6.9.3.
An Architecture for Cyborg Super-Society
57
Fig. 25 The distinction of meet/work via the difference convex/concave is being re-applied on the level and scale of the furniture
Fig. 26 The crisp, formal, dichotomous end the space
Fig. 27 The blurred, informal end of the space in perspective
58
P. Schumacher and X. Duan
We should expect that architectural codes, just like verbal languages, expand their vocabulary by analogical extension rather than by arbitrary invention. This example is meant to illustrate how a relatively complex and nuanced semiological project can be step-wise elaborated on the basis of an integrated, systematic series of simple and intuitive form-function or form-meaning correlations. It serves here as initial illustrative glimpse into the ambitious, transformative project of a generalized, all-encompassing, increasingly information-rich and nuanced language of the built environment and world of artefacts. We are living in a communication society where nearly all work has become communication work, while all physical work has been delegated to machines, and where indeed all activities are about communication. During the last 150 years, channels and media of communication have proliferated starting with print media, telegraphy, moving on to telephone, radio and television, and finally via the explosive communication revolution delivered by the Internet. All these advances have delivered so many crucial expansions and intensifications of social communication and thus crucially enhanced the vital integration of society, affording us new material and social freedoms via its increasing productive power. The full utilization of the total built environment and world of artefacts as radically enriched, multi-model, intuitive medium of communication would be a fascinating and ambitious expansion of our discipline’s remit, congenial to contemporary communication society. This project could deliver a further enhancement of society’s integration and potency. Architecture could thus contribute another unique step towards an advanced, strange, but ultimately liberating, cyborg super-society.
References 1. Halacy, D. S. (1965). Cyborg: Evolution of the superman. New York: Harper and Row Publishers. 2. Haraway, D. (1985). A cyborg manifesto—Science, technology and socialist feminism. 3. Clark, A., & Chalmers, D. (1998). The extended mind. Analysis, 58, 7–19. 4. Clark, A. (2003). Natural-born cyborgs: Minds, technologies, and the future of human intelligence. Oxford University Press. 5. Smith, A. (1776). An inquiry into the nature and causes of the wealth of nations. 6. Wittgenstein, L. (1953). Philosophical investigations. 7. Dennett, D. (2017). From bacteria to bach and back: The evolution of minds (p. 281). Penguin Books. 8. McLuhan, M. (1962). The Gutenberg galaxy: The making of typographic man. 9. McLuhan, M. (1964). Understanding media: The extensions of man. 10. Bateson, G. (1972). Steps to an ecology of mind (p. 459). Chicago: University of Chicago Press. 11. Schumacher, P. (2012). The autopoiesis of architecture Vol. 2—A new agenda of architecture. London: Wiley & Sons.
Do Robots Dream of Digital Buildings? Neil Leach
The movie Blade Runner (1982), directed by Ridley Scott, is based on the novel by Philip K Dick, Do Androids Dream of Electric Sheep? It depicts a dystopian future world involving “replicants”—bio-engineered robots—manufactured by the Tyrrell Corporation to have superhuman abilities so that they can survive in the hostile conditions of off-world colonies. Replicants are therefore potentially dangerous, and as a safety measure are given a limited life span of 4 years. In the movie, a group of six replicants return to earth in a bid to extend their lives. Rick Deckard, played by Harrison Ford, is a “blade runner,” a kind of policeman/bounty hunter, charged with hunting down and “retiring”—killing—these replicants. The problem, however, is that replicants—especially the advanced Nexus 7 model—look almost identical to human beings, and can only be distinguished by using the elaborate “VoightKampff” test designed to check whether their emotional responses and eye reflexes meet the standard of human beings. The movie is thus primarily about the difference between human beings and replicants, such that replicants become a mirror in which to understand what it is to be human. As the late Rutger Hauer, who plays the leader of the replicants (Fig. 1), Roy Batty, comments, “In many ways, Blade Runner wasn’t about the replicants. It was about what does it mean to be human.”1 Fast forward to 2019—the year in which Blade Runner is set—and it is worth reflecting on how prescient the movie has proved to be. We don’t have replicants infiltrating society, but we do have AI personal assistants, Siri, Alexa, and Google Assistant, colonizing our everyday lives, and we do have AI filtering our spam, and performing other tasks on our cell phones. We don’t have flying cars, but we do have Maglev trains, drones, and self-driving cars. We don’t have the Tyrrell Corporation, but corporate life is dominated nonetheless by hi-tech companies, such as Google, 1 Roxborough
[1].
N. Leach (B) College of Architecture and Urban Planning, Tongji University, Shanghai, China e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2020 P. F. Yuan et al. (eds.), Architectural Intelligence, https://doi.org/10.1007/978-981-15-6568-7_4
59
60
N. Leach
Fig. 1 Refik Anadol, hallucinated building image based on a database of designs by ZHA
Amazon, Apple, and Microsoft. And, as predicted in Blade Runner, we do talk to our computers, and do have LED advertising all over our buildings, especially in cities like Shanghai. Clearly Blade Runner has proved to be highly prescient. Although replicants are not necessarily controlled by AI, they are clearly an artificial life form endowed with some kind of “intelligence.” They therefore make a productive vehicle by which to introduce the topic of AI, especially in the context of debates about human intelligence. The title of the novel, Do Androids Dream of Electric Sheep?, can also be extended to raise the interesting question as to whether AI can not only dream, but also design. Of course, “dreaming” is not the same as “designing,” in that dreaming—at least according to some commentators—refers to a bottom-up process of unleashing repressed ideas, whereas designing must also entail some measure of top-down control.2 Nonetheless, it provides a useful starting point for a discussion about AI and creativity. 2 Freud
[2].
Do Robots Dream of Digital Buildings?
61
1 Can AI Dream? It is generally assumed that computers cannot be creative. This is the conclusion, at any rate, reached by Japanese computational architect, Makoto Sei Watanabe, writing about the potential applications of Artificial Intelligence (AI) in the field of design: “Machines are better than people at solving complex problems with many intertwined conditions. In that realm, people are no match for machines. But people are the only ones who can create an image that does not yet exist. Machines do not have dreams.”3 The term “dream,” however, has been used—albeit metaphorically—in connection with a technique, DeepDreaming, discovered by Alex Mordvintsev of Google Artists and Machine Intelligence [AMI], while analyzing the operations at work in the process of recognition using artificial neural networks. Mordvintsev found that he was able to generate images by reversing the flow of information in a neural network. Artificial neural networks are often used to recognize and images. They are used, for example, to recognize faces on Facebook and to classify images on Instagram, and are often based on a class of deep neural networks, known as Convolutional Neural Networks [CNNs]. Generally speaking, a network consists of 10–30 stacked layers of artificial neurons. With DeepDream, however, the neural network operates in the opposite direction. Instead of recognizing an image and assigning it a category, DeepDream starts with a category and proceeds to generate an image. For example, whereas a standard neural network can recognize an image of a bird and categorize it as a “bird,” DeepDream is able to start with the category “bird,” and generate an image of a bird. Thus, instead of operating “from image to media,” DeepDream operates “from media to image.” But how exactly can a neural network typically used for recognizing images also be used to generate—or “synthesize”—images? Importantly, although computational neural networks are trained to discriminate between images, it is essential that they have some understanding of those images in order to distinguish them. This allows them to work in reverse, and generate images, instead of merely categorizing them. However, the process of inverting the operation of a neural net produces a somewhat “trippy” images that appear vaguely surrealistic with a multiplicity of objects generated in a variety of poses (Fig. 2). These generated images are referred to as “hallucinations.” Technically, however, they cannot strictly be hallucinations as such, in that true hallucinations are phantasms or figments of the imagination, composed of images that do not actually exist.4 These “hallucinations,” by contrast, are a form of overly strong perceptual prediction of what is there already, and are generated by “maximizing the activation of the semantic neurons at the end of a recognition network.”5
3 Watanabe
[3]. [4]. 5 y Arcas [5]. 4 Sacks
62
N. Leach
Fig. 2 50 iterations of DeepDream trained to perceive dogs, Martin Thoma (2015)
2 Reality as a Controlled Hallucination It is perhaps not insignificant that according to neuroscientist, Anil Seth, perception is a form of perceptual prediction, and is actively generated— “hallucinated”—on the part of the viewer. Thus, perception becomes a form of “controlled hallucination”: “If hallucination is a kind of uncontrolled perception, then perception... is also a kind of hallucination, but a controlled hallucination in which the brain’s predictions are being reined in by sensory information from the world.”6 As Anil Seth notes, instead of passively recognizing an object, the brain actively partakes in trying to make sense of what it is sensing through a form of perceptual prediction: “Instead of perception depending largely on signals coming into the brain from the outside world, it depends as much, if not more, on perceptual predictions flowing in the opposite direction. We don’t just passively perceive the world, we actively generate it. The world we experience comes as much, if not more, from the inside out as from the outside in.”7 Seth goes on to illustrate this with an algorithm based on Google’s DeepDream to simulate the effects of an overly strong perceptual prediction: “When perceptual predictions are too strong, as they are here, the result looks very much like the kinds of hallucinations people might report in altered states, or perhaps even in psychosis.”8 Seth’s comments about altered states is highly relevant, in that research has shown that altered states can also result from a combination of top-down and bottom-up processes, on which computational hallucinations 6 Seth
[6]. [6]. 8 Seth [6]. 7 Seth
Do Robots Dream of Digital Buildings?
63
are based. In fact, Seth uses the “hallucinations” of DeepDream to illustrate how the brain itself works. Others, such as Demis Hassabis, go further, “The brain is just a computer like any other... Traits previously considered innate to humans— imagination, creativity and even consciousness—may be just the equivalent of software programs.”9 Whether or not the brain is actually computational, there are clear parallels between the operations of AI and the operations of the brain.
3 Generative Adversarial Networks (GANs) The next step in the development of generative AI came with the introduction of Generative Adversarial Networks (GANs), a technique first proposed by Ian Goodfellow in 2014, which has undergone a process of rapid development.10 GANs represent a significant step forward in the quest to “synthesize” images, in that the issue of invariance to “pose” is no longer a problem, and the images generated have significantly better resolution. GANs are based on a competition between two neural networks. There is always a bottom-up generator—or “artist”—(typically a deconvolutional neural network) that generates images, and a top-down discriminator—or “critic”—(typically a convolutional neural network) that evaluates those images. In the competition, the generator attempts to fool the discriminator by producing images so realistic that the discriminator is unable to distinguish them from a real dataset. Effectively GANs are a way of training a computer to perform complex tasks through a generative process. “[A] Generator (an artist) generates an image. The Generator does not know anything about the real images and learns by interacting with the Discriminator. Discriminator (an art critic) determines whether an object is ‘real’ and ‘fake’... The Generator keeps creating new images and refining its process until the Discriminator can no longer tell the difference between the generated images and the real training images.”11 The two work in tandem, so that the “artist” trains the “critic,” and the “critic” trains the “artist.” Once the “artist” has been trained, the “critic” can be removed. Although GANs have been used primarily to generate faces, they have also been used for clothing, shoes, industrial design objects, and even interiors and exteriors of buildings (Fig. 3). A number of variations of GANs have now been developed. A recent variation of GANs, Progressive Growing of GANs (ProGANs) has improved the speed and stability of the training process, by starting with a very low-resolution image and increasingly the resolution progressively with every layer. “The key idea is to grow both the generator and discriminator progressively: starting from a low resolution, we add new layers that model increasingly fine details as training progresses.”12 This 9 Ahmed
[7].
10 Goodfellow
et al. [8]. [9]. 12 Karras et al. [10]. 11 Nayak
64
N. Leach
Fig. 3 Images of faces generated by Progressive GANs
allows the whole process to produce some highly convincing faces of non-existent people with far greater realism than previously achieved. The results, while still not perfect, show that true photo realism is not far away. Likewise, a further version of GANs has now been developed to produce Creative Adversarial Networks [CANs]. CANs have been used, in particular, to generate art; “The system generates art by looking at art and learning about style; and becomes creative by increasing the arousal potential of the generated art by deviating from the learned styles.”13 The intention here is to open up the range of creative possibilities “by maximizing deviation from established styles and minimizing deviation from art distribution.”14 CANs do this by exchanging the critic (or discerner) whose role is ordinarily to ensure likeness, with an agent that produces novelty. As a result, the “artist” draws upon a database of established “art,” and the CAN then introduces variations on a theme. Interestingly, it was found that human beings were unable to distinguish “art” created by CANs from art created by artists. As such, it would appear that CANs are capable of passing the Turing Test. In 2018, a portrait of Edmund de Belamy which was generated using Creative Adversarial Networks [CANs] by the Paris-based art collective, Obvious, was auctioned at Christie’s for the remarkable sum of $432,500, almost 45 times the initial estimate, becoming the first AI-generated art work to be sold at auction (Fig. 4).15 A further version, StyleGAN, is the latest in this line of development and the most sophisticated GAN system to date. StyleGAN is an open-source style-based generator architecture that borrows from style transfer literature to produce images, and offers remarkable improvements in terms of resolution and quality. In particular, 13 Elgammal
et al. [11]. et al. [11]. 15 Is artificial intelligence set to become art’s next medium? https://www.christies.com/features/Acollaboration-between-two-artists-one-human-one-a-machine-9332-1.aspx. 14 Elgammal
Do Robots Dream of Digital Buildings?
65
Fig. 4 Edmond de Belamy (2018) a CANs portrait generated by Paris-based arts-collective Obvious, and auctioned at Christie’s in 2018
it overcomes the problem of “entanglement” whereby any slight tweak or amendment to one feature would have a knock on effect on other features, by reducing the correlation between different features.16 As a result, it generates faces so convincing that it is often almost impossible to tell the difference between the generated artificial faces and those of real human beings. Perhaps the most significant improvement, however, is that images can simply be fed in, and do not need to be tagged or classified as with previous versions of GAN. This saves considerable time. As the authors note: “The new architecture leads to an automatically learned, unsupervised separation of high-level attributes (e.g., pose and identity when trained on human faces) and stochastic variation in the generated images (e.g., freckles, hair), and it enables intuitive, scale-specific control of the synthesis.”17
16 Horev 17 Karras
[12]. et al. [13].
66
N. Leach
4 Architectural Applications of GANs One of the first applications of GANs to an architectural project was a projection on to the Walt Disney Concert Hall [WDCH], by media artist, Refik Anadol, in collaboration with Google AMI and computational sound consultant, Parag Mital, and others, commissioned to mark the 100th anniversary of the LA Philharmonic in October 2018. “When we dream, our minds process memories to form new combinations of images and ideas.” (Google Arts and Culture) This, at any rate, is the premise behind the projection. The intention was to give “consciousness” to the WDCH in three distinct stages. The first stage of the projection— “Memory”—involved the “training” of a neural network by classifying thousands of images and sound recordings of events from the archives of LA Philharmonic. These were then processed by the network, and the results projected onto the surface of the WDCH. The second stage— “Consciousness”—involved the use of a neural network to categorize and rearrange the material, by finding similarities within the material. The third stage— “Dreaming”—was probably the most spectacular. This was when the full “hallucination” takes place (Fig. 5). More recently Anadol has explored the use of StyleGANs as a means of “hallucinating” an animated movie from which still images of fictional buildings can be extracted, drawing upon a database of thousands of photographs of more progressive buildings, designed by architects such as Gehry and Partners, Zaha Hadid Architects, Toyo Ito, Tadao Ando, and so on. This work raises the interesting question as to whether StyleGANs might be able to generate designs that are even more experimental. Indeed, it might equally
Fig. 5 Refik Anadol with Google AMI, WDCH projection, October 2019, “Dream” section
Do Robots Dream of Digital Buildings?
67
be possible to use alternative data, such as natural forms like flora and fauna— orchids, perhaps—or even rocks and landscapes, to “hallucinate” novel hybrid or cross-pollinated forms. Xkool, an AI start-up in Shenzhen, China, whose CEO is Wanyu He, has probably engaged with the most sophisticated application of AI to architectural design to date. The intention is to streamline the design process and make it more efficient and creative by using deep learning to not only search through a vast range of possibilities, but also automatically generate designs based on the trained models, and then evaluate and return the outcomes from various evaluation models. Xkool has also used StyleGANs and other AI techniques to generate images of buildings that do not exist. On the whole, these buildings are more traditional than those generated by Anadol, and are largely modernist in their aesthetic, reflecting the content of the database of images used. These images also reveal how much depends on the quantity of data, in that they draw upon a larger database, and are therefore somewhat clearer and more detailed than those generated by Anadol. However, as yet they do not reach the same standard as the faces generated by StyleGANs, and at present there are still too many tell-tale glitches giving away the fact that they are not images of actual buildings (Figs. 6 and 7). There are now literally hundreds of versions of GANs. Apart from ArchiGAN, the version developed by Stanislas Chaillou and described in the volume, CycleGAN has
Fig. 6 Xkool, hallucinated building images, 2019
68
N. Leach
Fig. 7 Xkool, hallucinated building images, 2019
also proved popular with architects. This works with unpaired datasets and allows for cross-domain transfers between set A and set B. The network has to decide how to transfer concepts from dataset A to concepts from dataset B, working not on a stylistic level, but looking for the main features. Usually with a GAN there is one generator and one discriminator, but with a CycleGAN there are two generators and two discriminators. The advantage of a CycleGAN is that it avoids the possibility of “mode collapse,” where the generator is not generating anything new, but the discriminator cannot complain about the output as the result is “true.”18 It is therefore capable of learning special characteristics of one image dataset and of figuring out how those characteristics might be translated into the other image dataset. And, importantly, it does this without having paired training examples (Figs. 8 and 9). Daniel Bolojan has been responsible for using CycleGANs to “hallucinate” potential buildings for Coop Himmelb(l)au based on two unpaired datasets of images. In this case, dataset A is based on reference images of geomorphic formations and dataset B is based on actual Coop Himmelb(l)au projects. The outcome is a video of a journey through an imaginary landscape of Coop Himmelb(l)au-like building
18 Instead
CycleGAN will learn translations from dataset A to dataset B, but it will have to prove that it is able to reconstruct the translation from dataset B to dataset A.
Do Robots Dream of Digital Buildings?
69
Figs. 8–10 Daniel Bolojan, Deep Himmelblau
Figs. 11–13 Daniel Bolojan, Gaudi’s Hallucination
forms. The important point to stress is that these buildings do not actually exist. They are merely “machine hallucinations.” Bolojan has also been responsible for developing “Gaudi’s Hallucinations” based on a workshop that he with Emmanouil Vermisso at ECAADE SiGRADI in 2019. Here he again used CycleGANs, but this time with one dataset consisting of images of the interior of the Sagrada Familia church in Barcelona, Spain, and one consisting of images of walking through a forest. Although GANs can already generate some startlingly innovative images, there are a number of constraints holding back its development. Firstly, the use of StyleGANs is constrained by the power of Graphic Processing Units (GPUs) available. As a result, Anadol, for example, would not have been unable to generate images so quickly without the support of extremely powerful GPUs provided by NVIDIA. Secondly, it should be borne in mind that at present StyleGAN has so far been restricted largely to the generation of 2D images. In order to generate truly convincing “designs,” it would be necessary to move beyond the domain of two dimensions and operate in three dimensions. Thirdly, it would be necessary to deepen the level of information to incorporate some more highly refined details and to engage with performative constraints. Since GANs are already in use for the purposes of optimization, some aspects of the introduction of performative constraints should be fairly straightforward. Moreover, the introduction of KoolPlan, an AI assist for floor plans and facade design, which builds upon the logic of StyleGANs, is already helping to increase the refinement of the detailing.
70
N. Leach
Meanwhile, Autodesk is continuing to explore the potential use of AI into structural optimization through Project Dreamweaver, automation in design through Fusion 360, and even AI-based software for risk assessment in the construction industry19 Rajagopal et al. [14]. Indeed, given the logic of Moore’s Law, it is surely safe to assume that the process of development might even accelerate, and that a robust AI-based technique for generating 3D architectural drawings will be available within the next few years.
5 Can AI Be Creative? Can AI be creative? In some senses, the answer would appear to be clearly “yes.” Take the example of the highly novel Move 37 made by AlphaGo in the second game of its match against Lee Sedol. As Fan Hui comments, “When AlphaGo chose that move, I assumed that it had made a mistake. I immediately looked to see Lee Sedol’s reaction. At first, he seemed to smile—as though he too thought it had made a mistake—but as the minutes rolled by it was clear that he was/starting to realize its brilliance. In fact, after the match, he said that when he saw this move he finally realised that AlphaGo was creative.”20 If, as Richard and Daniel Susskind note, AlphaGo could come up with such an original move, and it can be viewed a creative does this not suggest that creativity amounts to little more than imagining a range of already available options that have not been thought about before?21 In other words, could we even say that being “creative” is little more than searching for the unexpected or merely proposing that which has not been proposed before? As the Susskinds note: “Contrary to widespread belief, machines are now capable of generating novel outcomes, entirely beyond the contemplation of their original human designers.”22 In order to answer the question more comprehensively, however, we would need to understand the nature of creativity in more detail. According to Margaret Boden, “Creativity can be defined as the ability to generate novel, and valuable, ideas.”23 For Boden “valuable” has many meanings: “interesting, useful, beautiful, simple, richly complex and so on.” Meanwhile, “ideas” could take many forms from ideas as such, “concepts, theories, interpretations, stories,” but also artifacts, such as “graphic images, sculptures, houses and jet engines.”24 Boden notes that there is nothing magical about creativity. Although we have come to think of creativity as a mysterious process that often depends on the “black box” of intuition, this is simply because most people do not understand how it operates. In this 19 Rajagopal
et al. [14]. and Hui [15]. 21 Susskind and Susskind [16]. 22 Susskind and Susskind [17]. 23 Boden [18]. 24 Boden [18]. 20 Hassabis
Do Robots Dream of Digital Buildings?
71
sense, it is similar to magic. For magicians they do not perform magic. They simply conceal the processes that actually happen, so that the audience comes to attribute them to magic.25 Nor does creativity need to operate at a high level, and Boden offers examples of some relatively banal jokes generated by AI that nonetheless display a low level of creativity, such as “What do you call a strange market? A bizarre bazaar.”26 But the corollary to questioning whether computers can be creative is to question whether human beings are so creative. It could be argued that architectural design is based on a “canon” of great designs, and that all designers need to stick closely to that canon. A design that looks like a pineapple, for example, would be clearly unacceptable, while one that looks similar—but not identical—to work by Rem Koolhaas, Zaha Hadid, etc. would be considered acceptable. Moreover, human beings tend to have their individual “signatures”—in the way they walk, talk, and indeed design—such that they are often constrained in the design moves that they make, much like pieces on a chessboard. In short, is not human creativity, as has been claimed, something of a “myth”?27
6 Conclusion And so do robots dream of digital sheep? Or—to be more precise—can AI dream of digital buildings? Is Watanabe correct to claim that humans are the only ones that can create an image that does not exist, and that machines do not have dreams?28 Let us start by stating the obvious. All faces or buildings generated by StyleGAN do not actually exist. In terms of dreams, however, what becomes clear is that the word “dream” appears in many of the AI applications and projects cited above, whether it be “DeepDream,” “Dreamweaver,” or “WDCH Dreams.” Clearly, in one sense, computers are able to “dream” not just of sheep but also of buildings through the use of various forms of GANs. However, the term “dream”—along with the terms “memory” and “hallucination”—should not be taken literally. They need to appear in inverted commas. Until computers have consciousness, they cannot literally dream. In this sense, Watanabe is correct. But can AI really be as creative as human beings? For Boden, this is a philosophical question unanswerable for now, as many highly contentious philosophical issues would need to be resolved.29 For Boden, however, the key issue is not whether AI can match human creativity, but how AI has helped us to understand human
25 Leach
[19]; Boden [18]. [18]. 27 King et al. [20]. 28 Watanabe [3]. 29 Boden [18]. 26 Boden
72
N. Leach
intelligence, “Thanks in part to AI, we have already begun to understand what sort of phenomenon creativity is.”30 Certainly, there are a number of specific questions about human creativity that we might pose in the light of what we know about AI. Could the process of classifying objects to form a database in machine learning be similar to how children themselves are taught to identify objects, “This is a cow,” “this is a horse,” and so on? Could we even surmise that the teaching of architectural design is based on a similar principle, since architectural design is not taught as an overarching theory, but rather by way of examples: “this is a good design by Le Corbusier,” “this is a good design by Rem Koolhaas,” and so on? Could we even claim that the architectural design process consists of a directed search of latent space based on a database of such examples, not dissimilar to how StyleGANs work? Does the reversal in direction of operation of a neural network, pioneered with the development of DeepDream, suggest that the process of interpretation is—in some senses—the opposite of the process of creativity? If so, might this help to explain why architectural theorists/critics tend not to be so creative, while architectural designers tend to be less theoretical? Blade Runner, as has been noted, has proved to be extraordinarily prescient in predicting the future. But are there not also uncanny parallels between Blade Runner and the operations of AI? Does the victory at chess of replicant, Roy Batty, over Dr. Eldon Tyrrell, not foreshadow the eventual moment when DeepBlue, IBM’s supercomputer, beat the then leading chess player Gary Kasparov at chess in 1997? Does not the way in which replicants are almost indistinguishable from human beings, echo how StyleGAN-generated faces are almost indistinguishable from human faces? And, does not the Voight-Kampff test, used to distinguish replicants from human beings, echo the Turing test, used to establish whether the operations of a computer might be indistinguishable from those of human beings? There is, however, a further parallel that deserves to be mentioned. Just as for Hauer, Blade Runner is primarily about the difference between human beings and replicants, such that replicants become a mirror in which to understand what it is to be human, so too AI—and robotic intelligence in general—can become a mirror in which to understand what it means to be human. As Hiroshi Ishiguro notes, “The robot is a kind of mirror that reflects humanity and by creating intelligent robots we can open up new opportunities to contemplate what it means to be human.”31 38 years ago Ridley Scott released a movie, Blade Runner, set in October– November 2019, that predicted a world inhabited by artificial life forms almost identical to human beings. Indeed the fact that two replicants, Rachael and Deckard, could fall in love, and that Roy Batty could deliver a soliloquy as eloquent as “Tears in the Rain” implies that these artificial life forms had developed at least some form of consciousness. But whether AI will ever achieve full consciousness and match human creativity remains an open question.
30 Boden
[18]. [21].
31 Ishiguro
Do Robots Dream of Digital Buildings?
73
References 1. Roxborough, S. (2018, February 18). Rutger Hauer on ‘Blade Runner 2049’ and Why Films Today ‘Lack Balls’. Hollywood Reporter. https://www.hollywoodreporter.com/heat-vision/rut ger-hauer-blade-runner-2049-why-films-today-lack-balls-1085827. 2. Freud, S. (2010). The interpretation of dreams (trans. Strachey, J.). New York: Basic Books. 3. Watanabe, M.S. (2017). Can AI make designs? In N. Leach, & Yuan; P. (Eds.), Computational design. Shanghai: Tongji UP. 4. Sacks, O. (2012). Hallucinations. New York: Knopf/Picador. 5. y Arcas, B. (2016). How computers are learning to be creative. TED Talk. https://www.ted.com/ talks/blaise_aguera_y_arcas_how_computers_are_learning_to_be_creative?language=en 6. Seth, A. (2017). Your brain hallucinates your conscious reality. TED Talk. https://www.ted. com/talks/anil_seth_how_your_brain_hallucinates_your_conscious_reality?language=en 7. Ahmed, M. (2015, January 30). Lunch with the FT: Demis Hassabis. Financial Times. https:// www.ft.com/content/47aa9aa4-a7a5-11e4-be63-00144feab7de. 8. Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., & Ozair, S., et al. (2014). Generative adversarial networks. In Proceedings of the International Conference on Neural Information Processing Systems (NIPS 2014) (pp. 2672–2680). 9. Nayak, M. (2018). Deep Convolutional Generative Adversarial Networks (DCGANs). Data-Driven Investor. https://medium.com/datadriveninvestor/deep-convolutional-generativeadversarial-networks-dcgans-3176238b5a3d. 10. Karras, T., Aila, T., Laine, S., & Lehtinen, J. (2017, October 27). Progressive growing of GANs for improved quality, stability, and variation. https://www.arxiv.org/abs/1710.10196. 11. Elgammal, A., Liu, B., Elhoseiny, M., & Mazzone, M. (2017, June 21). CAN: Creative adversarial networks, generating ‘Art’ by learning about styles and deviating from style norms. https://arxiv.org/abs/1706.07068. 12. Horev, R. (2018, December 30). Explained: A style based generator architecture for GANs – Generating and Tuning Realistic Artificial Faces. Towards Data Science. https://towardsdatascience.com/explained-a-style-based-generator-architecture-forgans-generating-and-tuning-realistic-6cb2be0f431. 13. Karras, T., Laine, S., & Aila, T. (2018, December 12). A style based generator architecture for generative adversarial networks. https://arxiv.org/pdf/1812.04948.pdf. 14. Rajagopal, A., Tetrick, C., Lannen, J., & Kanner, J. (2018). The rise of AI and machine learning in construction. Autodesk University. https://www.autodesk.com/autodesk-university/article/ Rise-AI-and-Machine-Learning-Construction-2018 15. Hassabis, D., & Hui, F. (2019). AlphaGo: Moving beyond the rules. In C. Wood, S. Livingston, & M. Uchida (Eds.), AI: More than human (p. 89). London: Barbican International Enterprises. 16. Susskind, R., & Susskind, D. (2017). The future of the professions: How technology will transform the work of human experts (p. x). Oxford: Oxford UP. 17. Susskind, R., & Susskind, D. (2017). The future of the professions: How technology will transform the work of human experts (p. xi). Oxford: Oxford UP. 18. Boden, M. (2009). Computer models of creativity. Association for the Advancement of Artificial Intelligence. 103 (I 998), 347–356. 19. Leach, N. (1999). Millennium culture. London: Elipsis. 20. King, R., Churchill, E., & Tan, C. (2017). Designing with data. Sebastopol, CA: O’Reilly. 21. Ishiguro, H. (2019). A reflection of ourselves: Robots and the journey towards understanding human intelligence. In C. Wood, S. Livingston, & M. Uchida (Eds.), AI: More than human (pp. 174–179). London: Barbican International Enterprises.
A Hybrid Creativity in Architecture: From Brain Plasticity to Augmented Design Intelligence Chao Yan
Abstract Under the contemporary development of digital technology, design tools have gradually become more intelligent than ever, becoming creative in itself and, then, usually forming a new hybrid creativity through the collaboration with human subject. In this posthumanist scenario, the paper conducts an interdisciplinary investigation into the mutual incorporation between human and digital tool. By giving a close reading of phenomenological and neurophilosophical theories of tool, it examines how the digital tool could be absorbed into human body as the prosthesis and how the prosthetic tool could embed human subject into a larger network of intelligence. Further, by giving a critical analysis on the regulating effect of this human-technology collaboration, it reclaims a new way of forming design novelty, in which design intention is freed with the dynamicity from not only the constant reconstruction of the tool, but also the dynamic flux of information between human subject and the larger network of social intelligence. Keywords Design intelligence · Digital tool · Brain plasticity · The extended mind · Creativity
1 Introduction: Intelligence in Digital Tools If the well-known statement that “architecture is a combination of art and technology” argues for the essential role of technology in driving the transformation of architecture, it would be even more true in the contemporary context with the rapid development of digital technology. The past 30 years have witnessed the myriad ways in which digital technology has cultivated significant paradigm shifts in architectural research, pedagogy, and practice. With industrial robots and digital software as its revolutionary platforms, we can now see a clear transformation of the production process of architecture away from traditional crafts and industrial reproduction to a new production mode within our emerging digital age. C. Yan (B) College of Architecture and Urban Planning, Tongji University, 1239 Siping Road, Shanghai, China e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2020 P. F. Yuan et al. (eds.), Architectural Intelligence, https://doi.org/10.1007/978-981-15-6568-7_5
75
76
C. Yan
There’re two questions that are essential to this new production mode of architecture—automation and creation. The former question emerges from the widespread discourses on robotics in the architectural discipline in the past 20 years, which has been trying to replace notional drawings and human labors in building construction, so as to form a continuous process from conception to realization of architecture. Meanwhile, digital tool not only improves the dynamicity of communication between design and construction, but it also provides entirely new directions for innovation and creation. Through the process of making—the intuitive dialog between human and material–the creativity within the medieval crafts once again becomes the central topic in understanding architectural design. However, in this digitally structured process of making, the essence of creativity is no longer limited to the mere energy and contours of the human hand. It now includes an array of intelligent machines capable of extending human intentions outward, away from the body. In this continuous process of design-to-construction, applications of digital software and robotics are not limited to visualizations and realizations of the conceived form by human subjects, but can be involved directly in the creation process and become a source of creativity in itself. No longer is the human the only author of an architectural project. Machines are also part of the design subject, forming a new hybrid creativity in the production process. In this way, new possibilities for collaboration between human and machine question traditional design authorship and challenge the traditional concept of the creative genius [1]. As contemporary digital technology increasingly expands the creativity within the human–machine collaboration, these technique tools could also take over the creation process itself and restrict it in a highly mechanical and reductive way. Then, fundamental to this process will be how to keep the purely intuitive, empirical part of design in its relation to art. The key question will point at the primary role human mind will play in this increasingly hybrid process of architectural production. To unfold this question, a phenomenological analysis would be essential. By addressing the relationship between human and tool from a perceptual perspective, it could reveal the essential mechanism of this human–machine collaboration in its relation to the hybrid creativity, so that we could identify some key aspects to avoid the pitfalls of technological reductionism and to reinstate richness nature of creation in this hybrid way of architectural production.
2 A Phenomenological Approach to Tool Thinking From a perceptual perspective, the relationship between human and tool has been frequently discussed by the phenomenological campaign through the twentieth century. Martin Heidegger’s tool thinking reveals the authenticity of tool in its technological correlation with human being. In Heidegger’s argument, the relation of human and tool could be distinguished into two categories—“present-at-hand” and “ready-to-hand”. Heidegger uses his “hammer” example to articulate them—a human subject could either pick up the hammer and use it, or just stare at it as an object from
A Hybrid Creativity in Architecture …
77
a distance and analyze its meaning. The former situation is defined by Heidegger as “ready-to-hand”, while the latter is called “present-at-hand”. According to Heidegger, the usual way of dealing with things is actually not observing them as “present-athand”, but using them as “ready-to-hand”. And only using them, could the authentic existence of the things as tool be revealed [2]. According to Heidegger’s theory, we shouldn’t conceptualize design tool as inert object, but as the functional entity in association with ourselves. Furthermore, Heidegger describes a skillful scenario of using tool, in which the functional object could be effectively disappeared from the perception of the user [3]. In this way, as Jonathan Hale further explained, the focus of the user’s perception is shifted away from the tactile connection between body and tool, and focusing on the contact of the metal end of hammer and the nail. However, as Hale also points out in his comparison on Heidegger and Merleau-Ponty, although Heidegger has revealed the recessed condition of the tool from perception, what he “fails to address is the progressive way that we come to appropriate tools, and absorb them into our body” [4]. Merleau-Ponty’s phenomenology of perception reveals that human subject would be able not only to accommodate tools, but also to “think through” them unconsciously. Then, with its own sensory abilities, tool would be able to expand and to transform human sensations. With the “skillful” habit, “the blind man’s stick has ceased to be an object for him, and is no longer perceived for itself; its point has become an area of sensitivity, extending the scope and active radius of touch, and providing a parallel to sight” [5]. To achieve this expanded sensation, the tool could be recognized as being effectively incorporated into the human body. As Jonathan Hale articulated, tool is incorporated into the extended “body-schema” in the perceptual dimension so that the human subject could experience the rest of the external environment through it [6]. “Our organs are no longer instruments; on the contrary, our instruments are detachable organs” [7]. The relationship between human and tool would essentially change both perception of the existential reality of the tool and the prioperception of the body itself. And, as body is the only reference for us to understand our existence in the world, the body-tool incorporation will fundamentally alter our cognition on the living environment. In architecture, designer is not only the subject creating the built environment, but also the subject sensing, wandering and inhabiting in the built environment. Tools in architecture, from construction tool as the externalized motoring organs to design tools as the externalized brain and sensory organs, would not only determine the way that human subject represents and realizes the design object built environment, but also influence their way of conceiving the object by altering their understand on their own existence in the environment. Then, a phenomenological approach of design, which addresses the sensational dimension of creation, is not only about retrieving the inspirations and the memories that hidden in the external place, but also involves the capacities of the external tool to transform our embodied perception.
78
C. Yan
Incorporated tools could also influence the way of human subject thinking as much as the content of it. In other words, the incorporated tools could influence unconscious mechanism of design process as much as its involved conscious intention. In Merleau-Ponty’s phenomenology, the theory of “flesh” also addresses the mutual incorporating phenomenon between human subject and external object—“my body is a thing among things; it is caught in the fabric of the world…because it moves itself and sees, it holds things in a circle around itself. Things are an annex or prolongation of itself; they are encrusted into its flesh, they are part of its full definition” [8]. Further, as Taylor Carmen has explained, “flesh is the identity of perception and perceptibility, even below the threshold of conscious awareness” [9]. In architecture, as the theory of flesh suggests that body-tool incorporation happens in the unconscious dimension, the tools would influence much more on the construction of the unconscious process of human thinking, rather than its semiotic meaning in consciousness. It is more related to the construction of the unconscious body-subject, in the phenomenological term. As Seamon puts it, “body-subject is the inherent capacity of the body to direct behaviors of the person intelligently, and thus function as a special kind of subject which expresses itself in a preconscious way” [10]. So, built upon the notion of body-subject, tool that could be unconsciously perceived as part of the body could transform body-subject itself and, then, reconstruct the unconscious intention of the body. Regarding the essential working mechanism, how tools as externalized organs could affect the unconsciousness of the mind?
3 Brain Plasticity and Body-Tool Incorporation In neuroscience, the theory of brain plasticity has been significantly contributing to the exploration of the operational mechanism of the brain from both theoretical and pragmatic perspective. Instead of adopting a one-way causal relation from the inner neural activity to using tools to achieve certain intention, there have been many emerging theories addressing the reciprocal relationship of the body-tool incorporation indicated by phenomenology and claiming the dissolution of the boundary between mind, body, and environment [11]. According to the discussions by scientists Bennett, Diamond, Krech, and Rosenzweig in 1964, brain plasticity is derived from the long-term speculation that our brain could change its size, interconnections of its cells, and its chemical composition [12]. From computational perspective, brain plasticity could be further explained by the connectionism model, in which the modification of the connectivity between neurons would provide the brain’s capability to modify its own information-processing structure [13]. The connectionism-based mechanism of human adaptation offers a possible explanation on our capacity to adapt to new tools as well as they become absorbed within our body schemas. According to the philosopher, Andy Clark, it’s precisely the brain plasticity that forms the human nature as “natural-born cyborgs”—human mind operates partially through biological organs and partially through external devices.
A Hybrid Creativity in Architecture …
79
In the article “The Extended Mind”, Clark and David Chalmers initiates the discussions on the extended mind by proposing the question “where does the mind stop and the rest of the world begin?” [14]. The question implied the proposition that the dynamic relations between human and tool could be explored further to questioning the biological boundary of the body. As Clark puts it in the book Natural-Born Cyborgs, “our sense of our own bodily limits and bodily presence is not fixed and immovable. Instead, it is an ongoing construct, open to rapid influence by tricks and by new technologies” [15]. Furthermore, the theory of the extended mind is not only suggesting that tool could be incorporated in the body construction, but also reveals the scenario that human mind could be merged into the external intelligent system. Similar with the transformation from Bergson’s vitalism to Deleuze’s “machinic Phylum”, in which the key difference is that the distinction of human mind is completely eliminated and treated as the “material” operating with others in a whole system, the extended mind theory also implies that human mind could join into the operational mechanism of the external tool to establish a computational matrix to receive, process, and response information in perceptual dimension. In an example of discussed by Clark in his book, when people use pen and paper to do mathematical calculations, the distinct activities in the brain are actually the information processing on the patterns of the calculation in paper rather than the control of the muscles of the arm and fingers [15]. In that sense, the pen, the paper, and the body are merged together to solve the problem of calculation, meanwhile, the information processing of the mind is extended across the biological brain and merged with the non-biological pen/paper system. Clark defines the technological aspect of tool as either transparent or opaque.1 Obviously, both pen and paper in the example are transparent tool to the human subject, as they become recessive from human perception through the calculation. Design tools as the technological apparatus usually contain both opaque and transparent state at the same time. When driven by a complex task through architectural space, the subject would consciously retrieve and process the information from the opaque part of the environment, and naturally or intuitively interact with the transparent part. So, in the end, as Clark puts it, “it just doesn’t matter whether the data are stored somewhere inside the biological organism or stored in the external world. What matters is how information is poised for retrieval and for immediate use as and when required” [15]. What matters is whether the subject could use tool intuitively. As long as there is an intuitive information feedback loop between human mind and the external tool, they could be perceived as an integrated system by human subject to achieve certain tasks. In this process, the plasticity of the brain plays the important role in constructing the enlarged problem-solving system. With the plastic brain, the information processing between the mind and the external object could be 1 According
to Clark, the transparent technology means “a technology that is so well fitted to, and integrated with, our own lives, biological capacities, and projects as to become almost invisible in use”; while, in the latter, “an opaque technology, by contrast, is one that keeps tripping the user up, requires skills and capacities that do not come naturally to the biological organism, and thus remains the focus of attention even during routine problem-solving activity”.
80
C. Yan
operated intuitively as the mass as the perceptual connection between the mind and the biological body. Furthermore, both transparent and opaque tool are relative terms. Their states could be dynamically determined through the adaptation process. Adapting to tools could be interpreted as possessing the similar mechanism of “learning”, which implies the role of the brain, with its plasticity, in coordinating the mutual incorporation between the body and the tool by constantly reframing the structure of the perception mechanism on both of them. As articulated above, brain could constantly change its internal mechanism (neuron connectivity) to gradually become able to instinctively retrieve and process the information from an opaque tool, or become estranged to a transparent tool. In this way, an opaque tool could be transformed into a transparent one, and vice versa. In the end, as brain would constantly change its internal structure during the adaptation process, the whole structure of body-tool incorporation would become dynamic as well. And the dynamicity of the incorporation would be one of the key aspects to retrieve the richness nature for the hybrid creativity in design.
4 Digital Tools as the Extension of Human Intelligence If, as Clark claims, being cyborg is part of human nature, then, in contemporary, the incessant development of increasingly sophisticated digital tools and prostheses –from robotics to Artificial Intelligence–have been making us ever more cyborglike. The theory of the extended mind reveals the information-processing mechanism of the body-tool incorporation, in which the fundament of this posthumanist phenomenon is the intuitive feedback of information between human mind and the tool. Then, the intuitive feedback loop has been further enhanced by the development of digital technology, According to Katherine Hayles, “for the human is the tradition of liberal humanism; the posthuman appears when computation is taken as the ground of being, a move that allows the posthuman to be seamlessly articulated with intelligent machines” [16]. As the reciprocal intervention and the mutual incorporation are essentially based on the information exchange between human mind and tool, when the tool is fully digitalized or digital controlled, the information processing in both the human mind and the digital device could be integrated into a united system without any physical limitations. For example, nowadays, the development of digital technology has already allowed us to be seamlessly connected with tools through data exchange, so that body and tool have been able to fully penetrate into each other. In design process, there is no longer a one-way relationship between us and digital tools. As the user, we will gradually change our design thinking in the process of exchanging data with the tool, as well as in the process of adapting to the character and prerogatives of digital technology. Along with the rise of digital technology, there have been continuous discussions on the concept of cyborg to explore how human nature could be transformed by the using of advanced digital tools through the second half of the twentieth century.
A Hybrid Creativity in Architecture …
81
In the mid-1980s, the term has been appeared in various scenarios including the scientific, the artistic, the fictional, the intellectual, and so on. In film, cyborg is a kind of expression about the situation that “man witnesses himself being progressively dehumanized” and “become an insensitive machine” [17]. While, in philosophy, such as what Donna Haraway proposed in A Cyborg Manifesto, the term has been profoundly addressed in the exploration of the gender issue. These explorations about the transformation of human nature have been deeply associated with design discourses, as it fundamentally undermines both the subject and the object of design. However, the original idea behind these explorations is not so much about the dynamicity of human nature as that tool could become the extension of the body to give human subject the alternative capabilities of adapting to the environment. The first iteration of this idea appeared in the early 1960s for both military use and space exploration. The NASA scientists Manfred E. Clynes and Nathan S. Kline proposed the term cyborg to articulate the way of using “tools” as the extension of the body to enhance human’s capability to adapt to the external space. As they put it, “the cyborg deliberately incorporates exogenous components extending the self-regulatory control function of the organism in order to adapt it to new environments” [18]. Through data transmission, the condition of the biological body of astronauts could be monitored and controlled by spacesuit, and, through spacesuit, further connected with the space center on the ground. The feedback loop between astronauts, spacesuit, and space center brings up the fact that digital tool and its mediation effect would be able to even frame the interventions between subject and a largely extended intelligent system. In general, the different aspects of the posthumanist tool thinking, as those mentioned previously, are all indicating a more opened understanding on the adaptive nature of human being—human being can incorporate tools to form a larger intelligent system. But, essentially, the understanding is always built upon the idea of challenging and reconstructing the physiological foundation of human body. Now, with digital tools, the body reconstruction process also dissolves the boundary between the virtual and the physical existence of human being and brings back the idea of body-space incorporation based on the cybernetic information exchange. In design, tool is not only an externalized organ of designers to interact with the physical object. Now, embedded with digital technologies, design tools will be able to extend human intelligence to a larger context by cultivating different feedback loops of the interventions between human and other digital system, like cloud platform, code libraries, etc., which could involve other human subjects. In this way, as Ben Bratton states, it will be uncertain that what human subject is associating with is “a fully autonomous, partially autonomous, or completely human-piloted synthetic intelligence” [19]. Design intelligence will be eventually formed in a highly sophisticated network of human–machine collaboration and human–human collaboration.
82
C. Yan
5 Design Novelty with Hybrid Intelligence As Winston Churchill once said, “we shape our buildings, thereafter they shape us”. We could also use the sentence to explain our relationship with tools. Tools, in its nature, are artificially made. The making of tools is naturally involving human intelligence. Then, the incorporation of body and tool always contains a reciprocal feedback loop, in which the tool could be constantly transforming both human body and mind by mediating the subjective perception. As the body extension, tools not only enhance the capabilities of human subjects, but also regulate and structure them. In this way, human being and the tools would be tending to conform each other to create linear correspondences through adaptation, so that eventually our perception and action could be fully reflecting the mechanism of the externalized organ—tools. Tool’s reversed influence on human being could be traced back to the concept of “enframing” in Heidegger’s discussion on modern technology. In Heidegger’s discussion, the relationship between human and tool is expanded to formulate criticisms on modern technology. According to Heidegger, modern technology is not neutral, but constantly confining and framing how human thinks and acts by suggesting certain kind of functional affordance to the human subject. For example, airplane is for flying, car is for driving. Once the subject becomes the orderer of this functional tool completely, he would also be confined by the suggested way of using it. Then his thinking and acting would also be framed and structured by its affordance [20]. In the effect of the “enframing”, the reciprocal intervention between subject and tool is primarily grounding on the physical interactions. Hayles traces the lineage of the changing relationship between human and tool, and distinguishes their feedback loops on the basis of the cybernetic communication in the digital age. In the scenario of the “man the tool-maker” in the early twentieth century, although human and tool were associated more closely so that they could shape each other, the mutual intervention was still mechanical rather than informational. In contrast, it is only through the 1960s, when the media of the informational exchange was increased, that tool began to be recognized as the prostheses connecting and fabricating the body virtually [16]. Then, through the continuous feedback loop of the information exchange between the digital tool and its related larger intelligent system, human subject would be absorbed into the virtual network of intelligence. In this way, as part of the extended schema of human mind, the networked intelligence would have the active capabilities to alter how the subject is perceiving, acting, and even thinking. And following Churchill’s statement that “thereafter they (architecture) shape us”, body-tool incorporation could eventually offer the possibilities of challenging the traditional way of understanding human intelligence in the plural trend of the posthumanism. As the human network would essentially influence and even determine how each human individual performs in design, then how design novelty will still remain in the process? To answer this question, body-tool incorporation has to be reimagined and rearticulated in the contexts of digitalization and artificial intelligence, in order to propose an alternative way of the immanent process of human appropriating tool and to against the digital rationalization in contemporary design practice.
A Hybrid Creativity in Architecture …
83
In contemporary design practice, new tools for architectural design and construction have been constantly invented and reinvented in order to seek new ways of conceptualizing and realizing forms. In this way, built upon human–machine collaboration, the object-centered production mode also undergoes a significant transformation, resulting in radically new scenarios of architectural authorship and ownership. Architects are not only credited for offering building designs, but also the tools producing them. Then, the processes, interfaces, and systems of design tool could be customized or modified according to the intention of the architect. Then, because digital tool is the media of the mutual intervention between human subject and the networks of intelligence, inventing new tools are not so much about expanding methods of generating and fabricating architectural objects as they are concerned with cultivating a new context of human-tool collaboration. Technology could become active to drive the creation process rather than being merely a passive precondition of the design. Tools could be invented, inventively used, or even misused in order to stimulate new creativity. In this upgrade of the incorporation triggered by customized tools, the collaboration between human and technology will become highly differentiated and could cultivate diversified design results. And when these differential feedbacks are magnified into the social production system, a new relationship between architects, technology, and social network would be established.
References 1. Carpo, M. (2001). The alphabet and the algorithm. Cambridge: The MIT Press. 2. Heidegger, M. (n.d.) s.l.:s.n. 3. Heidegger, M. (1962). Being and time. Translated by J. Macquarrie & E. Robinson (eds.). New York: Harper and Row. 4. Hale, J. (2012). Architecture, technology and the body: from the prehuman to the posthuman. In: S.C.H.H.C. Greig Crysler (ed.), The SAGE Handbook of architectural theory, pp. 513–533. s.l.:SAGE Publications. 5. Merleau-Ponty, M. (1962). Phenomenology of perception. Abingdon: Routledge. 6. Hale, J. (2014). Materiality, movement & meaning: architecture and the embodied mind. In: Proceedings of the 6th Annual Architectural Research Symposium in Finland 2014: Designing and Planning the Built Environment for Human Well-Being, pp. 305–314. Finland: University of Oulu. 7. Merleau-Ponty, M. (2007). Eye and mind. In T. Toadvine & L. Lawlor (Eds.), The MerleauPonty reader (p. 178). Evanston: Northwestern University Press. 8. Merleau-Ponty, M. (1968). The visible and the invisible. Translated by A. Lingis (ed.). Evanston: Northwestern Unviersity Press. 9. Carmen, T. (2008). Merleau-Ponty. Abingdon: Routledge. 10. Seamon, D. (1979). A geography of the lifeworld. London: Croom Helm. 11. Thompson, E., & J., V.F. (2001). Radical embodiment: Neural dynamics and consciousness. Trends in Cognitive Sciences, V, 418–425. 12. Bennett, E. L., Diamond, M. C., Krech, D., & Rosenzweig, M. R. (1964). Chemical and anatomical plasticity of brain. Science, 146, 610–619. 13. Clark, A. (2001). Mindware. Oxford: Oxford University Press. 14. Clark, A., & Chalmers, D. J. (1998). The extended mind. Analysis, 58, 10–23.
84
C. Yan
15. Clark, A. (2003). Natural-born cyborgs: Mind, technologies, and the future of human intelligence. Oxford: Oxford University Press. 16. Hayles, N. K. (1999). How we became Posthuman: Virtual_bodies in cybernetics literature and informatics. Chicago and London: The University of Chicago Press. 17. Picon, A. (2000). Anxious landscapes: From the Ruin to Rust. Grey Room (1), 64−83. 18. Clynes, M. E., & Kline, N. S. (1960). Cyborg and space (pp. 26–76). Issue September: Astronautics. 19. Bratton, B. H. (2015). The stack: On software and sovereignty. Cambridge: The MIT Press. 20. Heidegger, M. (1977). The question concerning technology and other essays. Tranlated by W. Lovitt (ed.). New York: Harper and Row.
Collaborative Intelligence
Toward a Collaborative Robotic Platform: FUROBOT Ming Lu, Wei Ran Zhu, and Philip F. Yuan
Abstract In usual robotic fabrication by 6-axis industrial robots such as KUKA, ABB, and other brands, the usual robot’s 4th, 5th, and 6th axis is exactly converged in one point. When this type robot (pieper) is doing movement commands, setting the degree of 4th axis close to zero is an ideal condition for motion stability, especially for putting device which connects to tool head on 4th axis arm part. In plastic melting or others print which not cares the rotation angle about the printing direction (the printing direction means the effector’s output normal direction vector, KUKA is X axis, ABB is Z axis), the optimization of 4th axis technology not only makes printing stable but also makes better quality for printing. The paper introduces a new algorithm to get the analytics solution. The algorithm is clearly explained by mathematics and geometry ways. At the end of the paper, a grasshopper custom plugin is provided, which contains this new algorithm, with this plugin, people can get the optimized target path plane more easily. Keywords 3D printing · Brick fabrication · Robotic · Optimization algorithm · Grasshopper plugin
1 Introduction In 1959, the Unimate # 001 prototype developed by Joseph Engelberger and George Devol was first installed on an assembly line for the first time at a General Motors diecasting plant in Trenton, New Jersey. The manufacturing method around the world has changed completely since then. In 1968, the Stanford Research Institute developed the world’s first intelligent robot; in 1974, the first microcomputer-controlled industrial robot set a precedent. Since the wave of digitalization in the past sixty years, digital tools in the field of architecture have also been integrated into all links from M. Lu Fab-Union, Shanghai, China W. R. Zhu · P. F. Yuan (B) Tongji University, Shanghai, China e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2020 P. F. Yuan et al. (eds.), Architectural Intelligence, https://doi.org/10.1007/978-981-15-6568-7_6
87
88
M. Lu et al.
design to fabrication and continuously innovated the creative thinking and working modes in architecture. There’s no doubt that the development of robotic platforms deserves more attention and discussion when facing an environment surrounded by speculations about the relationship between humans and machines. On the one hand, rediscovering the connection between architects and robots can bridge the long-standing gap between design and fabrication and put the initiative of fabrication back into architects. On the other hand, researching and developing a digital construction software that can be used in actual fabrications can find and solve many specific problems in practice, and contribute to a future that robots are widely used in fabrication.
2 Collaborative Thinking in Intelligent Fabrication At the end of the twentieth century, the division between designers and constructors can no longer be ignored. Until the beginning of the twenty-first century, Fabio Gramazio and Matthias Kohler established the world’s first robotic laboratory in ETH Zurich [1]. It enables the virtual design to be tested and fed back through digital fabrication. Gramazio Kohler Research thus became a pioneer of experimental robotic fabrications. When Ford’s assembly line turned to industrial robots, standard has changed. The industrial manufacturing achieved a leap to the intelligent future. Meanwhile, computational thinking also integrated into human creation. This mode of sharing between machines and humans has profoundly influenced the dichotomy of design and fabrication. In the integration process, the data transmission between machines constitutes a network relationship between various stages. At the same time, human as the subject maintains a symbiotic relationship with the machine [2]. In this way, fabrication can be re-incorporated into architects’ responsibility under the guidance of human–machine collaboration. Intelligent fabrication is not limited to digital design, but a set of processes including end effectors design, robotic path programming, and control systems. The robotic platform is an essential part of them. It can be said that its research and development is to build a bridge of thinking sharing between the architect and the robot. Shared design thinking and collaborative work modes enable us to solve numerous problems at the very beginning of digital fabrication in a human–machine way. The new working mode of human–machine breaks through a single thinking style and realized a collaboration. But it should be noted that the purpose is not to make robots lead the design, but to enhance human capabilities under some preset conditions, and bring new possibilities for forms and materials. Additionally, the efficiency in repetitive production makes robots hold a specific power on non-standard processing that humans cannot achieve. As the core of digital fabrication, the architectural robot is a high-precision, high-efficiency, and multi-degree-of-freedom tool, which can be applied to various tasks. Above all, the premise of intelligent fabrication is a common understanding between architects and robots.
Toward a Collaborative Robotic Platform: FUROBOT
89
3 Collaborative Platforms of Robotic Fabrication In recent years, robotics has gradually penetrated into the construction industry, and more and more 6-axis robots have been used in the construction industry. The development of construction robot systems has also attracted the attention of many research and development institutions and high-tech companies. Despite this, construction robots are still in an underdeveloped state, especially in China. Building a bridge between architects and robots and lowering the threshold of the robot industry requires the help of a universal software platform. Just like the rapid popularization of computers and smartphones, the main internal driving force is universal operating systems and application software. At present, the most widely used commercial robot software platform is ROS developed by Willow Garage of the United States. Its applicability can allow small research teams or even individuals to develop excellent application software for ROS systems. Some existing foreign software, such as KUKA | prc, HAL, Robots, and RoboDK, have already done some work, but some functions in these softwares do not meet the usage habits of other users, and many of the functions required by innovative processes are not available. In addition, in the production process, as the actual construction requires more and more robot tool heads and hardware integration, the requirements for easy operability of programming become higher and higher. Sometimes users also hope that processes such as hierarchical printing and spatial printing can be summarized and programmed into a digital construction platform and adapted to many different robots such as KUKA, ABB, and UR. The root cause is the pursuit of higher collaboration. From an architectural perspective, in digital design and construction, the Grasshopper platform built in Rhinoceros modeling software which is a node-based parametric programming environment commonly used in the construction industry. Developers of robotic software platforms are tended to research and develop in the same environment. Therefore, independent software platforms, such as KUKA’s WorkVisual and ABB’s RobotStudio, have failed to collaborate well with digital building design. RoboDK under Rhinoceros is not convenient for parameterized debugging because it is independent of Grasshopper; while KUKA | prc, HAL, and Robots based on Grasshopper, KUKA | prc only support robot models of KUKA manufacturers, and cannot be customized Multi-axis external axis; HAL supports multiple robots such as KUKA, ABB, and UR, but the operation is complex, and the hardware cannot be customized, and it is limited to use in foreign network environments; Robots also support multiple robots, but the operation is complicated. The need for a sustainable and practical collaborative robotics construction platform is the driving force behind the development of the FUROBOT software platform (Fig. 1). FUROBOT focuses on the construction field of architecture, and its purpose is to design and build docking, adapt to multiple types of robots, and complete the integration of multiple processes. FUROBOT, which takes collaboration as its responsibility, supports different manufacturers and different models of robots, is easy to operate, and can be customized to expand and share the hardware library (including
90
M. Lu et al.
Fig. 1 FUROBOT toolbar
robots, external axes, and tool head libraries). After customizing the new hardware library, users can also share it with other FUROBOT users through .gha files. In addition, based on the robot simulation and programming module of the design model environment, FUROBOT can realize robot programming in the design environment, helping users to achieve rapid deployment of digital construction, which greatly facilitates the production of prototype components and small-scale trial production.
4 Research and Development of FUROBOT The digital environment built today tends to combine the brain with new tools, such as construction robots, to form a new relationship-designers no longer design appearances, but various parameter logic. As a node-based programming software based on Grasshopper, FUROBOT inherits the parameterized control mode of Grasshopper. Most parameters are transferred in the form of node input and output, instead of the traditional dialog input mode.
4.1 Parameter Representation At the beginning of understanding FUROBOT, there are several basic parameter concepts of robotics [3, 4] that need to be explained in advance. Among all the concepts that need to be clarified, the parameters in Table 1 need to be understood Table 1 Different representation of rotation Parameter
Description
Tool head center point plane (TCP)
Pose transformation of the moving target of the robot relative to the working base coordinates
Robot root (Root)
The position of the robot’s position relative to the pose of the world system
Work object (Work base)
Pose transformation of robot work object coordinate system relative to robot root coordinate system
Tool (Tool)
Pose transformation of the end of the robot tool head relative to the center point of the flange
Toward a Collaborative Robotic Platform: FUROBOT
91
Fig. 2 Conversion of different rotation representation
first. These parameters include (Table 1) FUROBOT, which is designed to support the work of multiple robots at the same time, needs to deal with these different pose expressions in a unified manner and provide tools for converting them (Fig. 2). There are more than one pose states for the robot to reach the center point plane (TCP) of the same tool head. The 6-axis robot commonly used in the industry has 8 pose states to choose from in a normal working space (Fig. 3). Although the UR robot has no explicit pose state parameters for users to choose, its pose state parameters are implicitly derived from the state of the robot at the previous moment. So among these three robots, only the preset robots KUKA and ABB need selecting the pose state parameters manually. Because FUROBOT supports a variety of robots, it is necessary to uniformly type parameters of the pose. For the sake of convention, FUROBOT uses the official parameters officially designated by KUKA and uses this as a reference to map the parameters of ABB and KUKA one by one. The specific conversion is as follows: (Table 2). At the same time, there are corresponding options in the motion command input of FUROBOT (Fig. 4).
Fig. 3 General 6d robot has 8 pose states
92
M. Lu et al.
Table 2 ABB and KUKA pose state conversion KUKA
ABB
Description
110
1
Wrist in front of shoulder, wrist in front of elbow, negative wrist
100
3
Wrist in front of shoulder, wrist behind elbow, negative wrist
010
0
Wrist in front of shoulder, wrist in front of elbow, positive wrist
000
2
Wrist in front of shoulder, wrist behind elbow, positive wrist
111
7
Wrist behind shoulder, wrist behind elbow, negative wrist
101
5
Wrist behind shoulder, wrist in front of elbow, negative wrist
011
6
Wrist behind shoulder, wrist behind elbow, positive wrist
001
4
Wrist behind shoulder, wrist in front of elbow, positive wrist
Fig. 4 FUROBOT pose states parameter in move component
Toward a Collaborative Robotic Platform: FUROBOT
93
4.2 Workflow The technological revolution triggered by digital technology has caused fundamental changes in architectural design. Thinking, design, and construction have become the organic integration of all links. Digitalization has almost updated the methods and methods of traditional architectural design, and it has also profoundly affected the construction process. FUROBOT’s internal operation process can be simplified as shown below (Fig. 5). This running process is the running logic of the software, and it is also a concept that needs to be clear in the process of use. 1. Get final TCP In order to meet the needs of different working scenarios, FUROBOT sets whether to automatically rotate TCP as an optional item. Generally speaking, the initial pose state is the first state where the robot starts to move, which helps the robot’s movement. FUROBOT generally uses angle instructions to complete this task. For consideration of attitude selection and inverse solution in robot kinematics, adding a PlaneToPlane (PTP) command before running any straight line command can ensure smooth movement. Users can manually add a PTP instruction, FUROBOT will automatically add it to the head of all LIN command groups without manual addition. Position and rotation of the mobile robot can be changed by external axes and root device. In the FUROBOT logic, only one of the external axis and the root coordinate needs to be selected. If both are input at the same time, only the external axis will function. 2. Simulation and offline programs The response speed of the software is very important in the simulation of a large number of target points. FUROBOT uses an improved dichotomy and region search, which enables the robot to quickly find the corresponding target plane. At the same time, transition simulations between each target plane are also considered. The quaternion SLERP method is used in trajectory interpolation to smooth the entire
Fig. 5 FUROBOT internal workflow
94
M. Lu et al.
simulation process. Different brands of robots use different proprietary programming languages to generate offline files. In addition to the simple replacement of the same command into different robot languages, there are also differences in grammatical structure. The user only needs to provide the file storage path, and FUROBOT will automatically output the offline program file with the corresponding extension according to the input robot type. 3. Check To build a robot safely, inspection of the robot is essential. The inspection is divided into the following items: Collision: Collision detection is the most critical detection item in safe construction. The main detection is divided into two types, for collisions of robotic arms and external objects and collisions of robotic bodies and tools (Fig. 6). Workspace access and limit: check whether the robot arm can reach and exceed the limit angle. Singularity detection: In a common 6-axis robot, there are three kinds of singularities, which correspond to three pose states. The wrist is zero, and this singularity is the most common kind (Fig. 7).
Fig. 6 Two kind of collision
Fig. 7 When A5 is close to 0°, A4 and A6 will rotate very fast, causing the robot to report an error.
Toward a Collaborative Robotic Platform: FUROBOT
95
FUROBOT divides the test into two parts, the first part is performed in the main battery, and only tests whether there is the above problem. The second part is placed in an independent detection battery and is responsible for the global detection. There is a subdivision option at the input of the global detection, which can set the division level of the global detection.
4.3 Customize hardware FUROBOT comes with a number of hardware facilities, including robots, tool heads, external axes, and mobile bases, which are numerous and essential in use. But most of the time, the user’s hardware is not in the scope of support, especially the large number of 6-axis robots of KUKA and ABB. Currently, FUROBOT supports a total of more than 30 models that are far less than the number of robots on the market, so custom robots are a solution that gives users more freedom. FUROBOT provides components for custom robots. Users can refer to the tutorial to customize the robots based on existing robots. Similarly, for the tool head, external shaft, and mobile base, FUROBOT provide components for all the hardware, and users can customize according to the tutorial. The first is a custom robot. FUROBOT simplified the operation process as much as possible and added a UI interface to the component for the convenience of the user (Fig. 8). In addition to the robot body model that must be entered, the user can modify the robot length parameters on the component UI.
Fig. 8 Customize ABB and KUKA
96
M. Lu et al.
Because FUROBOT is a universal robot platform, all existing and custom hardware, such as external axes (Fig. 9), tool heads, etc. can be used between different types of robots in FUROBOT. FUROBOT uses a shared, collaborative approach to use this hardware. Users can personally make hardware class libraries (Fig. 10), and also share files with the extension “.gha”. Because FUROBOT has opened the API interface of the corresponding custom hardware, users can make their own hardware library in Visual studio. After generating the .gha file, just put the .gha file in the Grasshopper component folder.
Fig. 9 Custom single-axis external axis (maximum is can support three axes)
Fig. 10 Relationship between hardware library with main program
Toward a Collaborative Robotic Platform: FUROBOT
97
Sharing hardware libraries is more convenient than making custom hardware components in Grasshopper, but the former is more difficult to make than the latter. Users can choose according to their needs.
5 Human–Machine Collaboration in Digital Fabrication After the performance-oriented construction platform realizes the open-source construction method, human’s design intent can be realized to the greatest extent, and objects can also show stronger adaptability and flexibility. On the digital construction platform, the roles of tools and humans have been blended with each other, and craftsmanship has profoundly influenced design to construction. In robot construction, the main programming task is to issue a series of instructions and a small number of parameter instructions to the TCP position of the tool head. These tasks are done before entering the simulation and are very time consuming. FUROBOT has developed a set of mature process components to speed up the efficiency of user programming, which in turn assists our creation and opens up new possibilities for design and construction [5]. Not only that, we used the robot operating parameter output data provided by FUROBOT to produce the following functions. These functions have initially realized human–machine collaboration and path optimization, which can better support various processes.
5.1 Real-Time Communication Real-time communication (Fig. 11) is a must-have feature in robot construction scenarios that require sensors to participate. Under the KUKA model, FUROOBOT uses a communication software package based on EthernetKRL [6], which provides corresponding battery components through socket programming. After receiving the instruction data from the computer, the robot converts the instruction data into instructions that a real robot can execute. After the robot completes the task, it sends its own state and sensor data back to the computer. After the computer receives the instruction, it sends new instruction data to form a closed loop.
5.2 Path Plane Optimization In general robot construction, such as layer printing, space printing, painting, and bricklaying, these applications have a characteristic: once the normal vector of the tool is determined, any rotation around the normal vector of the tool can be selected. However, how many degrees of rotation around the normal vector can make the entire robot spend the least joint rotation and reduce the rotation angle of the A4 axis
98
M. Lu et al.
Fig. 11 Robot real-time communication for printing board installation
(in order to maintain stability), which is a realistic problem. FUROBOT provides a solution (currently only KUKA). Under this solution, the robot can make the same printing effect while minimizing the changes in the robot’s joints. FUROBOT provides a path-optimized component for easy completion (Fig. 12).
Fig. 12 FUROBOT’s target plane is optimized. At this time, the A4 axis is 0°
Toward a Collaborative Robotic Platform: FUROBOT
99
Fig. 13 Robotic Spatial 3D Printing (left: Cloud Village, Venice Biennale 2019; right: West Bund World AI Conference Venue B)
5.3 Layer Printing and Spatial Structure Printing Layer printing is the most widely used robotic construction application, including plastic printing, concrete printing, and clay printing. Layer printing generally requires cutting the geometry horizontally to obtain the contour curves, and then connecting these contour curves. In this process, FUROBOT puts the height of the horizontal cut, the selection of seam points, the angle of the upslope between each layer, the retention of sharp corner points, and the printing speed of different angles into several and layer printing processes related components. In layer printing, using the path plane optimization function provided by FUROBOT can better avoid singular points and reduce unnecessary jitter. Spatial structure printing is more complicated than layer printing [7]. Its complexity is not only in the choice of path, but also in the avoidance of the existing spatial structure by the robot and the tool head. In addition, the UV segmentation of the surface, the offset vector of the structural thickness, and the speed of the tool head printing and flight need to be considered. FUROBOT has selected one of the most mature space structure printing processes (Fig. 13) and packaged it into a modular cell with an innovative process. Modified plastic space printing allows robots to make rapidly condensed materials to weave a space grid structure in space, which can print a lightweight, large-scale space grid structure in a short time.
5.4 Other Processes In addition to the modified plastic 3D printing process, FUROBOT’s Grasshopper components also include brickwork, wood processing [8], and fiber winding processes, and have been successfully applied to actual construction. Wood processing technology includes circular saw, drilling, chain saw, band saw, and other tools. The integration of these processes has gone through a lot of long-term practice (Fig. 14).
100
M. Lu et al.
Fig. 14 Robotic Wooden Fabrication (left: Wooden Project, Shenzhen Bi-City Biennale of Urbanism\Architecture 2019; right: Inkstone House OCT Linpan Cultural Center)
Fiber winding is a cutting-edge new technology. FUROBOT’s implementation of this process is based on ordered anchor points and considering the collision of tools and anchor points to select the optimal winding path. These process components were used in the carbon fiber bridge of the Digital Future Work Camp 2019 (Fig. 15). In the process of automatic fiber winding, the industrial robot continuously conveys resin-saturated glass fibers and carbon fibers, and gradually fixes as the previous layer hardens and the subsequent layers are placed. The robot filament winding technology enables each fiber to accurately calculate its position in space, and maximizes the physical and mechanical properties of each fiber. Except for layer printing, which requires high continuity, all the above-mentioned construction processes are suitable for construction using real-time instruction transmission, so that the robot has the
Fig. 15 Carbon fiber bridge built at Digital Future Work Camp 2019
Toward a Collaborative Robotic Platform: FUROBOT
101
function of adapting to the outside world. For example, through the function of changing the pose state of the robot introduced in Sect. 4.1 above, various parameters provided by FUROBOT can be programmed to select an optimal pose state to work. Collisions can be avoided in real time, especially in fiber winding, spatial structure printing, and spraying.
6 Conclusion With the new digital culture evolving, robotics is improving rapidly. This technology assists the experimental fabrication to pave the way for an intelligent vision. The accuracy and efficiency of robots promote the application of intelligent materials and experimental fabrications, and make architectural robots become an important part of future building techniques. Although the research and development of robotic platforms involve many interdisciplinary subjects, which makes researches seem more difficult, the intelligent vision of architecture still requires the exploit in robotic fabrication. FUROBOT, a digital fabrication software platform, has made us find that the coupling of software and hardware is a trend. Through this coupling, various hardware bases and technique packages can respond to specific needs. Sharing and collaboration are no longer out of reach.
References 1. Liu, Y. Robotic design construction: digital fabrication strategies for freeform masonry casting and mobile assembly (Doctoral dissertation, Massachusetts Institute of Technology). 2. Yuan, P. F., Zhou, J., & Yan, C. (2019). Digital craftsman: the future of architecture with humanmachine collaboration. The Journal of Architecture, 04, 1–8 (in Chinese). 3. Craig, J. J. (2017). Introduction to robotics: Mechanics and control (4th ed.). Pearson. 4. Spong, M. W., Hutchinson, S., & Vidyasagar, M. (2005). Robot modeling and control. Wiley. 5. Yuan, P. F., Zhu, W., & Gao, W. (2019). Why performance fabrication? Architecture Technique, 2019(09), 8–14 (in Chinese). 6. KUKA, Ethernet KRL 2.2 For KUKA System Software 8.2, 8.3, KUKA Roboter GmbH (2016). 7. Chen, Z. W., Zhang, L. M., & Yuan, P. F. (2019). Innovative design approach to optimized performance on large-scale robotic 3D-printed spatial structure. In CAADRIA. 8. Chai, H., & Yuan, P. F. (2018). Investigations on potentials of robotic band-saw cutting in complex wood structures. In Robotic fabrication in architecture, art and design. Cham: Springer.
Interdisciplinary AI: A Machine Learning System for Streamlining External Aesthetic and Cultural Influences in Architecture Güvenç Özel
Abstract Architecture does not exist in a vacuum. Its cultural, conceptual, and aesthetic agendas are constantly influenced by other visual and artistic disciplines ranging from film, photography, painting, and sculpture to fashion, graphic, and industrial design. The formal qualities of the cultural zeitgeist are perpetually influencing contemporary architectural aesthetics. In this paper, we aim to introduce a radical yet methodical approach toward regulating the relationship between human agency and computational form-making by using Machine Learning (ML) as a conceptual design tool for interdisciplinary collaboration and engagement. Through the use of a highly calibrated and customized ML systems that can classify and iterate stylistic approaches that exist outside the disciplinary boundaries of architecture, the technique allows for machine intelligence to design, coordinate, randomize, and iterate external formal and aesthetic qualities as they relate to pattern, color, proportion, hierarchy, and formal language. The human engagement in this design process is limited to the initial curation of input data in the form of image repositories of non-architectural disciplines that the machine learning system can extrapolate from, and consequently in regulating and choosing from the iterations of images the Artificial Neural Networks are capable of producing. In this process, the architect becomes a curator that samples and streamlines external cultural influences while regulating their significance and weight in the final design. By questioning the notion of human agency in the design process and providing creative license to Artificial Intelligence in the conceptual design phase, we aim to develop a novel approach toward human–machine collaboration that rejects traditional notions of disciplinary autonomy and streamlines the influence of external aesthetic disciplines on contemporary architectural production.
G. Özel (B) UCLA Department of Architecture and Urban Design, Los Angeles, USA e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2020 P. F. Yuan et al. (eds.), Architectural Intelligence, https://doi.org/10.1007/978-981-15-6568-7_7
103
104
G. Özel
1 Introduction: Disciplinary Autonomy and Cultural Engagement in the Post-digital Era Since the commonplace integration of digital design and fabrication tools into architectural processes, much of the disciplinary effort focused on using computation as a tool for automating the production and fabrication of complex and variable formal systems efficiently and economically. The conceptual objective behind these schemes was to liberate architectural production from the Modernist norms of repetition and mass production, allowing for more diversity of formal languages to emerge. Nevertheless, in most of these parametrically driven design and fabrication processes, the role of computation has traditionally been linear and subservient to the architect’s creativity and disciplinary zeitgeist. The objective of the computational process was to iterate on and automate the production of the architect’s sketch. Architecture was viewed as an autonomous discipline with its own set of tools and priorities, engaging with the culture of technology almost exclusively from its own disciplinary lens. Its affiliation with outside disciplines has largely been limited to social sciences and philosophy in order to mine for metaphorical formal models to justify many of its seemingly novel aesthetic agendas [1]. This tool-centric approach allowed for architectural discourse to create many unique formal agendas, yet its dependency to the capabilities of digital modeling tools eventually limited its output and created selfsimilar results. During this time, the computational process operated as a black box that was not open to outside cultural influences, incapable of making vital intellectual, visual, aesthetic, and conceptual connections to artistic disciplines other than itself. This isolation further limited its future potential and jeopardized its cultural relevance.
2 Contemporary Generative Design Tools: Simulation-Based Procedural Modeling Versus Machine Learning and Artificial Neural Networks Since the inception of complex simulation tools that can generate and project formal schemes in a non-linear and meta-parametric fashion, we are able to duplicate the behavior of natural phenomena and harness the intelligence of natural systems as design models. Contemporary procedural modeling and animation software can accurately simulate the behavior of these natural systems, such as physics-based particle systems, fluid dynamics, agent-based systems, and so on, and link them with the formal agenda of the architect’s choosing [2]. Nevertheless, this type of predictive intelligence that the software provides still aims to create a subservient hierarchy between the tool and its user. Its capabilities as a tool for form-finding are predictable and linear. Contemporary ML tools, however, based on artificial neural networks, allow for a much higher level of generative potential, wielding a level of autonomy and unpredictability never before seen in computational design tools. This type of
Interdisciplinary AI: A Machine Learning System …
105
new design intelligence, one that is capable of producing its own interpretations as it interpolates between pre-determined computational objectives provide an immense amount of creative potential [3]. This difference between simulation-based generative modeling and ML-based projective interpolation highlights the differentiation between the performative potential of machine intelligence and its ability to automate form-making processes. Contemporary use of ML, particularly in the context of machine vision and sensor systems, is primarily focused on their interactive potential and their ability to respond to outside contexts such as the environment, human occupation, comfort, and psychology. Through cyberphysical architectural systems that combine static components, responsive robotic components, and media applications controlled by machine vision and sensor networks, we can create programmed behaviors for architecture to interactively and autonomously respond to environmental and occupational contexts [4]. This approach yields to productive results in understanding the human presence as it relates to architectural form and motion, but has not resulted with alternative spatial languages that can be deployed and explored further in order to pose alternatives to familiar models of computational form-making, which are heavily constricted by the limitations of parametric design tools. In many cases, the questions of enclosure as it relates to motion, novel material science experiments, and variable formal modulations to strike a balance between static and dynamic qualities of space are explored through investigating the historic evolution of such forms in architectural and industrial design and are independent from their potential to interact with the world of culture and aesthetics around them. In order to overcome this problem, this paper will focus on the relationship between human agency in design discourse and computational iteration by using machine learning as a generative concept design tool. We will be introducing highly customized machine learning tools in order to classify and iterate stylistic approaches that exist outside the discipline of architecture to allow for ML to design, coordinate, randomize, and iterate qualities as they relate to pattern, color, proportion, hierarchy, and formal language. The human engagement in this design process will be limited to the initial curation of input data that the ML system can learn from, and also in regulating and choosing the iterations as the final outputs of two-dimensional images such systems are capable of producing. Considering that contemporary ML systems are limited in their generative abilities to only provide 2D images, an additional computational mediation process in the form of agent-based systems and procedural modeling is deployed after in order to convert these images into 3D geometry, further automating and streamlining the formal production. By questioning the notion of human agency in the design process and providing creative license to Artificial Intelligence in the conceptual design phase, we aim to develop a novel approach toward human–machine collaboration that rejects traditional notions of disciplinary autonomy and streamlines the influence of external aesthetic disciplines on contemporary architectural production.
106
G. Özel
3 Process 3.1 Documentation of Selected Architectural Landmarks In order to limit the architectural scope of our experimentation, we decided to pick five landmark buildings in downtown Los Angeles as our sites of experimentation. Through documenting these selected civic buildings and their urban presence in Downtown Los Angeles, we aimed to bypass discussions regarding urban mass, adjacency, and other architectural and urban concerns that fall beyond the scope of the proposed design method. The existing massing of the selected buildings and their inherent connection to their context served as vessels for our machine learning operations. In order to achieve an accurate and comprehensive documentation of these buildings, we used existing drone footages from various angles as the basis for our generative machine learning process (Fig. 1).
Fig. 1 Drone footage architectural landmarks in DT Los Angele
Interdisciplinary AI: A Machine Learning System …
107
Fig. 2 Interdisciplinary aesthetic influences from visual art, sculpture, painting, fashion, and industrial design
3.2 Precedent Research and Selection of Interdisciplinary Aesthetic Influences Our main cultural objective is to look outside the discipline of architecture to mine for inspiration for new aesthetic paradigms. Looking into the world of film, photography, painting, sculpture, fashion, and industrial design, we asked our students at the Technology Studio in UCLA Department of Architecture and Urban Design, IDEAS Program, to create a detailed repository of images from a particular body of work by an artist or designer who is alive and influential. This library of images later on served as the basis for “Style-Transfer” from the image into the massing of the building as documented through the drone footage (Fig. 2).
4 Method 4.1 Artistic Style-Transfer with Convolutional Neural Networks on Architectural Drone Footage We developed a technical pipeline to use artistic Style-Transfer operations where the influence image is classified as input, the drone footage divided into single frames as target image, and the newly generated image as output content (Fig. 3). Based on that workflow, we asked our students at the Technology Studio in UCLA Department of Architecture and Urban Design, IDEAS Program, to apply the technical pipeline
108
G. Özel
Fig. 3 Artistic style-transfer on architectural drone footage (Image Credit: Lu Geng, Sarah Kim, Qinyung Lu, Rui Ding, Allen Wang, Ying Huang, Hoodeen Hakimian, Kshama Swamy, Haocheng Dai, Yiran Zhou)
of artistic Style-Transfer on Drone Footages of the five landmark buildings in downtown Los Angeles. The Style-Transfer algorithm that we utilized in this workflow is a machine learning algorithm using Convolutional Neural Networks (CNN) that is executed through Google’s TensorFlow and Nvidia’s cuDNN API [5]. The five algorithm combines a target image with one or multiple style input images in order to
Interdisciplinary AI: A Machine Learning System …
109
generate a new output image. The new artificially generated images respect semantic features of the target and the style input image simultaneously. In this process of Style-Transfer, the target images, the content, and the style of the input images are separated and combined in multiple layers of the neural network in order to extract and deploy specific features in them. Each of these layers is small computational units in the CNN that processes visual information and classifies them [6]. Through this classification, we are able to distinguish patterns in style and content images and map features from one to the other. This process is considered as part of a texture transfer problem in Computer Vision (CV) by using machine learning and neural networks. We are particularly interested in using this technique on content images (output) as a resultant of architectural elements (target, drone footage) and stylize them with artistic images (input) since this process allows for maintaining the architectural characteristics of certain images. For example, the system is capable of recognizing elements such as window, door, roof, and other architectural features and is able to maintain their general morphology while transferring the semantic qualities of the input style influence image. This technical setup creates a highly accurate architectural definition that consequently respects the initial functional parameters of the building.
4.2 Relevance of the Style Image In our workflow and research, the quality of the input style image is particularly important to control and steer the final result. We discovered that style images with clear patterns and defined color palette have a much higher success rate in being transferred into content images with an architectural subject. The patterns in the style image can change in scale and density as long as they follow the same set of distinguishable rules. If the style influence is blurred with no clear regions or boundaries, the result is equally undefined and only the color palette is being transferred. This is based on the object detection capabilities of the CNN and the pre-trained VGG-Network [7]. In this regard, we introduced an additional step to the process by editing the style and input images. Through experimenting with contrast ratios, saturations, and color values of the target image and input image, we could calibrate the success rate of the output image. Through this process, we can increase the strength of pattern recognition at the level of the CNN and curate the final result toward a particular direction. Further, we discovered that the pixel ratio of the target and as well as the input style image can influence the final result. The Style-Transfer algorithm natively transforms the style image into the pixel ratio of the target image. If the format and pixel count of the input style image is significantly different than the target image, the style image will be scaled and interpolated. This operation within the algorithm can yield to undesirable results in terms of resolution and sharpness. Therefore the in between image manipulation steps we introduced to the style image prior to executing the transfer heavily influenced the success rate of the generated result.
110
G. Özel
4.3 Style Weight and Multiple Input Styles Despite the style (input) image, we can control the number of style influences and their weights within the Style-Transfer algorithm. This gives us the possibility to adjust the trade-off between the target image and style influence. By simply applying more than one style image, the CNN will try to keep the semantic features in all images but generates a hybrid or a pastiche between the target and style input images [5]. Further experimentation with different style weights allowed us to prioritize certain influences in the process of texture transfer that is aware of all the semantic features and patterns in an image composition. In this method, each input image that is used as a style reference is given a particular weight. This method alters the final outcome dramatically but is always applied globally to the whole content image. In order to apply multiple style images to selected regions on the content image, we use paired image masks that assign a particular style influence to a specific boundary on the content image (Fig. 4). This method is also referred to as semantic image segmentation in Computer Vision (CV) and adds an additional layer of control to the process [5].
4.4 Feature Extraction of Architectural Elements In our process, we were particularly concerned about the detection and recognition of architectural elements in 2D imagery and how formal characteristics are transferred through artistic Style-Transfer to these elements. In our artistic Style-Transfer process, we used the VGG-19 network for object recognition and feature extraction. This deep convolutional neural network was developed and trained by Oxford’s Visual Geometry Group (VGG) [8]. This pre-trained neural network has 16–19 layers and was trained on vast datasets which are not specific to architecture and the built environment. Nevertheless, the neural net by VGG has proven to be very effective on a general basis of object and feature detection for various datasets [8]. Also in our process, the VGG-19 network has proven to be very effective in detecting and labeling architectural elements as features from the target image. In order to create a feature map from the style images and the target image, we reconstructed the input image from layers “conv1 1”, “conv2 1”, “conv3 1”, “conv4 1”, and “conv5 1” of the original VGG-Network [9]. We discovered that through the use of the VGG-19, architectural elements such as windows, doors, facade patterns, patios, setbacks, and architectural ornaments are very well preserved as features and patterns in the output image. This yields precise results while transferring the feature map of a style input image to the target image. We realized that in regard to the architectural elements, contrast and light conditions in the target image are crucial to extracting geometric figures such as windows and doors as rectangles or facade patterns as features for the Style-Transfer. But despite the pure feature extraction of architectural elements through the use of the VGG-19 network, we also discovered that the CNN is capable
Interdisciplinary AI: A Machine Learning System …
111
Fig. 4 Influence of style weight and multiple input styles (Image Credit: Lu Geng, Sarah Kim, Qinyung Lu, Rui Ding, Allen Wang, Ying Huang, Hoodeen Hakimian, Kshama Swamy, Haocheng Dai, Yiran Zhou)
112
G. Özel
Fig. 5 Feature extraction of architectural elements—windows, doors, facade patterns, patios, setbacks, and architectural ornaments (Image Credit: Lu Geng, Sarah Kim, Qinyung Lu, Rui Ding, Allen Wang, Ying Huang, Hoodeen Hakimian, Kshama Swamy, Haocheng Dai, Yiran Zhou)
of detecting and accounting for factors such as scale, perspective, and orientation of architectural surfaces and patterns. Localized structures and patterns are very well perceived and rendered in the style of the reference images. This was a crucial factor for the success of Style-Transfer onto existing architectural buildings and elements (Fig. 5).
4.5 2D Image Translation to 3D Models In our pipeline, we looked for methods of translating 2D images into 3D geometry with procedural 3D modeling techniques. Procedural modeling is a method to generate and describe 3D geometry through a set of rules. With this method, the geometry is constructed on runtime and can be described through a series of computational instructions [10]. We utilized this method in Houdini, Cinema 4D, Maya, and Rhino’s Grasshopper with our students at UCLA in order to establish a direct link from the process of Style-Transfer to 3D geometry translation (Fig. 6). More specifically we automatically linked procedural modeling techniques and mesh operations such as extrusion, loft, height field, tessellation, sweep, revolve, and deform to 2D imagery. The visual content of the style transferred images is extracted by the procedural modeling software and translated into architectural 3D models. During this process, we again utilized low-level CV operations within the procedural modeling
Interdisciplinary AI: A Machine Learning System …
113
Fig. 6 2D image translation to 3D models through procedural modeling (Image Credit: Lu Geng, Sarah Kim, Qinyung Lu, Rui Ding, Allen Wang, Ying Huang, Hoodeen Hakimian, Kshama Swamy, Haocheng Dai, Yiran Zhou)
workflow such as region detection, image segmentation, and feature extraction. This allows us to automate the translation from 2D images to 3D geometry and create a more coherent link for human–machine collaboration.
4.6 Architectural Style-Transfer on Drone Footage Since we are applying the Style-Transfer to architectural imagery and translating the 2D images to 3D models, we are interested in capturing architecture and a building from more than one elevation. Therefore, we used drone footage as our content images in order to get all elevations from a building and its context. In order to achieve such comprehensive documentation, we applied the Style-Transfer on a video sequence that was rendered in full HD with a Nvidia 1080 Ti graphics card with 11 GB memory. In order to maintain a stable and continuous Style-Transfer on multiple frames, we used the latest loss functions and initializations for Style-Transfer with CNN [11]. Once we got a stable video sequence that is stylized through the reference images, we were able to reconstruct and generate the 3D model of the building with various techniques of procedural modeling and photogrammetry. In our particular pipeline from 2D stylized images to 3D geometry, we used Cinema 4D’s capabilities to reconstruct scenes and 3D models from video footage through photogrammetry [12]. This allows us to directly link the process of 3D generation to the video output
114
G. Özel
Fig. 7 3D photogrammetric reconstruction from drone footage-cinema 4D with motion tracking tool
of artistic Style-Transfer (Fig. 7). Further, we were also able to use Cinema 4D’s build in rotoscoping tools to later visualize and render the stylized 3D geometry composited back into the drone footage (Fig. 8). This process allowed us to show the new style of the building in situ so that we could regulate the influence of machine learning on the urban performance and presence of the building.
5 Conclusion The procedural modeling process of extracting a 3D model from the ML generated 2D image sequence requires qualitative decisions from the human designer in regard to what modeling techniques should be deployed and automated to get the most accurate results. Like in any design development process, any concept design goes through multiple steps of iteration with large teams, and no current technological process or algorithmic scheme is capable of delivering a complete design from start to finish, and the goal of the ML integration is not to fully automate the design process. Our technique challenges the traditional workflow of computational design in the sense that it creates a more dynamic exchange of agency between humans and machines. Previously, the human designer would come up with a concept sketch and would devise computational design methods to automate the design and production of the design concept through computation. In this new workflow, the Machine Intelligence comes up with the concept design by considering design priorities determined by the human designer, but the human designer will have to devise a combination of
Interdisciplinary AI: A Machine Learning System …
115
Fig. 8 3D geometry composited back into the drone footage (Image Credit: Lu Geng, Sarah Kim, Qinyung Lu, Rui Ding, Allen Wang, Ying Huang, Hoodeen Hakimian, Kshama Swamy, Haocheng Dai, Yiran Zhou, Kevin Clark, Luo Lei, Ruiyang Cheng, Yixuan Ye, Christina Charalampaki)
manual as well as procedural methods to convert the ML generated sketch from a 2D concept image into a 3D model. The value judgment, intuition, and experience of the human designer are necessary to devise the most effective computational technique. Therefore, our workflow exemplifies a speculative instance where Machine Intelligence engages with the design process in an intuitive and creative way for the first time, yet still relies on human collaboration for furthering the design.
116
G. Özel
References 1. Jarzombek, M. (2016). Digital Stockholm syndrome in the post-ontological age (Forerunners: Ideas First). University of Minnesota Press. 2. Lienhard, S., Lau, C., Müller, P., Wonka, P., & Pauly, M. (2017). Design Transformations for rule-based procedural modeling. Computer graphics forum (Wiley), 36, 39–48. https://doi.org/ 10.1111/cgf.13105 3. Karras, T., Laine, S., & Aila, T. (NVIDIA). (2018). A style-based generator architecture for generative adversarial networks. CoRR abs/1812.04948. https://arxiv.org/abs/1812.04948. 4. Ozel, G. (2016). Toward a postarchitecture. Log 36 (pp. 99–105). Edited by Cynthia Davidson and Greg Lynn. Anyone Corporation. 5. Smith, C. (2016). neural-style-tf. GitHub. August 03, 2018. Retrieved May 26, 2019, from https://github.com/cysmith/neural-style-tf. 6. Gatys, L. A., Ecker, A. S., & Bethge, M. (2015). A neural algorithm of artistic style. arXiv: 1508.06576. 7. Gatys, L. A., Bethge, M., Hertzmann, A., & Shechtman, E. (2016). Preserving color in neural artistic style transfer. 8. Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. Visual Geometry Group, Department of Engineering Science, University of Oxford. 9. Gatys, L., Ecker, A., & Bethge, M. (2016). Image style transfer using convolutional neural networks. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE. https://doi.org/10.1109/cvpr.2016.265. 10. Ullrich, T, Schinko, C., & Fellner, D. (2010). Procedural modeling in theory and practice. Fraunhofer IGD. 11. Ruder, M., Dosovitskiy, A., Brox, T. (2016). Artistic style transfer for videos. In German Conference on Pattern Recognition (GCPR), LNCS 9796 (pp. 26–36). https://doi.org/10.1007/ 978-3-319-45886-1_3. 12. Kersten, T., Lindstaedt, M., Maziull, L., Schreyer, K., Tschirschwitz, F., & Holm, K. (2015). 3D recording, modelling and visualisation of the fortification Kristiansen in Trondheim (Norway) by photogrammetric methods and terrestrial laser scanning in the framework of erasmus programmes. ISPRS—International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences (Copernicus GmbH) XL-5/W4: 255–262. https://doi.org/10.5194/ isprsarchives-xl-5-w4-255-2015.
ArchiGAN: Artificial Intelligence x Architecture Stanislas Chaillou
AI will soon massively empower architects in their day-to-day practice. This article provides a proof of concept. The framework used here offers a springboard for discussion, inviting architects to start engaging with AI, and data scientists to consider Architecture as a field of investigation. In this article, we summarize a part of our thesis, submitted at Harvard in May 2019, where Generative Adversarial Neural Networks (or GANs) get leveraged to design floor plans and entire buildings (Fig. 1). We believe that a statistical approach to design conception will shape AI’s potential for Architecture. This approach is less deterministic and more holistic in character. Rather than using machines to optimize a set of variables, relying on them to extract significant qualities and mimicking them all along the design process represents a paradigm shift. We can unpack floor plan design into three distinct steps: (I) building footprint massing (II) program repartition (III) furniture layout Each step corresponds to a Pix2Pix GAN-model trained to perform one of the three tasks above. By nesting these models one after the other, we create an entire apartment building “generation stack” while allowing for user input at each step. Additionally, by tackling multi-apartment processing, this project scales beyond the simplicity of single-family houses. Beyond the mere development of a generation pipeline, this attempt aims at demonstrating the potential of GANs for any design process, whereby nesting GAN models, and allowing user input between them, we try to achieve a back and forth between humans and machines, between disciplinarian intuition and technical innovation. S. Chaillou (B) Spacemaker, Oslo, Norway e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2020 P. F. Yuan et al. (eds.), Architectural Intelligence, https://doi.org/10.1007/978-981-15-6568-7_8
117
118
S. Chaillou
Fig. 1 GAN-Generated masterplan. Source Author
1 Representation, Learning, and Framework Pix2Pix uses a conditional Generative Adversarial Network (cGAN) to learn a mapping from an input image to an output image. The network consists of two main pieces, the Generator and the Discriminator. The Generator transforms the input image to an output image; the Discriminator tries to guess if the image was produced by the generator or if it is the original image. The two parts of the network challenge each other resulting in higher quality outputs which are difficult to differentiate from the original images. We use this ability to learn image mappings which lets our models learn topological features and space organization directly from floor plan images. We control the type of information that the model learns by formatting images. As an example, just showing our model the shape of a parcel and its associated building footprint yields a model able to create typical building footprints given a parcel’s shape. We used Christopher Hesse’s implementation of Pix2Pix. Figure 2 displays the results of a typical training. We show there how one of my GAN-models progressively learns how to layout rooms and the position of doors and windows in space–also called fenestration–for a given apartment unit in the sequence in Fig. 2. Although the initial attempts proved imprecise, the machine builds some form of intuition after 250 iterations.
2 Precedents The early work of Isola et al. in November 2018 enabling image-to-image translation with their model Pix2Pix has paved the way for our research.
ArchiGAN: Artificial Intelligence x Architecture
119
Fig. 2 Apartment architectural sequence. Source Author
Zheng and Huang in 2018 first studied floor plan analysis using GAN. The authors proposed to use GANs for floor plan recognition and generation using Pix2PixHD. [1] Floor plan images processed by their GAN architecture get translated into programmatic patches of colors. Inversely, patches of colors in their work turn into drawn rooms. If the user specifies the position of openings and rooms, the network elements laid out become furniture. Nathan Peters’ thesis [2] at the Harvard Graduate School of Design in the same year tackled the possibility of laying out rooms across a single-family home footprint. Peters’ work turns an empty footprint into programmatic patches of color without specified fenestration. Regarding GANs as design assistants, Nono Martinez’ thesis [3] at the Harvard GSD in 2017 investigated the idea of a loop between the machine and the designer to refine the very notion of “design process”.
3 Stack and Models We build upon the previously described precedents to create a three-step generation stack. As described in Fig. 3, each model of the stack handles a specific task of the workflow: (I) footprint massing, (II) program repartition, (III) furniture layout. An architect is able to modify or fine-tune the model’s output between each step, thereby achieving the expected machine-human interaction.
120
S. Chaillou
Fig. 3 Generation stack in three models. Source Author
3.1 Model I: Footprint Building footprints significantly define the internal organization of floor plans. Their shape is heavily conditioned by their surroundings and, more specifically, the shape of their parcel. Since the design of a housing building footprint can be inferred from the shape of the piece of land it stands on, we have trained a model to generate typical footprints, using GIS-data (Geographic Information System) from the city of Boston. We feed pairs of images to the network during training in a format suitable for Pix2Pix, displaying the raw parcel (left image) and the same parcel with a given building drawn over it (right image). We show some typical results in Fig. 4.
Fig. 4 Results of Model I. Source Author
ArchiGAN: Artificial Intelligence x Architecture
121
Fig. 5 Results of model II. Source Author
3.2 Model II: Program Model II handles repartition and fenestration. The network takes as input the footprint of a given housing unit produced by Model I, the position of its entrance door (green square), and the position of the main windows specified by the user (red lines). The plans used to train the network to derive from a database of 800 + plans of apartments, properly annotated and given in pairs to the model during training. In the output, the program encodes rooms using colors while representing the wall structure and its fenestration using a black patch. Some typical results are displayed in Fig. 5.
3.3 Model III: Furnishing Finally, Model III tackles the challenge of furniture layout using the output of model II. This model trains on pairs of images, mapping room programs in color to adequate furniture layouts. The program retains wall structure and fenestration during image translation while filling the rooms with relevant furniture, specified by each room’s program. Figure 6 displays some typical results.
4 UI and Experience We provide the user with a simple interface for each step throughout our pipeline. On the left, users can input a set of constraints and boundaries to generate the resulting
122
S. Chaillou
Fig. 6 Results of model III. Source Author
plan on the right. The designer can then iteratively modify the input on the left to refine the result on the right. Figure 7 showcases this type of interface & process, set up for Model II.
5 Model Chaining and Apartment Building Generation We scale the utilization of GANs in this part to the entire apartment building design. The project uses an algorithm to chain Models I, II, and III, one after the other, processing multiple units as single images at each step. Figure 8 shows this pipeline. The challenge of drawing floor plates hosting multiple units marks the difference between single-family houses and apartment buildings. Strategically, the ability to control the position of windows and units’ entrances is key to enable unit placement while ensuring each apartment’s quality. Since Model II takes doors and windows position as input, the generation stack described above can scale to entire floor plates generation. The user is invited to specify the unit split between Model I and Model II, in other words, specifying how each floor plate divides into apartments and to position each unit entrance door and windows, as well as potential vertical circulations (staircases, cores, etc.). The proposed algorithm then feeds each resulting unit to Model II (results shown in Fig. 9), and then III (result in Fig. 10), to finally reassemble each floor plate of the initial building. The algorithm finally outputs as individual images, all floor plates of the generated building.
ArchiGAN: Artificial Intelligence x Architecture
123
Fig. 7 Design iterations using model II interface. Source Author
Fig. 8 Apartment building generation pipeline. Source Author
6 Going Further If generating standard apartments can be achieved using this technique, pushing the boundaries of our models is the natural next step. GANs offer remarkable flexibility to solve seemingly highly constrained problems. In the case of floor plan layout, partitioning and furnishing the space by hand can be a challenging process as the
124
S. Chaillou
Fig. 9 Model II output, program for each floor plate. Source Author
Fig. 10 Model III output, furnishing of each individual unit. Source Author
footprint changes in dimension and shape. Our models prove to be quite “smart” in their ability to adapt to changing constraints, as evidenced in Fig. 11. The ability to control the units’ entrance door and windows position, coupled with the flexibility of my models, allows us to tackle space planning at a larger scale, beyond the logic of a single unit. In Fig. 12, we scale our pipeline to entire buildings generation while investigating our model’s reaction to odd apartment shapes and contextual constraints.
7 Limitations and Future Improvements If the above results lay down the premise of GANs’ potential for Architecture, some clear limitations will drive further investigations in the future. First, as apartment units stack up in a multi-story building, we cannot guarantee for now the continuity of load-bearing walls from one floor to the next. Since all the internal structure is laid out differently for each unit, load-bearing walls
ArchiGAN: Artificial Intelligence x Architecture
125
Fig. 11 Flexible building layouts. Source Author
might not be aligned. For now, we consider the façades to be load bearing. However, the ability to specify load-bearing elements’ position in the input of Model II could potentially help address this issue. Additionally, increasing the size of the output layer by obtaining larger images which offer better definition is a natural next step. We want to deploy the Pix2Pix HD project developed by NVIDIA in August 2018 to achieve this. Finally, a major challenge comes from the data format of our outputs. GANs like Pix2Pix handle only pixel information. The resulting images produced in our pipeline cannot, for now, be used directly by architects & designers. Transforming this output from a raster image to a vector format is a crucial step for allowing the above pipeline to integrate with common tools & practices.
8 Conclusion AI will soon massively empower architects in their day-to-day practice. As such potential is about to be demonstrated, our work participates to the proof of concept while our framework offers a springboard for discussion, inviting architects to start engaging with AI, and data scientists to consider Architecture as a field of investigation. However, today, our manifesto could be summarized in four major points. Conceptually first, our belief is that a statistical approach to design conception shapes AI’s potential for Architecture. Its less-deterministic and more-holistic character is undoubtedly a chance for our field. Rather than using machines to optimize a set of variables, relying on them to extract significant qualities and mimicking them all along the design process is a paradigm shift. Second, we are directionally convinced that our ability to design the right pipeline will condition AI’s success as a new architectural toolset. Our choice for the “Grey
126
S. Chaillou
Fig. 12 GAN-enabled building layouts. Source Author
Boxing” approach, as introduced by Prof. Andrew Witt in Log, will likely secure the best potential results. This method contrasts with the “black box” approach, that only allows users to input information upfront, and to get finished design options at the end of the process, without any control over the successive generation steps. To the contrary, by breaking out our pipeline into discrete steps, “Grey Boxing” permits the user to intervene all along the way. His tight control over the machine is his ultimate guarantee of the design process quality. Third, we technically believe that the sequential nature of the application will facilitate its manageability and foster its development. The ability to intervene throughout
ArchiGAN: Artificial Intelligence x Architecture
127
the generating process is a fundamental dimension: as each step of the pipeline represents a distinct portion of architectural expertise, each model can be trained independently, opening the way to significant improvements and experimentation in the near future. Indeed, improving this entire pipeline end-to-end could be a long and cumbersome task, while amending it step by step remains a manageable process, within the reach of most architects and engineers in the industry. Finally, we hope our framework will help address the endless breadth and complexity of the models to be trained and those used in any generation pipeline. Tackling parcels-footprint-room split-etc., as we do is one possible approach among, we believe, a large set of options. To encapsulate the necessary steps of space planning, the key is more the principle than the method. And with the growing availability of architectural data, we encourage further work and open-minded experimentation. Far from thinking about AI as the new dogma in Architecture, we conceive this field as a new challenge, full of potential, and promises. We see here the possibility for rich results, that will complement our practice and address some blind spots of our discipline.
References 1. Zheng, H., & Huang, W. (2018). Architectural drawings recognition and generation through machine learning. Cambridge: MA, ACADIA. 2. Peters, N. (2017). Master thesis: “Enabling alternative architectures: Collaborative frameworks for participatory design.” Cambridge, MA: Harvard Graduate School of Design. 3. Martinez, N. (2016). Suggestive drawing among human and artificial intelligences. Cambridge, MA: Harvard Graduate School of Design.
Advanced Timber Construction Platform Multi-Robot System for Timber Structure Design and Prefabrication Hua Chai, Liming Zhang, and Philip F. Yuan
Abstract Robotic Timber Construction has been widely researched in the last decade with remarkable advancements. While existing robotic timber construction technologies were mostly developed for specific tasks, integrated platforms aiming for industrialization has become a new trend. Through the integration of timber machining center and advanced robotics, this research tries to develop an advanced timber construction platform with multi-robot system. The Timber Construction Platform is designed as a combination of three parts: multi-robot system, sensing system, and control system. While equipped with basic functions of machining centers that allows multi-scale multifunctional timber components’ prefabrication, the platform also served as an experimental facility for innovative robotic timber construction techniques, and a service platform that integrates timber structure design and construction through real-time information collection and feedback. Thereby, this platform has the potential to be directly integrated into the timber construction industry, and contributes to a mass-customized mode of timber structures design and construction. Keywords Timber construction platform · Multi-robot system · Sensing system · Control system · Application
1 Introduction Technology development has always been a source of driving forces in the transformation of the construction industry. The last two decades have witnessed unprecedented development in material sciences, design methods, and fabrication techniques, which not only contributed to the increasing complexity of architectural practices, but also gradually led to the new interdisciplinary way of handling the complexity. By introducing multidisciplinary fields such as mechanics, computer science, and material science into architecture, robotic fabrication researches manage to integrate material properties, structural performance, fabrication constraints, and construction H. Chai · L. Zhang · P. F. Yuan (B) College of Architecture and Urban Planning, Tongji University, 1239 Siping Road, Shanghai, China e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2020 P. F. Yuan et al. (eds.), Architectural Intelligence, https://doi.org/10.1007/978-981-15-6568-7_9
129
130
H. Chai et al.
methods through computational design process, which shows an integrated approach for architectural design and construction. Timber has gained extensive attention in this process because of its great advantages in terms of sustainability, and the enormous potential in the new technological context [7]. With the increase in practices, complexity in timber structures is gradually becoming more prominent, posing new challenges for design and construction [9]. Complexity lies not only in the design and simulation, but also in the fabrication and construction of complex structures, components, and joints, which calls for innovative technologies with high accuracy and efficiency.
1.1 Timber Machining Center At present, large-scale timber machining centers, which constitute the core equipment of the timber structure contractor such as Hess-timber, Simonin, and BlumerLehmann AG, are the main platforms for complex timber structures’ production. Timber machining centers are developed on the basis of five-axis CNC with universal milling as the core function, as well as other auxiliary features including cutting and drilling. As wood-frame construction such as small houses dominates the early timber building market, timber machining center such as Hundegger in Germany was mostly developed for wood-frame construction [5]. There are also machining centers such as Technowood and Switzerland, which are mainly developed for heavy wood structures and large panels [10]. The advantage of the wood machining center lies in the extremely high processing efficiency in handling all kinds of geometric complexity. While timber machining center and other task-specific computer-aided manufacture machines manage to handle some of the challenges, they often overlook the material efficiency in the milling process and cause a lot of material waste. There are also difficulties in computer-aided manufacturing in data transmission between design model and fabrication machines, impeding the integrity of the architectural process. For example, the Hundegger K2I machining center, which is quite popular among Chinese timber contractors, can only receive special format files made by limited software such as CADWORK and VisKon. VisKon, developed by professional carpenters, is a typical and powerful design software for wood frame buildings which supports indepth parametric design of the wood structures inside the software, provides detailed information models of all the components at the same time, and finally acts as an interface to transmit necessary data to CNC for prefabrication [17]. However, as very limited CAD formats are supported to be imported, VisKon can hardly interface with an external design model developed with other software. As a free standard format, BTL facilitates data transmission from design software to prefabrication machine to some extent through its “parametric description of the geometry of wooden building components and their processings as well as structural information for prefabrication and assembly” [18]. BTL format could be read by
Advanced Timber Construction Platform Multi-Robot System …
131
most timber-specific CAD software and timber machining centers. Generally, software like CADWORK and VisKon would support the export of components in BTL formats. In this way, BTL provides the possibilities to connect parametric models and timber machining centers. Woodpecker, a Grasshopper plug-in developed by design to production for LignoCam, provides a series of components that could transfer parametric models in grasshopper into BTL format.
1.2 Robotic Timber Construction In contrast, robotic fabrication technology has demonstrated the ability to handle these complexities with their strengths in terms of flexibility, multifunction, large workspace, and accuracy. More importantly, with parametric robot control tools, architects can simulate the fabrication process directly from the design process, blurring the boundaries of design and construction. Robotic Timber Construction has been widely researched in the last decade and has achieved remarkable advancements. Both additive and subtractive construction technology have been studied by a large range of research institutions such as Gramazio and Kohler research, ICD, AA School, EPFL [7]. The researches show a wide range of concerns. Different robotic techniques, including milling, sewing, and saw cutting, have been developed in the construction of different timber materials, from logs to all kinds of engineered wood like Glulam and plywood boards. Despite this, it should be noted that, existing technologies in robotic timber construction were mostly developed to meet the needs of specific tasks. With the maturity of the robotic techniques for timber construction, large platforms for timber structure prefabrication start to emerge, introducing researches into the industrial field. The gantry robot in ERNE AG shows great potentials, especially in robotic additive timber construction for large-scale platforms in timber industry [14]. The gantry robot setup in ETH also conducts a lot of researches in bespoke timber structure [11]. TIM—a mobile robotic timber construction platform was built by ICD mainly for timber plates structures [13]. The introduction of advanced robot technology has made it possible to develop multifunctional robotic technology and introduce material properties in the fabrication process. As Prof. Menges said in his description of Cyber-Physical Making: “As production machines and fabrication robots no longer remain dependent on a clear set of instructions…… and are increasingly capable of sensing, processing and interacting with each other and with the physical world in real time, a novel point of convergence of design and materialization is on the horizon.” [6] Recently, advanced robotics research including machine vision-in Robot localization [2]/Material monitoring [1]/Human–Robot collaboration [12], Autonomous path planning [3, 4] and Real-time Robot Control [8] has emerged in the construction field. In this context, this article describes a multi-robot platform being built, which tries to cope with the challenges in timber construction by making use of the opportunity in advanced robotics. The platform consists of multi-robot setup, sensing system,
132
H. Chai et al.
control system, and fabrication tools. Instead of being a simple construction machine, this platform will allow for timber structures innovation by integrating multi-source information into the interactions between design and construction process.
2 Development of Advanced Timber Construction Platform 2.1 Purpose The design and development of Advanced Timber Construction Platform was initiated in 2015, and was mainly developed by Fab-Union, together with DDRC, Tongji University [15], (Fig. 1). The initial purpose of the platform is a multifunctional research laboratory for digital design and robotic construction. Therefore, the platform was developed as a compact physical facility which integrated a gantry robot system with tools for different fabrication processes. Served as a general platform, the initial configuration focuses on the physical system, leaving the content of the virtual processes such as perception, location, and communication to each specific experiment. The timber construction platform is developed on the basis of the existing facility, which needed to be upgraded and refined to meet the demands and vision of timber
Fig. 1 The initial infrastructure of Timber Construction Platform
Advanced Timber Construction Platform Multi-Robot System …
133
construction. The platform will serve as first, a machining center that allows multifunctional multi-scale timber components’ prefabrication. Therefore, the platform needs to serve the general function of a normal machining center. Second, an experimental facility for innovative robotic timber construction techniques, thus should provide basic functions and allow for flexible expansion of effector set. Third, a service platform integrating design and construction, which requires an interface that allows the connection and interaction between design and fabrication process. Through real-time information interaction, the platform will contribute to a mass-customized design and construction mode for timber structures.
2.2 Structure In order to achieve the vision, the Timber Construction Platform will consist of three parts: multi-robot system, sensing system, control system and interfaces (Fig. 2). The multi-robot system is developed by integrating the current multi-robot system with multifunctional Plug-and-Use robotic timber fabrication tools. The sensing system is to achieve material and process monitoring by implanting sensors and machine vision technology. In order to realize an integrated process from design to fabrication, the robotic control system integrated a grasshopper-based robotic simulation software called FURobot, and a basic version “digital twin” system for real-time control and feedback through system modeling and real-time communication.
Fig. 2 The structure of the Timber Construction Platform
134
H. Chai et al.
2.3 Infrastructure and Tools The main manufacturing resources of the platform comprised of. The gantry robot system, with a 3 axis gantry system which is 12 m long, 8 m wide and 6 m high. Two KUKA KR120 robots, master one and slave one, are hanged on the two z-axes of the gantry, which share the same y-axis (Fig. 3). Each robot is equipped with a quick tool changer that integrates necessary modules that may be required by different tools, including electrical, gas, liquid, and signal, which lays the foundation for flexible combination of tools and also expansion of tools library. With KUKA Roboteam software, the two robots can execute instructions collaboratively. All these parts in the gantry robot system are well coordinated and could be controlled with the same control panel. The material handling tools consist of a set of robotic effectors customized for both additive fabrication and subtractive fabrication of timber structures (Figs. 4 and 5). A gripper integrated with a nail gun is equipped for additive fabrication. The tools for subtractive fabrication include one universal milling module with a set of milling bits and circular saw; a chain saw module for fast slotting on wood components; a band saw module for curved beam cutting. Quick Changers are equipped to improve the efficiency of robotic tools changing. Each tool can be mounted on both robot arms, that is, the two robot arms can use and control the same set of tools. Tool libraries can be expanded as needed.
Fig. 3 The model of the multi-gantry robot system
Advanced Timber Construction Platform Multi-Robot System …
Fig. 4 The tools library
Fig. 5 Robotic Effectors
135
136
H. Chai et al.
2.4 Sensing System To ensure process transparency and interaction between control system and fabrication process, a multi-sensor system is designed to monitor machine condition, material condition, and power consumption. At the core of the perception system are the motion capture devices, which are used for localization of the tools and materials. Using high-quality infrared camera as hardware, motion capture system could record the 3D coordinates of points with high speed, high frame rate, and high resolution. For a point in three-dimensional space, as long as the point can be seen by two cameras at the same time, the position of the point in the three-dimensional space at this moment can be determined. When the camera continuously shoots at a sufficiently high rate, the real-time position and motion trajectory of the point can be obtained from the image sequence. In this case, eight infrared motion capture cameras, evenly distributed on both sides of the space, record the coordinates information of a special reflective marker ball in the platform space at a frequency of 200hz. The information is then gathered and connected to a wireless router. Therefore, motion analysis software cortex on the control computer could obtain data from the wireless router and perform data processing and visualization. A Raspberry Pi that runs ROS program could receive the data from the motion capture system via Wi-Fi and then send it to grasshopper after data processing. Using the ROS system on the Raspberry Pi as an intermediary, grasshopper could visualize the space points through several components for subsequent construction operations. As marker points can be added and arranged at low cost, this system has a wide range of applications. Despite the high speed and accuracy of the motion capture system, there are also some shortcomings. As the points can be only identified in the overlapping field of camera’s views, the area that can be identified by the camera is smaller than the robot reach area. As the cameras could be affected by sunlight in the field of view, this system has strict restrictions for the lighting conditions of the laboratory space. Nevertheless, the motion capture system can be used for component localization in most construction scenarios (Fig. 6).
Fig. 6 Motion capture system
Advanced Timber Construction Platform Multi-Robot System …
137
In addition to vision sensors, most tools will be equipped with sensors to optimize the machining process. The motor current is monitored with the current transformer, which is connected with the robot through PLC, and used to optimize the energy consumption and material feed rate (or robot processing speed). In the future, information from the sensors are intended to be collected to form a multi-heterogeneous information database altogether. According to different analysis requirements, such as error analysis, deformation and temperature correlation analysis, specific data can be extracted from the database and fused to establish different analytical models.
2.5 Control System The core of the control system is grasshopper plug-in FURobot, a universal online and offline programming software for multiple brands of robots developed by FabUnion (Fig. 5). By supporting multiple functions such as robotic simulation, real-time control, and specific fabrication technology programming, FURobot is committed to building a digital twin system for robotic fabrication. A virtual model of the robotic platform and tools has been built into the FURobot, and different sensors will be also introduced later. As for the connectivity of the robot system, the robot system and tools are inter-connected via PLC, and a host computer could control the whole system in real time. At the current state, the parametric design model on personal computers can communicate with the platform through KUKA software RSI (Robot Sensor Interface). A programming toolkit for robotic timber fabrication called Tech-wood is built-in FURobot. Tech-wood is used for path generation of basic robotic timber fabrication processes, which are corresponding to the tools of the robot platform. The components include Bandsaw Cutting, Chainsaw Slotting, Circularsaw Cutting, Drilling, and several batteries for milling. As robotic milling is still the dominating method for timber fabrication, some basic path planning algorithms for 2D contour, 3D contour, and 3D surface milling are packed in Tech-wood. Further, some basic path generation tools are also being developed for different timber components and joint types. By combining different batteries during the parametric design process, Tech-wood can directly transform geometric information into robotic paths (Figs. 7 and 8).
3 Application Scenarios The efficiency of large-scale industrial production comes from a fixed factory configuration and a single application scenario. Unlike industrial products, every building is different in terms of construction requirements. To meet the needs of customized construction, the robot platform needs to be able to be reconfigured easily and quickly to adapt to different projects. At the same time, the robot platform could adapt to a
138
H. Chai et al.
Fig. 7 The interface of FURobot
Fig. 8 Path generation and Robotic simulation of different processes with FURobot
variety of application scenarios. Not only can traditional construction needs be met, but the need for innovation can also be achieved through innovative applications of tools or the development of new tools. Compared with timber machining centers, robotic platform is specialized without compromising flexibility. The general construction requirements can be achieved through the combination of single robots or dual robots with various tools. Gripper could be used to perform additive assembly tasks, while spindle and saws could be used for subtractive operations such as joints cutting and freeform milling. The two robots can perform different tasks separately and cooperate with each other to realize automated processing and assembly. During the gradual development and transformation process, the platform has been used for a series of construction projects, including “robotic timber tectonics” projects for Digital FUTURE 2016, “Sewn Timber Shell 2017” for Design Society Shenzhen, and “Nomadic Wood(looking)” projects for Bi-City Biennale of Urbanism/Architecture(Shenzhen). The Nomadic Wood(looking) project will be
Advanced Timber Construction Platform Multi-Robot System …
139
taken as an example to show the capabilities and possibilities of the robot platform, and how the collaboration of different tools in this platform makes innovative timber structure possible.
3.1 Nomadic Wood(Looking) Fabricated with robotic timber construction technology, this project is dedicated to showcasing innovative design and construction techniques in architecture as part of the Bi-City Biennale of Urbanism\Architecture in Shenzhen. This installation is designed to be 9 free curved wood beams intertwined with each other to stimulate bodily behaviors such as sitting and crossing (Figs. 9 and 10). While plane curved Glulam could be easily produced in the factory, there were still no efficient solutions for free curved Glulam production. Robotic band saw cutting techniques have been proved to be a feasible solution for producing curved wood beams without immense time consumption [16]. In this project, plane curved
Fig. 9 Nomadic wood(looking) installation design
Fig. 10 The design process of nomadic wood(looking) installation
140
H. Chai et al.
Fig. 11 Robotic simulation of joint milling in nomadic wood(looking)
Glulam is used as raw material for free curved beams production. Each beam is divided into eight parts. After each part is processed, they will be connected into a continuous curved beam through a specially designed interlocking joint. Each of the three-dimensional curved parts is expected to be cut out of two-dimensional curved beams with minimum volume, which are produced by a timber construction contractor (Fig. 11). The fabrication of all the components is done on this robot platform. Making full use of the advantages of dual-robot collaborative construction, the two robots are responsible for band saw cutting and joints milling, respectively, with the master robot equipped with spindle, and the slave one operates band saw effector. Six cuts are needed for each component from raw materials to finished product. Four side surfaces are to be cut with band saw as ruled surfaces, and the joints at two ends need to be milled with spindle. The fabrication sequence is designed as follows: after the raw beam is fixed on the processing table and located, slave robot will first cut the top and bottom sides with band saw; then the beam will be flipped 90 degrees and re-located; the slave robot continues to cut the third surface; then joints at two ends will be milled by the master robot with spindle; after which slave robot will move back and cut the last surface. Seventy-two components are fabricated in one month and assembled on three days (Fig. 12). By employing multi-tool and multi-robot collaboration, this project demonstrates the robot platform’s ability in handling complex tasks through multi-process integration. In this way, the timber construction platform could not only realize the functions of traditional machining center, but also provides the possibility for innovation in timber structure construction (Figs. 13, 14 and 15).
Advanced Timber Construction Platform Multi-Robot System …
141
Fig. 12 The relationship between raw beam to component of the installation
Fig. 13 Joint milling with master robot and spindle
4 Discussion and Outlook This paper describes a systematic design of a robotic timber construction platform. The platform integrates the advantages of timber machining center and robotic fabrication technologies to form an advanced platform for timber structure innovation. The development, which is still in progress, is largely guided by the principle of cyber-physical system, in which the information acquisition and feedback stand at
142
Fig. 14 Side surface cutting with slave robot and band saw effector
Fig. 15 Nomadic Wood(looking)
H. Chai et al.
Advanced Timber Construction Platform Multi-Robot System …
143
the core of the platform. The hardware, software, and techniques are all interrelated with information. In this case, the platform has the potential to be directly integrated into the future timber construction industry. The target of the platform development is not for timber construction automation, but for a degree of automation that enables the combination of timber properties and advanced robotics. Based on the platform, robotic timber construction techniques need to be developed through process organization, parameter adjustment, quality control, etc. to meet the mass customization requirements of timber structures. For the development of an integrated timber design method, this platform will be used for establishing an integrated design workflow that synthesizes material properties, structural performance, fabrication constraints, which could adjust in real time according to the fabrication information. There are still several key technical issues that need to be addressed in the development process. The integration and coordination of multi-robot platform, sensing system, and effectors is still a great challenge for achieving systematic stability. Acknowledgements This research is supported by National Key R&D Program of China (Grant No. 2018YFB1306903).
References 1. Bard, J., Bidgoli, A., & Chi, W. W. (2018). Image classification for robotic plastering with convolutional neural network. In: Robotic fabrication in architecture, art and design (pp. 3–15). Springer. 2. Brugnaro, G., Baharlou, E., Vasey, L., & Menges, A. (2016). Robotic softness: An adaptive robotic fabrication process for woven structures. In: Posthuman frontiers: Data, designers, and cognitive machines, proceedings of the 36th conference of the association for computer aided design in architecture (ACADIA). Ann Arbor. 3. Dubor, A., Camprodom, G., Diaz, G. B., Reinhardt, D., Saunders, R., Dunn, K., NIEMELÄ, M., Horlyck, S., Alarcon-Licona, S., & Wozniak-O’connor, D. (2016). Sensors and workflow evolutions: Developing a framework for instant robotic toolpath revision. In: Robotic fabrication in architecture, art and design 2016. Springer. 4. Huang, Y., Carstensen, J., Tessmer, L., & Mueller, C. Robotic extrusion of architectural structures with nonstandard topology. In: Robotic fabrication in architecture, art and design, 2018 (pp. 377–389). Springer. 5. Hundegger. (2019). Product overview. Available at https://www.hundegger.de/en/machine-bui lding/company.html. Accessed 19 Mar 2019. 6. Menges, A. (2015). The new cyber-physical making in architecture: Computational construction. Architectural Design, 85, 28–33. 7. Menges, A., Schwinn, T., & Krieg, O. D. (2016). Advancing wood architecture: A computational approach. Routledge. 8. Munz, H., Braumann, J., & Brell-Cokcan, S. (2016). Direct robot control with mxAutomation: A new approach to simple software integration of robots in production machinery, automation systems, and new parametric environments. In: Robotic fabrication in architecture, art and design 2016. Springer. 9. Scheurer, F. (2010). Materialising complexity. Architectural Design, 80, 86–93.
144
H. Chai et al.
10. Technowood. (2019). TW solutions. Available at https://www.technowood.ch/de/solutions. Accessed 29 Mar 2019. 11. Thoma, A., Adel, A., Helmreich, M., Wehrle, T., Gramazio, F., & Kohler, M. 2018). Robotic fabrication of Bespoke timber frame modules. In: Robotic Fabrication in Architecture, Art and Design, 2018 (447–458). Springer. 12. Vasey, L., Nguyen, L., Grossman, T., Kerrick, H., Schwinn, T., Benjamin, D., Conti, M., & Menges, A. (2016). Springer human and robot collaboration enabling the fabrication and assembly of a filament-wound structure. In: Proceedings of the 36th annual conference of the association for computer aided design in architecture ACADIA//2016: Posthuman frontiers: Data, designers, and cognitive machines (pp. 184–195). 13. Wagner, H. J. (2018). Introducing TIM—a mobile robotic timber construction platform. Available at https://icd.uni-stuttgart.de/?p=23427. Accessed 29 Mar 2019. 14. Willmann, J., Knauss, M., Apolinarska, A. A., Gramazio, F., & Kohler, M. (2016). Robotic timber construction—expanding additive fabrication to new dimensions. Automation in Construction, 61, 16–23. 15. Yuan, P., & Meng, H. (2016). Fab-union: A collective online to offline robotic design platform. Architectural Design, 86, 52–59. 16. Chai, H., & Yuan, P. F. (2018). Investigations on potentials of robotic band-saw cutting in complex wood structures. In: Robotic fabrication in architecture, art and design (pp. 256–269). Springer, Cham. 17. Weto AG. (2020). Viskon—the new generation of visual construction. Available at https://www. weto.com/viskon.html. Accessed 01Jan 2020. 18. Design2machine. (2020). Welcome. Available at https://design2machine.com/index.html. Accessed 01 Jan 2020.
Iterative Intelligence
Bioinspired Generative Architectural Design Form-Finding and Advanced Robotic Fabrication Based on Structural Performance Ding Wen Bao, Xin Yan, Roland Snooks, and Yi Min Xie
Abstract Due to the potential to generate forms with high efficiency and elegant geometry, topology optimization is widely used in architectural and structural designs. This paper presents a working flow of form-finding and robotic fabrication based BESO (Bi-directional Evolutionary Structure Optimization) optimization method. In case there are some other functional requirements or condition limitations, some useful modifications are also implemented in the process. With this kind of working flow, it is convenient to foreknow or control the structural optimization direction before the optimization process. Furthermore, some fabrication details of the optimized model will be discussed because there are also many notable technical points between computational optimization and robotic fabrication. Keywords Generative design · Form-finding · BESO method · Robotic fabrication · Topological optimization · 3d printing · Pavilion
1 Introduction Throughout the history of architecture, there is always a close relationship between the development of traditional architectural forms and the evolution of structural morphology. From Rome to Gothic, the popular architecture prototype changes from barrel arch to pointed arch based on the technology of flying buttress. Similarly, after the Renaissance, many physical analysis methods are developed to help architects to achieve more complex architecture forms structurally, like graphic statics and D. W. Bao · X. Yan · Y. M. Xie (B) Centre for Innovative Structures and Materials, School of Engineering, RMIT University, Melbourne, VIC, Australia e-mail: [email protected] D. W. Bao · R. Snooks (B) School of Architecture and Urban Design, RMIT University, Melbourne, VIC, Australia e-mail: [email protected] X. Yan Centre of Architecture Research and Design, University of Chinese Academy of Sciences, Beijing, China © Springer Nature Singapore Pte Ltd. 2020 P. F. Yuan et al. (eds.), Architectural Intelligence, https://doi.org/10.1007/978-981-15-6568-7_10
147
148
D. W. Bao et al.
inverse lifting models by Antonio Gaudi. Furthermore, after Modernism, the new construction techniques with glass and steel make the international style popular over the world. Recently, with the fast development of computational techniques, the topic of form-finding based on structural performance has gained new momentum [1]. Not only researchers have focused their view on structural form-finding methods, but also many architects are attracted because it can generate forms with a characteristic of high structural efficiency and more elegant shape potentials. Among the form-finding methods, the topology optimization method of ESO (Evolutionary Structural Optimization), developed by YM Xie and GP Steven in 1993 [2] and its modified version, known as BESO (bi-directional evolutionary structural optimization), published in 2006 [3], are widely implemented in architecture practices, such as Qatar National Convention Centre and Shanghai Himalayas Center. With the further development of Ameba, a new GH plug-in based on BESO algorithm by YM Xie and his team [4], more and more architects and designers will have opportunities to use a new intelligent method to work with the computer interactively, to create innovative, efficient and organic architectural forms and facilitate the realization of mass customization in the construction industry through the introduction of advanced 3D printing technologies, such as large robotic 3D printing and some hybrid fabrication strategies developed by Roland Snook and his research team in RMIT Architectural Robotic Lab. The concept of topological optimization and the inspiration of Gaudi’s Sagrada Familia Basilica will be reflected through the pavilion form-finding and its optimization. The new approach of generative architectural design and fabrication will be introduced in this project, which explores the architectural implications of topological optimization design through robotic 3D fabrication.
2 Basic Theory of BESO Method BESO algorithm aims to find the solution with the highest structural performance under certain material limitations by removing or adding material elements step by step. The basic problem can be described mathematically as follows: 1 T 1 p T U KU = x u ki ui 2 2 i=1 i i N
min C(X) =
(2.1)
subject to V (X) =
N i=1
xi vi ≤ V ∗
(2.2)
Bioinspired Generative Architectural Design Form-Finding and Advanced …
149
In which C, U, K, V ∗ and X are the objective function (compliance), displacement vector, global stiffness matrix, the objective volume and the global design variable vector, respectively. The terms vi , xi , ki and ui are the volume, design variable, stiffness matrix and nodal displacement vector for the ith element. Furthermore, there are only two alternative values for xi in BESO, which are 1 for the solid element or prescribed value xmin for the void element. For stiffness optimization problem, the sensitivity αi for the ith element, which is the criterion for design variable xi , can be calculated as the gradient of compliance with respect to the design variable [5] ∂C(X) 1 p−1 = − pxi uiT ki ui ∂ xi 2 1 T u ki ui when xi = 1 1 ∂C αi = − = x p−12 iT min p ∂ xi ui ki ui when xi = xmin 2
(2.3)
(2.4)
and when the penalty coefficient p tends to infinity, the sensitivity number αi becomes the one in hard-kill BESO method [5], i.e. 1 T 1 ∂C u k u when xi = 1 = 2 i i i αi = lim − p→∞ 0 when xi = xmin = 0 p ∂ xi
(2.5)
The above sensitivity is usually modified to solve the mesh-dependent problem [6, 7] using a filtering scheme with N
α˜ i =
wi j α j
j=1 N
(2.6) wi j
j=1
wi j = max(0, r f − di j )
(2.7)
in which di j is the distance between the centre of the jth element and the ith element and r f , α j is the filter radius and the original sensitivity of the jth element. −
To achieve a convergent solution, another historical average, αi , indifferent iterations are introduced [8], i.e. αi =
α˜ i(n) + α˜ i(n−1) 2
(2.8)
In BESO method, the element sensitivities are ranked in each iteration to determine a threshold with a target volume of next iteration, V (n) , which is defined based on the current volume V (n−1) and the evolutionary ratio δ.
150
D. W. Bao et al.
V (n) = V (n−1) (1 − δ)
(2.9)
The threshold can be used to evaluate if the element shall be changed in such a way that if one solid element’s sensitivity is lower than the threshold, its design variable will be the switch from 1 to xmin , and the design variable of a void element will be changed from xmin to 1 as well if its sensitivity is higher than the threshold.
3 Form-Finding Based on BESO In this work, a pavilion structure is introduced to make a discussion about the BESO form-finding details for architecture. There are mainly three steps in the form-finding process: 1. Model definition, 2. BESO topology optimized iterations, 3. Post-modification. To generate an optimized model with not only high structural performance but also some other characteristics to meet other functional requirements or aesthetic preferences, there are many detailed modifications that we should pay much attention to during the process.
3.1 Model Definition BESO method is an FEA (Finite Element Analysis)-based iterative process. The BESO sensitive number, which is the criterion to add or remove the elements, is also calculated with the data of the FEA result. Therefore, defining an appropriate FEA model is a fundamental work for the BESO algorithm.
3.1.1
Initial Geometry
Before generating the FEA mesh, the initial geometry should be made as accurately as possible if there is a rough objective form. Theoretically, the BESO object is to find the best answer within a solution domain about the material distribution, and we can shrink the initial material distribution possibilities by modifying the initial geometry to fit the objective form. In this pavilion design, with the inspiration of Gaudi’s Sagrada Familia Basilica, the concept of this pavilion is to generate a tree-like structure form (Fig. 1). If the initial geometry was designed as a solid block (Fig. 2a), the BESO algorithm would generate the form with the highest structural stiffness in Fig. 2b, which is far from the design concept. Therefore, to produce the branch-like supports and leave
Bioinspired Generative Architectural Design Form-Finding and Advanced …
151
Fig. 1 Natural tree branch (left), columns in Sagrada Familia Basilica (middle) and one structure in this pavilion (right)
(a) Initial geometry
(b) BESO optimized model
Fig. 2 The initial geometry without modification and its BESO result
enough space for visitors to go through, the initial geometry should be modified with some cavities and thin columns (Fig. 3).
3.1.2
Mesh Discretization
Another way to manually predesign the BESO result is to modify the calculation mesh. In the BESO method for continuum structures, the main element types of FEA calculation mesh are solid for block structures or shells for surface structures, respectively. For this pavilion design, considering symmetric geometry and boundary conditions of the structure, only a quarter is generated with solid elements in the mesh discretization process.
152
D. W. Bao et al.
(a) Initial geometry
(b) BESO optimized model
Fig. 3 The modified initial geometry and its BESO result
(a) Initial design mesh
(b) BESO optimized model
Fig. 4 Initial mesh and BESO result without modification
In the BESO method, the structure evolves based on the element addition or deletion so that the element size can influence the structure details on some level with a proper BESO filter radius. In other words, the smaller the element size and the filter radius are, the more details can be generated using BESO. Furthermore, some mesh modifications can also help us to control the final model. For example, a space near the symmetric plane is imposed to interrupt some certain force paths (Fig. 5) in case the unsafe horizontal beam (Fig. 4) occurs.
3.1.3
Material Property
The material property setting is another important aspect of FEA modelling. Different material properties can also influence the BESO result indirectly. In the finite element analysis, the material is assumed to be homogenous. For the homogenous, isotropic,
Bioinspired Generative Architectural Design Form-Finding and Advanced …
(a) Initial design mesh
153
(b) BESO optimized model
Fig. 5 Initial mesh and BESO result with modification
and linearly elastic materials, there are two main parameters, Young’s modulus and Poisson’s ratio. For the model with only one material, the BESO results will change if they are assigned with different Poisson’s ratios (Fig. 6) or Young’s moduli (Fig. 7), and Poisson’s ratio has a closer correlation with BESO result than Young’s modulus.
Fig. 6 BESO result with different Poisson’s ratios (0.15 for the left, 0.30 for the middle and 0.45 for the right)
Fig. 7 BESO result with different Young’s moduli (0.01 GPa for the left, 1 GPa for the middle and 100 GPa for the right)
154
D. W. Bao et al.
However, for the model with multi-material, different relative material Young’s modulus can be designed purposely to generate different forms. For example, Fig. 8 shows a control experiment about façade topology optimization. The initial design domain is divided into two parts, the non-design domain and the design domain. One bottom corner is fixed in all three displacement directions, and the non-design domain is assigned with uniform pressure. The optimized structures vary significantly with the different materials assigned to the non-design domain. Specifically, the following figures show the different results with different Young’s modulus value of the nondesign domain, and it can be concluded that with a decrease of non-design domain’s Young’s modulus, there will be an increase in the area of the branch structure’s top to hold the soft materials (Fig. 9).
Fig. 8 Initial FEA model settings
(a) 100GPa
(b) 10GPa
(c) 1GPa
(d) 0.1GPa
Fig. 9 BESO result with different Young’s moduli of non-design domains
(e) 0.01GPa
Bioinspired Generative Architectural Design Form-Finding and Advanced …
155
Fig. 10 Initial model with pointed load and its BESO result
3.1.4
Load Case and Boundary Condition
Different from the above geometry part, load case and boundary conditions are the force defining part in the FEA process, which can make significant influences on BESO results.
3.1.5
Load Case
Load case and boundary condition are two aspects in FEA to define the force field where the model is located. Also, some impressive points can be concluded to describe the relationship between BESO results and the force field. In architecture design, concentrate load and distributed load are two load types that are usually used to define load cases. For concentrate load, especially pointed load, materials always tend to concentrate around the local area where the concentrated load acts and form a local structure in the load direction to support that load. While the distributed load, especially pressure, usually acts on a surface that is treated as a non-design domain in BESO and supported by some branches in the final BESO results (Figs. 10 and 11).
3.1.6
Boundary Condition
It is well-known that one point has six degrees of freedom in 3D space, including three displacements and three rotations. Boundary conditions describe which directions are fixed in the model boundaries. The boundary condition should be made based on the physical conditions around the model. However, when designing a form, the BESO results can be different with various boundary conditions. For example, the BESO algorithm may generate some structures in certain directions to resist the displacements or rotations of the boundary if there is not any
156
D. W. Bao et al.
Fig. 11 Initial model with pointed load and its BESO result
Fig. 12 BESO results with different boundary conditions
constraint in that direction. In the pavilion design, if the bottom corners are only fixed in z directions, the ring beam at the bottom will be generated to resist the horizontal displacements of the bottom. However, if the bottom points are pinned in three displacement directions, the ring beam will be unnecessary and avoided by BESO (Fig. 12).
3.2 BESO Topology Optimized Iterations Besides the above details in FEA model definitions, the BESO algorithm also provides users with some algorithm constraints and parameters to modify the designs. In the past 10 years, the topic of modifying the topology optimization method to solve some specific problems, such as generating symmetric or periodic structures, printing concrete and reserving functional parts, has attracted much attention. In the process of this pavilion form-finding, the modifications about the non-design domain and
Bioinspired Generative Architectural Design Form-Finding and Advanced …
157
symmetric constraint have been introduced. And adjustments of BESO parameters, such as filter radius (FR), evolution ratio (ER) and volume fraction (VF) are also considered.
3.2.1
BESO Parameters
BESO main parameters are evolution ratio (ER), filter radius (FR) and volume fraction (VF), describing the number of variable elements, the sample range of averaging sensitivity number and the volume of the final model, respectively.
3.2.2
Evolution Ratio (ER)
With different ER values, the topology optimization process will be completed at a different time, and the results can also be different significantly. It is because, with a large evolution ratio, the number of variable elements in each iteration will increase too much to get the global-optimized structure. In traditional topology optimization theory, to achieve getting effective structures, ER value is suggested smaller than 5% and as small as possible. However, for designers, the global-optimized structure is not necessary sometimes and changing ER value comes to be a simple way to generate diversity local-optimized results with similar structural performances, although a little lower than the global best one (Table 1 and Fig. 13). Table 1 Iterations and compliances of BESO processes with different ER values
ER = 4%
ER = 2%
ER = 1%
Iteration
83
167
307
Compliance
0.622
0.616
0.603
Fig. 13 The BESO results with different ER values (from left to right 4, 2, 1%)
158
D. W. Bao et al.
Fig. 14 The BESO results with different FR values (from left to right 16, 24, 32 and 40 mm)
Table 2 Iterations and compliances of BESO processes with different FR values FR = 16 mm
FR = 24 mm
FR = 32 mm
FR = 40 mm
Iteration
82
83
96
106
Compliance
0.591
0.622
0.654
0.797
3.2.3
Filter Radius (FR)
Filter radius (FR) is vital in predesigning the BESO result. In topology optimization theory, the filter radius is introduced to solve the checkboard problem. However, from the appearance of the final results, the filter radius can be used to predesign the minimum size of whole structure details. As what the following figures show, the BESO results of the same model can be different with different filter radiis (Fig. 14 and Table 2).
3.2.4
Volume Fraction (VF)
Volume fraction (VF) is the parameter to define the remaining part’s number, and it is comprehensible that volume fraction has an obvious influence on BESO results. However, there are also two points that should be treated carefully. The first one is that for some model, VF value cannot be too small in case the whole structure collapses due to lack of materials. The other one is that the shell element model is easier to get transparent holes than the solid element model with the same VF value, while the solid element model can represent which parts should be thicker than other parts (Figs. 15 and 16).
3.2.5
Algorithm Constraint
The three above parameters are the normal parameters in the traditional BESO algorithm, and there are also many types of research about the algorithm modifications
Bioinspired Generative Architectural Design Form-Finding and Advanced …
Fig. 15 The BESO result of the top surface with solid elements
Fig. 16 The BESO result of the top surface with shell elements
Fig. 17 The initial domain’s settings of the pavilion
159
160
D. W. Bao et al.
in topology optimization. For the pavilion design in this paper, two main algorithm constraint parts are as follows.
3.2.6
Non-Design Domain
For some functional requirements, there are always some local parts which should be reserved during the topology optimization process. Therefore, the BESO method permits the users to set the non-design domain in the initial model, which will take part in the FEA calculation but will be reserved in the following optimization iterations. To generate the tree-like structures in the pavilion, the initial domains are set as Fig. 17.
3.2.7
Symmetric Constraint
Because of the symmetric characteristic, this pavilion needs to be kept symmetrical during the iterations. However, numerical calculation errors or odd void element numbers may cause asymmetries to the whole model. In BESO, there is also a constraint function to keep the model symmetrical all the time (Figs. 18 and 19).
Fig. 18 The BESO result without symmetric constraint
Fig. 19 The BESO result with symmetric constraint
Bioinspired Generative Architectural Design Form-Finding and Advanced …
161
Fig. 20 The smooth mesh workflow in Ameba
Fig. 21 The pavilion generation process
3.3 Post-Modification It is easy to see that the rough mesh model optimized by the BESO method cannot satisfy the atheistic and fabrication requirements for architecture. As a result, the optimized mesh model should be modified carefully after the optimization for the following fabricating works.
3.3.1
Mesh Smooth
For finite element analysis, the calculation mesh is composed of some fundamental elements, such as triangles or quadrangles for shell and cubes or tetrahedrons for solid. As a result, the BESO model is always a mesh with a coarse, irregular surface. Fortunately, the GH plug-in, Ameba, has a really strong mesh optimization functions to deal with that problem. With the help of Ameba mesh tools, it is easy to get a smooth mesh model for the following fabrication works [9] (Figs. 20 and 21).
3.3.2
Over-Hanging
In this pavilion work, large 3D printing techniques are implemented. The current technique has some printing limitation by the issue of large overhang angles without
162
D. W. Bao et al.
any supporting material, so the model should be modified to avoid large draft angles in the model. The maximal overhang angles are 32°.
3.3.3
Fine-Tuning Based on the Feedback of FEA Analysis
Once the form of the pavilion is finalized, it has been imported into Abaqus for finite element analysis to get the more accurate structural performance feedback which helps to re-test and fine-tune the form to fix some structural defects and ensure the pavilion has a better structural performance based on keeping the basic generated geometry (Fig. 22).
4 Advanced Robotics Fabrication The digital pavilion structure (Fig. 23 left) is finalized based on generative method topological optimization. To fabricate it, it has been further designed for fabrication and construction that includes main three parts: top transparent 12 mm thickness acrylic panel, 3d printing main structural bodies and timber base (Fig. 23 right).
4.1 Application of KUKA Robotics The Architectural Robotic Lab (Fig. 24) sits within the RMIT University School of Architecture, and Urban Design directed by Associate Professor Roland Snooks leads the school’s development of architectural robotic research and advises on the development of its infrastructure. The lab consists of nine industrial robots ranging in scales from a large Kuka KR 150 mounted on a five-metre track, down to small UR10
(a) Displacement Fig. 22 FEA analysis
(b) Mies Stress
(c) Strain Energy Density
Bioinspired Generative Architectural Design Form-Finding and Advanced …
163
Fig. 23 The digital model of the innovative pavilion for fabrication
Fig. 24 RMIT Architectural Robotic Lab
robots. Currently, the main robotic research is primarily focused on large 3D printing of polymers. Roland Snooks and his team have developed a series of innovative 3D printing technologies to build up several pilot project and large-scale prototypes, such as Monash SensiLab (2017) and NGV Floe pavilion (2018) [10].
4.1.1
Advantages of Robotic 3D Printing
The innovative technology that combines KUKA KR 150 6 axis robot with a 3D polymer printing extruder is applied on printing large-scale prefabricated building
164
D. W. Bao et al.
components. The folding, corrugated, translucent printed polymer components can refract intricate patterns of light and create varying transparencies. The X-Form 1.0 pavilion was printed in 8 pieces with non-screw joint connections (Fig. 25). This updated ‘start–stop’ 3D polymer printing approach is a development from the previous one-curve continuous printing path. It achieves the aim of printing fractallike geometries. The total printing time is 64 h, including four upper part tree columns and four lower part base columns (Fig. 26).
Fig. 25 The eight pieces of components of the pavilion structure
Fig. 26 The digital model of fractal-like structure and grasshopper simulation
Bioinspired Generative Architectural Design Form-Finding and Advanced …
165
Fig. 27 The updated start–stop script based on grasshopper KUKA PRC
4.2 Modified Printing Path Code for Fractal-Like Geometries 4.2.1
Start–Stop Script Development
Due to the tree branches’ system of the pavilion columns, the new start–stop script can achieve the aim of printing fractal-like forms. The script is originally written in C# code by Roland Snooks’s research team (Fig. 27).
4.2.2
Import the Printing File into KUKA Machine and Run Printing
Firstly, the geometry needs to be imported into rhino/grasshopper. Later, the code converts geometry into the mesh, slice it in the Z direction and generates the printing file ‘prc_kuka.src’ based on the printing parameters (speed, layer height, temperature, etc.). Import the generated.src file into KUKA and run the machine (Fig. 28).
4.3 Printing Parameters Materials behaviour in the printing process is an important factor that impacts the quality of the printing result. Due to some uncontrollable factors, it is hard to avoid uncertainty, such as the interior humidity, temperature, old/new plastic. However, it can decrease the risk of uncertainty through repeated experiments and data record. The main factors include printing speed, layer height, bead size, extrude temperature and purging step. From table (Fig. 29), the stability of printing will be significantly influenced by printing speed once it is more than 200 mm/s; the Z height impacts the stability and speed of printing, and bead size will cause the thickness of extrusion; one of the most influent parameters is extruded temperature, it significantly affects the transparency of printing result. Thus, the most successful result is with the 60 mm/s, 2.8 mm Z height, 4.2 bead size and 210° extrude temperature [10]. The purging step is important that can clean the nozzle and ensure the fused polymer extruded from nozzle equally and smoothly (Fig. 30).
166
Fig. 28 The process of robotic large 3D printing using updated code
Fig. 29 Testing results of polymer materials with various parameters
Fig. 30 Printing examples with various qualities (from left to right: bad to good)
D. W. Bao et al.
Bioinspired Generative Architectural Design Form-Finding and Advanced …
167
Fig. 31 Plug-in joint design for connecting lower part and upper part structures
4.4 Joint Design and Assembly Methodologies There are two types of joints design applied to this pavilion. One is the plug-in joint design without any screws. The plug-in joint provides convenience for connecting the lower and upper part of structures through printing an internal offset layer and inserting the lower part tubes into the upper part tubes. (Fig. 31). The other joint design is applied to the connection between the top panel and structure branches. Instead of screws, nails and glues, the white reusable cable ties are used to tie up through the reserved holes on 3D printing structures, and laser-cut acrylic top panel (Fig. 32). Due to the efficient fabrication process by large-scale robotic 3D printing in the lab, the construction process only took one hour to assemble the whole pavilion by five students supervised by the authors. Five students elevated the upper parts of structures up to 1 m, then the authors moved the lower part columns to the corresponding location, and let lower columns are plugged into the columns of the upper part (Fig. 33).
Fig. 32 White cable plastic ties are applied on connecting top panel and structure
168
D. W. Bao et al.
Fig. 33 On-site assembly process
5 Conclusion and Future Works The paper explores the integration of emerging technologies in both digital design and advanced manufacturing, respectively, topological optimization-based generative architectural forming finding and advanced robotic large 3D printing fabrication. In this work, a pavilion is introduced to demonstrate the combination of new design and construction techniques and explain the design and construction process (Fig. 34). Pavilion X-Form 1.0 is an experimental prototype that tested how important role the optimized structure plays in architectural form-finding (Fig. 35). The Bidirectional Evolutionary Structural Optimization (BESO) method provides not only
Fig. 34 The final built pavilion (2 m x 2 m × 2.5 m)
Bioinspired Generative Architectural Design Form-Finding and Advanced …
169
Fig. 35 The X-Form 1.0 pavilion in the Digital FUTURES 2019 exhibition (From left to right: Feng ‘Philip’ Yuan, Wen Jun Zhi, Dingwen ‘Nic’ Bao, Mark Burry, Yi Min ‘Mike’ Xie, Xin Yan, Tong Yu Sun)
an efficient structure but also an elegant architectural form. The integrated technologies have the potential to serve the building industry due to its capability of producing large-scale free form architectural components with high structural performance and efficient materials. In a further study, major barriers to the implementation of these technologies in the building industry will be resolved to apply this new technology widely to the mass customized design and manufacturing in the building industry. Also, the further research project X-Form 2.0 will be investigated and focus on more complex topological optimization form-finding, curved top panel, and more advanced 3D printing techniques. Acknowledgments The authors would like to thank several colleagues, whose support helped fulfil the research project described in this paper: • Professor Feng ‘Philip’ Yuan (Archi Union, Fab Union, DigitalFUTURES, Tongji University) • Professor Yi Min ‘Mike’ Xie (Centre for Innovative Structures & Materials, RMIT University) • Associate Professor Roland Snooks (School of Architecture and Urban Design, RMIT University). • Dr. Jiawei Yao, Dr. Xiang Wang, Miss Reina Zhewen Chen (Tongji University).
170
D. W. Bao et al.
References 1. Huang, X., & Xie, Y. M. (2010). Evolutionary topology optimization of continuum structures: Methods and applications. New York: Wiley. 2. Xie, Y. M., & Steven, G. P. (1993). A simple evolutionary procedure for structural optimization. Computers & structures, 49(5), 885–896. 3. Huang, X., Xie, Y. M., & Burry, M. C. (2006). A new algorithm for bi-directional evolutionary structural optimization. International Journal Series C Mechanical Systems, Machine Elements and Manufacturing, 49(4), 1091–1099. 4. Zhou, Q., Shen, W., Wang, J., Zhou, Y. Y., & Xie, Y. M. (2018). Ameba: A new topology optimization tool for architectural design. In: Proceedings of IASS Annual Symposia (Vol. 2018, No. 19, pp. 1–8). International Association for Shell and Spatial Structures (IASS). 5. Huang, X., & Xie, Y. M. (2010). A further review of ESO type methods for topology optimization. Structural and Multidisciplinary Optimization, 41(5), 671–683. 6. Sigmund, O., & Peterson, J. (1998). Numerical instabilities in topology optimization: A survey on procedures dealing with checkerboards, mesh-dependencies and local minima. Structural optimization, 16(1), 68–75. 7. Sigmund, O. (2007). Morphology-based black and white filters for topology optimization. Structural and Multidisciplinary Optimization, 33(4–5), 401–424. 8. Huang, X., & Xie, Y. M. (2007). Convergent and mesh-independent solutions for the bidirectional evolutionary structural optimization method. Finite Elements in Analysis and Design, 43(14), 1039–1049. 9. https://ameba.xieym.com. 10. Bao, D. W., Yan, X., Snooks, Roland., & Xie, Y.M. (2019). Design and construction of an innovative pavilion using topological optimization and robotic fabrication. In: Proceedings of IASS Annual Symposia (Vol. 2019, pp. 474–481). International Association for Shell and Spatial Structures (IASS).
A Question of Style Style, Artificial Intelligence, and Architecture Matias del Campo, Sandra Manninger, and Alexandra Carlson
At the beginning of the twentieth century, the German Architect and writer Hermann Muthesius (Fig. 1) proposed to get rid of the architecture discipline entirely from the term Style and to replace it rather with the notion of Type [1]. An attempt to divorce architecture from its pompous nineteenth century Historicism, a Style that identified itself with the imitation and at times amalgamation of various historical styles. The grandeur of the Ringstraße in Vienna can be considered one of the prime specimens of this development—including a parliament in the Greek style [2], the city hall in a Flemish Neo-gothic Style [3] and Museums in Neo-Renaissance Style [4]. The formation of the Deutsche Werkbund [5] (Fig. 3) in 1907 moved the debate further in regard to discussing the difference between Style and Type. In contrast to Style that is connotated with particular, identifiable features, positioning buildings in a specific timeframe and culture, type talks about the result of basic geometries, specific production technologies, and materialities. The time was calling for an association that does not specialize in any specific architecture (rural, urban, domestic, official, religious, or otherwise) but rather created a frame for the entirety of living (Fig. 2). The Motto of the Werkbund says it all “vom Sofakissen zum Städtebau” (from the sofa upholstery to urban design), and reflects on the notion of the “Gesamtkunstwerk”, the all-encompassing piece of art. The ideas circulating at this time (also with other associations such as the Wiener Werkstätte or the Art and Crafts movement in the UK) most certainly drew from Gottfried Semper’s desire to reinvent the M. del Campo (B) · S. Manninger (B) Taubman College of Architecture and Urban Planning, University of Michigan, Ann Arbor, USA e-mail: [email protected] S. Manninger e-mail: [email protected] A. Carlson Michigan Robotics, University of Michigan, Ann Arbor, USA e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2020 P. F. Yuan et al. (eds.), Architectural Intelligence, https://doi.org/10.1007/978-981-15-6568-7_11
171
172
M. del Campo et al.
Fig. 1 Anna and Hermann Muthesius drinking tea around 1900. Wiki Commons
Fig. 2 Charles Jencks, evolutionary diagram, 2000
A Question of Style
173
Fig. 3 Fritz Hellmut Ehmcke: Deutsche Werkbund Ausstellung Köln. Courtesy MOMA
discipline in the face of the impact of industrialized fabrication versus craftsmanship as well as its methods of materialization. Semper, possibly the first materialist thinker in the discipline, was obsessed with the feedback loop between material and human ingenuity. It would be interesting to hear what he would have to say facing the possibilities of a contemporary, post-human, design universe. Muthesius concept of the Type proposed that form presides over function, and that instead of inventing ever new styles and shapes, architects and designers should rely on standardized shapes (“typisierung”), such as platonic solids, as starting point for every kind of design whether this be industrial design, architecture, or urban design. Although Muthesius is considered one of the ideological fathers of the Bauhaus, he was profoundly suspicious of the Bauhaus, as he considered the Bauhaus nothing but yet another Style [6], undermining his attempts to rid of the discipline from the burden of Style. …Now that the century is over it is time to ask what it meant for architecture. This is a harder question than it first appears. Did Modern architecture, as its apologists claim, triumph over other contenders? What was the relationship of commercial practice to quality—did the best architects lead or only influence the profession? Did good architecture trickle-down or was it dumbed down? Or did a hundred mini movements tell the real story of the century; or was it like that of the past, one of spec builders, the DIY industry and self-build?… … It is true that certain architects of the previous century—how strange those words ring for Modernists—were creative forces that lasted for longer. Mies was a power to be reckoned with in the ’20s and ’60s. Le Corbusier, Frank Lloyd Wright, and Aalto, who with Mies made up the big four, were seminal at more times and Kahn, Stirling, Eisenman, and Gehry, the little four, each had two small periods of influence. But even these protean characters, in order to stay relevant and on top, had to reinvent themselves about every 10 years….Usually when historians look at the recent past they do so with the eyes and taste that rigidly exclude the variety, contradictions, mess, and creative wealth of a period, and we applaud them for so doing. All history writing is selective and based on theories of what really matters, and
174
M. del Campo et al.
there is no way around this limitation. But there are ways to compensate for perspectival distortion and over the last 30 years I have devised a method, the evolutionary tree, which if it is not completely inclusive is at least balanced in its selective effects. .…. As can be seen in the classifiers to the extreme left of the diagram, it is based on the assumption that there are coherent traditions that tend to self-organize around underlying structures. These deep structures, often opposed to each other psychologically and culturally, act like what are called, in the esoteric science of nonlinear dynamics, “attractor basins”: they attract architects to one line of development rather than another. Why? Not only because of taste, training, education and friendships, but because of typecasting and the way the market forces architects to have an identifiable style and skill. In a word, specialization…. [7]
Ever since the Style versus Type debate [8], the term Style has been observed with suspicion in the discipline. The entire twentieth century was mainly trying to avoid the term at all costs. However, if we take Muthesius’ criticism of the Bauhaus seriously, we can state that also the twentieth century was profoundly driven by questions of style. From the Bauhaus to the Deconstructivists—Style reigned supreme also in the twentieth century (Fig. 2).
1 A Closer Look into the Suspicious Noun Style First and foremost, there is a series of methods as of how to explain the noun style in a consensual manner. These definitions are applied to a series of different disciplines and criteria. For example, in the fabrication industry style is understood as a particular designation, or the title of a machine or machine part. In literature it pertains to a distinctive manner of expression—just think about realist literature, romantic literature, or the flowery style of eighteenth century poetry. Style can also be applied to human behavior with all its mannerisms, ticks, physiological, and psychological behavioral features. All of which are encapsulated in a distinct style of behavior— conduct is a main expression of personal style: courteous, discreet, abrasive. Of course, the term is also associated with questions of manner and technique, in that things are done or created in a unique style (just remember, for example, the attribution of painting of unknown origin with denominations such as “in the style of Caravaggio”(Fig. 4)), and finally the term is applied to things that are fashionable—to care about aspects of AI in architecture is in style at the moment. The complexity of the term Style consists in the unusual weight and flexibility of the concept itself. In essence, the concept defines the main basic rulesets of artistic achievement and excellence. The term Style itself is a latecomer to the considerations of the examination of artistic endeavor and is being discussed in a fierce fashion to this very day. The etymology of the term in various languages such as the French and English Style, the Italian Stile and the German term Stil, can be traced back to one Latin root: Stilus. The Latin stilus is a tool in the form of a stake or pike that transmutated in the common understanding to “a pointed tool for writing or engraving wax tablets.” [9] During the period of the Roman Empire this tool was fabricated using either metal, bone, or ivory, with a pointed end to scratch letters into
A Question of Style
175
Fig. 4 The Card Players, follower in the style of Caravaggio, Harvard Museum
the soft wax of the tablet, and a flattened end that allowed to flatten the wax again, basically erasing the text. The metonymic transformation of the term from the purely descriptive—in order to describe a tool—to a metaphor describing the qualities of an orator (und thus the formal qualities of art—itis style) occurred already in antiquity, for example, through the works of Marcus Tullius Cicero [10]. A pointy stick showed the way to Style. An instrument of warfare, siege, and violence—the stake, the pike—turned into an expression capable of identifying the changing characteristics and the passage of time in architectural production. In a way itis befitting that a bellicose tool of leaguer, which in itself is a form of architecture, turns into the term to describe (scribe-stylus-style!) the morphology of architectural production. The Roman Stilus, however, is also the ancestor to the writing stylus—the historic tool of the trade of the architecture discipline, whether this be for the penning of timeless architectural rulesets such as Vitruvius books, or the literal scratching of
Fig. 5 Scratched Gothic drawings on the framing ground of the York Cathedral—the “Reissboden". In Oberösterreichische Heimatblätter, 37. Jahrgang, Heft 2, 1983, P. 128
176
M. del Campo et al.
plans on floors (Fig. 5). The German words Grundriss (plan) and Aufriss (elevation) still contain parts of the word Riss denoting the origin of scratching plans onto the plaster of medieval building huts and workshops using, you guessed it, pointed sticks. Ultimately the Stilus, the ancient Roman writing tool that paved the way to the term Style, creates a bridge to our contemporary writing of code, in which the written word—the lines of code—serves as a model for the accurate description of spatial, architectural conditions. The term Style in this frame of conversation can be divided into two specific cases. Case one is the idea of Style in Architecture. Case two is the use of the term in Computer Science. In combination, these two instances form the frame of this essay on the emergence of novel considerations of style in the architecture discipline through the application of Deep Learning/neural networks, which are a biologically inspired subset of machine learning and vision algorithms. More specifically: through the adoption of Style transfer as an algorithmic design technique.
2 How Algorithms Define Style Traditionally architects are trained during their studies to operate like data miners: every new project is based on the hundreds and thousands of images ingested during the training received in architecture school. However, it is not only about mining; what goes beyond the ability to simply ingest imagery is the inherently human ability to perform pattern recognition and apply these patterns to achieve complex and abstract tasks. One of the aspects the human mind is particularly avid about is to recognize events and objects, separate fore- and background. The ability to even recognize that an error or mistake inhabits the potential for a creative solution to a problem. How can this abstract and rather difficult to grasp pattern-finding process be harnessed to quantify style? This is where the aspects of the neural network’s learned features, or what it has learned are salient pixel patterns within a given image, come into play. In the same way that an architect must observe instances/images of different architecture styles and elements to learn which of these features are unique to specific architectures, a neural network must be trained by observing millions of images and learning what pixel patterns, i.e., visual features, of an image are relevant to the task it is trying to solve. We can use these learned feature sets from neural networks to achieve novel representations of form and style for standard architectural motifs. For example, if given the task of image or object classification, what is the set of features learned for “arches” by a neural network? Are there specific patterns to arches that the human visual system disregards when it performs classification that a neural network utilizes? In deep learning, the “style” of an image is synonymous with “texture,” which describes the underlying 2D patterns that result from how the shape and material of objects in a scene interact with the light and camera to form the image. Continuing the line of thought from above, we can use trained neural networks to successfully
A Question of Style
177
quantify and define textures within images without human intervention of bias, and thus transfer the sense of style into other, novel spaces. I now will give a brief overview of how neural networks learn features. Artificial neural networks are computing systems that are designed to loosely emulate the hierarchical structure of the human visual cortex. A neural network is composed of information processing nodes, called neurons, that are organized into groups, called layers, based upon how they connect to other nodes within the network. Input information flows through a neural network in a feed-forward, hierarchical manner: each neuron in the network receives input from neurons in the preceding layer and transforms it into a new representation via a nonlinear function, which acts as a threshold that filters out relevant information captured by its input. This new representation becomes the input to the neurons it is connected to in the proceeding layer. Since each layer operates on the activations of the previous layer, neurons can extract/detect groups of input features that become increasingly more complex. The set of filtering transformations a neural network performs on images (and consequently the novel ways the network represents salient visual information captured by the images) are learned directly from the image pixel intensities within the training dataset; humans do not specify these features beforehand. For example, in an image classification task, a neural network learns to transform an input image into a new representation by decomposing it into a set of visual features that makes the image’s semantic content easy to classify as, for example, “Door” or “Column.” The visual features that comprise this new image representation could be textural, like marble, concrete, glass, or metal, or pertain to geometry and shape, like curves or corners. Thus, the “Column” class may be represented by a set of long vertical line features combined with stone textural features, whereas the “Door” class could be represented by a set of corner and wood features. These visual features are extracted sequentially by the network, where the first network layers filter out simple lines, edges, and textures, and the later network layers filter out the sets and combinations of these features, such as corners. The final network layer predicts the semantic class label, e.g., “Column”, based upon the set of features extracted from the image by the preceding layers. In this example, the neural network is trained for a discriminative task, and functions as a prediction/classification machine. For this kind of task, the network learns only to model and capture the visual information that maximally differentiates the semantic classes present in the dataset. Neural networks can also be taught to perform generative tasks like image rendering. In the case of image generation, neural networks learn the statistics of visual information over all possible images in the input dataset. Conceptually, this means that the network learns the set and frequency of pixel patterns that define the training set of images. Independent of the task, neural networks learn how to represent images in terms of color, texture, and geometric structures. These representations can be used to perform image manipulations that result in a unique design. It is key to collect a large dataset that captures all possible variance of such styles so the neural can learn accurate
178
M. del Campo et al.
and representative visual patterns. The dataset used to train a neural network can be thought of as the “world” the network exists in. We can then use these learned patterns as a way to mathematically represent an image by decomposing it into its base spatial/geometric, style, or texture features. In the following paragraphs, we discuss the specifics of two style transfer techniques, first, Neural style transfer (NST) and second, Generative Adversarial Neural Networks (GANs). The objective of NST is to alter a given input image so that it captures the style of a second, “style guide” image without altering the original content, i.e., the geometric/spatial structure of the input image. As previously described, an input image can be decomposed into specific visual features by projecting it into a given network layer, i.e., transforming it into the set of visual features learned by that layer. The network layer representation of the image not only provides information as to what type of visual features are present in the image, but also where they occur within the image. Thus, we can change the pixel values of our input image such that the network’s representation of its style features, like texture and color, resembles the network’s representation of the style features of the guide image, while making sure that the network’s representation of structural features in the input image, such as outlines of buildings or edges, remain unaltered. This technique allows us to have a quantifiable metric of style that can be used to probe how the nature of buildings, and other architectural components, are decomposed and represented in this style space. The objective of a GAN, in comparison, is to model the visual statistics of the real world as defined by the training dataset such that it can generate novel, unseen images in the style of the training dataset. In a GAN framework, there are two competing neural networks. The generative neural network, called the generator, is trained by a second neural network, called the discriminator. The generator network renders candidate images, and the discriminator evaluates them by comparing the rendered images to real images (from the training dataset). Both networks are trained simultaneously, and they are adversaries to one another. The generative network’s training objective is to “fool” the discriminator network by producing novel synthesized images that appear to have come from the set of real images. The discriminator, conversely, learns to detect patterns within the generated data that do not occur in real images. With this adversarial training strategy, the generator learns the distribution over real-world visual information, which results in the production of novel, photorealistic images that trick the discriminator. When a GAN is trained on this specially curated dataset, it learns the textural, color, and geometric properties that connect and define the visual space between the images in the dataset. Conceptually, this means that the generator network learns to “interpolate” between the images in its training set by combining different pixel patterns it has learned (which can be textures or geometric structures) to “imagine” previously unrealized images that incorporate elements/visual features across all the styles in the training dataset.
A Question of Style
179
In general, the learned visual representations of neural networks could capture new ways of “seeing” and understanding structure and aesthetics. Thus, in design, these networks can be used to transform and potentially remove the constraints of the human visual system on not only the definition of style, but of the whole design process.
3 Neural Style Transfer (NST) as a Tool of Architectural Interrogation In order to discuss the nature of style in architecture, I would like to rely on the works of Gottfried Semper as the grand seigneur of the conversation on style. In his opus magnum Style in the Technical and Tectonic Arts, or Practical Aesthetics, Semper laid out a comprehensive interrogation of style as a driver of architectural innovation, divided into the material driven chapters textiles, ceramics, carpentry and masonry. Semper’s interrogation of the provisions of style transformed the discourse on aesthetics, architecture, and art history (Fig. 5). In Semper’s assessment, Style should be driven by cultural sympathy, historical function, creative free will, and the inherent qualities of the distinctive materials. This can be summarized as a critical examination of nineteen century notions of artistic discussion in which Semper’s voice is critical toward ideas of aesthetics, historicism, and materialism. Materialism in this frame of conversation is not part of the Materialist philosophy of Marx [11], but rather an attempt of organization and cataloging of architecture elements, parts, and blocks. In his book Style, Semper developed an intricate catalog of transformations, in particular Styles, based on a deep interrogation of distinct objects, resulting in an exceptional understanding of variations in culture. Based on this, it can be stated that Style can be considered through the lens of Materialist considerations, but also as a continuous series of forms produced through changes in the methodology of construction and technical progress in general. In this sense, every new style is a result of previous efforts and is within the gravitational field of new materials, new construction methods as well as novel social values (see also Charles Jencks, Evolutionary Diagram, Fig. 2). Ultimately Style, to state a possible way to think about it, is simply the application of new materials, techniques, and purposes [12]. Style as an area of inquiry in architecture theory has a long and painful history, as laid out through the treatises of Gottfried Semper, John Ruskin, Violet le Duc, Garnier and Alois Riegel [13], and the opposition to this area of critical interrogation by figures such as Otto Wagner—who started his career profoundly invested in the Historicism of the nineteenth century, only to reinvent himself as a proponent of modern considerations in architecture—Muthesius, Loos, Augustus Welby Pugin, and other critical voices toward a purely historic vantage point of architectural production. What makes a difference between the analog methods of inquiry utilized by these critics and the algorithmic methods described in the previous section that can be
180
M. del Campo et al.
applied today to plow through massive amounts of historic data about the discipline is the methods applied to structure and organize Big Data. However, there is more to discuss than just to drop the issue of Big Data on the table without further closer scrutiny. In his book The Second Digital Turn [14], the historian Mario Carpo described “the different ways, today’s digital avant-garde has already started to use Big Data and computation to engage somehow the messy discreteness of nature as it is, in its pristine, raw state—without the mediation or the shortcuts of elegant, streamlined mathematical notations.” It is incredible how just a couple of years of development can overthrow ideas of specific paradigmatic shifts. What Mario Carpo missed in his book is the rise of Artificial Intelligence, specifically Deep learning techniques, as a tool in contemporary architecture design—there is not a single mention of it in his book The Second Digital Turn. At least he did not consider it important enough to make it into the index of the book. How could he miss it? Deep neural networks have dominated a multitude of fields over the last few decades, including machine vision and natural language processing, due to their incredible accuracy at extracting salient features from input data and using these representations of our environment to perform tasks. This accuracy is thanks in part to the rapid development of powerful graphics computing technology, but also as access to big data has evolved, such that datasets can now start to capture the huge amount of visual variance that exists in the world. In particular, neural network frameworks trained for visual generation, Generative Adversarial Networks (GAN’s) came into being in 2014 as a machine learning methodology devised by Ian Goodfellow [15]. Leon Gaty’s paper A Neural Algorithm of Artistic Style [16] was published in 2015. Experiments with the use of Neural Networks in architecture started around 2018 [17] (Fig. 6), and the first building project utilizing Neural Style Transfer (NST) as design method was devised by SPAN (Matias del Campo & Sandra Manninger) in collaboration with Michigan Robotics (Alexa Carlson) in 2019 (Fig. 7). This shows once more the slow speed of our discipline and the, at times very slow, ways to adapt to novel technologies. Circling back to the discussion of Big Data, Architecture and Neural Networks—the lesson learned here is not so much about collecting Big Data, not even about Don’t Sort: Search (another chapter in Mario Carpo’s Second Digital Turn), but rather about how to crunch through this big data to extract the relevant information that allows to inform a project. Itis literally about processing data to reveal information. Or to put it this way: Data is the new Oil [18]. Why? Because—in an analogy to crude oil—it is almost useless in its unrefined state, but needs to be refined into gas, plastics, chemicals, etc. in order to create a valuable commodity. In a similar fashion raw data is pretty much inert, as it is unlegible to the human mind—it needs to be broken down and analyzed in order to reveal the valuable information. Yes, Data and Information are two distinctly different things. This is also what makes the use of Neural Networks so incredibly powerful. It would go far beyond the boundaries of this article to describe in detail the possible facets in the application of Neural Networks in architecture—reaching from site analysis, to plan analysis to improved methods of Building Information Modeling, to aspects
A Question of Style
181
Fig. 6 Gottfried Semper: Der Stil in den technischen und tektonischen Künsten, oder Praktische Aesthetik, Verlag der Kunst und Wissenschaft, Frankfurt 1860
of ecologic, economic, and social impact of a project—the opportunities to reveal the profound nature of a project are gigantic. As this article is primarily concerned with aspects of style by using Style transfer techniques aided by Neural Networks, it is worth inquiring the use of Big Data in order to interrogate aspects of style, to understand the visual information and features that comprise a given style (see Fig. 8). The larger a data set of anything is, the more accurate the results generated by a Neural Network will be. In researching the issue of style in architecture, the size of the datasets is crucial; the images that comprise the dataset in essence comprise the sense of the “world” for the neural network, any visual information that is not included in the dataset the network cannot know of, and thus cannot use to learn. Consider the task of image classification as an example: it starts with at minimum thousands of images upward of a particular class. As the technique is based on learning through the examination of features within an image, not only the quantity of images is important, but also the semantic quality of the images of a database is very important. In one of the initial stages design of the Robot Garden (Fig. 9) by SPAN, a Deep Dream framework{cite} was created using a classification network that was trained to discriminate fountains from other architectural elements that were desired for the garden, such as columns,
182
M. del Campo et al.
Fig. 7 SPAN (Matias del Campo & Sandra Manninger: Austrian Pavilion for the Dubai Expo 2020. A first attempt on harnessing Neural Style Transfer techniques for architecture design
Fig. 8 Consider the visual task of building recognition/classification with the above images. At first glance, the panel on the far left of Le Corbusier’s Villa Savoy is easily separable from the two on the right; it differs significantly from the other two images in terms of shape (of the three it is the only building with columns and simple geometric edges/corners), in terms of color, and in terms of background. The right-hand panels, the Casa Batllo by Antoni Gaudi (middle image) and the Coop Himmelb(l)au’s Haus des Brotes (far right image), have similar bluish backgrounds and have curved roofs. Therefore, more complex visual features, like the texture and color of the Casa Batllo roof must be used to differentiate the two. If presented this small dataset from which to learn building classification, a neural network would only need to learn the features “white” and “columns” to separate the Villa Savoye from the others. In contrast, the network would need to learn more complex, dense visual feature sets to separate the Casa Batllo image from the Coop Himmelb(l)au image, e.g., the different roof shapes as spatial locations of curves in the image, the different colors in the roofs, and perhaps the locations of the roof ornaments or windows in the Casa Batllo image. These example features are highlighted in red in the figure
A Question of Style
183
Fig. 9 SPAN (Matias del Campo & Sandra Manninger), the Robot Garden for the Robotics Department of the University of Michigan. USA 2019
boulders, and stairs, with the goal of “dreaming” these features into the garden landscape. After several iterations of the training, this classification network made a weird connection—it identified fountains by only the spout of water emanating from a spring fountain, which in part was due to there only being spring fountains within the fountain class of the dataset (Fig. 10). During the dreaming process, this resulted in waterspouts being visualized at all vertical lines, as well as in strange color distortions in the vertical dimension of the image. This example demonstrates how novel representations and features learned by a neural network with a specific training dataset (or view of the world) can allow us as designers to rethink the lowlevel visual components that comprise a given “style,” and how neural representations of style can be used as a method of architectural interrogation. In the tradition of historic architectural analysis, the tool can expand on the abilities of the human mind to scrutinize in detail the qualities of an architectural project. The main difference being that in contrast to the model of examination from figures such as Gottfried Semper, it is not about a “revival” of historical forms, but rather about the discovery of novel architectural opportunities dormant in its historical core matter and uncovered through the hallucinations of machines.
4 Style and Artificial Intelligence Expanding on the basis created by the theory on Style by Gottfried Semper, it would be interesting to examine the changes, or relationships, to Semper’s universe of thought through the lens of contemporary design techniques related to Artificial
184
M. del Campo et al.
Fig. 10 The deep dream visualization of the learned features/representation for fountain class learned by a classification neural network trained on ImageNet, a dataset which has over 1600 classes of objects represented and over 1 million images (left image) versus the representation learned by the same network trained on a significantly smaller dataset of about 8 thousand images of only architectural classes (e.g., column, ditch, stepping stone, fountain, boulder, stairs, arch)
Intelligence research. As stated above the main trajectory of interrogation for Semper was the idea to utilize aspects of new materialities, new construction methods as well as social values. This being considered we can revisit ideas on Style in the light of the application of novel design techniques. More specifically Style transfer techniques using Neural Networks. A question that comes to mind is Why did Computer Science pick up the term style to explain the nature of this neural network? The origin of the application of the term Style for particular networks can be traced to a paper on Computer Vision and Pattern Recognition. In it, Leo Gatys, Alexander Ecker, and Matthias Bethge explain that “in fine art, particularly in painting, humans have mastered the skill to create unique visual experiences through composing a complex interplay between the content and style of an image.” [16] It is not known yet how this quality of human behavior can be encoded in an algorithm, and there are no synthetic structures able to imitate that profoundly human quality. Deep Neural Networks on the other hand have proven the ability to recognize objects and faces with an almost human degree of accuracy. The Style transfer Neural Network is essentially a synthetic system based on a Deep Neural Network capable of generating images of high artistic value. It would go beyond the frame of this essay to discuss the meaning and relevance of artistic value, so for now we will keep this claim in a vague cloud of what this pairing of words actually entails. What is relevant for this conversation though is the methodology as of how a Deep Neural Network is able to work through the problem of style. In short, style transfer algorithms utilize the ability to create neural representations of images, isolating and associating the “content” (which is the global structures in an image, such as a building) and style (which are the low-level
A Question of Style
185
visual structures in images, such as brush strokes or blurring of edges) of images. Once these two representations are extracted and isolated, the algorithm can then recombine the content and style in novel and unseen ways, resulting in an algorithm capable of artistic expression. With this process, style artifacts can be exaggerated to a point of hyperbole, transforming the natural balance/harmony of human style and design into a pareidolic and compositionally unstable, but novel form rooted in post-human (in the sense that they were not primarily authored by human ingenuity), but humanly accessible, interpretable features. Let’s revisit here for a second Semper’s definition of style: “Style should be driven by cultural sympathy, historical function, creative free will and the inherent qualities of the distinctive materials.” If we break down the components of this definition and attempt to apply them to a Neural Network, it would mean that NNs have to be able to comprehend sympathy as well as history and possess a creative free will as well as the understanding of the inherent qualities of a material world. The question here is whether there is a possibility to prove scientifically that a neural network is able to replicate aspects such as sympathy and creative will. By now we can state that there is no evidence for the possibility to synthetically replicate aspects of sympathy and free will in an artificial neural network, despite our perception of the network’s ability to mimic them. However, that an algorithm is capable of analyzing history can be considered possible, ultimately this is all in the quality and methodology of the training of an NN—the same amounts for aspects of materials. It is absolutely possible to train an NN to differentiate between various materials based on databases of images of distinct materials. Combine this with additional data about the material such as tensile strength, compression strength, and methods of treatment and you have a well-informed Neural Network capable to aid in the work of an architect on a day to day basis. How about aspects of creativity? There seems to be evidence that AI’s can be creative. We can name two instances here: The Bob and Alice12 experiment of Facebook AI comes to mind, or the incredible game between Alpha Go (a creation of Google DeepMind) and Lee Sedol13. In both these cases, the neural network(s) were able to learn representations for the task that could be considered more efficient than those of their human counterparts; for example, in the Bob and Alice experiment, the neural networks learned a short-hand, “code-speak” version of English that satisfied the specified training objective, but was not in fact a useful or interpretable language to humans. However, this “efficiency” can also be a big limitation and is the main reason why neural networks are rigid, fragile systems. The learned representations of neural networks are dependent upon whatever patterns exist within the training dataset, and these patterns may not be present in or generalize to other datasets/scenarios. This can be described as a neural network overfitting to a dataset; it finds patterns that are unique and specific to a particular dataset that is not representative of the real world, which ultimately places limits on the network’s “creativity”; it is always dependent upon the training dataset. The consequence of this is that Semper’s argument about style is only partially applicable to the area of style and Artificial Intelligence, which demonstrates the
186
M. del Campo et al.
evolution of the idea of style from the mid-nineteenth century universe of Gottfried Semper, to the Post-human age. It is rather just one part of a combination of various possibilities to consider style in the discipline of architecture through the lens of Artificial Intelligence. First, a practical example of the use of NN’s in the arts. The painting Portrait of Edmond Belamy (Fig. 11), by the Paris based art collective Obvious14 is based on a Generative Adversarial Network (GAN) that was trained with a set of 15.000 portrait examples spanning from the fourteenth to the nineteenth century. This set was utilized to generate a block of family portraits entitled La Famille de Belamy. Though not a Style transfer per se, the example shows the possibility for creative results emerging from Neural Network research. Another popular example is the use of van Gogh paintings (or any other artist—van Gogh is just a popular victim) as style to transfer into any given image through the application of Neural
Fig. 11 The painting Portrait of Edmond Belamy, by the Paris based art collective Obvious14 is based on a Generative Adversarial Network (GAN) that was trained with a set of 15.000 portrait examples spanning from the 14th to the nineteenth century
A Question of Style
187
Style Transfer—a category of algorithm that allows for the manipulation of the pixels of an image. In doing so they absorb the visual qualities of one image (Style) and imprint them on any other given image (target). The approach is closely related to deep neural networks, which are able to perform in a similar fashion. The goal is to transfer style information from input images to any provided target image. Neural Style Transfer (NST) algorithms belong to the family of image stylization techniques, which have been examined in the area of non-photorealistic rendering for more than twenty years. Before the NST algorithm was devised, image transfer techniques relied on machine learning which were based on image analogy. Image analogy relies on the idea that a trained pair of images (an image of an artwork and a photo of any given object) can learn to perform a transformation on any given image by analogy. For architecture, this presents a very particular problem, as all these techniques are 2D. Considering the description in the previous section, we can outline a very specific problem with the approach of Style transfer techniques. At least if we focus on the current state of the development of Style transfer techniques: They are 2D design techniques. Meaning they can deal with two-dimensional data such as images, plans, and sections. This represents also the bulk of the current research in terms of architectural production. In our own research, we have explored various avenues that include databases of various historic styles such as Gothic, Baroque, and Modern. Not for the sake of imitating said styles—not to revive the Historicism of Sempers era, but to find novel possibilities within the gigantic repositories of architectural imaginations. Neural Networks are profoundly well suited for the examination of the big data enclosed in the vaults of architecture history. On closer examination, current techniques of Style transfer open avenues of inspirations for novel architectural approaches. As described before, Style comes from the Latin Stylus—a writing tool. Neural Network Algorithms are nothing but written text, lines of code that allows to explore massive amounts of data in order to find the essence of a style. Maybe even the first genuine style of the twenty-first century.
References 1. Muthesius, H. (1903). Stilarchitektur und Baukunst, Verlag v. Schimmelpfeng, Mülheim an der Ruhr. 2. The Austrian Parliament Building (Parlamentsgebäude, das Parlament). (1883). Architect: Theophil Hansen. 3. The Vienna City Hall (Wiener Rathaus). (1883). Architect: Friedrich von Schmidt. 4. Kunsthistorisches & Naturhistorisches Museum. (1891). Architects, Gottfried Semper & Carl von Hasenauer. 5. The Deutsche Werkbund was founded around the Viennese architect Joseph Maria Olbrich after leaving Vienna for the city of Darmstadt, Germany, following an invitation by Ernest Louis, Grand Duke of Hesse in order to design and form an artist colony. The founding members of the Werkbund in 1907 included Joseph Maria Olbrich, Peter Behrens, Richard Riemerschmid and Bruno Paul. The constituting meeting was conducted in Munich at the instigation of Herman
188
6.
7. 8. 9. 10. 11.
12. 13.
14. 15.
16. 17.
18.
M. del Campo et al. Muthesius. This first iteration of the Werkbund was in place until 1934, when the Nazis shut it down. It was re-established after the second world war in 1950. Rujivacharakul, V., Hazel Kahn, H., Tadashi Oshima, K., & Christensen, P. (Eds.). (2013). Architecturalized Asia: Mapping a continent through history (p. 108). Hong Kong University Press. Jencks, C. (2000). Jencks theory of evolution—An overview of twentieth-century architecture. In Architectural review (pp. 76–79). London. Schwartz, F. J. (1996). The Werkbund: Design Theory and mass culture before the first world war (p. 122). New Haven and London: Yale University Press. Vercelloni, L. (2016). The invention of taste: A cultural account of desire, delight and disgust in fashion, food and art (p. 132). Bloomsbury Publishing. Cicero, M. T., & Piderit, K. W. (1886). De oratore. BG Teubner. Historical materialism, also known as the materialist conception of history, is a methodology used by some communist and Marxist historiographers that focuses on human societies and their development through history, arguing that history is the result of material conditions rather than ideals. Mallgrave, H. F. (1983). The idea of style: Gottfried Semper in London (p. 76). University of Pennsylvania, Philadelphia. Alois Riegl was an Austrian art historian and is considered a member of the Vienna School of Art History. He was one of the major figures in the establishment of art history as a self-sufficient academic discipline, and one of the most influential practitioners of formalism. Carpo, M. (2017). The second digital turn—Design beyond intelligence. Cambridge, Massachusetts: MIT Press. Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., et al. (2014). Generative adversarial networks. In Proceedings of the International Conference on Neural Information Processing Systems (NIPS 2014) (pp. 2672–2680). Gatys, L. A., Ecker, A. S., & Bethge, M. (2015). A Neural Algorithm of Artistic Style, eprint arXiv:1508.06576. For example, the Austrian Pavilion for the Expo 2020 in Dubai, which utilized Style Transfer techniques to generate the ceiling. SPAN 2018—Matias del Campo & Sandra Manninger. This design was a primitive effort in that it used the online solution by Google Deepdream to create Style Transfers. It used the Deep Style option to experiment with combinations between Baroque and Modern ceilings, transforming the resulting image into a 3D model using zBrush. The phrase ‘data is the new oil’ was apparently coined in 2006 by Clive Humby, the British mathematician and architect of the Tesco Clubcard, a supermarket reward programme. (J. Bridle, New Dark Age—Technology and the End of the Future, Verso, 2019, p. 245).
Day–Night Scene Transfer for Architectural Renderings Based on Color Transfer Approaches He Wanyu, Nie Guang Yang, Wang Chuyu, and Jackie Yong Leong Shong
Abstract This paper presents a refined algorithm based on main color transfer approaches to turn architectural renderings from a daytime scene to a night scene. The aim is to achieve a more accurate day–night scene transfer between architectural images which share some semantically related content but may vary dramatically in appearance or structure, such as two images of two different buildings in different scenes. For that two key improvements: Feature Normalization and Adding Extra Features techniques are made to the main approaches for color transfer, and the new algorithm implementation scheme is presented to effectively perform day– night transfer for architectural renderings. This method addresses time- and costconsuming problems of manual scene transfer for architectural renderings. By testing with various architectural renderings, the proposed method shows good performance in achieving the desired goal. Keywords Scene transfer · Color transfer · Semantic matching · Deep learning
1 Introduction Architects sometimes face tasks to turn a fully rendered daytime scene into a night scene. Normally, setting up night scenes and lighting can be done in some external render engine like Kerkythea or V-Ray, or by resorting to Photoshop. Those ways require a series of manual operations such as removing sharp shadows, darkening the image, lighting, etc. which is time-consuming and inefficient. In the past one or two H. Wanyu (B) · N. G. Yang · W. Chuyu · J. Y. L. Shong Xkool Tech. Co. LDT, Vanke Design Commune, Liu Guang Rd., Nanshan, B210 Shenzhen, China e-mail: [email protected] N. G. Yang e-mail: [email protected] W. Chuyu e-mail: [email protected] J. Y. L. Shong e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2020 P. F. Yuan et al. (eds.), Architectural Intelligence, https://doi.org/10.1007/978-981-15-6568-7_12
189
190
H. Wanyu et al.
years, transferring the color style of a reference image onto a source image has gained tremendous progress due to deep network-based methods, such that automating the process of turning renderings from a daytime scene to a night scene becomes feasible and effective. With the advances of these techniques, computers can help architects altering the color style without changing the original rendering content in order to emulate different illumination, weather conditions, scene materials, or even artistic color effects. In fact, color transfer has been a long-standing research direction in the field of machine learning or computer vision, but the transfer results are not quite effective or precise for years due to the key challenging to establish the semantically dense correspondences between two images, which is necessary to achieve accurate color transfer. Traditionally, matching methods depending on hand-crafted features or additional segmentation [1, 2], or user specifications [3] are not quite effective or precise enough. Since deep neural networks were introduced to solve the dense correspondence problem, [4, 5], transfer precision was largely improved, but their color transfer results still suffered from artifacts such as ghosting or halo. To refine correspondences and reduce interferential artifacts, He et al. proposed a progressive framework to estimate correspondences in the deep feature domain (extracting features from a pre-trained VGG19 [6]) and do color transfer in the image domain, which achieve natural and consistent color transfer effects [7]. Inspired by the work of He et al. we refined the algorithm to perform day–night scene transfer for architectural renderings. Given an image S (source) and another image R (reference), whose main objects are both buildings but the buildings may vary dramatically in appearance or structure, our algorithm builds semantic correspondences between them and then applies color transfer on image S, thereby generating a new image T (target) which preserves the structure and content of S while emulating the color style of R, as shown in Fig. 1. In our method, two major adjustments listed below are made: (1) Feature normalization is applied to balance the weight of each channel when using the well-trained VGG19 to extract image features (Sect. 3.2).
Fig. 1 Color transfer example of architectural renderings
Day–Night Scene Transfer for Architectural …
191
(2) Clustering and semantic segmentation are applied to add extra features in order to better build mapping between image S and image R, which improves the accuracy of color transfer (Sect. 3.3). The above two techniques strengthen the extraction and matching of similar semantics such that the refined algorithm better fits scene transfer requirements in the architectural field. At the end, by testing with various architectural renderings we show how our techniques can be effectively applied to a variety of real day–night scenes transfer scenarios.
2 Related Work Color transfer is a rather broad research topic that includes transferring colors to grayscale images or color source images. Early researches focus on colorization of grayscale images depending on the user sketching similar regions for computers to put different colors [8]. In recent years, learning-based algorithms appeared to automize grayscale images colorization by learning image statistics (pixel distributions) from large extensive datasets [9, 10], which further inspired approaches for transferring chrominance from one image to another containing similar image statistics [11]. However, apart from chrominance, luminance also has much to do with how an image looks. Therefore, He et al. [7] further proposed a feasible method to transfer both luminance and chrominance between a color image pair. Our focus is to refine and adjust this method to deal with day–night scene transfer for architectural renderings.
2.1 Image Feature Extraction Feature extraction is a well-studied problem in machine learning. It greatly influences accuracy of object recognition in image processing and better object recognition leads to better semantic correspondence between the processed image pair. If the content of the source image S is well corresponded to the reference image R, for instance, buildings in S corresponding to buildings in R, sky in S corresponding t to sky in R, road in S corresponding to road in R etc., transferring chrominance and luminance from R to S is more accurate and natural. Many feature extraction techniques have been accumulated for years, such as independent component analysis, kernel PCA, and autoencoder. Each of them can be a large research topic, thus instead of going deep into this issue, we can apply existing excellent accomplishments. For instance, He et al. tried to apply a pre-trained deep learning model—VGG19 to address the feature extraction issue in order to improve efficiency and accuracy of building semantic matching between S and R.
192
H. Wanyu et al.
Fig. 2 Illustration of the network architecture of VGG-19 model [13]
VGG-19 is a Convolutional Neural Network (CNN) that is trained on more than a million images from the ImageNet database1 . The network can classify images into 1000 object categories, such as keyboard, mouse, pencil, and many animals. A typical VGG-19 structure is shown as in Fig. 2. As a result, the network has learned rich feature representations for a wide range of images [12]. Since pre-trained models like VGG-19 have learned features of many categories of objects, the weights and biases of these networks can be transferred to different data. For instance, as VGG-19 has been trained with a large amount of building images, it contains learned features like edges, lines, etc. of the building category. By feeding image pairs to a pre-trained model for object recognition, we can save time and compute resources to achieve effective feature extraction in our particular scenarios. Applying VGG-19 to the source image S and reference image R, feature maps of them can be extracted from different layers. At a layer L, the feature map of S and R can be defined as FSL and FRL , respectively. As feature maps contain abstract features of identifying certain objects, things of the same category would have similar feature maps though they have largely different appearance. This begs a question: how to measure the similarity of feature maps? Computing nearest-neighbor fields (NNF) introduced in the next section has been proven a useful technique to address this issue.
1 The
ImageNet project is a large visual database designed for use in visual object recognition software research. It contains more than 20,000 categories with a typical category, such as “balloon” or “strawberry” consisting of hundreds of images.
Day–Night Scene Transfer for Architectural …
193
2.2 Nearest-Neighbor Field Search In the image domain, matching patches between two images is given two images S and R, find for every patch in S a similar patch in B, which is called PatchMatch method. The milestone PatchMatch algorithm [14] is fast for computing approximate nearest-neighbor correspondence between two patches of two image regions and has been used in various computer vision problems, including object removing from images, reshuffling, or moving contents of images, inpainting/retargeting images, texture synthesis, etc. Though patch matching is usually carried out in the image domain, it can also be performed in the feature domain, since CNN keeps the spatial relationship of input images, i.e., correspondence between feature maps represents correspondence between images to a great extent. The goal of the PatchMatch algorithm is to find the patch correspondence by defining a nearest-neighbor field (NNF) as a mapping function. In this paper, for each patch around image coordinate p in the source feature map FSL , its nearestL neighbor patch around q = ∅S→R ( p) in the reference feature map FRL needs to be found. If we search for every point in FRL (brute search), the computation cost is too high. Inspired by natural coherence in the imagery, Barnes et al. suggested that if we can find some good patch matches via random sampling, other good matches could be quickly found in the surrounding areas, which avoids global searching [14]. Three key phases of Barnes’ randomized nearest-neighbor algorithm are shown in Fig. 3.
Fig. 3 a Patches are initially assigned with random values; b The blue patch of A checks above/green and left/red neighbors to see if they will improve the blue mapping (denoted as blue arrow). In this example, the red patch in B has higher similarity with blue patch in A, then assign the offset of left/red neighbor patch to the blue patch; c The patch searches randomly for improvements in concentric neighborhoods
194
H. Wanyu et al.
The complete algorithm starts from phase (a) in Fig. 3, and then performs the iterative phase (b) and phase (c) to improve NNF. It has been proved that a few iterations (4–5) the NNF can converge to its limit, which is very efficient. L from FSL to FRL as well as With this technique, the mapping denoted as ∅S→R L L L reverse mapping denoted ∅R→S from FR to FS can be built efficiently. The bidirectional correspondences allow us to use Bidirectional Similarity (BDS) voting [15] to, respectively, reconstruct the guidance image G L and the corresponding feature map FGL . G L contains S’s content and R’s color style, therefore it can serve as the guidance for color transfer in the next step—local color transfer.
2.3 Local Color Transfer However, G L and S have different resolutions which lead to great difficulty in building in-place correspondence between them, therefore a well-designed color transfer algorithm is needed to change the colors of the source image S to better match those of GL. Given the guidance image G L at layer L, we could downscale S to S L to match the resolution of G L . To perform local color matching, He et al. constructed the color transfer function as a linear function for every pixel p in S L . Thus, the transferred result T pL can be obtained by Eq. (1). T pL S L ( p) = a L ( p)S L ( p) + b L ( p)
(1)
In Eq. (1), a L and b L are the linear coefficients and the aim becomes to esti mate a L ( p) and b L ( p) for each pixel p, such that the target result T pL S L ( p) is similar to G L ( p) as much as possible. To estimate proper coefficients for linear functions given a certain data distribution is a common problem which can be solved by constructing an objective function then minimizing or maximizing it according to specific requirements. As He et al. considered, to construct such an object function, three aspects listed below should be involved: (1) a term E D to make the color transfer result similar to the guidance G L ; (2) a term E L to encourage locally adjacent pixels to have similar linear transforms while preserving edges in the source S L ; (3) a term E NL to enforce the non-local constraint to penalize global inconsistency, which is sourced from the assumption that pixels with identical colors in the source should get similar transferred colors in the result. Therefore, it leads to construct the overall object function as Eq. (2). E total = E D + λ L ∗ E L + λ N L ∗ E N L
(2)
Day–Night Scene Transfer for Architectural …
195
In Eq. (2), λ L and λ N L are weights of E L and E N L and λ L = 0.125, λ N L = 2.0 by default. By minimizing the objective function, the coefficients a L and b L of the color transfer function can be estimated from higher layer to lower layer (the higher layer, the lower resolution), thereby obtaining the corresponding color transfer result.
2.4 Fast Guided Filter Since in the above section the parameters a L ( p) and b L ( p) are estimated from lowresolution to high-resolution layers, therefore in order to get the full-resolution result in the finest layer (L = 1), we need to upsample a L ( p) and b L ( p). One of the tricky issues in this process is edge-preserving smoothing.2 The guided filter [16, 17] is one of the popular algorithms for edge-preserving smoothing. It depends on an important assumption that the guided image has linear relation with the filtered output image, involving coefficients a↑L ( p) and b↑L ( p). Because linear relation can preserve both regional information and edge information, for instance, if region A and region B are both smooth color areas and region C is an edge, the gradient between A and B would not be too large while the gradient between A and C or B and C would be relatively huge. Therefore, a linear transformation between the guide image G L and the filtered output image S˜ L can amplify the original gradients by the same proportion, such that the sharp edges in the source image will not be smoothened in the de-noising process. The aforesaid properties make the guided filter algorithm very useful in image edge-preserving, image matting and image haze removal, etc. In 2015, K. He et al. continued proposing the Fast Guided Filter [18] by speeding up the guided filter method to further popularize this method. The difference is made by first reducing numbers of pixels by subsampling (corresponding to a L ( p) and b L ( p)) and then recovering to the full resolution by upsampling (corresponding to a↑L ( p) and b↑L ( p)) 3 , which is consistent with the layered color transfer process described in Sect. 2.3. By least square estimation, the upscaled a↑L and b↑L can be achieved and the desired output image with high resolution can be obtained from Eq. (3): S˜ L ( p) = a↑L ( p)S L ( p) + b↑L ( p), ∀ p ∈ S˜ L
(3)
2 Edge-preserving smoothing is an image processing technique that smooths away noise or textures
while retaining sharp edges. Refer to https://en.wikipedia.org/wiki/Edge-preserving_smoothing. 3 Subsampling refers to making an image smaller and upsampling refers to making an image larger.
196
H. Wanyu et al.
3 Method The goal of this paper is to tune He et al.’s color transfer approach from the engineering point of view such that the approach can fit day–night Scene transfer requirements for architectural renderings. Our scenario is to perform day–night scene transfer given two input architectural renderings sharing semantically related contents whose appearance or structure can vary dramatically. The key challenge is to build semantic corresponding between these two input images, and we also resort to a CNN VGG19 to encode input images from low-level features to high-level semantics. However, as a generalized model purely relying on VGG19, it cannot achieve satisfying results in recognizing objects of architectural renderings. Observing this, we optimize the color transfer algorithm by normalizing features (Sect. 3.1) and adding extra features (Sect. 3.3) to fit our application scenario.
3.1 Overview Given a source image S and a reference image R, our algorithm can be described as below. For a given layer L (5 layers in total by default): • Extract feature maps FSL and FRL of S and R by VGG19, respectively. • Perform normalization in each channel of FSL and FRL (Sect. 3.2). • Adding more features by clustering or semantic segmentation techniques to help building mappings (Sect. 3.3). • Use PatchMatch method to build bidirectional mappings by use of VGG19 L features and new features: ∅˜ S→R , ∅˜ LR→S . L • Reconstruct the guide image G L from R L and ∅˜ S→R , ∅˜ LR→S . L L • Build linear relation between S and G , and estimate the linear coefficients a L and b L . • Use Fast Guided Filter to amplify a L and b L to a↑L and b↑L thereby upsampling SL. • Construct the result image with full resolution by S˜ L = a↑L S L + b↑L . According to previous research, the focus is how to accurately match regions with similar semantic between input images. We mainly consider two aspects in our scenario: (1) how to deal with features extracted by VGG19 such that important semantic information will not be lost (feature normalization); (2) how to add extra features to help building more accurate mappings (clustering and semantic segmentation).
Day–Night Scene Transfer for Architectural …
197
3.2 Feature Normalization When using VGG19 to extract image features, feature maps in different layers have different sizes. For instance, as Fig. 2 shows, the corresponding feature map FS2 is of size (112, 112, 128), which means the feature map contains both 112 pixels in the width and height, and the number of channels is 128. The feature map in each channel represents a feature of the input image, therefore weights of features should be balanced in case missing some important features. By observation, we noticed that the value of pixels can vary dramatically in each channel. In some channels, the pixel value range can be [−500, +100] while in some channels the pixel value range is [−1, +1], which leads to a situation that some important semantic information has been extracted in a channel but has little influence in the PatchMatch process due to narrow pixel value range and further impacts the accuracy or even effectiveness of the correspondence between S and R. Feature normalization techniques are useful to balance contributions of all features. There are many ways for feature normalization, among which Standardization and Min-Max Scaling are the most common used two methods. In this paper, we apply Standardization method to normalize the data. The general process shown as Eq. (4) is to carry out the calculation below for each feature map in each channel: z = (x − μ)/σ
(4)
In Eq. (4), x stands for a pixel value, and μ and σ are the mean pixel value and standard deviation of a channel, respectively. After normalization, pixel value ranges of all channels are in an equal order of magnitude, thereby avoiding emphasized contributions of high-value data and impaired influence of low-value data.
3.3 Adding Extra Features As aforesaid, VGG19 is a generalized pre-trained model that may not perform well enough in some specific application scenarios. Therefore, apart from applying feature normalization to original image features that are extracted by VGG19, adding extra features to join the original features can be considered an effective way to increase recognition accuracy and build better correspondence between S and R. Clustering and semantic segmentation techniques are often used to meet this goal.
3.3.1
Clustering
Clustering is a useful technique for statistical analysis and data mining. Usually, a cluster involves some patterns, and based on similarity there are more similarities
198
H. Wanyu et al.
between patterns in a cluster than patterns in different clusters. Considering similarities of some features, we use clustering techniques to produce feature clusters as extra features to increasing accuracy of mappings between S and R. In our scenario, feature maps FSL and FRL extracted by VGG19 are of size (224, 224, 64) at layer 1 (L = 1). After massive experiments, we found that classifying all the 64 feature maps at this layer into 3 feature clusters (representing sky, building, and ground) can achieve the best performance. The specific approach is described as following. Step 1: Concatenate FSL and FRL in the horizontal direction to get the new feature L whose size is (224, 448, 64). map F(S,R) Step 2: Use K-Means clustering algorithm to cluster all pixels of the new feature L L to obtain a one-hot format clustering result image C(S,R) with size (224, map F(S,R) 448, 3). L horizontally into two cluster images C SL , C RL both of size Step 3: Divide C(S,R) (224, 224, 3) Step 4: Join the cluster images C SL , C RL and the original feature maps FSL , FRL in the previous channel and produce new feature maps FC SL and FC RL , both of size (224, 224, 63 + 3) Step 5: Replace FSL and FRL with FC SL and FC RL to build mappings between S and R. As new extra features are added to help building semantic correspondences, the precision of scene transfer is increased, which will be demonstrated in Sect. 4.
3.3.2
Semantic Segmentation
Semantic segmentation is a very important field in computer vision. It refers to the pixel-level recognition of images which is marking out the object category of each pixel in the image. In simple terms, the goal of segmentation is to output a segmentation image where each pixel contains the label of its category taking a grayscale or RGB image as input. For instance, given a GRB image only contains three categories of things: person, bicycle, and background denoted as label 1, 2, and 3, respectively, and the output segmentation image replaces each pixel value with its corresponding label value. Because of this property, semantic segmentation has similar effects as clustering that can classify pixels into a certain number of groups. There are many semantic segmentation architectures available like FCN, SegNet, U-Net, PSPNet (Pyramid Scene Parsing Network), and Mask-RCNN, etc. In our scenario, PSPNet4 trained with the ADE20K dataset of MIT is adopted to segment the input images. The specific approach is described as the following steps: Step 1: Input S and R to PSPNet to obtain output images M S and M R which are of size (473, 473, 150). Step 2: Reduce size of M S and M R to (224, 224, 150) to fit VGG19. At each layer of VGG19, the reduced images are denoted as M SL and M RL . 4 https://github.com/Vladkryvoruchko/PSPNet-Keras-tensorflow.
Day–Night Scene Transfer for Architectural …
199
Step 3: Concatenate M SL , M RL and feature maps FSL , FRL extracted by VGG19 at each channel to produce new feature maps F M SL and FM RL , both of size (224, 224, 64 + 150). Step 4: Replace FSL and FRL with F M SL and F M RL to build mappings between S and R. It is worth noting that when adopting pre-trained semantic segmentation model to deal with images not all channel results will fit our application scenario since some different channels may represent same contents. Therefore, manually adjusting (combine or abandon some channels) outputs of semantic segmentation model are recommended to achieve better performance. Apart from using pre-trained semantic segmentation models, users can also utilize self-trained semantic segmentation models that meet specified application needs to provide directive guidance for the process of color transfer.
3.4 Algorithm Implementation Scheme The complete algorithm is described in Sect. 3.1, which is basing on He et al.’s progressive color transfer algorithm and the proposed feature reinforcing techniques (Sects. 3.2 and Sect. 3.3). The pseudo-code of our implementation is listed in Algorithm 1 shown in Fig. 4. In our refined algorithm, we cancel the loop from layer 5 to layer 1 but only perform NNF search and local color transfer at one layer in order to increase the transfer speed. Therefore, we initialize the layer value to a specific number between 1 and 5 and our test results showed that layer 1 outputs the best performance since at this layer the image resolution is highest. Besides, to add extra features to help building more accurate correspondence between S and R, one method of the two: clustering and semantic segmentation can be chosen. We will evaluate the effects of these two methods, respectively, in the next section. In the algorithm implementation [19, 20], C language is applied to optimize heavy computation modules like NNF search and local color transfer, thereby reducing the whole runtime compared with Python implementation.
4 Evaluation and Results 4.1 Performance Our core algorithm is developed in the following environment, as shown in Table 1. When using clustering to add extra features to optimize pixel corresponding building, the runtime is about 65 s for single-reference day–night scene transfer with an average resolution of 800 × 600. The time cost mainly occurs in three
200
H. Wanyu et al.
Fig. 4 Algorithm pseudo-code
Table 1 Algorithm development environment
Experiment environment Machine type
X64 compatible personal computer
Operating System
Windows 10, X64
CPU
Intel Core i7-7700 @ 3.60 GHz
Main board
ASUS PRIME B250M-K (200 Series, Family—A2C8)
RAM
16 GB (GLOWAY DDR4 2133 MHz)
Primary hard drive
Samsung SSD 850 EVO 1 TB
GPU
Nvidia GeForce GTX 1050 Ti (4 GB)
Deep learning platform
Keras + TensorFlow
processes: (1) Setting up the architecture, including initiating TensorFlow and importing VGG19, etc. (approximately 10–20 s); (2) Building mappings between S and R by PatchMatch method (~ 40 s), which involves extensive computation to measure similarities on hundreds of channels. (3) The optimization of color transfer by feature normalization and adding extra features also take a relatively small portion of time cost (~10 s).
Day–Night Scene Transfer for Architectural …
201
If PSPNet is used as semantic segmentation network to enhance feature extraction, the overall runtime is about 30 s longer than using clustering. The extra time mainly occurs in the process of setting up PSPNet and building mappings between S and R.
4.2 Evaluation Mainstream color transfer algorithms have been verified by many previous works. Our algorithm mainly refers to He et al.’s work, which has been verified with series of images from artistic images to scenery images. In order to adjust the color transfer algorithms to fit day–night scene transfer for architectural renderings, we used feature normalization and adding extra features as feature enhancement techniques. To evaluate these techniques, we conducted two sets of experiments: feature normalization plus clustering and feature normalization plus semantic segmentation. In our tests, day architectural renderings are used as source images while night renderings are used as reference images. By comparing the results of applying these two ways to the same S and R, we found that both ways performed well enough under fine tuning. Feature normalization plus clustering. In this set of tests, feature normalization and clustering were used to optimize the process of building mappings between S and R. We found that with these two feature enhancement techniques, good enough results (penultimate column in Fig. 5) can be achieved for most of our test images, which proved that our algorithm showed good adaptability in dealing with architectural renderings. Feature normalization plus semantic segmentation. As aforesaid semantic segmentation requires manually tuning the outputs of segmentation networks, which depends on experience or even intuition; therefore, using this way to enhance feature extraction leads to relatively unstable results in the application. With carefully tuning, we also obtained satisfied transfer results (last column in Fig. 5) in this set of tests and the results have little difference with applying feature normalization plus clustering method.
4.3 Day–Night Scene Transfer To validate our approach on day–night scene transfer for architectural renderings, we first explore the main approaches of color transfer (Sect. 2) and propose our improvements (Sect. 3). In this section, parts of our experiment results are shown in Fig. 6. These results proved that by making proper adjustment to well-developed color transfer technologies based on deep learning, image processing tasks like turning a fully rendered daytime scene into a night scene, indoor scene transfer, weather transfer, etc. can be automatically finished by computers which saves time and human resource cost. Though experiments in this paper focus on day–night scene
202
H. Wanyu et al.
Fig. 5 Comparisons between clustering and semantic segmentation applied to architectural renderings
Fig. 6 Partial results of day–night scene transfer using our approach
Day–Night Scene Transfer for Architectural …
203
transfer, the adopted approach can be easily extended to more application scenarios by little adjustment, for instance, based on proper luminance and chrominance transfer, adding extra light source to the transferred images to create new scene graph. In the next section, we will demonstrate results of our attempts to created mid-autumn full-moon night renderings.
4.4 Creating Full-Moon Night Renderings Since our approach transfers both luminance and chrominance, it makes scene transfer more effective and natural. Inspired by this property, we tried to create mid-autumn full-moon night renderings from daytime architectural images. This is a novel creation and the outputs are stirring (Fig. 7).
5 Conclusion In this paper, we demonstrate a refined algorithm based on main color transfer methods for day–night scene transfer between semantically related architectural renderings. It handles both color and luminance transfer, which improves the naturality of scene transfer. And feature normalization and adding extra features techniques strengthen the consistency and accuracy. The evaluation results have shown that this approach is applicable and effective to day–night scene transfer, and it can also be adapted to more scenes transfer such as different weather conditions transfer and indoor scenes transfer. Thought it is a successful application attempt in the field of architecture, there are still limitations to this approach. The first one is the precision of color transfer which also troubles scholars of computer vision. To address this issue, generally we need to figure out better ways to build pixel correspondence between image pairs rather than fully relying on pre-trained models like VGG19. Because those models are trained on very generic dataset, though they save us a large amount of time to do semantic matching across images they sacrifice accuracy or adaptation to some extent. Particularly in our application scenario, more targeted semantic segmentation models should be developed to improve object (building, sky, tree, road, etc.) matching precision thereby achieving more accurate color transfer. Secondly, computation cost should be well considered in the future. Currently, to deal with normal size renderings in personal computers time cost is about 60 s and it gets more for high-resolution images. The time cost can be reduced by accelerating the process of building mappings between image pairs or resorting to parallel computing framework based on GPU. For future work, we would explore tailored semantic segmentation models for the field of architecture and further refined color transfer algorithm. We could foresee
204
H. Wanyu et al.
Fig. 7 Partial results of creating full-moon night renderings
that color transfer technologies will gain greater progress in day-night scene transfer and will be extend to various scenes transfer scenarios.
References 1. Dale, K., Johnson, M. K., Sunkavalli, K., Matusik, W., & Pfister, H. (2009). Image restoration using online photo collections. In 2009 IEEE International Conference on Computer Vision (ICCV) (pp. 2217–2224). IEEE.
Day–Night Scene Transfer for Architectural …
205
2. Luan, F., Paris, S., Shechtman, E., & Bala, K. (2017). Deep Photo Style Transfer. arXiv:1703. 07511 [cs.CV]. 3. An, X. & Pellacini, F. (2008). AppProp: All-pairs appearance-space edit propagation. In ACM Transactions on Graphics (TOG) (Vol. 27, p. 40). ACM. 4. Chen, D., Liao, J., Yuan, L., Yu, N., & Hua, G. (2017). Coherent online video style transfer. In Proceedings of the International Conference on Computer Vision (ICCV). 5. Chen, D., Yuan, L., Liao, J., Yu, N., & Hua, G. (2017). Stylebank: An explicit representation for neural image style transfer. In Proceedings of the CVPR (Vol. 1, p. 4). 6. Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556. 7. He, M., Liao, J., Chen, D., Yuan, L., Sander, P. V. (2018). Progressive color transfer with dense semantic correspondences. arXiv:1710.00756 [cs.CV]. 8. Levin, A., Lischinski, D., & Weiss, Y. (2004). Colorization using optimization. In ACM transactions on graphics (TOG) (Vol. 23, pp. 689–694). ACM. 9. Endo, Y., Iizuka, S., Kanamori, Y., & Mitani, J. (2016). DeepProp: extracting deep features from a single image for edit propagation. In Computer Graphics Forum (Vol. 35, pp. 189–201). Wiley Online Library. 10. Yan, Z., Zhang, H., Wang, B., Paris, S., & Yu, Y. (2016). Automatic photo adjustment using deep neural networks. ACM Transactions on Graphics (TOG), 35(2), 11. 11. Arbelot, Benoit, Vergne, Romain, Hurtut, Thomas, & Thollot, Joëlle. (2017). Local texturebased. color transfer and colorization. Computers & Graphics, 62, 15–27. 12. Internet contributors. Pretrained VGG-19 convolutional neural network [Internet]. MathWorks [cited 6 Sep, 2019]. Retrieved from https://www.mathworks.com/help/deeplearning/ref/vgg19. html. 13. Zheng, Y., Yang, C., & Merkulov, A. (2018). Breast cancer screening using convolutional neural network and follow-up digital mammography”, In Proceedings of the SPIE 10669, Computational Imaging III, 1066905. https://doi.org/10.1117/12.2304564. 14. C. Barnes, E. Shechtman, A. Finkelstein and D. B. Goldman (2009). Patchmatch: A randomized correspondence algorithm for structural image editing. In SIGGRAPH. 15. Simakov, D., Caspi, Y., Shechtman, E., & Irani, M. (2008). Summarizing visual data using bidirectional similarity. In 2008 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 1–8). IEEE. 16. He, K., Sun, J., & Tang, X. (2010). Guided image filtering. In ECCV (p. 1014). 17. He, K., Sun, J., & Tang, X. (2013). Guided image filtering. TPAMI, 35(6), 1397–1409. 18. He, K., Sun, J. (2015). Fast guided filter. arXiv:1505.00996 [cs.CV]. 19. GitHub contributors. Implementation of Neural Color Transfer between Images by PyTorch [Internet]. GitHub 2018 [cited 5 Sep, 2019]. Retrieved from https://github.com/rassilon712/ Neural_Color_Transfer [accessed]. 20. GitHub contributors. Fast End-to-End Trainable Guided Filter [Internet]. GitHub 2018 [cited 5 Sep, 2019]. Retrieved from https://github.com/wuhuikai/DeepGuidedFilter/tree/master/Gui dedFilteringLayer/GuidedFilter_TF
Form Finding and Evaluating Through Machine Learning: The Prediction of Personal Design Preference in Polyhedral Structures Hao Zheng
Abstract 3D Graphic Statics (3DGS) is a geometry-based structural design and analysis method, helping designers to generate 3D polyhedral forms by manipulating force diagrams with given boundary conditions. By subdividing 3D force diagrams with different rules, a variety of forms can be generated, resulting in more members with shorter lengths and richer overall complexity in forms. However, it is hard to evaluate the preference toward different forms from the aspect of aesthetics, especially for a specific architect with his own scene of beauty and taste of forms. Therefore, this article proposes a method to quantify the design preference of forms using machine learning and find the form with the highest score based on the result of the preference test from the architect. A dataset of forms was firstly generated, then the architect was asked to keep picking a favorite form from a set of forms several times in order to record the preference. After being trained with the test result, the neural network can evaluate a new inputted form with a score from 0 to 1, indicating the predicted preference of the architect, showing the possibility of using machine learning to quantitatively evaluate personal design taste. Keywords Machine learning · Form finding · 3DGS · Generative design
1 Introduction 1.1 3D Graphic Statics Graphic Statics (2D/3D) is a geometry-based structural design and analysis method. The history of graphic statics can be tracked back to the Hellenistic Age, when Archimedes used algebraic formulas and illustrations to explain in his book On the Equilibrium of Planes that the weight of an object is inversely proportional to the distance under equilibrium conditions in the law of levers. In 1864, after systematically combing and expanding the knowledge, Karl Culmann named this H. Zheng (B) School of Design, University of Pennsylvania, Philadelphia, USA e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2020 P. F. Yuan et al. (eds.), Architectural Intelligence, https://doi.org/10.1007/978-981-15-6568-7_13
207
208
H. Zheng
Fig. 1 2D versus 3D funicular solutions and their corresponding force diagrams [2]
subject as “Graphische Statik” (Graphic Statics) in his book Die Graphische Statik [4], which was widely accepted by the academic community. However, 2D Graphic Statics has its own limitations. As [1] states, it has been developed as a 2D method. Thus, only 2D abstractions of 3D structures can be designed. With the development of computing power, 3D Graphic Statics attracted the attention of researchers. Under the support of computers, architects developed digital algorithms to generate 3D forms from 3D force diagrams, such as [2, 3]. The computational solution of 3D Graphic Statics helps designers to generate 3D polyhedral forms by manipulating force diagrams with given boundary conditions, which is easier and more efficient than the direct modeling of forms. In the form finding of 3D Graphic Statics, the transformation rules from force diagrams to form diagrams work similarly as the situation in 2D. Figure 1 shows the comparison of 2D and 3D Graphic Statics, where figure (b) and (d) are the force diagrams and figure (a) and (c) are the form diagrams. Each applied load Fi in the force diagram represents a corresponding load force F i in the form diagram, both of which are perpendicular to each other. Each exterior supporting force F ei in the force diagram results in a structural member ei in the form diagram, which shows the corresponding form of a force diagram. By generating or adjusting the polyhedral force geometries, different truss-like forms can be provided using 3D Graphic Statics. The advantage of this form finding algorithm is that the generated structures are always equilibrium under given boundary conditions. As long as the force diagram is a set of closed polyhedrons, the corresponding form can stay balanced under the action of applied forces. So when designing a form with given applied loads, architects can divide the force polyhedrons with additional interior faces to achieve complexity while keeping the form equilibrium.
1.2 Subdivision of Force Diagrams To set up the boundary condition, a tetrahedron is selected as the original force diagram, indicating the four applied forces. In the Cartesian coordinate system, the bottom three vertexes of the tetrahedron are always on the XY plane, so they can
Form Finding and Evaluating Through Machine Learning …
209
be expressed as (x 1 , y1 ) (x 2 , y2 ) and (x 3 , y3 ) without the value in Z-axis, while the top vertex contains three numbers (x 4 , y4 , z4 ). Based on the random coordinates of the four vertexes, different boundary conditions can be generated, further extending the methods to explore complex forms. The subdivision will happen inside the tetrahedron, keeping the directions and magnitudes of the applied forces constant. To subdivide a polyhedral force diagram, [5] proposes eight possible algorithms, six of which are programmable and practical. Figure 2 shows subdivisions 1, 2, and 3, which do not subdivide the original exterior faces. Different from subdivisions 1, 2, and 3, subdivisions 6, 7, and 8 (Fig. 3) will subdivide the original exterior faces to achieve further complexity. But since the total area and direction of each exterior face still keeps the same, the overall boundary condition does not change after subdivision. Among the 6 subdivision rules, subdivision 7 is the most complicated one, resulting in 24 new cells, while subdivision 1 is the simplest one, only constructing 4 new cells. Different boundary conditions will cause similar but different forces and form diagrams when applying the same subdivision rule, while different subdivision rules will result in totally different forces and form diagrams. Since each subdivision rule takes one cell as input and returns several new cells as output, it’s possible to further subdivide the sub-cells iteratively. Figure 4 shows the result grid of the iterative subdivision, in which two subdivision rules work in turn to subdivide the force diagrams. The number means the rules that apply to the subdivision. For example, 2 indicates that only rule 2 is applied one time, while 36 means that rule 3 is applied to the original force tetrahedron, and then rule 6 is applied to the sub-cells generated by rule 3. So 36 iterative subdivision rules in total are created.
Fig. 2 Subdivision rules 1, 2, 3, and the corresponding forms
Fig. 3 Subdivision rules 6, 7, 8, and the corresponding forms
210
H. Zheng
Fig. 4 6 basic subdivision rules and 36 iterative subdivision rules
2 Methodology 2.1 Preference Test Among a variety of generated forms, in addition to the structural performance, architects also take visual effect and inexplicit aesthetics into consideration, which is hard to express as a formula. But with the help of machine learning, programs can learn any relationships between two sets of data, including the relationship between the forms and the scores, which shows the personal preference from architects. So in order to train a computer to grade a form, a preference test was designed and given to voluntary testers, to collect the training dataset for machine learning neural network. First, a dataset of 400 pairs of force and form diagrams were generated, based on random boundary tetrahedrons and iterative subdivision rules. However, it is very hard and time-consuming to ask a tester to grade them all, since the scoring process will be influenced by a variety of aspects if the testing time lasts for a long time, such as the mood of the tester and the undulation of the scoring standard. So rather than asking the tester for 400 scores, the testing system will show 200 batches of forms, and each batch contains 6 forms listing from left to right (Fig. 5). For each batch of the test, the tester will be asked to pick the favorite form from the 6 forms, and a recorder will add a score of 0.33 to the form. Since each form will be equally shown 3 times (200 * 6/400), after the whole test, each form will have a score of 0,
Fig. 5 Testing batch and accumulated score system
Form Finding and Evaluating Through Machine Learning …
211
0.33, 0.66, or 1, showing the preference from the tester. In this way, it will be much easier for the tester to grade all 400 forms only by picking the comparative favorite form from 6 forms. The whole test takes around 10 min for a well-educated architect, average 3 s for one batch. In order to prove the feasibility of machine learning in predicting the scores in the next step, three tests will be given to the same tester. In the first test, the tester is asked to always pick the simplest form from his perspective, while in the second test, the tester is asked to do the opposite, to always pick the most complex form. The reason for taking these two comparative tests is to simplify the personal preference into a degree that is easier to evaluate by common sense. If the machine learning algorithm successfully gives a higher score to the simple forms in the first model and a higher score to the complex forms in the second model, the feasibility then is proved, so that the third test is convincing, in which the tester is asked to pick forms based on his own preference.
2.2 Machine Learning After the collecting of data from the three tests, similar to the work by [8], vectorbased artificial neural network, a machine learning algorithm was used as a framework to learn the mapping from forms to scores, figuring out the rules of grading a form based on the preference test. Other neural networks such as pixel-based CNN used by [6] and voxel-based 3D CNN used by [7] are not suitable in learning the polyhedral forms, because only the 3D presentation of the line-like modelings can best describe the polyhedral forms, rather than the 2D presentation of entity-like modelings. To apply the neural network in the data learning, the first step is to transform the data according to a well-structured format, so that it can be understood digitally by the computer. In this case, to build a polyhedral form based on 3DGS, the necessary data includes the original boundary tetrahedron and the iterative subdivision rule. As Fig. 6 shows, a tetrahedron with a horizontal bottom face can be expressed as (x 1 , y1 ) (x 2 , y2 ) and (x 3 , y3 ) and (x 4 ,y4 ,z4 ), which is a set of nine real numbers. And the iterative subdivision rule can be expressed as a set of 12 Boolean numbers of 0 or 1, in which the first 6 numbers represent the first subdivision rule and the last 6 numbers represent the second subdivision rule. For example, (0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0) means subdivision 7 (actually the 4th rule) is applied firstly and then followed with subdivision 2. So a set of 21 numbers in total is enough to identify a form from others, and also fits the requirements of the input data in a neural network. The output data is much simpler, which only contains one real number showing the score of a form. Next, an artificial neural network with 2 hidden layers was built (Fig. 7), which contains 21 input neurons, 1 output neuron, and 2 layers of 50 hidden neurons. The activation function, which acts as a transformation rule to calculate the value in a current neuron from the previous neurons, is a sigmoid function. Since all data is real
212
H. Zheng
Fig. 6 Data format of a form and its score. Left: original force diagram. Middle: generated subdivided form. Right: input and output data structure
Fig. 7 Neural network structure
numbers from 0 to 1, the sigmoid function can achieve a high prediction accuracy while keeping the slope of learning rate proper in each learning period. The formula is expressed as following, where yˆ is the value in the current neuron, x is the value in the previous neuron(s), w and b are the parameters in the network, which represent the learning ability of a network and will be figured out after the training. yˆ = Sigmoid(w ∗ x + b) To evaluate the accuracy of the predicted score to the ground truth score and activate the backpropagation to update the network parameters, a Mean Squared Error (MSE) function is set as the loss function as following, calculating the difference between the predicted value yˆ and the true value y. n 1 2 yi − yi LOSS y, y = n i=1
Form Finding and Evaluating Through Machine Learning …
213
Lastly, three separate models using the three preference datasets were trained. After 18,000 epochs of training, the loss value decreased to an acceptable range near 0.00005 for one training item, which means the average difference between the predicted score and the ground truth score is inside the ideal range. Thus, mathematically speaking, now the three trained neural network models all have the ability to predict the score for a given polyhedral form, but they will provide different scores since they were trained by different preference test results.
3 Results 3.1 Comparative Models Evaluation Based on the trained models, for a given boundary tetrahedron, scores of all subdivision rules can be provided by the neural network, that means, the best form can be found through the searching of the 36 subdivided forms, which predicts the design preference of the tester. Before searching and evaluating by the preference model, two comparative models are analyzed to prove the feasibility of this machine learning form finding method. To test the scoring system, a randomly generated tetrahedron, as well as 36 subdivision rules, are inputted. Figure 8 shows the selected scoring result of the two comparative models. It can be obviously seen that, in the first simple-picking comparative model, the forms with higher scores are often simpler than others, while the situation in the second complex-picking comparative model is reversed, where the complex forms are often granted with higher scores. That means both models work correctly to
Fig. 8 Top: evaluated scores of the simple-picking comparative model. Bottom: evaluated scores of the complex-picking comparative model
214
H. Zheng
Fig. 9 Evaluated scores of the preference model for the testing boundary tetrahedron
predict the scores. So concluded from the testing and observation, it has been proved that the trained neural network has the ability to grade a polyhedral form based on the design preference from the architect who trained it.
3.2 Preference Model Evaluation With the conclusion above, the preference model was then used to evaluate the generated forms. Figure 9 shows the selected forms and scores of the testing boundary tetrahedron. There is no obvious tendency of the complexity of the forms since the training dataset was produced from the inexplicit personal preference of the tester. That means, in the preference model, its the boundary condition as well as the subdivision rules together that decide the score of the form. This phenomenon further proves that the complexity is not, or at least not the only factor to influence the testers preference. But unlike the previous comparative models, it is impossible to prove the feasibility of the preference model by a common sense of standards, unless the tester himself evaluates the forms again and compares his scores with the predicted scores from the neural network. So based on this assumption, a test on re-evaluating the forms was taken by the tester again. First, since it is still hard to ask a tester to grade so many forms by different scores from 0 to 1 while keeping his standards constant during the test, four levels of ABCD were used to describe the grades of the forms, in which A means the tester feels perfect about the form, B means the tester likes the form, C means the tester feels just fine about the forms, and D means the tester does not like the form. In the same way, the predicted scores from the neural network were also translated into this grading system, in which a score higher than 0.66 will be graded as A, a score higher than 0.33 but lower than 0.66 will be graded as B, a score higher than 0.01 but lower than 0.33 will be graded as C, and the left will be graded as D. The reason to set 0.66, 0.33, and 0.01 as the boundaries to grade differently is that, in the previous preference test, a score higher than 0.66 means the corresponding form must be picked all three times, so this form must be perfect to the tester. The same reason applies to the 0.33 and 0.01 rules, which means the form must be picked twice or once. Thus, according to the result of the re-evaluation, although the two
Form Finding and Evaluating Through Machine Learning …
215
Fig. 10 Form finding by the highest score for three random boundary conditions
rankings are not exactly the same, two different grades for the same form do not differ more than one level. Generally speaking, around 80% of the predicted rankings are correct, which is already remarkable. It proves that this scoring system can be a reference to the architect when he needs to decide which form to use and develop. Using this preference model, a form finding method can be achieved, which will evaluate all 36 subdivision rules for a given boundary tetrahedron and select the form with the highest score as the best form. Figure 10 shows the outcome of the form finding for five random boundary tetrahedrons. The re-evaluated grades from the same tester for those ten forms are also shown in the top right of the forms. While 3 of the 5 forms are marked as A, the rest 2 of them are marked as B, even though they have the highest scores compared with other 35 forms. But if the top three forms are presented together and further selected by the tester, as Fig. 11 shows, it has a much greater chance that there is at least one form is graded as “A”. Although this is a compromised solution, it already largely narrows down the range of forms for the architect to choose.
Fig. 11 Secondary selection from the user
216
H. Zheng
4 Conclusion and Discussion In conclusion, 3DGS is a powerful and convenient method for the architect to find a form in structural design. By subdividing the polyhedral geometries in the force diagrams, a variety of forms can be generated, each of which is quite different and the set of parameters to express a form is unique and distinguishable from each other. The clear data structure of the polyhedral forms makes it suitable for a neural network to learn the features and then evaluate the forms. Thus, the neural network can also learn the design preference from a specific architect, by learning the result of a preference test taken by the architect. What seems inexplicit and unclear before, now can be quantified through a test, and mathematically evaluated by machine learning. Forms with higher scores can be generated and selected among the solution space automatically, which have higher possibility to satisfy the architect’s personal design taste. As long as there is data, machine learning can work to learn the relationships between them. In the future, the tendency to design cooperation between human and machine will become clearer. The machine will assist the design process not only in simple repeated work but also in creative work by learning the design examples from human beings. The next step of this research is to extend the usage of machine learning to other kinds of design, and develop neural networks for learning and generating architectural geometries in different design tasks.
References 1. Akbarzadeh, M. (2016). 3D graphic statics using polyhedral reciprocal diagrams. Ph.D. thesis, ETH Zurich, Zürich, Switzerland. 2. Akbarzadeh, M., Van Mele, T., & Block, P. (2016). Three-dimensional graphic statics: Initial explorations with polyhedral form and force diagrams. International Journal of Space Structures, 31, 217–226. 3. Bolhassani, M., Ghomi, A. T., Nejur, A., Furkan, M., Bartoli, I., & Akbarzadeh, M. (2018). Structural behavior of a cast-in-place funicular polyhedral concrete: Applied 3D graphic statics. In Proceedings of the International Association for Shell and Spatial Structures (IASS) Symposium 2018, MIT, Boston, USA, July 2018. 4. Culmann, C. (1864). Bericht an den hohen schweizerischen Bundesrath uber die Unter-suchung der schweiz. Wildbache: vorgenommen in den Jahren 1858, 1859, 1860 und 1863. Zurcher und Furrer. 5. Ghomi, A. T., Bolhassani, M., Nejur, A., & Akbarzadeh, M. (2018). The effect of subdivision of force diagrams on the local buckling, load-path and material use of founded forms. In Proceedings of IASS Symposium 2018, MIT, Boston, USA. 6. Huang, W., & Zheng, H. (2018). Architectural drawings recognition and generation through machine learning. In Proceedings of the 38th Annual Conference of the Association for Computer Aided Design in Architecture, Mexico City, Mexico.
Form Finding and Evaluating Through Machine Learning …
217
7. Newton, D. (2018). Multi-objective qualitative optimization (MOQO) in architectural design. In Proceedings of the 36th International Conference on Education and Research in Computer Aided Architectural Design in Europe, Poland. 8. Sjoberg, C., Beorkrem, C., & Ellinger, J. (2017). Emergent syntax. In Proceedings of the 37th Annual Conference of the Association for Computer Aided Design in Architecture, Boston, United States.
Comparison of BESO and SIMP to Do Structural Topology Optimization in Discrete Digital Design, and then Combine Them into a Hybrid Method Gefan Shao
Abstract On account of the high efficiency of discrete digital design when comparing with 3d-printing in the background of additive manufacture, this essay is going to introduce a hybrid high-efficiency method that is combined with BESO and SIMP for solving topology optimization in discrete digital design. The reason is that both BESO in Karamba3D and SIMP in Millipede have some disadvantages and cannot optimize the structure in an extremely efficient way in discrete design. Based on the project TRANSFOAMER (Chen Ran, Chen Zhilin, Shao Gefan, Wei Na, 2016–2017), RC4, Bartlett School of Architecture, loads of tests will be conducted to demonstrate how hybrid method is operated and why it is more efficient than each single method. Finally, the method will be applied to the project to design some productions. Keywords Discrete design · Topology optimization · BESO · SIMP
1 Context of Discrete Design and Topology Optimization 1.1 Background of Digital Design and Discrete Assembly According to philosopher Goodman [1], the digital computer is a theory of counting and transferring signals by a binary system (1968, pp. 159–160). Architect Retsin [2] argues that digital discrete design which is a kind of additive manufacture method means that both the design process and the physical organization of material should be digital and discrete (2015, p.144), not like the continuous way such as 3d-printing and CNC milling. Physicist Neil Gershenfeld, et al. [3] claim that the implication of the use of digital materials should be defined as “reversibly assembled from a discrete set of parts with a discrete set of relative positions and orientations into large scale projects” (2015, p.122). A construction can be regarded as an aggregation that combined by thousands of discrete elements like the “bits” in computer fields, G. Shao (B) Bartlett School of Architecture, University College London, London, UK e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2020 P. F. Yuan et al. (eds.), Architectural Intelligence, https://doi.org/10.1007/978-981-15-6568-7_14
219
220
G. Shao
and automatically develop by a sort of rules with limited positions and orientations. Unlike those projects such as The Programmed Wall (2006) by Kohler [4], whose project is to pile bricks with analogue orientation of increasingly freedom degrees, Architect Sanchez [5], pp. 91 argues that differentiated and complex space can also be produced with serialized and serially repetitive elements. Retsin et al. [6], pp. 180 demonstrate that discrete digital design aims to produce efficient heterogeneous space by the same asymmetric pieces with several limited combinatorial design strategies.
1.2 Topology Optimization and Architecture Design Actually, most problems that designers need to solve are what’s known as “wicked problems” [7]. It contains incomplete, contradictory, and changing requirements that are often difficult to recognize (Australian Public Service Commission, 2007, p. 3), so solutions to wicked problems cannot be easily regarded as true or false but better or worse. The concept of topology optimization was firstly raised by Bendsøe and Kikuchi [8] (p. 197–198) They claim that topology optimization is conducted by giving some constraint conditions as well as satisfying other design requirements, and it is also related to modern manufacturing techniques in augmented ages, in which computer will participate in the optimization process to calculate optimal distribution in space of an anisotropic material. This issue is also revealed in architecture that architectural design concerns numerous factors, such as structure, costs, material, appearance, scale, etc. Famous Architect Vitruvius claims that a building should follow three basic principles of strength, utility, and beauty (firmitas, utilitas, venustas) [9]. However, there is usually a chasm between the vision of the architect and structure engineer in traditional process. Since Architects focus more on aesthetics and engineers focus more on techniques, the cooperation would be (at worst) a compromise or (ideally) a synergistic [10], pp. 715–717 if they divide the work. Historically, there are architects whose aesthetic design also follows the very creative structural design, for example, Antonio Gaudi, who uses physical models to calculate sophisticated structure, and Frei Otto, who tests membrane structure in long-span architecture by doing experiments with soap bubble. When it comes to the field of additive manufacture, it is reasonable to manufacture complex topologies. Theoretically, any shape that is improved by topology optimization is able to be built by digital design. The method allows designer to define constraint conditions such as loads, support points, and joint types of the final shape and mass, and then by computing optimal distribution, the shape will be changed into a structure with less material but following the optimal structure strength. In this case, the advantages of topology optimization such as weight saving and material saving can be reflected in maximum [2], p. 17. Since the biggest advantage of discrete digital design is its efficiency of producing loads of heterogeneous space by limited combination methods, topology optimization will further increase efficiency by regarding minimum weight of materials as the primary condition on this wicked problem.
Comparison of BESO and SIMP to Do Structural Topology …
221
2 Existing Topology Optimization Methods-SIMP and BESO 2.1 Solid Isotropic Material with Penalisation (SIMP) Before 1989, topology optimization has only been studied in integer value with only 0 and 1. By giving less influence to intermediate values of the structural analysis variables, [11], p. 194 proposed a method to vary the design variables continuously which resulted in a non-discrete solution. This is the beginning of developing a theme that is named Solid Isotropic Material with Penalization (SIMP) [11]. Later, it has been optimizing by some engineers, such as Rietz [12], Sigmund [13], Martinez [14], etc. SIMP is a topology optimization method in virtue of density. According to Aremu et al. [15], p. 684, the process of SIMP is a kind of integral analysis, after times of density analysis, the density of the mass will be updated. The process will stop until more mass are converged in high-density part and emptier in less density part. Therefore, the result will look like a continuous mass with higher density volume remaining and lower density volume deleted when mass ratio dropping off. The whole process can be generalized into the chart below (Fig. 1), which shows the algorithm of this method. A grasshopper plugin called Millipede has similar theory of SIMP.
2.2 Bi-Directional Evolutionary Structural Optimization (BESO) BESO is a method that combines with ESO (algorithm for removal) and AESO (algorithm for addition) [16], p.1031. In 2006, Huang, et al. updated the existed BESO method by Querin, et al. to a new algorithm with only one parameter, i.e., the removal ratio of volume (weight) [17], pp. 1091–1092. BESO operates more similar as a discrete method. Although the finite elements are subdivided into quite tiny beams, the results are nothing but 1 (add) or 0 (remove). Aremu et al. [15], p. 684 also give a flowchart of BESO algorithm for the minimization of Strain Energy(SE) for a given volume fraction constraint (V*) (Fig. 2). In this chart1 , an Evolution Rate (ER), at which the volume is allowed to change per iteration, Filter Radius (FR), which is a distance limit, V* and design domain are supplied to the algorithm. So the volume will be optimized into a relatively more efficient shape after the setting times of iterations. There is a component called BESO analysis in the Grasshopper plugin Karamba3D in Rhinoceros, to optimize topology in structure grids and structure FR = Filter radius; ER = Evolution rate; V* = Volume fraction constraint; λα = Sensitivity of element α; λthadd = Element sensitivity threshold value (adding); λthdel = Element sensitivity threshold value (deleting); SE = change in strain energy; tol = Convergence tolerance, le-5; V* = Volume fraction constraint.
1 Nomenclature:
222
G. Shao
Fig. 1 Flow chart of SIMP [15]
shapes [18]. For discrete digital design, analysis of loads of beams can be regarded as a kind of conversion of discrete elements, thus in this essay, most tests will be done in BESO for Beams to optimize the structure of discrete digital architecture.
3 Project TRANSFOAMER Project TRANSFOAMER aims to cut mega blocks of high-density XPS foam pieces by hot-wire cutter and then pick and place them into large scale aggregation through a spatial design strategy (Fig. 3). Retsin [19] argues discrete digital system as an efficient method to design, fabricate, automate, and build on the scale of architecture (2016, pp. 144). Therefore, the main target of the research project is efficiency, of which not only in design process but also fabrication part should be very fast. In the material thinking, foam itself can be delivered and fabricated in very quick way; In fabrication aspect, each foam piece can be precisely manufactured by robotic hotwire-cutter within only 18 min; In design aspect, the concept of efficiency should also be the main purpose. In the premise of foregone fabricating time of each discrete piece, the efficiency is embodied in the limited amount of digital materials, but it also
Comparison of BESO and SIMP to Do Structural Topology …
Fig. 2 Flow chart of BESO [15] Fig. 3 Robotic hot-wire cutting-project TRANSFOAMER
223
224
G. Shao
meets the strongest structural requirements. In other words, the meaning of topology optimization in discrete digital design is to find a most efficient organization of discrete pieces to build a construction. This discrete piece follows two grids of triangle and one grid of square, and it also introduced a method of point-to-point connection in computational logic. Each piece will have a center point (red) and eight serialized surrounding points (blue) (Fig. 4). Every time when aggregating, the second cloned piece will connect its red point to one of blue points on previous piece, and then rotate by a certain Euler angle (eight types) until finding a strong combination without colliding. (Fig. 5) The line between center points of the first and the second pieces shows the direction of aggregating.
Fig. 4 Piece analysis-project TRANSFOAMER (left)
Fig. 5 Different rotations of pieces-project TRANSFOAMER (right)
Comparison of BESO and SIMP to Do Structural Topology …
225
In this way, the piece can be guided to propagate through a serial of structure lines after the structure analysis. In order to find the equivalent transformation between pieces and structure lines, there are also some structure tests between only two pieces by karamba3D. It shows the displacement (m) of each combination and also the orientation between their center points (Fig. 6). After loads of different combinatorial structural tests, the piece shows a kind of connection feature between center points: the force follows the direction between center points, for which the combination of pieces can be regarded √ as the structural line with 60-degree-angle in a special grid with 0.15 * 0.15 * 3 3/20 (Fig. 7). More than that, some properties such as the material and cross sections need to be set to make the analysis closer to the reality. In this project, the material is high-density XPS foam, with the density of 0.3 kN/m3 , Young’s modulus of 15,000 kN/cm2 , the shear modulus of 6481 kN/cm2 , the yield strength of 23.5 kN/cm2 , and the cross section is a trapezoid shape with upper width of 1 cm and lower width
Fig. 6 Structure analysis of different combinations-project TRANSFOAMER
Fig. 7 Piece grid-project TRANSFOAMER
226
G. Shao
of 15 cm, which is the exact cross section of the piece. So further research will be all based on this special gird with structural lines (beams), material, cross-section shape. Since the main comparison between these two methods is the efficiency, which means to compare the amounts of pieces (beams) that are existing after optimization, and also staying a stable structure. Typically, the maximum deformation of beams is limited to the beam’s span length divided by 250, so those tests aim to find the minimum pieces and also allow the displacement to be less than around 1/250 of the beams span [20], pp. 90.
4 BESO and SIMP Based on Project TRANSFOAMER No matter which type of method is going to be applied to optimal digital discrete building, the aim is to search for best constraint conditions to balance structure, material, form, and aesthetics. This part is going to show loads of tests in some typical shapes (Fig. 8) by both BESO in karamba3D and SIMP in Millipede.
4.1 BESO in TRANSFOAMER In the BESO method, the load is set as 1 KN/m3 , the number of change iterations is 10 and the number of convergence iterations is 20. And the tension factor, the compression factor, and the BESO factor are separately set as 1, 2, and 0. In order to make a more intuitional result, all the pieces will be replaced by 60-degree structure lines to show the percentage of remaining volume (Fig. 9). And the most efficient result happens when the distortion meets 1/250 of the cantilever length. Some typical percentages are listed as follows.
Fig. 8 a Cube with only 1 load surface and 1 support part of 3 m*3 m*3 m; b Truss with 1 load surface and 2 supports of 6 m*1.5 m* 1.5 m;c Column with 4 load points and 1 sup-port part of 3 m*3 m*4 m; d One-floor building with support point in 4 corners of 6 m*6 m*3 m; e Beam with 1 load surface and 2 supports with 12 m*1.5 m*1.5 m
Comparison of BESO and SIMP to Do Structural Topology …
227
Fig. 9 Topology optimization by Karamba3D with typical structures: a 55%, b 63%, c 3%, d 45%, e 87%
4.2 SIMP in TRANSFOAMER In the SIMP method, the load is also set as 1 KN/m3 . The cell size is 0.15 m, when the optimization iteration is set as 5. In order to get a relatively smooth result, the smooth factor is 0.2 with the target density of 0.05. Comparing with BESO in Karamba3D, SIMP in Millipede shows a relatively continuous result. But the criterion will be the same to approach the most efficient result (Fig. 10).
Fig. 10 Topology optimization by Millipede with typical structures. a 29%, b 14%, c 5%, d 10%, e 59%
228
G. Shao
Fig. 11 Topology optimization by hybrid method-step 1: a 30%, b 15%, c 5%, d 10%, e 60%
4.3 Hybrid Method of BESO and SIMP In this part, a hybrid method is going to be introduced. The structure will be optimized by SIMP in step 1 (Fig. 11) to get a rough structure irregular shape, and after transforming into discrete grid, the inside structure will be optimized further by BESO into a more efficient stable structure. Some results in the same structures are as follows (Fig. 12).
4.4 Comparison of the Results After precious tests, both these two methods show different advantages and optimization effects in different structures. From the tests and data that are shown in the chart (Table 1), some comparison results are as follows. (1) In most cases, SIMP performs more efficiently than BESO. Firstly, it is probably because BESO in Karamba3D analyses the structure by subdividing it into discrete grid at first, while SIMP in Millipede analyses the structure in a more precisely continuous way, and then divides it into discrete grids. Since discrete grids always lead to redundancy and less precise than continuous shape, SIMP will produce more precious optimization than BESO. It can also be seen that some structure lines that are optimized by Karamba3D appear in low-density part of model that optimized by Millipede. Another reason may be the analysis
Comparison of BESO and SIMP to Do Structural Topology …
229
Fig. 12 Topology optimization by hybrid method-step 2: a 1.5%, b 8.55%, c 1.35%, d 1.6%, e 45%
mechanism of these two methods. Millipede is going to delete low-density part and converge high-density part together, which will make the high-density structure stronger. But BESO is to delete low-structural discrete lines in scatter way without converging, so in some extent, BESO attenuates the strong part of the structure gradually when target ratio is decreasing; (2) The larger the span of structure is, the more efficient that SIMP is. Because for large-span structure, the key part of the structure is more important than small-span structure. As SIMP converges high-density structure and deletes low-density part but BESO is to attenuate the volume, the disparity of effect will be enlarged; (3) Not only for BESO in Karamba3D, but also for SIMP in Millipede, sometimes, displacement plunges at a certain range of value. It probably because to a certain ratio, all load points can transmit the force to the support points by continuous structure lines; (4) As for the hybrid methods, on the basis of SIMP in Millipede, BESO in Karamba3D further optimizes the structure by deleting some redundant lines. And all of them show higher efficiency in topology optimization in different typical structures. And the longer the span of structure is, the less significant the effect is. In some structures like Cube and Column, only around 1–2% undeleted lines can keep the structure stable.
230
G. Shao
Fig. 13 Structure test of table design-project TRANSFOAMER
Fig. 14 Pavilion in B-pro show 2017, in Bartlett-project TRANSFOAMER
5 Applying to Project TRANSFOAMER This kind of hybrid method is also applied to the project TRANSFOAMER to design some practical structure like table (Fig. 13) and Pavilion (Fig. 14).
6 Conclusion Though SIMP and BESO attain topology optimization efficiently, the results are often local since it is constrained by the loads, supports, element sizes, and other parametric values. So the redundant cannot be cleaned up before the displacement of structure deformation starts getting larger. Tests show that the hybrid method is more efficient than individual SIMP or BESO in additive manufacture. It perhaps because
Comparison of BESO and SIMP to Do Structural Topology …
231
Table 1 Comparison of 3 methods doing topology optimization in typical structures Cube
Truss
Column
Floor
Beam
BESO
Ratio (%)
55
63
5
45
87
Displacement/m
0.0004
0.0036
0.0036
0.000096
0.007
SIMP
Ratio (%)
29
14
5
10
59
Displacement/m
2*10−6
0.000235
0.000244
0.000027
0.0235
Minimum ratio (%)
1.5
8.55
1.35
1.6
48
Displacement/m
0.001349
0.0047
0.0064
0.0021
0.0238
Hybrid
the local redundant has been avoided in the first step of SIMP, so that BESO can optimize integrally and get a more efficient result, which still needs further research. As for the digital design, the complexity of design makes it a problem without a clear standard answer [21, p. 5]. A variety of limiting factors also stop designers to solve the problem perfectly. For example, in the previous tests of topology optimization, the main factor that I focus on is the efficiency with reasonable structure strength, but on the other hand, some conditions such as aesthetic value are lost. When it comes to the second digital turn [22, pp. 168–173], many design factors can be digital, such as material weight, costs, time consumption, structure behavior, etc. Although topology optimization is a quite cutting-edge technology to introduce to architecture field at the moment, hopefully, in the future, more methods can be produced to make the whole process fundamentally digital.
References 1. Goodman, N. (1968). Analogues and digits. Languages of art: An approach to a theory of symbols (pp. 159–164). Indianapolis, IN: Hackett Publishing. 2. Razvan, C. (2014). Overview of structural topology optimization methods for plane and solid structures. Annals of the Oradea University. Fascicle of Management and Technological Engineering, 23(13), 2014/3(3). 3. Gershenfeld, N., Carney, M., Jenett, B., Calisch, S., & Wilson, S. (2015). Macro-fabrication with digital materials: Robotic assembly. Architectural Design: Material Synthesis: Fusing the Physical and the Computational, 85(5), 122–127. 4. Kohler, G. (2006). ROK—Rippmann Oesterle Knauss GmbH | projects | the programmed wall. Rok-office.com. Available at https://www.rok-office.com/projects/040-programmed-wall/. 5. Sanchez, J. (2014).Polyomino-reconsidering serial repetition in combinatorics. In: ACADIA 14: Design agency (pp. 91–100). Los Angeles: ACADIA/Riverside Architectural Press. 6. Retsin, G., Gracia, M., & Soler, V. (2017). Discrete Computation for Additive Manufacturing. Fabricate, 2017, 178–183. 7. Rittel, H. (1984). “Second generation design methods,” interview in design methods group, 5th anniversary report, DMG occasional paper 1, 1972, pp. 5–10. In N. Cross (ed.), Developments in design methodology (pp. 317–327). Chichester: John Wiley & Sons. 8. Bendsøe, M., & Kikuchi, N. (1988). Generating optimal topologies in structural design using a homogenization method. Computer Methods in Applied Mechanics and Engineering, 71(2), 197–224. 9. Marcus Vitruvius Pollio BC.14. The ten books on architecture. 1st ed.
232
G. Shao
10. Beghini, L., Beghini, A., Katz, N., Baker, W., & Paulino, G. (2014). Connecting architecture and engineering through structural topology optimization. Engineering Structures, 59, 716–726. 11. Bendsøe, M. (1989). Optimal shape design as a material distribution problem, 1st ed. [Lyngby]: Danmarks Tekniske Højskole. Matematisk Institut. 12. Rietz, A. (2001). Sufficiency of a finite exponent in SIMP (power law) methods. Structural and Multidisciplinary Optimization, 21(2), 159–163. 13. Sigmund, O. (2001). A 99-line topology optimization code written in Matlab. Structural and Multidisciplinary Optimization, 21(2), 120–127. 14. Martinez, J. (2004). A note on the theoretical convergence properties of the SIMP method. Structural and Multidisciplinary Optimization, 29(4), 319–323. 15. Aremu, A., Ashcroft, I., Hague, R., & Wildman, R. (2010). Suitability of SIMP and BESO topology optimization algorithm for additive manufacture (pp. 679–692). PHD, Wolfson School of Mechanical and Manufacturing Engineering: Loughborough University. 16. Querin, O., Steven, G., & Xie, Y. (1998). Evolutionary structural optimisation (ESO) using a bidirectional algorithm. Engineering Computations, 15(8), 1031–1048. 17. Huang, X., & Xie, Y. M. (2007). Convergent and mesh-independent solutions for the bidirectional evolutionary structural optimization method. Finite Elements in Analysis and Design, 43, 11. 18. Moritz, H., Orlinski, A., Clemens, P., Matthew, T., Robert, V., & Christoph, Z. (2017). BESO for karamba. Vienna: Clemens Preisinger. 19. Retsin, G. (2015). Discrete assembly and digital materials in architecture. In Proceedings of ECAADE 34, fabrication | robotics: Design & assembling, vol. 1 (pp.143–151). 20. Saddiqi, Z. A. (1997). Concrete structures (2nd ed.). Lahore: Help Civil Engineering Publisher. 21. Buchanan, R. (1992). Wicked problems in design thinking. Design Issues, 8(2), 5. 22. Carpo, M. (2014). Breaking the curve. ArtForum, 52(6), 168–173.
Interactive Intelligence
Emotional Intelligence: Affective Computing in Architecture and Design Behnaz Farahi
Abstract What if material interfaces could adapt physically to the user’s emotional state in order to develop a new affective interaction? By using emotional computing technologies to track facial expressions, material interfaces can help to regulate emotions. They can serve either as a tool for intelligence augmentation or as a means of leveraging an emphatic relationship by developing an affective loop with users. This paper explores how color and shape-changing can be used as an interactive design tool to convey emotional information, and is illustrated by two projects, one at the intimate scale of fashion, and one at a more architectural scale. By engaging with design, art, psychology, computer and material science, this paper envisions a world where materials can detect the emotional responses of a user and reconfigure themselves in order to enter into a feedback loop with the user’s affective state and influence social interaction. Keywords Affective computing · Responsive materials · Active matter · Interactive design · Robotics
1 Introduction Does matter “have” emotion? Can matter “recognize” emotion and “provoke” certain emotional responses in users? Can new materials be imbued with the right integration of sensing, actuation, and communication so as to serve as affective matter to detect and respond to emotions? (Fig. 1). In the past many Western thinkers have viewed emotion as an obstacle to rational and intelligent thinking; there has been a large gap between rational and emotional perspectives. Conventionally, computers are considered as being rational and logical. They are also thought to be good at accomplishing certain cognitive tasks at which humans are not so good. Anything related to emotion would, therefore, need to be dismissed or simply not taken seriously by the scientific community [1]. B. Farahi (B) School of Cinematic Arts, University of Southern California, Los Angeles, CA 90007, USA e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2020 P. F. Yuan et al. (eds.), Architectural Intelligence, https://doi.org/10.1007/978-981-15-6568-7_15
235
236
B. Farahi
Fig. 1 Mesolite: An emotive display
In the past decade, this view has changed dramatically. At last, it has been widely accepted that emotional systems can also influence cognition. This means that everything we do has both a cognitive and affective component that assigns meaning as well as value. Emotions—whether positive or negative—can directly influence cognition and other behaviors including our perception, attention, motivation, and in general our decision-making capabilities. Advances in neuroscience and psychology about understating the role of emotion—such as the research by leading neuroscientist, Antonio Damasio—have led many computer scientists to attempt to create computers, which can understand emotions.1
2 Detecting Emotion: Emotional Computing “Affective computing” is a term coined by Rosalind Picard in a paper at the computer science conference in 1995 [2]. However, the origins of this enquiry can be traced beyond this. For instance, in 1972 Manfred Clynes invented a machine called a “stenograph” for measuring emotions. In his experiments, subjects used touch and their finger pressure to express a sequence of emotions—anger, hate, grief, neutral, love, sex, joy, and reverence—while experiencing 25 min sequence cycles of music. He aimed to use his research as evidence that it is possible to “counter a negative emotional state by inducing a rather rapid shift into a positive one.” In his book, Sentics: The Touch of the Emotions, he outlined his findings about “emotional perception and response at the intersection of music, art and mathematics” [3] and 1 Note
that there are many different theories for emotions borrowing from various disciplines including psychology, neuroscience, physiology, and cognitive science.
Emotional Intelligence: Affective …
237
elaborated on the notion of “sentic” forms. “The emotional character is expressed by a specific subtle modulation of the motor action involved which corresponds precisely to the demands of the sentic state” [4]. “Affective computing” is now widely used to refer to the “computational modeling of emotion and implementations of autonomous agents capable of affective processing” [5]. Put simply, affective computing is about developing systems, which can recognize, interpret, and simulate human emotions through measuring physiological responses. In fact, studies have shown that the majority of affective communications take place non-verbally or para-linguistically through facial expressions, gestures, and vocal inflections [6, 7]. Thanks to sensor technologies various data from the user’s physiological or neurological responses can be captured and processed. In a manner not dissimilar to how we understand each other emotions through modalities of various information, these systems can perceive cues for any emotions. For instance, computer vision sensors can be used to capture the bodily gestures and even facial expressions, while biometric sensors can directly measure physiological data such as skin temperature and galvanic resistance and help us to better understand our emotional state. Although there is an ongoing debate as to whether emotions are socially and culturally constructed or universal,2 in the 1960s American anthropologist Paul Ekman, aimed to show that certain types of emotions are not culturally specific and are in fact universal in all walks of life. As Evans puts it, “Our common emotional heritage binds humanity together, then, in a way that transcends cultural difference” [8]. Ekman called the following “basic emotions”: joy, distress, anger, fear, surprise, and disgust. Through his research, he attempted to argue that the facial expressions associated with the basic emotions are innate and universal [9]. Universal emotions manifested in a physiological way through facial expressions, can be detected and recognized by computational systems. And if materials are enhanced with these computational systems, how might matter represent or simulate an emotional response accordingly? In other words, how can we map various emotions to various responses? Or, to put it another way, how might we use techniques for detecting facial expressions to control responsive behavior?
3 Provoking an Emotional Response …Emotion, as the word indicates, is about movement, about externalized behavior, about certain orchestrations of reactions to a given cause, within a given environment [10].
In 1944, Fritz Heider and Marianne Simmel conducted a very interesting experiment exploring how the brain assigns various emotional characteristics and constructs a story out of a series of events [11]. In their experiment, they showed participants 2 For
instance, in her latest book, How Emotions Are Made, Lisa Feldman Barrett argues that emotions are socially and culturally contrasted.
238
B. Farahi
a short simple animation and asked them to describe what they saw happening. What they discovered is that many people assigned certain characteristics, such as emotions, intentional movements, and goals, to simple shapes and their associated movements even though there is no evidence of any facial expression or even any indication of human representation. In 1986, Valentino Braitenberg made another fascinating observation in his book, Vehicles: Experiments in Synthetic Psychology [12]. In his thought exercises, Braitenberg envisioned simple robots that could produce surprisingly complex—and seemingly cognitive—behaviors. Even though these vehicles are able to move around autonomously based on how their sensors and wheels are wired together, it appears that they have a form of agency that allows them to achieve a goal or even represent various characteristics such as being aggressive, explorative, passionate, and so on. For instance, a robot avoiding the source of light can represent emotion of “fear”, not dissimilar to how a bug escapes from the light in order not to be caught. Of course one of the big questions in all these observations is how to exploit the natural phenomenon of anthropomorphism—the process by which we attribute mental characteristics to animated objects. It is fair to argue that, by utilizing material movement or color change as an interaction design tool, it is possible to use these tools for emotional communication. Most of these material interfaces have been used/studied already for their visual and haptic communication properties, but not necessarily for emotional expression. However, this area of research is growing. As Strohmeier et al. note, “Recent explorations into shape changing interfaces have begun to explore how shapes might be used to output emotions” [13]. So, how could shape- and color-changing designs convey emotions and how might this development change the experience of design? How would the application of such development in wearables potentially benefit those who suffer from an incapacity to understand emotional cues from their environment? Or how might an architectural element benefit from engaging emotionally with the visitors? This paper uses two projects, which engage with the notion of emotional computing at various scales, in an attempt to answer these questions.
4 Facial Expression Tracking Mark Weiser’ notion of “ubiquitous computing” has already became a reality. We now live in a world where computational devices are embedded throughout the environment. The notion of smart environments and smart gadgets are becoming more and more part of the fabric of our lives, from Fitbit tracking the number of calories that we burn, to Apps tracking our sleeping patterns, to Nest thermostat, learning from the pattern of our behavior in buildings. It is time for computational systems to not only engage with quantitative aspect of our life but also to create an interface with our emotions. Applications of facial expression tracking systems include
Emotional Intelligence: Affective …
239
1. Studying the affective reaction of costumers for marketing purposes to understand their satisfaction about a certain product 2. Serving a customized media content for advertising purposes 3. Observing the mental health of patients for clinical psychology and healthcare 4. Monitoring facial responses for security purposes in airports. For example, in the commercial world, the chocolate manufacturer, Hershey, has developed a dispenser in their retail outlets, which rewards you with a sample if you smile. The intention is to enhance the in-store experience and create a sales opportunity. Meanwhile, their competitor, Mendelez International, plays commercials based on an age and gender of the detected costumers [14]. Likewise, facial expression tracking has been implemented in the media and entertainment industry in order to create experiences such as FaceDance,3 which allows people to control the movement of a virtual Michael Jackson through their facial muscle movement. This paper argues that facial expression tracking can be embedded into the fabric of materials for a number of purposes: 1. The application of emotional computing into smart wearables could augment emotional intelligence. It could not only provide a better understanding of our emotional state in an objective way but also give us clues about our social settings. 2. The application of emotional computing into smart environments/objects could create a more empathic and engaging experience by establishing an affective loop with the user.
5 Opale: An Emotive Soft Robotic Garment How might clothing sense aggression, and go into defensive mode accordingly? This section describes the design strategy behind Opale, an emotive garment which can recognize and respond to the emotional expressions of people around it. The aim is to develop a soft robotic wearable fitted with an electro-mechanical system that controls the shape-changing behaviors mimicking the emotional expressions of onlookers (Fig. 2). Human hair and animal fur are some of the most inspiring natural phenomena both in terms of their morphology as well as communication purposes during social interaction. Inspired by animal fur, Opale is composed of a forest of 52,000 fiber optics embedded in silicon which bristle when the wearer is under threat. The intention behind the material development of this project was to control the location and orientation of hair-like elements so that they might respond to underlying forces not dissimilar to how hair stands up due to micromuscle contractions attached to hair follicles when we experience goosebumps or piloerection. These involuntary responses within the skin are due either to temperature changes or the experience 3 Designed
by filmmakers Ariel Schulman and Henry Joost and creative coders Aaron Meyers, Lauren McCarthy, and James George.
240
B. Farahi
Fig. 2 Opale: An emotive soft robotic garment
of emotions such as fear, sexual arousal, and excitement.4 To achieve this goal, a series of inflatable silicon pockets were incorporated beneath the fur-like skin in order to generate deformations in the texture and surface volume. The distribution of fibers on the surface was based on a study of the architecture of the human body. Data captured from an analysis of the surface curvature of the human body and the underlying contours of the muscles informed the location, density, and height of fiber distribution. The intention was to exaggerate the movement of underlying muscles by having the denser and longer fibers follow the contours of the curvature beneath (Figs. 3 and 4). The inflatable behaviors were controlled using a custom-designed electrical board attached to an Adafruit Feather microcontroller (M0 with ATSAMD21G18 ARM Cortex M0 processor) capable of controlling an array of six low powered three-port medical solenoids (LHLA Series). This facilitated the computational control of air pressure and rapid inflation through the Arduino programming environment. As a result, each of the six pneumatic soft composites pockets were capable of providing dynamically controlled texture patterns that could vary in speed and frequency of change. For this, a miniature sized CO2 capsule (16 grams, LeLand), a regulator (15 psi output, Beswick), and lithium polymer battery (3.7 V, 2000 mAh) were used (Figs. 5 and 6). From the perspective of interactive design, this project looks closely at the dynamics of social interaction. We tend to respond to people around us through our unconscious facial expressions and bodily movements. When surrounded by smiling people, we often smile back. And when threatened, we often take on a 4 The fabrication process for this project consisted of manually inserting 52,000 fiber optics into the
laser cut mounting surface (1/4 clear acrylic sheet). After placing all the fibers into the surface, the fibers were carefully moved to a bath of silicon. After 48 h, once the silicon was fully cured, the mounting surface was removed gently from the fiber landscape.
Emotional Intelligence: Affective … Fig. 3 Left: Placing fibers into the clear laser-cut acrylic sheet. Right: Data from surface curvature analysis of the human body informs the location, density, and the height of fibers
Fig. 4 Left: Placing fibers into the clear laser-cut acrylic sheet. Right: Data from surface curvature analysis of the human body informs the location, density, and the height of fibers
Fig. 5 Left: A pneumatic control circuit consisting of six three-port solenoids valves with coax cable connections. Right: 16 g CO2 capsule and regulator (15 psi output)
241
242
B. Farahi
Fig. 6 Left: A pneumatic control circuit consisting of six three-port solenoids valves with coax cable connections. Right: 16 g CO2 capsule and regulator (15 psi output)
defensive stance. Through the logic of neuron mirroring, we mimic each other’s emotional expressions. Likewise, animals use their skin, fur, and feathers as a means of communicating. Dogs, cats, and mice bristles their fur as a mechanism of defense or as a form of intimidation (Fig. 7). Darwin was the first to examine the emotional signals in humans and animals in his book, The Expression of the Emotions in Man and Animals (1872). He argued that the way that our hair stands on end when we are scared is a “leftover from a time when our ancestor were completely covered in fur” and its role was to make them look bigger and more intimidating [15]. The challenge, then, is to develop clothing that can likewise express emotions. For example, might it be possible for clothing to sense aggression, and go into defensive mode accordingly? The challenge was to explore whether emotions expressed in our social interactions could be represented in a non-verbal way through the motion of a garment. Thus, the garment would become an expressive tool or apparatus that empathizes with the
Fig. 7 Piloerection in mice accessed from https://www.nature.com/articles/1300989
Emotional Intelligence: Affective …
243
onlookers. For this purpose, this dress is equipped with a facial tracking camera that could detect a range of facial expressions on the onlooker’s face: happiness, sadness, surprise, anger, and neutral. Each emotion detected was sent to a microcontroller (Teensy 3.2) capable of activating the solenoids to generate various patterns and inflation speeds in each air pocket (Fig. 8). For example, if the dress were to detect an expression of “surprise” in an onlooker’s face, the wearer’s shoulder area would start to inflate. Or when an onlooker expresses “anger”, the wearer’s shoulder and chest would start to inflate and deflate with a frantic, aggressive motion. When people around start to smile and demonstrate “happiness” the dress would ripple subtly from top to bottom (Figs. 9 and 10). Although the current application of emotional computing and soft robotics for wearables still has limitations, it opens up exciting opportunities for shape-changing clothing in the future for both communication and healthcare purposes. Not only does the smart garment promise to become part of the apparatus of human intelligence but
Fig. 8 Facial tracking camera embedded into the silicon dress, and can detect onlooker’s facial expressions
Fig. 9 The dress is responding to the onlooker’s emotions
244
B. Farahi
Fig. 10 The dress is responding to the onlooker’s emotions
it can also benefit many people with autism who have difficulties recognizing facial expressions. People with autism might be paying attention to what you are saying but be unable to tell if you are happy, sad, or angry. As a result, their responses might not match the desired expectation, leading to isolation and rejection by others. Such a system can help them blend more easily with others and over time learn appropriate responses.
6 Mesolite: An Emotive Display How can we create an engaging emotional experience for costumers by design of an interactive display? How can the life-like quality be told through materials, dynamic behavior, and color transformation? How can persona and character be used as a tool for interaction design? Mesolite is an emotive display commissioned by Adidas designed to showcase the 2020 Predator concept shoe, and offer retail and other consumer experiences. The aim is to explore how computer vision and facial tracking technology can be implemented in the design of a display in order to influence patterns of social interaction and maximize the engagement with the viewers by giving animal-like qualities to the designed display object (Fig. 11). Inspired by the form of a soccer ball, the Mesolite sphere consists of 31 hexagonal and pentagonal modules CNC milled out of black acrylic. The sphere is equipped with a facial tracking camera and has an irregular opening in which their latest soccer shoe is showcased.5 Inspired by the natural formation of Mesolite crystal, the sphere is mounted with 1,800 acrylic tubes (1/4 diameter)—varying in length—lit up with various lighting effects to convey the speed and movement of the player and soccer ball on the field (Figs. 12 and 13). hexagon and pentagon was milled (using three- and five-axis milling tool) using 1 thick black acrylic sheet with a designed indent allowing each module to be connected to another module with two sets of screws on every edge. The attempt was to have a modular system both for ease of assembly as well as fabrication cost. Each module has about 70 unique angled holes into which acrylic tubes are mounted.
5 Each
Emotional Intelligence: Affective …
245
Fig. 11 Mesolite, an interactive display showcasing the latest soccer shoe, responds to the excitement of the visitors Fig. 12 Left: five-axis CNC milling of side holes for modules inter-connections. Right: Assembly of seven modules, showing one module with a dedicated LED panel
Fig. 13 Left: five-axis CNC milling of side holes for modules inter-connections. Right: Assembly of seven modules, showing one module with a dedicated LED panel
246
B. Farahi
The piece has a certain magic and wonder to it, similar to how fireflies emit lights in nature. Embedded with 1000 individually addressable R, G, B LED pixels, the tips of acrylic tubes illuminate creating a mesmerizing dance of light. For this purpose, from inside, every hexagon and pentagon has a dedicated 1/8 panel, which can be attached with sets of small magnets and which house the LED pixels. Each panel connects with a female-male latching connector to the neighboring module. All the wires that connect the LEDs meet each at the bottom of the sphere where they connect to the dedicated microcontrollers. Inside the sphere, there is a platform with an organic-like landscape of fiber optics (similar to Opale) equipped with 300 LED pixels and an aluminum shaft connected to a stepper motor in the middle. The product—the Adidas Predator 2020 shoe—is mounted on the shaft above a landscape of fiber optics, creating an organic interaction between the body of the shoe and the fiber optic landscape. The Mesolite display has an “eye”, a facial tracking camera which can detect the facial expressions of up to 35 visitors and the locations of their heads relative to the opening of the sphere. When a face is detected, Mesolite comes out of “dream mode” and acknowledges the viewer’s presence. “Dream mode” refers to when the shoe is not moving, and when the LED patterns of light have a subtle movement with a white glimmering light effect. Once the face of the visitor is detected the Mesolite comes alive, acknowledging the presence of the viewer by generating a red ripple of light which goes across the surface from the opening to the back of sphere as though it is welcoming the visitor’s presence. The shoe inside also comes alive by tracking the head of the visitor thereby creating an intimate engagement with the viewer. For this purpose, the value related to the head location of the viewers captured from the camera is mapped to the rotational position of the shoe, giving the illusion of the shoe facing the viewers. In this way, Mesolite is given a form of attention directionality. If there are multiple people in front of Mesolite, the camera computes the values of the detected faces all together, and the shoes start looking at multiple faces as though it is trying to capture the attention of multiple people. Once a person has the attention of Mesolite, she has to try to keep it by getting closer. If not, the shoe starts switching its attention to as many faces as detected in front of the opening (Figs. 14 and 15). The more the viewer engages, the more Mesolite comes to life. When the viewers express surprise, the red lighting starts to ripple in and out with deep breath-like rhythms. When the views smile and express happiness, Mesolite will share the happiness by having the shoe spin around and the red light starts to flash rapidly a few times as though it is also excited and happy! (Figs. 16 and 17). The computing system for the piece includes a central brain (Raspberry pi) and four microcontrollers which receive information from it. The task of the Pi is to process data from and send data to microcontrollers that interact with the physical world.6 As such, Pi contains the main application loop, which is orchestrated to
6 The
brain of the pi is written in python3 with some high-performance cython code reserved for the LED animations, while the microcontroller code is written in a subset of C++, using standard
Emotional Intelligence: Affective …
247
Fig. 14 Diagram of placement of each element in the sphere. On the right-hand side from top to bottom: Camera, Predator shoe, Platform with embedded LED and fiber optics, Computational brain
Fig. 15 Computational brain includes (1) Two power supplies (40 A, 5 V), (2) Three microcontrollers (Teensy 3.6), (3) Motor (Mechaduino Controller), and (4) Raspberry Pi
248
B. Farahi
Fig. 16 The face location and emotional facial expressions of the viewer including surprise and happiness can be detected. Mosilte comes alive when a face is detected
Fig. 17 The face location and emotional facial expressions of the viewer including surprise and happiness can be detected. Mosilte comes alive when a face is detected
run approximately 30 times a second, communicating with the connected microcontrollers to obtain camera data and drive the motor and LEDs. One of the challenges in the programming of the lighting system for this piece was to map the irregular structure of LED pixels to detect the exact location of each pixel with their dedicated ID in the 3D Vector space of the sphere (Fig. 15). To achieve this, we exported the coordinates of each point from the 3D file and stored them as a text file on the Pi. Prior to starting the loop, Pi parses the files related to the coordinate system of the LEDs to obtain the coordinates of each individual R, G, B LED unit. Three Teensy 3.6 microcontrollers were used, two of them attached to LED pixels and one attached to the camera. A Mechduino motor with its dedicated microcontroller was used to control the shoe movements, which required the development of our Ardunio and Teensyduino libraries. The code for each can be compiled and deployed using the Arduino IDE (v 1.8.5).
Emotional Intelligence: Affective …
249
Fig. 18 Custom designed of visual interface for calibration of PID of the motor
own customized software to calibrate the PID for the motor7 (Fig. 18). This was an essential step in understanding the motor torque, position, and acceleration, which was the key factor in designing the behavior. The scope and application of this project suggests numerous design opportunities for the world of architecture and product design. What is clear is that in designing smart/robotic objects with life-like behaviors, these artifacts may deliberately exploit the divergence between the object’s characteristics and preference and the human frame of reference. Anthropomorphism in this context refers to the emergence of interaction between a user and the robotic environment. According to Epley et al., this includes emotional states, motivations, and intentions ascribed by the user to the robot [16]. However, the task of expressing various emotions via shape-changing interfaces, as also noted by Strohmeier et al., is a significant design challenge which requires interdisciplinary approach uniting design, material science engineering, computer science, and HCI. Nonetheless, it opens up radically new opportunities for addressing psychosocial issues.
7A
proportional–integral–derivative controller (PID controller) is a control feedback loop mechanism mostly used for industrial control systems. We have written our own software visualizer for PID controller which will be soon released in GitHub open source platform.
250
B. Farahi
7 Conclusion and Future Work This paper has sought to illustrate how materials augmented with computational tools can serve as an emotional interface. In this process two challenges have been addressed; one is how to detect emotions from the user and the other is how to provoke a certain emotional response in the user through the implementation of dynamic behaviors such as color and shape changes. So, in terms of the first challenge, the goal is to detect the viewer’s emotion using a computer vision system and to store the information in the material. In terms of the second challenge, the material then responds physically to the detected emotion in order to establish an affective loop with users. The intention is to explore how shape- and color-changing interfaces might be used in the future to express and simulate various emotions through nonhuman representation. These material interfaces could be a very effective tool for the communication of emotions. This paper has presented the design process behind two projects: Opale, an emotive soft robotic dress and Mesolite, an emotive display. The work described here serves as a proof of concept for the application of emotional computing in design which can be used as a tool for emotional regulation either to augment emotional intelligence (e.g., with a smart garment) or to develop an emotional bond by developing an affective loop with the users (e.g., with a smart object display). The intention has been to demonstrate that despite our anxieties about smart environments and technologies in general, there is potential for empathy where these objects can become companions and co-exist with us and not against us. Moreover, while the capacity of computer vision to recognize different facial expressions has been exploited by many commercial and advertising purposes, the integration of such
Emotional Intelligence: Affective …
251
a system into clothing or architectural installation is a new venture which could open up novel opportunities for the world of design, HCI and architecture.
References 1. Picard, R.W. (2015). The promise of affective computing in the oxford handbook of affective computing. Edited by Rafael Calvo, Sidney D’Mello, Jonathan Gratch, and Arvid Kappas, Oxford University Press, Oxford. 2. Picard, R.W. (1995). Affective computing. MIT Media Laboratory; Perceptual Computing; Section Technical Report No. 321. 3. Popova, M. (2013). Sentics: Emotional healing through music and touch. Retrieved from https:// www.brainpickings.org/2011/09/05/manfred-clynes-sentics/, also Clynes lecture on Sentic Cycles https://www.youtube.com/watch?v=quIcl5UG1z4. 4. Clynes, M. (1989). Sentics: The touch of the emotions. Revised edition. Bridport, Dorset: Prism Pr Ltd. 5. Scherer, K.R. (2010) Emotion and emotional competence: Conceptual and theoretical issues for modeling agents. In K. R. Scherer, T. Banziger, & E. Roesch (Eds.), A blueprint for affective computing: A sourcebook and manual (1st ed.). Oxford , New York: Oxford University Press. 6. Mehrabian, A. (1971). Silent Messages. Belmont, California: Wadsworth. 7. Picard, R.W. (1998). Towards agents that recognize emotion. In: IMAGINA, Actes proceedings (pp. 153–155). Monaco. 8. Evans, D. (2003). Emotion: A very short introduction. ILL edition. Oxford , New York: Oxford University Press. 9. Ekman, P., & Lazarus, R. S. (1980). The face of man: Expressions of universal emotions in a New Guinea village. New York: Garland STPM Press. 10. Damasio, A. (2000). The feeling of what happens: Body and emotion in the making of consciousness (1st ed.). San Diego, CA: Mariner Books. 11. Heider, F., & Simmel, M. (1944). An Experimental study of apparent behavior. The American Journal of Psychology, 57(2), 243. https://doi.org/10.2307/1416950. 12. Braitenberg, V. (1986). Vehicles: Experiments in synthetic psychology. Reprint edition. Cambridge, Mass.: A Bradford Book. 13. Strohmeier, P., Carrascal, J.P., Cheng, B., Meban, M., & Vertegaal, R. (2016). An evaluation of shape changes for conveying emotions 2016. In CHI 2016 Proceedings of the 2016 CHI conference on human factors in computing systems (pp. 3781–3792). 14. Nieburg, O. (2015). Smile for candy: Hershey eyes-in store excitement with facial recognition sampler. Retrieved from https://www.confectionerynews.com/Article/2015/07/31/HersheySmile-Sample-Facial-recognition-to-dispense-chocolate?utm_source=copyright&utm_med ium=OnSite&utm_campaign=copyright. 15. Darwin, C. (1983). The expression of emotions in man and animals. London ; Dover, N.H.: Pinter Pub Ltd. 16. Epley, N., Waytz, A., Akalis, S., & Cacioppo, J.T. (2008). When we need a human: Motivational determinants of anthropomorphism. Social Cognition, 26(2), 143–155.
Designing an Architectural Robot: An Actuated Active Transforming Structure Using Face Detection Ji Shi, Yujie Wang, and Shang Liu
Abstract Although the advances of autonomous control in robotics broke new ground in the realization of architecture, they have hardly been integrated with architectural design intention. Very few examples of architecture-specific robot exist. This results from multiple realistic factors including the scalability and cost. However, the fundamental incentive lies in the stereotypical design ideology that fails to develop new spatial agendas to stimulate the integration. This paper presents a design workflow that revolves around designing an architectural robot. An operational full-scale architectural robot of an actuated active transforming structure was prototyped and tested to demonstrate the workflow. The design is based on a prototyping model with pneumatic actuation system and sensing using computer vision. The structure interacts with human by detecting face features and actively transforming its gesture to prevent human from approaching. The communication is based on a simulationmodel-free host program that constantly reads sensor feedbacks and sends actuation values. Keywords Architectural robot · Human–computer interaction · McKibben pneumatic artificial muscle · Face detection · Design and cultures
J. Shi (B) PILLS, Chaoyang, Beijing 100018, China e-mail: [email protected] Y. Wang Massachusetts Institute of Technology, Cambridge, MA 02139, USA e-mail: [email protected] S. Liu Carnegie Mellon University, Pittsburgh, PA 15213, USA e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2020 P. F. Yuan et al. (eds.), Architectural Intelligence, https://doi.org/10.1007/978-981-15-6568-7_16
253
254
J. Shi et al.
1 Introduction With the infiltration of digital technology into nearly all aspects of our life, we are now living with increasingly autonomous objects that are hybrids of digital and physical. However, architecture, as the most frequently used “object” we encounter in everyday life, is still understood as purely physical and non-computational construct. In fact, the contemporary architectural practice has developed a complex workflow that waves digital and physical. The architectural process has been divided into multiple standardized and specialized stages. Digital technology and autonomous features are introduced into each stage ranging from generative design of geometry to specialized fabrication technique. The reason people stereotypically consider architecture as non-computational construct is that neither human design intention nor design outcome is included in any computational or autonomous processes. The advances of autonomous control are still confined in the realization of architecture. Thus, designers often encounter difficulties working with computation since the original intents are not initiated within such context. Likewise, the users may also have trouble interpreting the computational aspects after their inhabitation since most of them are only shown in the creation process not after the completion. Thus, it is productive to extend the meaning of computational design toward both ends simultaneously. It’s critical for designers not to isolate their design intentions from computational processes in realization, rather they should be constructed in one holistic setup that combines the creation, realization, and the inhabitation of architecture. This calls for a new design model in which the conventional interaction between human and architecture should be rethought. This new model asks the fundamental question that how autonomous features can benefit architectural design intention and to what extent do the autonomous features last in the whole process (Fig. 1).
2 Related Work The development of human autonomous process in architecture has focused primarily on object-scale robots or machines. The practice within this thread has a conceptual link with design of interactive installations. Two inspiring precedents are the Table by Max Dean and Raffaello D’Andrea in 20011 and the Robotic Chair by Max Dean, Raffaello D’Andrea, and Matt Donovan in 2006.2 These two art installations were modeled on the basis of generic furniture pieces, but they were designed with robotic hardware and program which allowed them to interact with human. The installations were designed for gallery performance and collection, however, it presented strong 1 Raffaello D’Andrea and Max Dean, Table. Accessed: January 14, 2019. https://raffaello.name/pro
jects/table/. D’Andrea, Max Dean and Matt Donovan, Robotic Chair. Accessed: January 14, 2019. https://raffaello.name/projects/robotic-chair/.
2 Raffaello
Designing an Architectural Robot …
255
Fig. 1 a Multiple humans interacting with the architectural robot simultaneously. Interaction scenarios processed by a host program and presented to the public; b architectural robot constantly “observing” human using face detection method while the human looking at the robot; c module of the architectural robot in design prototyping iterations. d final setup of the architectural robot. System combines actuation, sensing, and programming in an integral installation
architectural relevance by introducing autonomous process into daily living scenarios in the confinements of a room. Within the field of architecture, the research of object-scale robot also strongly connected with the concept of assembly. Smaller objects are designed as modules for a larger system, i.e., buildings in most cases, and have the physical and programmable capability of being assembled. Typical precedents can be found in research projects of Theodore Spyropoulos and his teaching at AADRL. Project Hexy by Yuan Yao, Yuhan Li, and Yang Hong in 20173 developed a transformable hexagonal robot that autonomously assembled into reconfigurable room layouts. Similar studies around collective behavior and self-assembly have also been done in the field of biologically inspired engineering, for example, the Termite-inspired Robot Construction Team [1] and the Thousand-bot Swarm [2]. Another thread focuses on applying the autonomous concept directly to a full-scale architectural setup. The precedents include the Actuated Bending Tower Project (also known as the Bowtower Project) which developed a simulation-model-free approach for combining the abstract computational and material artifact as a method for exploration in the design process [3], and the Flexing Room Project which developed a
3 Yuan Yao, Yang Hong, and Yuhan Li. Hexy_AADRL Research Book 2016-2017. Accessed: January
14, 2019. https://issuu.com/yuanyao2014/docs/finalupload_small/.
256
J. Shi et al.
room size architecture prototype that communicates to its human inhabitants [4]. This paper is largely inspired by the research within this thread.
3 Architectural Robot with Lifelong Autonomous Agenda The motivation of this research is to rethink architecture as architectural robot which contains certain extent of autonomy that expands the concept of material computation into the realm of embodied computation [5]. The meaning of embodiment refers to a holistic understanding of architecture as computational spatial construct. Architectural robots should not be understood as machines or technical procedures used in isolated stages in architectural realization, rather it is a robust mechanism that intrigues the complex process of design, realization, and inhabitation in the lifelong development of an autonomous agenda. The current practice has developed divergent interpretations toward the meaning of architectural robot. What this paper presents should not be confused with the following perspectives. Firstly, the concept doesn’t refer to what is known as Smart Room. Architectural robot is nothing like a room drowned in all kinds of autonomous appliances and objects. Architectural robot uses interactions generated directly from its spatial setup instead of the appliances or objects it contains. Humans live with the architectural robot not only within it. Also, architectural robot isn’t merely automated process. The automated processes are all defined within the realm of the creation of architecture. However, the intervention of designer should not come to a stop at the completion of creation since the stages after are also important. In addition, architectural robot isn’t about designing anthropomorphic robots. Human has a long obsession with anthropomorphic robots. Architectural practice has also developed projects that encompassed the design of human-like robots, for example, the Robot DEKU and DEME [6] by Japanese architect Arata Isozaki for 70’ Expo at Osaka. The autonomous agenda doesn’t necessarily lead to a formal imitation, rather it addresses the autonomous interactions. The central task is to define an architecture-specific interpretation of “autonomous agenda” in contemporary digital culture and apply it to architectural processes in practice. The types of exchange between human and architecture are different from the established paradigm of human–machine interaction presented in object-scale computer devices [4]. The key of architectural robot is to combine these previously established features with architectural qualities. Architecture robot must be inhabitable which by nature brings more interaction scenarios to the system. From the perspective of designer, it’s critical to establish a new design method that organizes all design parameters within this setting. The parameters should not be limited in the material realm, rather it should be extended to prototyping systems through a combination of sensing, actuation, and feedback programming. Architecture robotics is largely based on such prototyping model.
Designing an Architectural Robot …
257
4 Method: Design Through Prototyping Systems The method of designing architectural robot includes an autonomous-agenda-centric design workflow in which designers start simultaneously from prototyping, sensing, and actuation (Fig. 2). Agenda is critical in this method and indicates the fundamental meanings of the design. However, it is not only the start point of a linear process, rather it unifies the design intents from all aspects and exists in the entire cycle. Prototyping. Prototyping approaches the agenda through making 1:1 prototype. Prototyping allows the designer to understand the design intention through physical impressions. Prototyping requires 1:1 models not scaled models, and it contains continuous iterations of hands-on experiment. This is also different from what is known as mock-up. Prototyping requires a functional construct as a whole instead
Fig. 2 Stereotypical design workflow (top) versus proposed autonomous-agenda-centric design workflow (bottom)
258
J. Shi et al.
of a selected study fragment. Prototyping is critical in the proposed workflow. It informs the designer of the potential parameters in sensing and proper mechanism in actuation. Sensing. Sensing approaches the agenda by reading specific parameters and writing it into a feedback loop. The concept of sensing and feedback is borrowed from the field of automated control systems. The parameters to be sensed may come from the designer’s initial intents, the human interactions after the inhabitation, the environmental factors of the design, etc. Specific sensing instruments or devices need to be studied and this requires designer to implement research in areas which were not conventionally considered within the field of architecture. Actuation. Actuation approaches the agenda by constructing schemes. Actuation should not be simplified as a series of mechanical motions, rather it stands for iterations of design schemes in the agenda. Actuation is the material realm of the agenda and is the front end of the spatial construct which is developed by multiple iterations of prototyping and sensing.
5 Case Study: An Actuated Active Transforming Structure The proposed workflow was tested in a design workshop4 the author instructed in Tsinghua University in 2017. The objective of the workshop is to familiarize students with the concept of architectural robot and the autonomous-agenda-centric workflow. Students were required to participate in multiple hands-on sessions of prototyping, sensing, and actuation. One project is presented in this paper showcasing the design methodology (Fig. 3). The students designed an autonomous gallery installation for displaying artworks. The installation autonomously responds to the surrounding environment and incoming visitors by constantly changing its geometry. On the one hand, the project itself was well developed following the proposed workflow and the students showed a thorough understanding of the notion of autonomous agenda in architecture; On the other hand, the selected topic, i.e., temporary gallery installation, is a good vehicle for the test run, the students, with architectural training, are experienced in making 1:1 prototype of objects in such scale and are skilled in comprehending and developing autonomous agenda for such topic. Here, the steps within the development of the project are presented in the sequence of agenda, prototyping, actuation, and interaction, however, in reality the developments in those aspects happened simultaneously and the parameters in each aspect affected each other. Additionally, the paper focuses on showcasing the overall workflow and how it could potentially benefit the field of architecture design. Thus, the 4 Parametric
Design Workshop, School of Architecture, Tsinghua University, Beijing, 2017. Instructor: Ji Shi, Zigeng Wang. Teaching Assistant: Dinglu Wang, Pengcheng Sun, Jingsong Wang. Student: Xiangfan Chen, Ziyao Geng, Yao Jiang, Zhan Zhang, Wen Si, Shang Liu, Yujie Wang. The workshop was organized by Prof. Weiguo Xu.
Designing an Architectural Robot …
Fig. 3 Experiment setup in axonometric view and workflow diagram
259
260
J. Shi et al.
technical procedures and setups are presented but the raw data and measurement might be partially omitted in the following discussion.
6 Autonomous Agenda Visitor circulation in conventional galleries follows the curatorial arrangement of the artworks. Thus, the location of specific artwork may influence the distribution of the crowd, for example, the spot of Mona Lisa being presented in the Louvre Museum always forms a tremendous gathering of people. It’s interesting that in this model architecture itself, though contains the human crowd, has no authorities in deciding how to distribute them. The project presented below questioned this stereotypical setup by proposing a new gallery installation for displaying artworks that constantly observes the crowd and responds to them through a series of geometrical transformation. This project showed a promising agenda. The transformability and reconfigurability is not exclusively based on designers’ intents, however, it is generated from an autonomous program. Also, the project questioned conventional architectural concept, specifically, the definition of a wall, from very fundamental level. Additionally, the notion that walls looking back at human is fascinating by proposing a counter intuitive metaphor which motivates human to rethink their everyday behavior and relationship with architecture. This project is based on case studies of multiple precedents among which the studio project by Francois Sabourin5 was found mostly inspiring. The proposal was initially inspired by similar concept and has developed unique, innovative, and complex features during the workshop.
7 Modular Transformation A cubic geometry of 1 × 1 × 1 m was physically modeled to represent the installation unit (Fig. 4). Here, the dimension is a reductive abstraction that represents a generic exhibit wall which isn’t associated with any artwork or venue. The dimension was chosen based on the operationality of the actuation system and the prototyping feasibility. In reality, the actual dimension may change depending on the site. The cube geometry was fabricated with wood frames and fabric. The vertical faces represent the surfaces on which artworks will be mounted and the horizontal face at bottom represents the base of the unit. The rest faces of the cube were omitted from this prototype for operational and representational reasons. The joining vertical edges and base corner connections are designed to be flexible. 5 Project
for Studio ARC505B Architectural Robotics––Embodied Computation, Fall 2016 taught by Prof. Axel Kilian and Ji Shi, Princeton University.
Designing an Architectural Robot …
261
Fig. 4 Modular transformation from a cube (left) to a uniform rhombic prism (right)
The assembly details were designed for a modular transformation in which the base can shear from a square to a rhombus which in 3D transforms the cube into a uniform rhombic prism. The geometrical transformation of the module represents the change of orientation of each artwork on that face, which directly changes the relationship between the human visitor and the artwork. This describes the fundamental interaction scenario. The system may also go more complex if more modules are involved and connected to each other. Here, the project included three modules connected to each other side by side as a research prototype.
8 Actuation System: Pneumatic Artificial Muscle and Bow Connection To realize such transformation, the system requires a contraction force that pushes the two opposite corners at the base toward each other; meanwhile, it also requires a corresponding spring action which pushes back and reset the system to its initial state. The actuation system developed for this project is inspired by the Bowtower experiment [3] and the Flexing Room Architectural Robot [4]. The contraction forced was realized by McKibben Pneumatic Artificial Muscle (hereinafter referred to as “the PAM actuator”), and the spring counteraction that reset the system was provided by a bow connection made with prestressed 15 mm diameter nylon rod. At the base of each module, two opposite corners were connected by one PAM actuator while the other two corners were prestressed by two bow connections (Fig. 5). During actuating phase, the contraction of PAM dominates the transformation and pulls the two opposite corners moving toward each other. This shears the geometry from a cube into a rhombus cube. During releasing phase, the prestress in the bent nylon bow connection counteracts the transformation and the system springs back to its initial state. The combination of the PAM actuator and the bow connection collectively works for a robust actuation cycle.
262
J. Shi et al.
Fig. 5 Module (indicated with red dashed line) with PAM actuator and bow connections. (1) PAM actuator; (2) Bow connections
9 Fabrication, Experiment, and Measurement of the PAM Actuator The key of this actuation cycle is to design and make a robust PAM actuator. The PAM actuator, firstly developed in the 1950s [7], contains an elastomer inner tube surrounded by double-helix-braided sheath. When an internal pressure is generated inside the inner tube, the free tip of the artificial muscle contracts while the external sheath maintains its cylindrical shape. The circumferential stress of a pressurized inner tube is transformed into an axial contraction force by means of a double-helix braided sheath whose geometry corresponds to a network of identical pantographs [8]. The reason for selecting PAM actuator is primarily based on its flexibility in prototyping. The actuator can be assembled from ready-made and easy-to-get materials and accessories. The properties, i.e., length, diameter, actuation speed, etc., can be customized and iterated through a series of modification in material and code. Compared to other actuators, for example, linear actuator, the PAM actuator works better as a tool for design. The PAM actuator used in this project (Fig. 6) was fabricated in-house with a combination of bicycle inner tube, braided cable sleeve, and customized hose fittings.6 During the design exploration and experiments, the team tested with different dimensions of the PAM with different internal pressures. The contraction showed the consistency of reaching approximately 75–80% of the PAM’s initial length (Fig. 7). This result was not ideal in terms of producing dramatic geometrical change, but it is 6 Hardware
details: Inner tube of the PAM actuator: KENDA bicycle inner tube 700*23/25 C 60L; Sheath of the PAM actuator: LEIXINTE Terminal 35 mm diameter PETE braided cable sleeve; Custom-made PAM fittings at two end: CNC fabricated 35 mm diameter nylon rod.
Designing an Architectural Robot …
263
Fig. 6 a (left) Schematic illustration of custom-made PAM actuator; b (right) Sectional illustration of design details: (1) Standard silicone tubing connecting to pneumatic circuit. (2) Standard barbed fitting. (3) CNC fabricated fitting for PAM actuator made from hard nylon round rod. (4) Wormdrive hose clamp bundling all layers into a sealed system. (5) Outside layer of PAM made with braided sheath. (6) Inside layer of the PAM made with elastomer tubing
Fig. 7 Experiment results of the PAM actuator contraction showing a consistent maximum contraction of 80% of initial length
good enough for the selected module in this design project. In addition, the rapidity and firmness reached the anticipation, and the system performed robustly when the module continuously transforming in multiple cycles over time.
264
J. Shi et al.
10 Pressure Control Unit (PCU) The pneumatic setup (Fig. 8) for controlling PAM actuators contains a combination of standard pneumatic appliances and custom-made pressure control units (hereinafter referred to as “PCU”). A single-stage air compressor (800 W, 60 L/min, 220 V) with air tank and regulator gauge was used as the exclusive pressure source for all actuators. The air was then divided into three streams by standard manifolds and tee fittings. Each stream was channeled into a PCU in which the designer has full control of the timing and on/off states of the air stream. Lastly, the programmed air stream was directed to three PAMs performing desired features. The air compressor unit has its own feedback to maintain the desired pressure at the output port. The PCU in each stream uses valves to control the volume of air being directed into the actuator. Each PCU is based on a pair of simple openclose solenoid valves (two-way/two-position valve, 2V025-08 G1/4 N.C. DC12 V). Design schemes involving dual valves are popular in similar research [3, 4, 9]. In the pneumatic circuit, two valves are connected in series and the actuator is connected in between the two valves. This dual-valve setup allows three control states, respectively inflating, holding, and deflating. When inflating, Valve-1 is switched on while Valve2 is off. The air is constantly directed into the actuator; When holding, both Valve-1 and Valve-2 is switched off. The air is being held in the actuator; When deflating, Valve-1 is switched off and Valve-2 is on. The air is released out of the system. The pressure inside the actuator is controlled by setting up timing and on/off state of the PCU, and a pressure sensor is connected in parallel with the actuator to monitor the internal pressure changes. An alternative solution is to use proportional valves to control the flow/speed. Appliance of such categories can be found from more complex pneumatic control company such as FESTO. These solutions are normally expensive and not very much feasible for prototyping in a design studio project.
Fig. 8 Air logic diagram and pneumatic circuit schematics of the PCU (left); Operational diagram showing valve state in relation to system state using dual-valve setup (right)
Designing an Architectural Robot …
265
11 Electronics and Integral Prototype of PCU All six valves for three PAM actuators were controlled digitally. The power for the DC12V solenoid valves came from AC220V outlet and was adapted with a 12V/10A (max) DC adaptor. All valves were connected in the form of an Arduino UNO board with high power control setup using transistors and diodes (Fig. 9). The prototyping of the electronics was based on standard breadboard and jump-wires (Fig. 10).
Fig. 9 Electronic circuit schematics using transistor to control high current load in solenoid valves with ASDX amplified board-mount pressure sensor
Fig. 10 Integral prototype of the PCU showing a combination of pneumatic circuits and electronic circuits
266
J. Shi et al.
Fig. 11 The architectural robot shifted between Wandering and Teasing states based on the result of face detection. The program was designed to challenge human’s established conviction of human– robot interaction scenarios
Three webcams, anchored at the top of each module, was oriented to the front, i.e., perpendicular with the vertical face, representing the artwork in search of the incoming human visitors. The webcams were connected to a PC laptop through USB cables. The live video footages were constantly downloaded and processed in a program in the software Processing. In the workflow, the Processing sketch works as a host program and constantly uploads signals to the Arduino. The Processing sketch captures human faces in each frame of the footage in real time. The face detection sketch in the program is based on the OpenCV library for Processing.7 The communication between the Arduino UNO board and the Processing host program was done using the Arduino Firmata Library for Processing.8 As a result, the control of vales was associated with the detection of faces, and all coordination was done in the master host program on Processing without any stand-alone code stored in the Arduino board. The workflow of controlling Arduino through Processing was proved to be a very effective method since the human designer can visually observe, control, and iterate the design through the interface of a PC laptop. A user-friendly interface was developed for the Processing program with the intent of making the program not only a control firmware but also a visual demonstration for people to observe the system from a global perspective and understand the interactions better.
12 Interactions: Wandering and Teasing The interaction is divided into two typical scenarios, respectively named Wandering and Teasing (Fig. 11). Wandering scenario will be triggered if no faces are detected by the webcam. This represents the situation that no one is looking at the artwork. In 7 The
face detection Processing sketch used in this case is based on Greg Borenstein’s OpenCV Computer vision library for processing. Greg Borenstein, Open CV. Accessed: January 14, 2019. https://github.com/atduskgreg/opencv-processing. 8 Jeff Hoefs. Firmata Firmware for Arduino. Accessed: January 14, 2019. https://github.com/fir mata/arduino.
Designing an Architectural Robot …
267
this situation, wandering state will be activated and the module will perform random transformations within its two typical states (the cube and the uniform rhombic prism). In this phase, orientation of each vertical face constantly changes as if the artwork is in search of human visitors. Since the pressure of each PAM actuator is controllable, the Wandering transformation presents a random pattern with various gestures and speeds which resembles the behavior of human wandering. Teasing scenario will be triggered when the program detects faces. This stands for the situation that someone is directly looking at the artwork. Under this circumstance, the module immediately dodges people by a fast orientation change of its physical state. Similar interactions will continue if the visitor tries to approach the artwork from an alternative direction. Thus, every time the human visitor tries to approach the artwork, they will be rejected by the system. This human-like behavior is understood as a metaphor that the installation teases human visitors by not letting them see the artwork. This interaction supports our initial argument that spatial construct works as active agent that brings new living scenarios. The interaction is also interesting since it provided counterintuitive perceptions. In most cases, the autonomous objects are programmed to assist human in achieving specific goals. However, autonomous agenda should also include the opposite scenario in which human’s initial intents are obstructed.
13 Results and Future Work The public test run was held during the workshop final review and it was intriguing to see people being confused by the behavior of the system. It seemed that no one has assumed that the robot would disobey human’s intention. The project performed a robust actuation in which the digital control of PAMs and the physical prototype all showed desirable features. However, the experiment didn’t reach a more complex stage in terms of behavior and interaction. The limitations primarily stay in the following areas. Firstly, the actuation of PAMs, though fast enough, may still fail to keep up with human’s movements. This is due to the delay in the actuation and reset process. The speed in actuation is directly related to the inflation of the PAM actuators. To speed up this process, more tests with the valves need to be done. In the experiment, pressure sensors were not yet successfully integrated into the PCUs and need more detailed research. The pressure was exclusively controlled by setting the timing and on/off states of the valves. Thus, the initial pressure of the air compressor cannot be set too high since upon doing so, the timing will become very short and hard to be precisely controlled. However, when pressure sensor is applied, the speed of inflation may potentially go higher without concerning the controllability and safety. The speed of the reset process is associated with the prestressed bow connections. In the experiment, we used nylon rod and prestressed it by bending it into an arc. A stiffer material can be applied to the same design schematics and may result in a faster reaction.
268
J. Shi et al.
Also, the programming was largely based on the single module. The prototype connected three modules together and the behavior of each module interfered with its neighbors. This is the primary area to be focused on in future works. On the one hand, more experiments need to be implemented to study the collective behavior and more composition typologies need to be tested, for example, the modules can also be connected in a rectangle or hexagonal grid instead of linear connection as it is in the current stage; On the other hand, the interactions programmed in the Processing sketch were based on a chart which contains all possible events. The program needs to be improved and future work includes developing a better sketch that runs the system in a more advanced autonomous manner. Additionally, the feasibility to larger sale or permanent structure remains unsolved. The system, as noted earlier in the paper, contains an external package of valves, electronics, PC laptops, etc. The prototype was designed primarily for pedagogical demonstration of the workflow not for a fully autonomously operated permanent structure.
14 Discussion The development of this architectural robot and its design workflow should be considered as a long process than any single project or research can cover. The workflow requires a holistic understanding of the meaning of autonomous agenda within the field of architectural design which can only be developed through continuous tests and experiments. Thus, the result of this experiment is not an answer to specific question, rather it is the question itself. The experiment proposed a question to the field and aimed to potentially inspire peer designers, as well as design students and teachers. It’s critical for designers to establish a design thinking ideology which includes the participation of autonomous technology in the core area of the discipline. Autonomous features should no longer stay as mediated steps in the realization of a conventional architectural concept, rather they should bring novelty to the concept itself. Also, this paper calls for attention that designers should establish a robust workflow in which prototyping system is considered as the fundamental tooling when iterating the design. The digital technologies used in the workflow, know-how of electronics, programming, coding, etc., are introduced to solve design questions form higher level of thinking, specifically the autonomous design agenda for architecture. Thus, the acquisition of these abilities is part of the designers’ responsibility, not the responsibility of the consultants or manufactures. The research also addressed the cultural driven force in contemporary digital practice. The manipulation of the digital should not be restrained in the form of material or automated process, neither should it be limited within the applications of optimization or simulation. The manipulation should also include unpredictable behaviors in which the computational spatial construct participates in social interactions as an active agent with equal importance as human participators. The very essential concept of architectural design, as the way it relates to our everyday social
Designing an Architectural Robot …
269
life, is by nature open-ended. The introduction of digital technology, no matter as mediated processes as is or as a central agenda and workflow as proposed in this paper, should obviously be prepared toward open-ended results, covering all aspects of design, from intent to outcome, and in a lifelong timeframe, from concept to inhabitation. Acknowledgments This research was supported by Tsinghua University School of Architecture. The author would like to express great appreciation to the instructors and teaching assistants of the workshop, specifically, Zigeng Wang, Pengcheng Sun, Dinglu Wang, Jingsong Wang for their contributions in teaching. The author would also like to express deep gratitude to Yujie Wang for the post-workshop work and continuous insightful inputs to the research. The author would also like to extend grateful thanks to Prof. Axel Kilian, whose work has greatly inspired and motivated the author to do the research.
References 1. Werfel, J., Petersen, K., & Nagpal, R. (2014). Designing collective behavior in a termite-inspired robot construction team. Science, 343(6172), 754–758. 2. Rubenstein, M., Cornejo, A., & Nagpal, R. (2014). Programmable self-assembly in a thousandrobot swarm. Science, 345(6198), 795–799. 3. Kilian, A., & Sabourin, F. (2017). Embodied computation–an actuated active bending tower: using simulation-model-free sensor guided search to reach posture goals. In T. Nagakura, S. Tibbits, M. Ibañez, & C. Mueller (Eds.), Disciplines & Disruption: Proceedings of the 37th Annual Conference of the Association for Computer Aided Design in Architecture (pp. 324–329). Cambridge: ACADIA Publishing Company. 4. Kilian, A. (2018). The flexing room architectural robot: an actuated active-bending robotic structure using human feedback. In P. Anzalone, M. D. Signore, & A. J. Wit (Eds.), Recalibration: On Imprecision and Infidelity: Proceedings of the 38th Annual Conference of the Association for Computer Aided Design in Architecture (pp. 232–141). Mexico City: ACADIA Publishing Company. 5. Kilian, A. (2016). Prototypes as embodied computation. In C. Gengnagel, E. Nagy, & R. Stark (Eds.), Rethink! prototyping: transdisciplinary concepts of prototyping (pp. 37–48). Cham: Springer International Publishing Switzerland. 6. Daniell, T. (2016). Bug eyes and blockhead. Log, 36, 34–47. 7. Chou, C. P., & Hannaford, B. (1996). Measurement and modeling of McKibben pneumatic artificial muscles. IEEE Transactions on Robotics and Automation, 12, 90–102. 8. Tondu, B. (2012). Modelling of the McKibben artificial muscle: a review. Journal of Intelligent Material Systems and Structures, 23, 225–253. 9. Yao, L., Niiyama, R., Ou, J., Follmer, S., Silva, C. D., & Ishii, H. (2013). PneUI: pneumatically actuated soft composite materials for shape changing interfaces. In: Proceedings of the 26th Annual ACM Symposium on User interface Software and Technology (pp. 13–22). New York City: ACM.
CoFabs: An Interactive Fabrication Process Framework Teng-Wen Chang, Chi-Fu Hsiao, Chun-Yen Chen, and Hsin-Yi Huang
Abstract In the early stages of maker practice, there are many practical problems or design errors that need to be taken care of on their own. In order to lower the manufacturing threshold and obtain as much information on the relevant services as possible, makers sometimes try to simulate and test their works in digital ways before manufacturing them. However, after the process of digital simulation, the maker must hand over the design drawings to the manufacturing unit to develop the prototype. This materialization step requires a long period of time for the consultation, waiting, manufacturing, and adjustment phases. Thus, how to shorten and facilitate the timeline of the prototyping process has become a very important issue in the analysis of a maker’s design practice manufacturing workflow. This article advances a “seeing-moving-seeing” design thinking model that can help designers refine their ideas through observations, as they can directly see the result of their intuitive gestures and present their works with design media. This model, therefore, makes it easier for the makers to explain what they are thinking. In addition, through the model, they can further extend the possibilities of multiple production processes by designing, manufacturing, and integrating tools through the use of a robotic arm. The “seeingmoving-seeing” model proposes that there are three different aspects that respond to the temporary design prototyping process: human features, physical features, and robotic arm manufacturing. We simplify the communication of computing systems through an immersive visual interface and the rendering of real-time feedback, so that makers can focus upon their practice of implementing design ideas. Keywords Robot arm · Digital fabrication · Mixed reality · Design process T.-W. Chang (B) · C.-F. Hsiao · C.-Y. Chen · H.-Y. Huang National Yunlin University of Science and Technology, Douliou, Taiwan e-mail: [email protected] C.-F. Hsiao e-mail: [email protected] C.-Y. Chen e-mail: [email protected] H.-Y. Huang e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2020 P. F. Yuan et al. (eds.), Architectural Intelligence, https://doi.org/10.1007/978-981-15-6568-7_17
271
272
T.-W. Chang et al.
1 Introduction The process of architectural design often involves many complex and cumbersome construction processes. The tools used in these processes require professional technical support, which also severs the relationship between design and manufacturing. In digital manufacturing tools, the robotic arm is a kind of manufacturing tool that integrates virtual computing and physical manufacturing. The features that facilitate cooperation with robotic arms, such as real-time controlling, correspondence to the real world, repeatability, or programmability, have created many more possibilities in design development. However, the learning threshold for using a robotic arm in digital manufacturing tools is also quite high. In recent years, the maker movement has emerged alongside a trend of handmade implementation around the world. As makers work with their own hands, practical problems or design errors are not likely to appear until they try to combine pieces together in the real world. Therefore, the process of rapid prototyping is very important for the maker. In order to lower the manufacturing threshold and obtain more information on relevant services, makers often manufacture their designs in digital ways. Furthermore, makers must sometimes hand over the design drawings to the manufacturing unit for construction during the process of digital manufacturing. This is due to the fact that the process is often too dangerous for makers to do themselves and may often necessitate a professional division of labor. Therefore, the prototyping process requires a lot of time for consultation, waiting, and manufacturing [1–3]. Consequently, this article advances a “seeing-moving-seeing” design model, allowing designers to refine their ideas [4]. Designers can observe and present their ideas with design media, making it easier to explain their thought patterns between the development of the handmade product and the test. A coexisting space representation [5, 6] is used for modeling such interplay that oscillates between cyber and physical spaces [7]. The possibilities of enacting multiple operations by designing, manufacturing, and integrating tools through digital means, such as a robotic arm, are expanded through this model. The three aspects inherent in an interactive fabrication process based on the “seeing-moving-seeing” model include human features, the digital stream, and the physical entity. Three additional factors comprised of control, communication, and computation exist with some interplay between human behavior and the digital stream, the digital stream and the physical entity, and the physical entity and human behavior. The relationship between these three aspects and the related interplays are shown in Fig. 1. By interacting with a digital stream and a physical entity, each maker can embed their product into a network of manufacturing systems. After describing the “seeing-moving-seeing” model, we propose the implementation of a Co-existing Fabrication System (CoFabs) to support an interactive process for makers that allows for real-time modification and manufacturing in the design sequence. We also analyze each thread’s results and create a system model to simplify the workflow of the controlling physical machine and lower its technical threshold (see Fig. 1).
CoFabs: An Interactive Fabrication Process Framework
273
Fig. 1 CoFabs system involves with three aspects in fabrication processes
1.1 Human Features: Technical Threshold for Digital Manufacturing The theoretical model of “seeing-moving-seeing” proposed by Schon and Wiggins in 1992 [4] is mainly used in design since handmade products and design thinking are requisite in the design process. Designers need to observe and refine their works via the development of their ideas prior to presenting the design medium. Thus, it is easier to explain the circular or loop thinking related to handmade implementation and testing. Makers generally cannot intervene in the digital manufacturing process, and so the relationship between makers and their products is severed. Project models and prototypes are still leading the way with more acknowledgment than 3D computer rendering and animation models [8] in the real design industry. The physical model or prototype can effectively assist in face-to-face interactions among different stakeholders, such as clients, designers, and consumers. Through physical interaction, these models also provide more information about the design ideas, thereby also highlighting unforeseen problems [9]. In order to allow the products to present the maker’s ideas properly, CoFabs integrates digital and entity information with embedded computer, network monitoring, and control mechanisms to establish a feedback system [10]. The cycle of this
274
T.-W. Chang et al.
system appears as a recursive, repetitive, and coexisting feature between virtual space and reality. We can, therefore, isolate several interactive architecture cases, representing space solely by recording the behavior of the users and generating feedback differently, and build up a specular space experience.
1.2 Digital Features: Digital Interface into the Reality In the digital manufacturing method, designers often design virtual models through CAD/CAM [11] and transfer the models into a machine for execution, but the virtual representations and corresponding entities are separated during this manufacturing process. In order to allow users to program faster and intuitively as well as to facilitate the operation of manufacturing tools, there need to be some custom applications accompanied by complex programming to facilitate interaction between human agents and manufacturing tools. Collaborative work has been fundamental to solving complex engineering challenges. In this sense, cloud storage services, collaborative environments, and social networks offer various advantages and limitations [12]. Collaborative work is, however, useful for communicating, writing, and presenting results when 3D modeling is not allowed. Fusion 360 from Autodesk [13, 14], a collaborative 3D modeling environment, is easy to use, integrates several assisted design environments, and has a smartphone application for viewing modeling as well as offering a public weblink to visualize the model in the browser [15]. Wireless Augmented Reality (AR)’s head-mounted Mixed Reality (MR) display device HoloLens [16] offers another way to present designers’ ideas. MR is the integration of the real world and the virtual world to create a new environment replete with visualization, where physical and digital objects coexist and interact instantly. MR not only occurs within the spaces of the physical and the virtual world, but it also combines the real world and virtual reality through AR and enhanced virtual immersive technology. HoloLens provides a platform for the development of applications in many fields that would benefit from placing a user within an environment with holographic elements that seem real, even though they are not. Microsoft has, therefore, made a major bet that AR or MR will emerge into a new mass market, as they have called it. Rivalry among smartphone manufacturers has since become intense after Apple introduced their original iPhone. If they were right, as in the case of the smartphone, the opportunity for independent developers is bound to be huge. Every new application increases the value of the HoloLens platform, thus drawing more customers and making the pie bigger for all HoloLens developers [17]. Szalavári et al. [18] have proposed five major advantages as designers collaborative on their projects within MR environments:
CoFabs: An Interactive Fabrication Process Framework
275
• Virtual Reality: Objects that do not exist in the real world can be viewed and examined. • Augmentation: Real objects can be augmented by virtual annotations. • Cooperation: Multiple users can see each other and cooperate in a natural way. • Independence: Each user controls his/her own independent viewpoint. • Individuality: Displayed data can be different for each viewer.
1.3 Physical Features: Manufacturing Tools into the Digital The convenience and rapid prototyping of digital manufacturing has become a trend in the development of manufacturing tools. 3D digital printing precision and other manufacturing variabilities have been the focus of interest in the scientific literature, including studies that have compared the differences between CAD models and physical models [19, 20]. In addition, other scholars have addressed the use of different digital manufacturing techniques for materials such as concrete [21, 22], or the study of numerical models to analyze the behavior of timber folded surface structures using semi-rigid multiple tab and slot joints [23]. Although the use of 3D modeling programs has increased in engineering environments with respect to the visualization and evaluation of designed elements [8], tangible models are still indispensable. This is due to the fact that physical interaction with the model helps one to understand, evaluate, and detect design problems [24]. The additional information incorporated in these models has been a focus of analysis in historical research [25]. Such models allow for more direct and personal interaction and visualization on the part of investors, clients, designers, and other professionals who may be involved in the project. Lavallee’s 2011 study has demonstrated that the manual assembly of parts is as complex as manufacturing tool construction [26]. Therefore, we need to consider a holistic approach in establishing manufacturing methods that can successfully integrate manufacturing tools, human agents, and material organizations into the field. In the traditional industry, manufacturing equipment was reliant on manual controls, such as metal bending, the use of table saws, and the introduction of other manufacturing tools. Due to the danger inherent in these manufacturing processes, the manufacturing industry has been searching for a kind of control system requiring less specialized knowledge or experience, as is the case with professional technology. HoloLens makes good use of MR characteristics and the simulation performance of the 3D real-time driving engine, Unity3D [27] in response to the requirement to simulate virtual robot dynamics and a kinematics control system with a certain level of complexity to achieve the expected performance. More specifically, the system can simulate how a robot control system interacts with the actual environment and also verify the influence of a certain parameter quickly and effectively when applying the robot control system in the actual environment. CoFabs with a robotic arm is a manufacturing tool that integrates virtual and physical elements to expand the possibilities of the manufacturing process with a
276
T.-W. Chang et al.
multiple operation workflow. Henceforth, the robotic arm enters the design field and saves a lot of time in completing complicated manual work. Different processing methods can be imagined as different movements of the human palm, and so it is necessary to understand different processing methods in order to develop different robotic arm tools for designers to control more diverse or customized prototypes. The industrial application of the extended robotic arms makes the production phase of the product even more delayed [28].
2 Related Work Along with the rise of the maker movement, many people have started to learn how to do it by themselves. In the prototype implementation stage, the traditional manufacturing process is replaced by digital manufacturing because the former workflow is too dangerous and the technical threshold is high. As a result, the manufacturing process is systematized through computer calculations and machines that can work based on programming. Digital manufacturing is an approach that can quickly produce tailor-made products by the makers themselves. However, in the process of digital manufacturing, the maker often cannot acquire prototyping feedback immediately with the design process due to the time-consuming nature of multi-disciplinary cooperation and the division of manufacturing processes. Therefore, this article seeks to resolve the current digital manufacturing problems through the implementation of the following three features: (1) threshold and limits for digital manufacturing with respect to human-related factors, (2) the application and status of a robotic arm with respect to physical features, and (3) the application and status of MR with respect to digital features. The robotic arm facilitates physical operation due to its three-dimensional moving path and extended control by coexisting integration. By combining such techniques with MR gesture controls, the complexity of the controls of the robotic arm can be simplified; users can then operate the robotic arm more intuitively and focus on solving problems that arise in the digital manufacturing process.
2.1 Human Features: Threshold for Digital Manufacturing Digital manufacturing is changing the way people design, produce, and interact with objects and devices. The diverse array of current manufacturing processes includes laser cutting, 3D printing, CNC milling, and Printed Circuit Board (PCB) manufacturing, each of which can produce parts in a variety of forms and materials. Because of the rise of digital manufacturing technology, industrial manufacturing has been both redefined and integrated into new processes. Designers must be capable of abstract design thinking, effective description, and the keeping of design results under control.
CoFabs: An Interactive Fabrication Process Framework
277
With the advent of a series of Rapid Prototyping (RP) technologies and their absorption into implementation, designers are able to see design results more quickly. As a result, designers and design colleges have begun to use RP equipment during the design process. Such an intermediary relationship has started to alleviate the characteristics of the machine and places more emphasis on the nature of the design process. The evolution of digital technologies is inseparable from the transformation of conventional building techniques. The use of digital fabrication in architecture allows for the mass production of customized complex structures. Therefore, they have become committed to working on-site, but it still cannot solve the problem of immediate on-site correction. In the contemporary design workflow, the prototype has always been considered an effective design method to examine and test how a new idea is going to be made and how it will be used [8]. The concept of a prototype emerges out of a focus on the utility of models beyond the final result, and so there are different categories for the objectives of each phase of the design project. These categories can be classified as follows [9]: • Soft model: Modeled by hand and allows for exploration and evaluation of the size, proportion, and shape of concepts and ideas. • Hard model: Usually made of wood, plastic, metal, or dense foam. It is not technically functional, but it is presented as a replica that is very close to the final design. • Presentation model: With every detail of composition, the components have been simplified to save time. • Prototype: High quality and functionality produced to exhibit a design solution. The “seeing-moving-seeing” model described above is based on a consideration of design as a “process of dialog between designers and design media,” which relies heavily on observation (seeing) activities. Observations are made, the designer’s ideas are refined, and then the results are presented via the design media, such as loop thinking, to make the design more complete, as shown in Fig. 2. The designer draws on the article, observes how the finished product is gradually forming, and uses different “seeing” methods, including, for example, visual apprehension and literal seeing. After the drawing is refined, the design features and associations can be defined, and more in-depth thoughts will be accumulated.
Fig. 2 Seeing-moving-seeing model
278
T.-W. Chang et al.
2.2 Physical Features: Robot Arm Application Since the 1980s, the six-axis robotic arm has been widely applied in automobile and aerospace industries due to its capacity to operate and execute the workflow more and more intuitively, which helps designers to produce the results more precisely and repeatedly. At the present time, the countries manufacturing robotic arms are mostly those with advanced heavy industry technology, such as Germany, the United States, and Japan. There have been several cases where robotic arms have been used for experimental purposes [29, 30], such as 3D printing [31], concrete printing [32], and cutting projects [33]. There are various processing methods for different materials, and there are different processing processes in the production workflow. A robotic arm can execute many complicated tasks in a precise and continuous manner that cannot be achieved manually. The end effector is just like the palm of our hand. It can “learn” different ways of processing through a kind of training. A robotic arm may unlock the limitations of the palm joint and fatigue of the muscles. A variety of processing methods have been developed by design, which have not only promoted the efficiency of the manufacturing process but have also enhanced manufacturing automation. In the 2015 “Extrusion Structure” course [34], the designers initially started with conventional triangulated space frame structures. They then expanded their research toward combining multi-sided rectangles and hexagonal frames, and developed custom printing sequences for non-standard spatial systems (Fig. 3). The hot wire cutting process has been commonly used in the foam industry for making sculptures and extracting raw materials by carving out large volumes in a short duration of time. This process is related to sternotomy (the art of stone cutting) and it may quickly and efficiently generate three-dimensional shapes due to the fact that the process of material removal is voluminous, allowing for the processing an entire ruled surface with a single motion, thereby avoiding the need for layer-by-layer processing. Reinhardt et al. [29] demonstrated the efficiency and reliability of hot wire cutting as a method for the fast production of large volumes and the formation of a space gauge block by stacking the cut modules (Fig. 4).
Fig. 3 The process of applying the robot arm to 3D printing (left) and results (right)
CoFabs: An Interactive Fabrication Process Framework
279
Fig. 4 Large scale The Robotic Hotwire Cutting (RHWC) production at Odico [33] (left), and hotwire cut production sample (right)
2.3 Digital Features: Mixed Reality Applications MR is the integration of both real and virtual worlds to create new environments and visualizations of physical and digital objects coexisting and interacting in real time. MR occurs not only in the physical or virtual world, but also in the combination of reality and virtual reality, including AR and enhanced virtual immersive technology. Currently, the range of MR technology and applications has expanded to include entertainment and interactive arts as well as engineering and medical applications. In 2014, Weichel et al. proposed the implementation of an MR environment for personal fabrication that lowered the barrier for users to engage in personal fabrication. In this case, users were able to design objects in an immersive AR environment, interact with virtual objects in a direct gestural manner, and introduce existing physical objects effortlessly into their designs [35]. As an example of this approach, users started by creating the base shape by drawing a circular outline (see Fig. 5a). The system recognized the drawing as a circle. To confirm the height, they tapped with the other hand (see Fig. 5b). In order to cut the left side, they placed their left hand where they wanted to cut and confirmed with the right one (see Fig. 5c). Then, they created the first hole, which would hold the glue-stick. They positioned the real glue-stick where they wanted the hole to be within the virtual object. Once in position, they selected “capture outline” and moved their hands out of the frame (see Fig. 5d). The system then captured the outline of the glue-stick and extruded its height. Confirming that initial height with the left hand turned the object into a virtual glue-stick replica. As the glue-stick stood on the ground of the frame, the virtual glue-stick replica was on the ground as well. To have some material at the bottom of the hole, they grabbed the virtual glue-stick, moved it up a few millimeters, and then released it to fix it in that position (see Fig. 5e). Eventually, they assembled the virtual glue-stick and the previously created base to create the hole for the stick. After selecting “assembly,” they were asked to choose the method (see Fig. 5f). Choosing to “subtract” removed the material where the glue-stick was, leaving a hole of the correct size and position.
280
T.-W. Chang et al.
Fig. 5 MixFab’s user interface
Lastly, they repeated the steps above for a pen, placing it in its desired position, capturing its outline, extruding it, and moving it up a few millimeters. To make the pen easier to access, they tilted it forward by grabbing at a point in space and forming a lever with which the object was reoriented (see Fig. 5g). Once in the correct position and orientation, they assembled the virtual pen replica, resulting in the final desktop organizer (see Fig. 5h). The second case used a 3D collaborative approach with HoloLens and Skype. In this case, the Endpoint developed by Microsoft was the HoloLens AR system that was used to interact with other users’ supported models for collaboration [36]. Traditional AR collaboration involves two or more Head-Mounted Display (HMD) users. Chen et al. described collaboration among a single HMD user and others who joined the space by assuming the HMD user’s point of view. Remote peers could participate in the process through Skype-enabled devices, such as tablets or PCs. This interaction is novel when using a 3D space with digital objects, where the interaction of the remote parties can be implemented asynchronously and reflected back to the primary user. First, this case supported the use of long-distance space annotation, and HoloLens Skype introduced real-time collaboration applications supporting the design requirements. Second, it explained two main methods for capturing and annotating 3D information associated with HoloLens users’ video delivery, which captured spatial data using static frames [37] and a series of video clips. Finally, this case demonstrated how users could reintegrate spatial data back into the original AR environment so that the primary users understood and interacted with them (see Fig. 6). Our approach allows multiple participants to use commercially available equipment, such as tablet computers and the like, to interact with the 3D video content captured through the HoloLens equipment in remote settings.
CoFabs: An Interactive Fabrication Process Framework
281
Fig. 6 HoloLens Skype for augmented reality scene FIG context
A case presented in 2012 proposed a model positioning method in which the existing guidelines for the new 3D objects were like 3D rendering in compliance with the existing 3D object. Lau promoted the idea of combining the AR modeling framework with this positioning work, generating a new virtual object-like design that could be displayed with an existing virtual [38]. They introduced a technique for stamping the shape of hands, touching feedback for the stamping, and then demonstrated the use of a variety of system-designed furniture and household products of their own. In Fig. 7, the table, dish rack, and candlestick are presented. The shelves were built with laptops and books on the table. A chair with a curved surface differed from a table, whose surface consisted of a continuous and connected set of cylinders. The chair on the top right of the figure demonstrated the use of a square prism and small cube primitives. The shelf on the bottom right of the figure was tightly embedded in the available space, and its wheels were created by small sphere elements. Symmetric tools can be used to create a virtual copy of all shapes, wherein the tool serves as the plane of symmetry. Such tools make it easier to create symmetrical parts and generate well-aligned and configurable shapes.
Fig. 7 Spatial modeling: shape tag interface with original shape
282
T.-W. Chang et al.
2.4 Summary Through the analysis of the above literature, the concept of the “seeing-movingseeing” model was introduced into the digital manufacturing process. Under the operating conditions of RP and virtual reality blending, the designer can use different “look and see” methods to observe the finished product. Gradually, design media has also evolved from a purely passive assistant role to an active supportive role that may call for some interaction on the part of the designer. Generally, the devices in MR are limited to their own resolution and interactivity in the display of the interface. The HoloLens headset developed by Microsoft not only provides a wireless HMD, but also enables the ability to assemble AR commands along with providing the opportunity for users to visualize such applications. It has a unique capability to integrate virtual 3D content with the real world. The optimization of interactivity with a simple gesture recognition input system allows users to easily manipulate 3D reconstructions and create 3D models in space. Therefore, HoloLens was used in this research as an MR interface device for the system.
3 Materials and Methods Makers are inextricably reliant upon digital manufacturing. Participatory behavioral observations have been made and explored in this article. Hence, makers with crossdomain interests were the main targets, and various manufacturing methods capable of RP were analyzed to obtain thermoforming. The manufacturing methods considered included metal bending, metal printing, hot wire cutting, incremental sheet forming, CNC engraving, and panel striking. As metal printing material has a large difference in plasticity, strength, and hardness, we used metal printing for a research test. Arc welding requires a welding power source to create and maintain an arc between the electrode and the solder material, causing the metal on the solder joint to melt to form a molten pool. Arc welders use either direct or alternating current and consumable or non-consumable electrodes. Sometimes, an inert or semi-inert gas, as a protective gas, is applied near the molten pool, and a weld repair material is added. The molten pool is formed by heating the workpiece to be joined and solidified. We divided the manufacturing process of metal printing into four steps: (1) preparation; (2) warm-up; (3) processing; and (4) finishing, but different technical knowledge is required for manufacturing in each step, including the method of assembling the tool head, the adjustment of parameters, and the setting up of safety equipment (see Fig. 8). It is hoped that the RP application might be extended to other manufacturing processes through the use of the network entity interaction system proposed in this article. After analysis, metal printing was selected as the main target because the material printed by metal had great plasticity, strength, and hardness. Therefore, we reduced the metal printing steps into different tasks and systematically modularized
CoFabs: An Interactive Fabrication Process Framework
283
Fig. 8 Traditional manufacturing process for metal printing
each step by designing the operation. CoFabs, combining MR and a machine, was mainly utilized during the process. The basic design flow was set by the MR interface, and the value was transmitted to the machine through the system. Finally, the material was heated, moved, and processed (see Fig. 9). When the maker starts to manufacture a product, the basic settings of the machine are first set through the MR interface, such as the work area, the initial model, the tool head, the positioning of the starting work, the initial position, and the tool and the point of contact of the head. After setting the basic values in the early phase, the heating stage began. Attention to the parameters of power supply adjustment, heat concentration, and deposition speed from the MR interface were required in this stage. During mobile processing, the maker set the printing path via gesture and started the system’s automatic calculation to generate the moving path, angle, and switching points of feeding for control of the movement path printed by the machine. At the final stage of completion, the progress of the physical printing on the MR interface was
Fig. 9 CoFabs system process
284
T.-W. Chang et al.
Fig. 10 User journey map
displayed through system operations. We observed the progress and gradually formed the finished product through the virtual interface and the corresponding reality. We also tried different “seeing” methods to modify the finished product.
3.1 The User Journey Map Based on the usage flow, it can be seen that the steps of setting up the initial working area and positioning the tool head are too cumbersome, and the maker may be confused by the difference between the virtual space and reality itself. Therefore, in the preparation stage, the positioning point needs to be attached to the function of the point to reduce the maker’s difficulty in positioning it. However, in the process of adjusting each parameter and starting to move the printing path, the maker can customize manufacturing in compliance with his/her own preferences. After the progress is displayed, the subsequent printed paths can be changed. In these steps, the maker is able to perform a simpler operation and can delay the design model phase in the entire manufacturing process (see Fig. 10).
3.2 CoFabs System Architecture In order to render the model and generate the model interface, HoloLens and a computer server with the MySQL database were required to set up CoFabs. HoloLens was used for the UI interface, and the computer server was built for the robotic arm g-code generation. HoloLens was first set up with Unity3D. There were prefabs of UI objects, a sender, and a listener script that transmitted messages regarding the assets of the unity project. Listeners and senders needed internal port numbers, external IPs, and external ports with the specific IP address of our computer server. HoloLens could communicate with other devices in a set of UDP/IP, enabling the virtual model in HoloLens to generate data strings and pass the data back to the computer server
CoFabs: An Interactive Fabrication Process Framework
285
Fig. 11 CoFabs system architecture
through the MySQL database. Then, we also built a C# script for Grasshopper to download the string type data as JSON from the MySQL database table, which contained geometric data, behavior variables, and remapping parameters between the robotic arm and the HoloLens UI system. Finally, we used the Taco ABB [39] Rhino/Grasshopper plug-into convert the points at the edge of the model to execute the robotic arm to move along the path of the server database (see Fig. 11).
4 The Experiment For reifying the CoFabs process, an experiment with the capability of immediate deformation by using the Robotic Hotwire Cutting (RHWC) workflow was conducted. Building parts for a “Fab Car” through this RHWC manufacturing process was implemented with CoFabs due to the spirit of digital fabrication embedded in the Fab Car design. We hoped this process could help designers style their projects in a different way via the ability to modify, customize, and adapt shapes to meet their specific needs with gesture poses. The goal was to study the real-time manufacturing of virtual and real integration, and, therefore, a MR UDP transmission system equipped with the ABB IRB2600 robotic arm to reduce the threshold of operating robots was implemented. Through Grasshopper, the geometric model was divided into three types of data. The UDP transmission system was uploaded to the cloud server and converted into a database. The HoloLens device was used to visualize the rendering in an immersive way in real space, and then the gesture was used to plan the moving path of cutting. The moving path was passed back to the cloud server, and the movement of the robotic arm was executed (see Fig. 12). The remote CoFabs included ABB IRB2600 robotic arms and a Microsoft HoloLens device, which uses human perception to interact with other variables.
286
T.-W. Chang et al.
Fig. 12 MR and the robot arm real-time view
Because modern robotic arms tend to use distributed control systems and algorithms, the infrastructure is easy to implement and integrate with the cloud-based approach. The following sections highlight the various design considerations involved in merging into this integrated system. On the input side, the system first sensed the moving position and hand gestures through HoloLens. In the meantime, with respect to hand acts, the system presented the 3D models to enlarge, reduce, or deform the model in real time. After these variant operations were decided, the geometric data of the dotted line surface were uploaded back to the cloud server, and finally, the data were uploaded through Grasshopper to be converted into the moving path of the robotic arm, thereby simplifying the complex programming of the operating arm. In the first stage (see Fig. 13), users could refer to the Fab Car body through the cutting model with the sensing of HoloLens. In order to design a variety of types, our experiment also presented various models from which a user could choose the best one by comparison, select the system in accordance with the model feature, upload it to the cloud server, the default Grasshopper after cutting path, and transmit it to the robotic arm’s manufactured body. Upon completion of the cutting, the body once again revised the details (see Fig. 14). In the process of the experiment, the continuity of the cutting surface and the gestures were simulated in a test. When the hands moved to generate the desired
Fig. 13 Fab Car type simulation
CoFabs: An Interactive Fabrication Process Framework
287
Fig. 14 Cutting process simulation (left), finish (right)
path, and the surface was converted into a cutting path, the smooth surface could not be accurately cut because it is difficult for human hands to remain stable when they are floating. Therefore, we experimented by tracking the median value of handshaking and smoothing the path of hand movement. Additionally, for a single direction cut, it was impossible to cut out a symmetrical model. We tested the symmetrical path from the model center by mirroring the cutting path, confirmed the cutting path through the preset example path, and modified it. In the actual cutting process, the cutting tool was moving too fast, and the center of gravity was not stable. A variety of parameters of the experimental machine were involved in this process. We noticed that the speed of the robotic arm movement, the material quality, and the curvature of the cutting path were the most important parameters due to their correspondence with the melted material, resulting in a line stretch of the cutting line when the cutting operation was executed. Therefore, the speed was reduced by 50% when ready for cutting, and the cutting line still produced some line tension, which caused the cutting path to shift. Then, the speed was reduced by 25%, and the cutting line cut into the material. However, the cutting line slowly began to produce a line stretch after cutting half of the material. Finally, the speed was reduced to 10%, so that the cutting line had enough time to preheat, and the material was smoothly cut for the remainder of the process. Based on the cutting method, the tool was cut from left to right. Due to the shift of the gravity center, the cutting tool was shaking while moving, and the model was somewhat different from the simulation. Therefore, we activated the cutting tool from top to bottom, but there was still a shift in the center of gravity of the cutting tool, causing the tool to shake. The fixed cutting tool was then tested, and the material was moved for the cutting process. Finally, the material was fixed on the table to reduce the shaking of the two robotic arms. However, the actual cutting model details were not obvious when the cutting path was set to too small of a value due to the dividing point. Thus, we increased the split point to 15 points, and the cutting started to be performed in a smooth manner, but the undulating surface began to shift, and the cutting speed was too fast. Finally, the split point was increased to 20 points, and the speed was reduced by 10%. The offset condition was significantly reduced, and the surface accuracy was improved.
288
T.-W. Chang et al.
Fig. 15 Cutting component simulation (left), finish (right)
In the second stage (see Fig. 15), we manufactured the components after confirming the completion of the car body. We fixed the Styrofoam on the table so that unidirectional cutting could be performed faster along the Fab Car head, tail, and side. The columns were manufactured one by one. During the process, we continuously observed the original body model and compared it to confirm the joints of the parts. In the final phase, after completing the manufacturing of each part, the analog position of each part of the body assembly was determined to be in good order through HoloLens and confirmed for assembly. The mechanism was not considered during assembly, which caused the mechanism and the model to collide with each other. After confirming the position of the design of the mechanism, a correction was made, and part of the designed mechanism was manually dugout with a cassette to enable the model to be mounted. Therefore, in the design phase, part of the mechanism design should be added for simulation to avoid any conflict between the model and the organization (Fig. 16). The designer’s role was also redefined during the process. When the user acquired more powerful customized tools with an easy-to-operate interface, the designer no longer had to wait for his/her creation to emerge in the final form to determine whether it was feasible, since the results of the creation took effect earlier than expected. As the designer drew on the article, the finished product was gradually shaped. The designer’s idea changed and could be instantly modified, simplified, or added to through the robotic arm activities during the operating process. While the virtual model was completed by HoloLens, the robotic arm also finished the solid model (Fig. 17).
Fig. 16 Assembly simulation
CoFabs: An Interactive Fabrication Process Framework
289
Fig. 17 Fab Car finish project
5 Results and Discussion Generally speaking, when a prototype is made for considerations of shape, 3D printing might be the most common and intuitive choice. However, 3D printing materials mainly consist of ABS or PBL, which are usually used only in terms of fused deposition. These materials are relatively fragile, weak, and generally limited to a smaller size. Thus, when makers are trying to develop a prototype, they will need to consider the use of different materials. With respect to the Fab Car experiment, the speed of robotic arm movement, the center position of the cutting object, and the curvature of the cutting path were the most important parameters because of their correspondence with the melted material, resulting in a line stretch of the cutting line manufacturing process when the robotic arm operation is executed. Therefore, to optimize the experimental results, we reduced the speed by 10%; this change of parameter provided the cutting line with enough time to preheat. In order to mitigate the offset center of gravity, we fixed the material on the table to reduce the shaking problem of the two robotic arms. We also set the number of divided curves to 20 points, which fixed the jitter problem of the cutting path. In the overall fabrication process, we added the condition of mechanical design simulation in the stage of preparation. This allowed the user to avoid any conflict between the model and the organization when designing the Fab Car. Therefore, in existing models, it is necessary to hand over the design draft to the builder, and both parties must reach a consensus after some communication before manufacturing. The defects on the design draft will, therefore, only be discovered after the product is finished. In contrast to this process, we have developed a system
290
T.-W. Chang et al.
that functions within a hybrid environment, through which the maker can reduce the technical threshold by augmenting the real interface and remotely manipulating the machine for metal printing. Through the virtual interface and the actual product, we may observe the finished products and use different “seeing” methods to modify them. Such an approach will reduce the time it takes for the maker to communicate with the builder and wait for the finished product, while also decreasing the number of revisions required after the product is completed.
6 Conclusion In order to reduce the threshold for makers to use different manufacturing processes, this article applies a “seeing-moving-seeing” design model to reconceptualize the digital manufacturing process. Under the operating conditions of RP and virtual reality blending, designers can use different “seeing” methods to observe the finished product. Therefore, the CoFabs process for makers has been developed to create an interactive manufacturing process within virtual and real spaces, so that the manufacturing process can be modified to progress in real time. Furthermore, through this revised process, the designer can bear witness as the model is being built, and the manufacturing process enacted by the robotic arm in real space and time can be observed to set up a remote wireless virtual reality integration system. Through such human–machine collaboration, the time that it takes to manufacture complex building combinations can be reduced, the structures of buildings can be quickly formed, and the fabrication of complex space structures can be realized while reducing costs, risks, complexity, and traditional dependence on 2D documents.
References 1. Chen, C.-Y., Chang, T.-W., Hsiao, C.-F., & Huang, H.-Y. (2019). Developing an interactive fabrication process of maker based on “seeing-moving-seeing” model. In P. F. Yuan, Y. M. M. Xie, J. Yao, & C. Yan (Eds.), Proceedings of the 2019 DigitalFUTURES. CDRF 2019 (pp. 312–321), 05 July 2019, Shanghai, China. Springer. 2. Hsieh, T.-L., & Chang, T.-W. (2019). ViDA: A visual system of DFA process for interactive surface. In T. G. Wyeld, E. Banissi, A. Ursyn, M. Bannatyne, N. Datia, & M. Sarfraz (Eds.), 23rd International Conference Information Visualisation (pp. 68–73), 16–19 July 2019. Flinders University, Adelaide, Australia. 3. Huang, H.-Y., Chang, T.-W., Wu, Y.-S., & Chen, J.-Y. (2017). Collective fabrication—A responsive dynamic skin design case. In P. Janssen, P. Loh, A. Raonic, & M. A. Schnabel (Eds.), The 22nd International Conference on Computer Aided Architectural Design Research in Asia (pp. 99–100). Xi’an Jiaotong-Liverpool University, Suzhou, China. CAADRIA. 4. Schon, D. A., & Wiggins, G. (1992). Kinds of seeing and their functions in designing. Design Studies, 13, 135–156.
CoFabs: An Interactive Fabrication Process Framework
291
5. Lai, I.-C., & Chang, T.-W. (2003). Companying physical space with virtual space—A coexistence approach. In A. Choutgrajank (Ed.), The 8th Annual Conference of Computer Aided Architectural Design Research in Asia (CAADRIA) (pp. 359–370), Bangkok, Thailand. 6. Lu, K.-T., & Chang, T.-W. (2005). Experience montage in virtual space. In Proceedings of the 10th International Conference on Computer Aided Architectural Design Research in Asia (pp. 426-435), 28–30 April 2005, New Delhi, India. CAADRIA. 7. Wesugi, S., Ishikawa, K., Suzuki, N., & Miwa, Y. (2004). Interactive spatial copy wall for embodied interaction in a virtual co-existing space. In RO-MAN 2004. 13th IEEE International Workshop on Robot and Human Interactive Communication (IEEE Catalog No. 04TH8759) (pp. 265–270). IEEE. 8. Hallgrimsson, B. (2012). Prototyping and modelmaking for product design. London: Laurence King. 9. Isa, S., & Liem, A. (2014). Classifying physical models and prototypes in the design process: a study on the economical and usability impact of adopting models and prototypes in the design process. In: DS 77: Proceedings of the DESIGN 2014 13th International Design Conference. 10. Lee, E. A. (2008). Cyber physical systems: Design challenges. In 2008 11th IEEE International Symposium on Object and Component-Oriented Real-Time Distributed Computing (ISORC) (pp. 363–369). IEEE. 11. Groover, M., & Zimmers, E. (1983). CAD/CAM: Computer-aided design and manufacturing. Pearson Education. 12. Al-Samarraie, H., & Saeed, N. (2018). A systematic review of cloud computing tools for collaborative learning: Opportunities and challenges to the blended-learning environment. Computers Education, 124, 77–91. 13. Autodesk. (2019). Fusion 360: Integrated CAD, CAM, and CAE software. Retrieved October 10, 2019, from https://www.autodesk.com/products/fusion-360/overview#banner. 14. Verma, G. (2018). Autodesk fusion 360 black book. BPB Publications. 15. Vila, C., Ugarte, D., Ríos, J., & Abellán, J. (2017). Project-based collaborative engineering learning to develop Industry 4.0 skills within a PLM framework. Procedia Manufacturing, 13, 1269–1276. 16. Microsoft. (2019). HoloLens: A new way to see your world. Microsoft HoloLens Microsoft. Retrieved October 10, 2019, from https://www.microsoft.com/microsoft-hololens/en-us/har dware. 17. Taylor, A. G. (2016). Develop Microsoft hololens apps now. Springer. 18. Szalavári, Z., Schmalstieg, D., Fuhrmann, A., & Gervautz, M. (1998). “Studierstube”: An environment for collaboration in augmented reality. Virtual Reality, 3, 37–48. 19. Chang, T.-W., Huang, H.-Y., & Datta, S. (2019). Design and fabrication of a responsive carrier component envelope. Buildings, 9, 84. 20. Dimitrov, D., Schreve, K., & de Beer, N. (2006). Advances in three dimensional printing—State of the art and future perspectives. Rapid Prototyping Journal, 12, 136–147. 21. Lowke, D., Dini, E., Perrot, A., Weger, D., Gehlen, C., & Dillenburger, B. (2018). Particle-bed 3D printing in concrete construction—Possibilities and challenges. Cement Concrete Research, 112, 50–65. 22. Wang, S.-Y., Sheng, Y.-T., Shih, J. Y.-C., Frank, F., Breitenberger, G., & Chang, T.-W. (2018). Transient materialization. In W. Huang, M. Williams, D. Luo, Y.-S. Wu, & Y. Lin (Eds.), The Short Paper Proceedings of the 23rd International Conference on Computer Aided Architectural Design Research in Asia. Beijing, China: China Architecture & Building Press. 23. Stitic, A., Nguyen, A. C., Rezaei Rad, A., & Weinand, Y. (2019). Numerical simulation of the semi-rigid behaviour of integrally attached timber folded surface structures. Buildings, 9, 55. 24. Yan, Y., Li, S., Zhang, R., Lin, F., Wu, R., Lu, Q., et al. (2009). Rapid prototyping and manufacturing technology: principle, representative technics, applications, and development trends. Tsinghua Science Technology, 14, 1–12. 25. Granado Castro, G., Barrera Vera, J. A., & Aguilar Camacho, J. (2016). La maqueta de Cádiz de 1779. Utilidad militar o metáfora de poder. Proyecto, progreso, arquitectura, 15, 16–29.
292
T.-W. Chang et al.
26. Lavallee, J., Vroman, R., & Keshet, Y. (2011). Automated folding of sheet metal components with a six-axis industrial robot. In ACADIA 2011: Integration Through Computation, 11–16 October 2011, Calgary/Banff, Alberta, Canada. 27. Unity. (2008). Unity game engine-official site. http://unity3d.com (Accessed). 28. Sha, X., Jia, Z., Sun, W., Hao, Y., Xiao, X., & Hu, H. (2019). Development of mixed reality robot control system based on HoloLens. In International Conference on Intelligent Robotics and Applications (pp. 571–581). Springer. 29. Reinhardt, D., Saunders, R., & Burry, J. (2016). Robotic fabrication in architecture, art and design 2016. Springer. 30. Willette, A., Brell-Cokcan, S., & Braumann, J. (2014). Robotic fabrication in architecture, art and design 2014. Springer. 31. Yuan, P. F., Meng, H., Yu, L., & Zhang, L. (2016). Robotic multi-dimensional printing based on structural performance. In Robotic fabrication in architecture, art and design 2016. Springer. 32. Culver, R., Koerner, J., & Sarafian, J. (2016). Fabric forms: The robotic positioning of fabric formwork. In Robotic fabrication in architecture, art and design 2016. Springer. 33. Søndergaard, A., Feringa, J., Nørbjerg, T., Steenstrup, K., Brander, D., Graversen, J., et al. (2016). Robotic hot-blade cutting. In Robotic fabrication in architecture, art and design 2016. Springer. 34. Helm, V., Willmann, J., Thoma, A., Piškorec, L., Hack, N., Gramazio, F., et al. (2015). Iridescence print: Robotically printed lightweight mesh structures. 3D Printing Additive Manufacturing, 2, 117–122. 35. Weichel, C., Lau, M., Kim, D., Villar, N., & Gellersen, H. W. (2014). MixFab: A mixed-reality environment for personal fabrication. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 3855–3864). ACM. 36. Chen, H., Lee, A. S., Swift, M., & Tang, J. C. (2015). 3D collaboration method over HoloLens™ and Skype™ end points. In Proceedings of the 3rd International Workshop on Immersive Media Experiences (pp. 27–30). ACM. 37. Büttner, S., Mucha, H., Funk, M., Kosch, T., Aehnelt, M., Robert, S., et al. (2017). The design space of augmented and virtual reality applications for assistive environments in manufacturing: A visual approach. In Proceedings of the 10th International Conference on PErvasive Technologies Related to Assistive Environments (pp. 433–440). ACM. 38. Lau, M., Hirose, M., Ohgawara, A., Mitani, J., & Igarashi, T. (2012). Situated modeling: A shape-stamping interface with tangible primitives. In Proceedings of the Sixth International Conference on Tangible, Embedded and Embodied Interaction (pp. 275–282). ACM. 39. Frank, F., Wang, S.-Y., & Sheng, Y.-T. (2016). Taco for ABB. http://blickfeld7.com/architect ure/rhino/grasshopper/Taco/ (Accessed).
Materializing Intelligence
3d-Printed Bending-Active Formwork for Shell Structures Xiang Wang, Kam-Ming Mark Tam, Alexandre Beaudouin-Mackay, Benjamin Hoyle, Molly Mason, Zhe Guo, Weizhe Gao, Ce Li, Weiran Zhu, Zain Karsan, Gene Ting-Chun Kao, Liming Zhang, Hua Chai, Philip F. Yuan, and Philippe Block
Abstract This paper presents a novel building technique for the formwork of thin shell structures with 3d-printed bending-active mesh sheets. To enhance the structural stiffness of the flexible plastic materials, bending-active form is applied to utilize the geometry stiffening effect through the large deformation of bending. As it is the main problem to determine the final geometry of the bent surface, design methods with consideration of the numerical simulation is researched and both simulations via dynamic relaxation and finite element method are presented. Several demonstrator pavilions and the building process are shown to test the feasibilities of the presented building techniques in the real shell project. It is expected that this method could be applied into more thin shell projects to realize an efficient building technology with less exhaust of materials. Keywords Bending-active · Shell structure · Formwork · 3d printing · Simulation
X. Wang (B) · K.-M. M. Tam Tongji University, Shanghai, People’s Republic of China e-mail: [email protected] K.-M. M. Tam e-mail: [email protected] Swiss Federal Institute of Technology in Zurich, Zürich, Switzerland A. Beaudouin-Mackay · B. Hoyle · M. Mason · Z. Karsan Department of Architecture, School of Architecture and Planning, Massachusetts Institute of Technology, Cambridge, USA K.-M. M. Tam · Z. Guo · G. T.-C. Kao · P. Block Block Research Group, Institute of Technology in Architecture, Swiss Federal Institute of Technology in Zurich, Zürich, Switzerland X. Wang · W. Gao · C. Li · W. Zhu · L. Zhang · H. Chai · P. F. Yuan College of Architecture and Urban Planning, Tongji University, Shanghai, People’s Republic of China © Springer Nature Singapore Pte Ltd. 2020 P. F. Yuan et al. (eds.), Architectural Intelligence, https://doi.org/10.1007/978-981-15-6568-7_18
295
296
X. Wang et al.
1 Introduction A perfect shell structure relies not only on an elegant design of the structure, which enables the form to follow the forces, but also on an appropriate usage of materials and the most efficient fabrication and building techniques. The past several decades has witnessed a great development of the design and building of shell structures that cover large spans with less materials. At the same time, their lightweight form also conforms to the aesthetics of structural mechanics and engineering. Nevertheless, the different interests from architects and structural engineers as well as the builders of shells also resulted a separation of concerns on the design and fabrication techniques. This has also made the shell structure research nowadays a very comprehensive and multidisciplinary topic. An ideal behaviour of the shell structures depends on the activation of the membrane behaviour, which has a great requirement for the doubly curved geometry [1]. Through the history of shells designed by the master builders such as Antonio Gaudi, Heinz Isler and Frei Otto, many talented practical methods to find the suitable form of shells have been presented based on physical simulations (Fig. 1). Compression-only shells, in this sense, got its importance in the history as a typical model for such designs. With the development of advanced computational tools and algorithms, form-finding methods based on Force Density Method, Dynamic Relaxation Method, Thrust Network Analysis Method as well as the Particle Spring Method are developed to provide a fast and direct solution based on certain parameters of the boundary condition [2]. Particularly, the design tools such as the Kangaroo [3] and RhinoVAULT [4] have enabled more and more architects, and shell designer started to consider and learn the proper ways to design a shell. However, as the form-found geometry of the shells defines usually a doubly curved surface, it is always a big challenge in the final fabrication process to build the shells into the proper geometry. Normally, large scaffolding systems are used to control the form and to take the load of the partial shell structure before it completes its shape (Fig. 2). Therefore, it is this reason that makes the building process of shell structure not as efficient as it looks like with its geometry and structural behaviour. To solve this problem, architects and researchers have made many concepts with the smart applications of different geometry and materials. Felix Candela developed the
Fig. 1 Physical simulation (A: hanging; B: pneumatic) in form finding of shell structures
3d-Printed Bending-Active Formwork for Shell Structures
297
Fig. 2 Scaffolding system as formwork for the building of concrete shell structures
way to use ruled surface and to build only with linear mould system [5]. Heinz Isler suggested the pneumatic formwork system and enabled a fast building process with removable lightweight flexible material [6]. Frei Otto utilized the degree of freedom of a quadrilateral grid system to develop the grid-shell structure concept [7]. Nowadays, the additive manufacturing technique, especially the 3d printing technology, provides the construction industry a lot of new possibilities. With bio-based materials as well as the fully recyclable modified plastics, it is possible now to realize a full green building technology. At the same time, with the robotic-aided 3d printing techniques, architects have now the great ability to create far more complicated geometry that they could not make in the past. In this research, 3d-printed sheets with structural patterns are supposed to be used as the supports of the shell structure, to enable a feasible and an efficient building method.
2 Bending-Active Structure and the Geometry of Bending In 2014, Julian Lienhard proposed a potential structural typology which uses its curvature for its form to improve the stiffness of the structure [8]. With suitable materials such as the FRP, which can offer enough strength at the same time with sufficient ability of bending, it is possible to utilize the curved bending geometry to stabilize the structure with very flexible materials. To combine the basic requirement of the deliberately designed geometry based on a form-finding process with the geometrical restrictions of the active-bending form, it is the first basic target of this research to define the principles of the geometry. Experienced from a relevant research based on the curved crease folding technique and a shell design with thin sheets [9], a Y-shaped prototype was used to define the total structure and was used predominantly in form finding (Fig. 3). Based on a primary compression-only state of equilibrium through the form finding, spatial curves could be generated from each two legs of the Y-branch. By developing a
298
X. Wang et al.
Fig. 3 A relevant design experiment with the Y-shaped prototype and bending form based on the curved crease folding
quasi-developable surface by simply dividing the adjacent curves into large amount of elements and connecting the points to compose plenty of triangles, this definition of the surface could make the polysurface unrollable into a planar one that could be defined as a simple contour. By either laser-cutting the sheet material or 3d printing an arbitrary pattern inside the contour, it could get the structural geometry back via bending to make a spatial shell whose shape was close to the original design (Fig. 3). It is obvious in the experience that the final form may have a deviation to the ideal one which is derived by the optimization for the developable surface; because the final forming process relies only on the bending behaviour of the material and the moving of the supports to the target position. For this problem, a simulation and analysis of the active-bent form is made to work as an important part in the design workflow.
3d-Printed Bending-Active Formwork for Shell Structures
299
3 Simulation and Determination of the Bending-Active Form 3.1 Dynamic Relaxation Method The simulation of the bending-active geometry can be generally achieved either in a dynamic way with the dynamic relaxation method to analyse the post-buckling behaviour, or with the finite element method to simulate the quasi-static problem with an explicit dynamic method [10]. Compared with the dynamic way, a finite element method usually takes much longer time for an analysis, and this makes the method not quite suitable for an application in the early state of design. For a quick design feedback of the bent geometry, a simple kinetic analysis method is made as a tool in this research with the plugin Kangaroo in Grasshopper environment in the software Rhinoceros. Based on the Particle Spring Method, an initial sheet surface is discretized into meshes, and the springs are generated according to the mesh edges. With the springs as the main structural elements, a dynamic behaviour can be simulated by adding an actuation for the movement of the springs. The target actuation of the system (the movement of the supports) is defined as the large forces added on specific lines. The whole system load is added step by step to slow down the actuation process to avoid a sudden large deformation in the structural system and hence to generate a quasi-static result for the bent geometry (Fig. 4).
Fig. 4 Bending simulation of a planar strip based on different boundary conditions and actuation in different steps (simulated with GH plugin Kangaroo)
300
X. Wang et al.
3.2 Finite Element Method At the same time, similar simulations and analysing workflows have been tested and achieved with the Finite Element Analysis (FEA) software. In this research, software ABAQUS is used with its Python Interface to export the complicated boundary conditions directly from Rhinoceros working environment. Different steps (StaticRIKS, Dynamic-Explicit) are tested based on different assumptions of the problem (imagine that as a static problem with a large deformation or a quasi-static problem based on the long time span kinetic problem). With the FE analysis, not only the final geometry but also the internal stresses caused by the bending can be analysed and visualized in the result (Fig. 5).
Fig. 5 Bending simulation of a planar strip based on different boundary conditions and actuation in different steps (simulated with GH plugin Kangaroo)
3d-Printed Bending-Active Formwork for Shell Structures
301
4 Printing Technique and a Pre-research Pavilion 4.1 3d Printing Technique With the above-mentioned methodologies and design tools generated for the activebending shell formwork, a pre-research project is made to testify the feasibility and the restrictions of this design method as well as the fabrication technique. Nowadays, the additive manufacturing technique, especially the 3d printing technology, provides the construction industry a lot of new possibilities. With bio-based materials as well as the fully recyclable modified plastics, it is possible to realize now a full green building technology. At the same time, with the robotic-aided 3d printing techniques, architects have now the great ability to create far more complicated geometry that they could not make in the past. In this research, 3d-printed sheets with structural patterns are supposed to be used as the supports of the shell structure, to enable a feasible and an efficient building method. The 3d printing technique is based on a robotic-aided FDM extrusion (Fig. 6). The used material in this experiment as well the following prototype pavilion is a customized modified plastic ABS, whose mechanical properties are shown in Table 1.
Fig. 6 3d printing technique based on a gantry six-axis robot with an extrusion system
302
X. Wang et al.
Table 1 Mechanical properties of the custom-modified ABS in this research Name
Density(g/cm3 )
Young’s modulus (GPa)
Tensile strength (MPa)
Bending strength (MPa)
ABS
1.05
2.25
40.7
61.5
4.2 Design and Bending Test The pre-research tends to establish a simple design method and test the feasibility to apply the state-of-the-art techniques in each design process. In this part, three main topics are mostly tested. Firstly, as the initial geometry should be 3d printed as a grid system, the design of its pattern is researched and tested. Secondly, because the formfound geometry of the shell structure is commonly a doubly curved surface, how to modify the initial shell form and unroll that into a planar geometry is also researched. In the last, it is also tested in the pre-research how to apply the above-mentioned bending simulation with a more complicated geometry. In this test, the initial design started from a triangular planar design domain, and the form-finding process is made with RhinoVAULT, to find a suitable compressiononly form for the shell structure. To enhance the structural behaviour of the shell, principal stress lines are also generated and post-processed to be printed as the structural pattern of the shell. Two methods are tested in the post-processing of the initial form-found curved geometry. An ‘orange-peeling’ method is firstly tested by changing the initial discretized mesh into several separated strips that can be unrolled flat (Fig. 7). The final shell is built by gluing and connecting the strips during the erecting process. With a tension tie added on the supported edges, a shell with the desired geometry can be finally built. However, such building process requires too much work on the connection, and the local geometry is also to a large extent affected by the properties of the adhesives as well as the result of each gluing details. To achieve a simple bending and to reduce the welding procedure as well as the tolerance it will bring, a simple optimization quasi-loft method is used in the second test to directly get the
Fig. 7 Initial design test with the ‘orange-peeling’ methodology with a simple triangular shell structure printed in scale 1:20
3d-Printed Bending-Active Formwork for Shell Structures
303
unrolled shape of the bent shell. The initial shell form is firstly redefined as a combination of a series of Y-shaped elements. By defining each element into a triangle in the centre and three strips that can be optimized by a planar lofting method, the whole geometry can be improved as developable. The bending process is then analysed with the above-mentioned FE method and tested with a 1:10 model to measure the deviations. Construction details as well as the building sequence are tested to cast concrete on the formwork, and the feasibility is testified with a small-scale model (Fig. 8). To show the feasibility of the building technique in a complicated design case of a shell structure, a more complex form is designed with the same methodology. The intercrossed geometry is then optimized into a combination of series of lofted developable surfaces. To improve the building process, several side trusses are added to support the structure and to help for the positioning of the shell surface. With such guides, the 3d-printed formwork can be easily assembled to reach the geometry and make it as a pure geometric problem (Fig. 9 Top). In the final building process, 10 mm Glass Fibre Reinforced Concrete (GRC) is casted on-site on the formwork with only
Fig. 8 Refinement of the developable surface and the casting test
304
X. Wang et al.
Fig. 9 Assembly and concrete casting on the active-bent formwork for shell in the pre-research project
supports on the curved edges to test the load-bearing capacity (Fig. 9 Bottom). Finally, the 80 kg 3d-printed formwork took the 400 kg dead load successfully.
3d-Printed Bending-Active Formwork for Shell Structures
305
Fig. 10 Site selection and the initial concept generation of the prototype tile-vaulted shell
5 Workflow and the Prototyping Tile-Vaulted Brick Shell 5.1 Design Workflow 5.1.1
Concept Generation and Initial Geometry
As the geometric rule in the design process relies on the Y-shaped surface geometry, it is required in the early stage of design to define the geometry firstly as a linear system. In this sense, a shell can be simplified firstly as a branch system like the trees (Fig. 10). With new branches added on the existing skeleton, a complicated shell geometry and its projection on the XY plane can be derived by giving an offset distance and defining the boundaries as curves rather than the initial polylines. In the final design of the prototype tile-vaulted shell, a rectangular public courtyard is selected as the site of the project, which accommodates the shell with a 7 m * 7 m * 2 m space as well as a circular tension ring with 6 mm steel as the possible support.
5.1.2
Form Finding and Optimization
The design of the final form of the shell consists of the initial form finding of a compression-only geometry and the following optimization process to make the geometry developable. With only the boundary curves as input in the initial design, a discretization process is applied to change the base geometry as a fine mesh. Considering both the boundary curves and the Y-shaped midlines, surfaces can be defined and subdivided into both quadrilateral and triangular-based meshes. With
306
X. Wang et al.
Fig. 11 Form-finding process of the shell geometry based on the TNA method
tools provided in the python framework COMPAS (in packages compas_main as well as compas_pattern) [11], a fully parametric design process can be established in the Grasshopper and Rhinoceros environment. With the processed mesh as a basic design concept, a form-finding procedure based on the Thrust Network Analysis method can be further applied. To achieve that, a design tool RhinoVAULT is used to control both the form and force diagram of the system; to find a final shape, both provides a compression-only form but also satisfies the multiple geometric requirements for the designers (local height and how the legs overlap) (Fig. 11). The geometry found in the form-finding process is mainly a double-curved surface. A following rationalization and optimization process is further applied to make the geometry developable. In this process, the midlines of the surface and the boundary curves are firstly selected. By moving the midlines a little bit higher in the Z-direction, one can change the geometry into some strips lofting the midline curves and the corresponding curves (Fig. 12). With this geometry, all the surfaces can be defined as developable, and the V-shape cross section derived from the corrugation can also stabilize the structure in its weak direction. As the bending moments concentrate mostly on the foot areas of the structure, a local refinement is further made to define the local structure as conical surfaces by removing the centre parts which is hard to bend (Fig. 13). With all the methods and a final subdivision of the surface, an optimized mesh can be found as the final design and as input for the final 3d printing fabrication process.
Fig. 12 Basic shape refinement and optimization towards developable surface
3d-Printed Bending-Active Formwork for Shell Structures
307
Fig. 13 Global and local refinements to enhance the stability of the structure
5.1.3
3d Printing Path Generation
The considerations of the 3d printing process are mainly to find a balance between the density of the printed pattern and the thickness of printed sheet. These two factors affect both the bending capacity and the final stiffness of the structure. In the final design, the mesh is generated based on the UV isocurves on the developable surface. At the same time, to avoid a local buckling of the surface, diagonal edges are added into the system as additional layers. In this way, a final eight-layer printing pattern is designed for the robotic fabrication (Fig. 14).
5.2 Fabrication The fabrication process is finished in a prefabrication factory with a gantry system which is composed by 2 KR120R12800 industrial robot arms. As the gantry system
308
X. Wang et al.
Fig. 14 Global and local refinements to enhance the stability of the structure
has a restriction of the printing area for about 4 * 8 m, the final unrolled mesh is therefore divided into six individual pieces and designed as an overlapping system in the final assembly (Fig. 15). With a customized tool ‘FURobot’ [12], which is designed for the robotic fabrication and programming in Grasshopper, the final SRC files are generated directly with the input mesh and saved finally through the algorithm. In total, the several panels encompassed a total print length of 2,200 m and required approximately 750 h to print. The printed parts were relocated to site and assembled piecemeal one base support at a time. Temporary bracing for the strip panels were provided at the side boundaries of the shell formwork and by the deployment of support poles inserted below the shell structure at an average density of 4 poles/m2 —poles that also doubled as a height adjustment and validation mechanism to ensure the geometric fidelity of the as-built formwork.
5.3 Bending Test and On-Site Assembly Because the final assembly of the bending-actuated process is structurally a large deformation of the planar sheet that shows an obvious non-linear behaviour, it is important in the research that people should care both the global final shape and the local deforming process during the bending process. In the analysis done in this research, the local behaviour of the sheets during bending is finally double-checked via the quasi-static analysis with the explicit dynamic method in Abaqus. Through the simulation, a proper bending method is examined to know its mechanism and
3d-Printed Bending-Active Formwork for Shell Structures
309
Fig. 15 Global and local refinements to enhance the stability of the structure
that helps the designer to know the stress concentration during the process (Fig. 16). The global behaviour, nevertheless, is not fully considered due to the complicated geometry. A global convergence is hard to get during the dynamic analysis, and this is still being analysed in the ongoing research. However, as the strips are all coupled with the corrugation geometry, it is assumed that the bent form will be to large extent similar to the design form of the shell.
310
X. Wang et al.
Fig. 16 Global and local refinements to enhance the stability of the structure
The final assembly of the shell formwork is finished on-site piecewise at one time on one support (Fig. 17). Local connections are with the zip ties through the matching of corresponding points. The tile selected for application on the assembled geometry is a type of clay brick commonly found in vernacular constructions in the Yangtze River delta region of China enveloping the site. Binding of the tiles, which measured
Fig. 17 On-site assembly of the 3d-printed formwork
3d-Printed Bending-Active Formwork for Shell Structures
311
Fig. 18 Construction details of the tile-vaulted shell with the bending-active formwork
approximately 200 mm × 100 mm × 20 mm relied on a custom prepared fast-setting cement mortar incorporating an additive called 801 adhesives. As with traditional tile vaulting methods, a total of three layers were applied (Fig. 18). Structural stability considerations informed the tiling layout: the close alignment of the tiling orientation at the top and bottom layers to the vault’s force paths and principal curvatures, which are approximately equivalent in this design. For the intermediate layer, tiles were applied at a 45° rotation to provide enough overlaps with the two sandwiching layers. Approximately 4500 tiles were applied on the vault, which measures approximately 30 m2 in area—completed over a period of four days (Figs. 19 and 20).
6 Conclusion This paper presents a well-around documented research process of a novel construction technique of shell structures with 3d-printed bending-active plates. The whole process is generated through a multiphased design-to-fabrication joint workshop organized by the MIT, ETHZ and Tongji University. Aimed at establishing a systematic design strategy, the geometric consideration, structural behaviour as well as the bending simulation are discussed and tested. Specifically, software tools such as the COMPAS framework, RhinoVAULT, FURobot are selected to work together in the
312
Fig. 19 Building sequences of the tile-vaulted brick shell
Fig. 20 The inside view of the completed brick shell pavilion
X. Wang et al.
3d-Printed Bending-Active Formwork for Shell Structures
313
whole design process. To demonstrate the feasibility of this concept, several prototype pavilions are built with large 3d-printed mesh panels with modified plastic as materials. The simple and fast assembly process shows the efficient building process for some complicated shell forms. Part of this paper is revised from some published papers such as the introduction of this methodology in the documented book of this workshop [13] and a conference paper discussing the design of this tile-vaulted pavilion in IASS2019 [14]. Therefore, the authors would like to mention again all the participants in this ETH/MIT/Tongji joint workshop ‘Robotic Force Printing’ project. Acknowledgements Prof. Philippe Block, Prof. Philip F. Yuan, Dr. Ing. Xiang Wang, Kam-Ming Mark Tam, Gene Ting-Chun Kao, Zain Karsan, Alex Beaudouin, Ce Li, Ben Hoyle, Molly Mason, Weizhe Gao, Weiran Zhu, Zhe Guo, Dalma Foldesi, Hyerin Lee, Jung In Seo, Anna Vasileiou, Youyuan Luo, D, Xiao Zhang, Liming Zhang and Hua Chai. This research is funded by the National Natural Science Foundation of China (Grant No. 51578378).
References 1. Flügge, W. (2013). Stresses in shells. Springer Science & Business Media. 2. Adriaenssens, S., Block, P., Veenendaal, D., & Williams, C. (2014). Shell structures for architecture: Form finding and optimization. Routledge. 3. Piker, D. J. A. D. (2013). Kangaroo: Form finding with computational physics. Architectural Design, 83, 136–137. 4. Rippmann, M., Lachauer, L., & Block, P. (2012). Rhinovault-designing funicular form with rhino. In Computer software. ETH Zurich. 5. Garlock, M. E. M., Billington, D. P., & Burger, N. (2008). Félix Candela: Engineer, builder, structural artist. Princeton University Art Museum. 6. Chilton, J. (2010). Heinz Isler’s infinite spectrum: Form-finding in design. Architectural Design, 80, 64–71. 7. Hennicke, J., Flächentragwerke, S. I. F. L., & Flächentragwerke, S. S. W. (1974). Gitterschalen: Bericht über das japanisch-deutsche Forschungsprojekt STI, durchgeführt... am Institut für Leichte Flächentragwerke (IL), Universität Stuttgart... Weiterführung... im Sonderforschungsbereich 64 “Weitgespannte Flächentragwerke” der Deutschen Forschungsgemeinschaft. Krämer. 8. Lienhard, J. (2014). Bending-active structures: Form-finding strategies using elastic deformation in static and kinetic systems and the structural potentials therein. 9. Wang, X., Guo, Z., Zhang, X., Jin, J., & Yuan, P. F. (2019). Design, analysis and robotic fabrication of a bending-active shell structure with thin sheets based on curved-crease-folding technique. 10. Schleicher, S., Rastetter, A., La Magna, R., Schönbrunner, A., Haberbosch, N., & Knippers, J. (2015). Form-finding and design potentials of bending-active plate structures. In Modelling behaviour. Springer. 11. Van Mele, T., Liew, A., Mendez, T., & Rippmann, M. (2017). COMPAS: A framework for computational research in architecture and structures.
314
X. Wang et al.
12. Fab-Union. (2019). FUROBOT. 0.5.1 ed. 13. Yuan, P. F., & Philippe, B. (2019). Robotic force printing: A joint workshop of MIT/ETH/Tongji. Shanghai, China: Tongji University Press. 14. Wang, X., Tam, K.-M. M., Beaudouin-Mackay, A., Hoyle, B., Mason, M., Guo, Z., et al. (2019). Tile-vault construction on bending-actuated robotically 3D-printed formwork. In Proceedings of the IASS Symposium 2019, Barcelona.
A Computational Approach for Knitting 3D Composites Preforms Yige Liu, Li Li, and Philip F. Yuan
Abstract This paper shows a computational approach for knitting net shape preforms with bespoke 3D shapes and patterns. The approach takes partial knitting as the major shaping technique and as the fabrication constraints to generate multicolored pixel-based knitting maps based on given 3D meshes. The generation process includes five steps: 1 generation of wales, 2 generation of courses, 3 generation of 2D knitting maps, 4 stitch placement optimizations, and 5 pattern variations. At final stage, users can get a knittable 3D mesh with each face representing each stitch, as well as a 2D pixel-based knitting map. The knittable 3D mesh allows designers to further design pattern variations; the 2D knitting map can be directly used for generating knitting information in knitting software or easily followed by users. Keywords 3D knitting · Machine knitting · Net shape · Composites preform · Knittable 3D mesh
1 Introduction Knitting is one of the most ancient techniques in human civilization. With the development of knitting technology and advanced fibers, nowadays designers are increasingly thinking about the potential of knitted textiles. In architecture field, multiple knitted textiles hybrid prototypes have been investigated [1–8]. “Listener” project can correspond to its surroundings, “Tower” and “Isoropia” projects feature lightweight systems made of bending-active load-carrying GFRP rods and CNC-knitted non-homogenous textile membranes. Moreover, postforming knitted textile composites [9] and pneumatic textile systems integrating Y. Liu · L. Li · P. F. Yuan (B) Tongji University, SiPing Rd. 1239, Shanghai 200092, China e-mail: [email protected] Y. Liu e-mail: [email protected] L. Li e-mail: [email protected] © Springer Nature Singapore Pte Ltd. 2020 P. F. Yuan et al. (eds.), Architectural Intelligence, https://doi.org/10.1007/978-981-15-6568-7_19
315
316
Y. Liu et al.
silicon tubes and textiles have also been explored in architecture field [10]. Some designers have tested bespoke knitted tubes to form complex architectural systems, such as in “MyThread” Pavilion [11]; others look to the possibilities of using knitted technical textile as lightweight stay-in-place formwork for complex concrete shell [12–14]. In this research, knitted textiles are used as net shape preforms for textilereinforced composite structures. Knitted net shape preforms minimize material wastage and is possible to customize material density, porosity, and microstructure locally. However, there still lacks a simple effective tool to transform designers’ 3D models into machine’s knitting data. In current textile industry, the transformation process from arbitrary 3D shapes or 3D patterns to knitting information is laborious and time consuming, and is highly relied on experienced technicians. Therefore, this research aims to provide a computational approach allowing smooth transitions between designer’s 3D models and machine knitting information.
2 State of the Art In terms of generating knitting information from 3D models, multiple approaches have been proposed. Igarashi et al. introduces a system that generates hand knitting patterns for 3D rotund animal models [15]. Yuksel et al. proposes a yarn-level modeling technique for knitted garments [16] and Wu et al. further develops this technique into a fully automatic pipeline capable of converting arbitrary 3D meshes into knit models [17]. Yet, this pipeline does not guarantee that the result is knittable. Later, Wu et al. introduces knittable stitch meshes by introducing shift paths to connect neighboring knitting rows and short rows to create shape variations [18]. This method requires user to have certain knowledge of knitting, thus to determine the knitting structure, and currently the input geometries are not arbitrary. McCann et al. represents a complier that automatically transforms shape primitives into lowlevel instructions of knitting machine [19]. The shape primitives are sheets and tubes with parameters of heights, circumferences, short rows, time, skew, and spin. Desired 3D geometries are achieved through the combinations of primitives and manipulations of parameters. Yet, this complier cannot take arbitrary 3D geometry as direct input. Popescu et al. describe an approach that transforms undevelopable surfaces into 2D knitting patterns [13]. The 2D knitting patterns require the yarns to be cut or replaced by a new yarn at the end or start of the short rows, and manual transformations of the 2D knitting patterns are necessary for knitting software. Narayanan et al. provide a fully automated computational approach that transforms arbitrary 3D meshes directly into knitting instructions based on computer-controlled V-bed knitting machines [20]. They use time field guided procedures to produce knitting graphs based on the input 3D surfaces, and such graphs are transformed into low-level knitting operations by a tracing algorithm.
A Computational Approach for Knitting 3D Composites Preforms
317
Our approach aims at generating 2D knitting maps from designers’ models with arbitrary shapes and textures. The features of our approach compared with previous researches could be summarized as followings: 1. The final 3D knit mesh generated by this approach can be knitted continuously using a single yarn. 2. The final 3D knit mesh shows the placement of each stitch on the 3D geometry, and allows designers to customize knit patterns at stitch level. 3. Partial knitting is the major shaping technique, so the final knitting map could be applied to both common domestic single-bed machines and industrial knitting machines to produce 3D preforms. 4. The 2D knitting map could be directly used for generating knitting information in knitting software without necessity of manual transformations. It allows quick production of large-scale preforms with hundreds and thousands of knitting wales and courses. There is also a layer of index numbers of wales and courses in the knitting map, easy for users to read and follow.
3 Partial Knitting Similar to other knitted textiles, textiles using partial knitting techniques are composed of repetitive stitches with course–wale structures. In our approach, such textile is abstracted to a network of quadrilaterals and a few triangles. Each quadrilateral represents a regular stitch and each triangle represents an end stitch of a short course (Fig. 1). There are multiple knitting techniques to create 3D structures on a flat knitting machine, such as tubular knitting, transfer stitches, partial knitting, etc. [21]. Tubular knitting uses double-bed knitting machines to create seamless tubes. Transfer stitches locally increase or decrease the width of fabric. Yet, over certain amount of transfer stitches in a knitting course may lead to failure of knitting machines since they cause yarn tension to increase. Partial knitting creates a raised or sunken area by adding short knitting courses, and a 3D surface could be achieved with the accumulations of short courses.
Fig. 1 Abstraction of a knitted fabric into a network of quadrilaterals and a few triangles
318
Y. Liu et al.
Fig. 2 Fabrication process and features of partial knitting. a Knit a full-length course; b knit only the stitches on the right side, creating a short course. Stitches on the left side do not knit and are still held on needles; c knit a new course from the end stitch of last knitting course. Stitches on the left side still do not knit; d stitches on left side later join knitting again, and a raised area is formed by the short courses created in steps (b) and (c); Steps (a), (b), (c), and (d) could be continuously knitted using a single yarn
Our approach adopts partial knitting as the major shaping technique due to ease of operation and less fabrication constraints. An ideal partial knitting structure can form a 3D geometry using a single continuous yarn. Major fabrication features of partial knitting include the following: knitting direction of each course is always the opposite of its previous and next courses, and every new course starts knitting from the same wale where the end stitch of previous course lies (Fig. 2).
4 Description of the Approach Inputs of this approach include a 3D mesh, stitch parameters, and reference points. The input mesh should be a triangulated mesh. The input stitch parameters are measured from knitted textile samples; they are an average stitch’s width along course direction (W) and height along wale direction (H). Such parameters often change when machine parameters, knitting parameters, stitch structures, or yarn types change. The input points could be either one point or multiple adjacent points. Multiple points are used as approximation of a curve (Fig. 3). This approach is implemented in 3D modeling tools Rhinoceros 5.0 and Grasshopper, custom codes for generating knitting courses, 2D patterns, and stitch
Fig. 3 a input triangular mesh; b input stitch parameters include an average stitch’s width along course direction (W) and height along wale direction (H); c input points could be either one or multiple adjacent points
A Computational Approach for Knitting 3D Composites Preforms
319
placement optimization, which are written in Python. The generation process includes five Steps: 1. Generation of Wales. The surface is covered with isocurves of constant distance W based on a geodesic distance field. Since fabric using partial knitting technique has parallel wales, we take these isocurves as wale edges. 2. Generation of Courses. Each wale edge is divided at constant intervals by points, and each point is connected to the closest point on neighboring curves. The connections constitute course edges. Course edges and wale edges together constitute the initial 3D knit mesh. 3. Generation of 2D Knitting Map. The initial 3D knit mesh is mapped onto 2D as a knitting map. Each stitch is marked as a square with a specific color representing its typology. 4. Stitch Placement Optimization. A three-step optimization is developed to adjust stitch structures over the 3D knit mesh and 2D knit map, following fabrication constraints of partial knitting. 5. Pattern Variations. Based on the final 3D knit mesh, stitches for pattern variations are selected and represented on the 3D knit mesh and 2D knitting map.
4.1 Generation of Wales Wale edges are generated based on a geodesic distance field with equal intervals of loop width W. There are multiple ways of computing geodesic distance field on a given mesh [22, 23], and we use the geodesic algorithm proposed by Surazhsky and implemented by Laurent Delrieu in Rhinoceros 5.0 and Grasshopper. This algorithm uses a parameterization of distance function over edges to determine the geodesic distance fields of triangle meshes. Before generating knitting courses, sequence and curve direction of wale edges need to be adjusted. Wale edges should be arranged from near to far from the input points and aligned in the same curve direction. Since we assume the knitting direction of the first course starts from left, and the knitting maps are read from bottom up, the direction of first wale edge should be examined and the direction of rest wale edges should be aligned to that of the first wale edge. To be specific, we take the starting point of first wale edge (P0) and its closest point on second wale edge (P1) as sampling points. Those sampling points are mapped into a plane where P0 is the origin and curve tangent vector V0 at P0 is the positive direction of x-axis. The curve direction is flipped if P2 is not located in first and second quadrants. Furthermore, when wale edges at a certain geodesic distance include two or more curves, curves are sorted following the curve direction. At last, since we do not consider connections of first and last knitting course, wale edges should not be closed curves; otherwise, they will be cut with an opening about twice the width of an average stitch (Fig. 4).
320
Y. Liu et al.
Fig. 4 a Wale edges have aligned curve directions and are sorted from near to far from the input point; b a reference coordinate defined by P0, P1, and V0 is used to judge the curve direction of first wale edge; c when there are multiple curves at certain distance from the input points, they are arranged following the direction of wale edge; d closed wale edges are not considered in this approach and should be cut with small openings
4.2 Generation of Courses To generate courses, each wale edge is divided with equal length around twice the H value. Why twice the H value will be later explained in stitch placement optimization step. Each dividing point is then visited and connected to the closest dividing point on next wale edge. During the connection, a maximum connection length is set to guarantee the accuracy of the model, and connections exceeding this length are eliminated. Yet, steps above cannot guarantee every point is connected to a point on its previous wale edge. So points without connections to its previous wale edge are revisited and connected. While generating courses, our tool is able to detect whether newly added lines cross the existed lines, and if so, end point of the new line will be adjusted to the end point of its closest intersected line (Fig. 5).
4.3 Generation of 2D Knitting Map Course edges and wale edges together constitute the initial 3D knit mesh. Each intersecting point of a course edge and a wale edge is given a label (I, j) with i representing the index number of the course edge and j representing the index number of the wale edge (Fig. 6). In the 3D knit mesh, each stitch is represented as an area enclosed by four neighboring points, such as stitch (i, j) is an area defined by point (i, j), point (i + 1, j), point (i, j + 1), and point (i + 1, j + 1). Each stitch is assigned a type attribute based on the relationship between the four points. There are four possible relationships between those four points. When point (i, j), point (i + 1, j), point (i, j + 1), and point (i + 1, j + 1) do not overlap each other, they form a quadrilateral area, representing a “Regular Stitch.” When only point (i, j) and point (i + 1, j) overlap or only point
A Computational Approach for Knitting 3D Composites Preforms
321
Fig. 5 a Course edges and wale edges together constitute the initial 3D knit mesh; b wale edges are divided by equal length; c only connecting dividing points to their next wale edges may leave points without connections to their previous wale edges; those points will be revisited and connected to their previous wale edges; d connections over maximum length are eliminated; e intersected connections are adjusted
Fig. 6 The network of course edges and wale edges in (a) is translated into a 2D pattern in (b)
(i, j + 1) and point (i + 1, j + 1) overlap, the four points define a triangular area, representing a short course’s “End Stitch.” When point (i, j) overlaps point ( i + 1, j) and point (i, j + 1) overlaps (point i + 1, j + 1), no area is defined between the four points, and the stitch (i, j) belongs to “No Stitch” type. After labeling each stitch in the initial knit structure, stitches are mapped onto a 2D plane as 1 mm squares. The i value of each square serves as its y-coordinate, representing the index number of a specific knitting course. The j value of each square is its x-coordinate, representing the index number of a knitting wale. Type attributes are expressed in colors, such as dark gray for “Regular Stitch” and “End Stitch,” and white for “No Stitch.”
4.4 Stitch Placement Optimization The initial knit mesh is usually not of an ideal partial knitting structure, and cannot be knitted continuously using a single yarn; therefore, we introduce a three-step
322
Y. Liu et al.
Fig. 7 a Optimized 3D knit mesh could be knitted using a single yarn, a continuous yellow curve connects each stitch and represents the knitting sequence; b double course edges; c adjust stitches’ placement
optimization in our approach. To be specific, the optimization is to make sure that the last stitch of every course is in the same wale as the start stitch of next course. Step 1 deals with situations where the neighboring courses do not have common wales. Steps 2 and 3 adjust even courses and odd courses, respectively, to make last stitch of every course in the same wale as the start stitch of its next course (Fig. 7). In the final 3D mesh, a one-stroke curve representing the knitting sequence could be drawn through all the stitches. The optimization of 2D knitting map and 3D knit mesh is carried out simultaneously. During optimization, our approach assumes the knitting direction of every even course which starts from knitter’s left to right and every odd course from right to left. Steps 2 and 3 are inspired by the approach proposed by Narayanan et al. The major difference is that this approach directly adjusts the knit structure while Narayanan’s approach deals with a tracing algorithm through existed knit graph. Getting a correct knit mesh is important for further step of pattern design over the 3D mesh (Fig. 8).
4.4.1
In Case of No Common Wales in Neighboring Courses
For every course i and course i + 1, if there are no stitches with common wales, extra stitches are added to course i until they share one stitch in the same wale. The size of those newly added stitches is half the height of regular stitches. This way of adding extra stitches sacrifices local accuracy to maintain the overall knit structure.
4.4.2
Adjust Stitches in Even Courses
Since the height of each stitch is of twice the regular stitch height, here we divide each stitch into two smaller units of regular height. By doing so, the end stitch of
A Computational Approach for Knitting 3D Composites Preforms
323
Fig. 8 a In case of no common wales in adjacent courses, such as in course i and course i + 1, extra small stitches are added; b adjust even courses by doubling the courses; c adjust odd courses by varying the stitch structure of every odd course and its following two courses. Transformations are controlled within three courses
every even course and the start stitch of its next odd course are in the same wale; furthermore, the structure of every even course is now exactly the same as its next course.
4.4.3
Adjust Stitches in Odd Courses
The way of adjustment for odd courses depends on the comparison of stitches’ wale index numbers. To be specific, for an odd course i with last stitch in wale j0 and its next course i + 1 with start stitch in wale j1, we compare j0 and j1. If j0 not equals j1, stitches of course i and i + 2 in wale domain [j0, j1] exchange their stitch types. If j0 equals j1, then no adjustments are needed. This optimization starts from first odd course and proceeds to penultimate odd course. This way of adjustment could limit transformations within three courses, and the outline of those three courses remains unchanged, minimizing the impact over global stitch structure.
4.5 Pattern Variations Our approach allows users to knit textiles with customized patterns. The 3D knit mesh records the location of every single stitch on the 3D geometry, and when given designed 3D pattern models, stitches closer to those models are selected, and represented in a different color. For representation, stitches are modeled as beam systems based on the 3D knit mesh. Each stitch is an inverted triangle frame made of three beams and fits within a face of the 3D knit mesh. Each triangle frame has wide top and pinched bottom,
324
Y. Liu et al.
Fig. 9 a Each stitches are presented as inverted triangle frames made of three beams; b each triangle frame is located within a face of the 3D knit mesh
Fig. 10 a Input 3D model and patterns; b representation of 3D knit mesh and pattern variations; c 2D knitting map; d knitted preform
and the three beams are of identical diameter. Since the knitted preforms later get sprayed by resin for composites structure, we assume all the connections between beams are fixed (Figs. 9 and 10).
5 File Conversion The final 2D knitting map contains two or more colors with each color representing a specific stitch type and a series of machine actions. Most stitches of “Regular Stitch” type or “End Stitch” type are mapped to 2D pattern as gray squares. Each gray square represents a series of basic loop-forming actions. All stitches of “No Stitch” type are mapped to 2D pattern as white squares, and white squares represent no knitting actions. Orange and pink squares represent the knitting actions of decreasing a stitch to its right stitch and to its left stitch, respectively. They cover “End Stitch” or “Regular Stitch” at end of wales. The final 2D Pattern has a layer of numbers showing the wale index number and course index number of each stitch. User could follow the 2D pattern from bottom up to knit each course. If one wants to use automatic knitting machine, the layer of index number could be turned off and the pattern image could be possessed by Photoshop turning each square into a pixel. This pixel-based image could then be directly used for knitting software, such as M1 Plus for Stoll. Machine actions are represented by colors; they can be set in advance using software’s color arrangement function. After setting color arrangement and loading default cast-on templates, knitting information could be immediately generated without manual transformations (Fig. 11).
A Computational Approach for Knitting 3D Composites Preforms
325
Fig. 11 a 2D knitting map has a layer of course and wale index numbers for users to follow; b knitting map in software M1 Plus is the same as the input 2D knitting map and needs to be added a default cast-on template at the bottom; c interface of M1 Plus software; d The machine actions corresponding to gray, pink, and orange colors can be defined in color arrangements function
6 Results We tested our approach with different 3D meshes, such as helicoid, saddle, horn tube, and twisted tube. The yarn is a heat-setting 100% polyester yarn so that textile could maintain shape after mold removal (Fig. 12). Objects with customized patterns are also tested. We tried different stitch structures for texture variation, such as two-color jacquard and lace stitch. Different stitch structures are distributed following given design patterns, such as voronoi cells, arbitrary curves, and points (Fig. 13). We also test to customize a full-scale 3D preform for a composites chair. The chair is developed from the typology of saddle surface with strengthened flanges on lateral and back edges. The final knitting pattern covers 5302 courses and 407
326
Y. Liu et al.
Fig. 12 Input 3D meshes, final knittable 3D meshes, knitted samples, and 2D knitting maps of helicoid (a), saddle (b), horn tube (c), and twisted tube (d)
Fig. 13 a Representation model; b knitted preforms; c multicolored 2D knitting map
wales. It was knitted on Stoll CMS 502 HP+ , and the whole knitting process took 3 h 22 min (Fig. 14). Physical tests mentioned above are made on different knitting machines. It proves this approach works for both high-end industrial knitting machines, such as Stoll CMS502+ , and domestic knitting machines, such as Brother KH970/KR850, even toy knitting machines, such as Silver Reed LK100. The major difference is that Silver Reed LK100 could only produce basic 3D preforms without complex pattern variations. It only has 100 needles, and the width of preforms is limited. Brother
A Computational Approach for Knitting 3D Composites Preforms
327
Fig. 14 a Design model; b knitted preform; c knitted chair after curing with resin; d knitting map
Fig. 15 a Silver Reed LK100 single-bed knitting machine; b Brother KH970/KR850 V-bed knitting machine; c Stoll CMS 502 HP+ automated knitting machine
KH970/KR850 could produce 3D performs with more texture variation choices. Since it is semi-automated, the knitting process is time consuming, such as taking 2–3 h to knit a 3D preform around 200 courses. Stoll CMS 502 HP+ machine allows the production of larger and more accurate 3D preforms as well as more complex patterns. Yet not every school is equipped with industrial knitting machine (Fig. 15).
7 Discussion and Outlook Shape Limitation. Currently our approach is mainly suitable for 3D meshes with open edges. For tube-like shapes, our approach leaves one or more seams, and those seams need to be processed after knitting. Our approach cannot directly work for
328
Y. Liu et al.
Fig. 16 a Bar graphs showing the length variations of stitches’ edges compared with the input stitch parameters; b stitches’ edges with larger deformation appears more frequently at the boundary of given meshes
3D meshes with holes or 3D meshes with more than one significantly raised/sunken area. Those geometries could be achieved by segmenting the original geometry into small patches (Fig. 16). Accuracy. Knitting results fit the input mesh in general. Our approach allows 85% of edges to be within the length variation of 20%. Large deformations appear mostly at the boundary of given meshes. In addition, the resolution of input mesh, the accuracy of input stitch parameters, stitch variations, tension distribution during knitting process, and molding technique also affect the knitting result. Nice mesh structure, higher resolution of input mesh, and more accurate knitting parameter may increase the accuracy. Stitch structure variations may result in uneven stitch sizes, and it could be improved by setting different knitting densities for different stitch type. In terms of tension distribution, partial knitting may cause uneven distribution of knitting tension within textile, affecting stitches’ actual sizes as well as success rate of loop-forming process. One of the solutions to this problem is to introduce special sinkers and roller systems. In terms of molding techniques, molds composed of flat plates may create bumps in a surface; this could be improved by increasing densities of plates or replacing plate molds with 3D-printed or CNC-milled molds with smooth surface. Size Limitation. Width of knitted preforms is restricted by width of the needle bed, in other words, number of wales should not exceed the needle numbers in one machine. While width is limited by knitting beds, length of a knitted preform is relatively less limited. Further, size of a 3D preform is also limited by computing power, computing time, and storage size during the generation process of 3D knit mesh and 2D knitting map. Patches, larger stitches, and computers with stronger computing and storage power are helpful to produce large-scale preforms for architectural applications.
A Computational Approach for Knitting 3D Composites Preforms
329
Comparison with 3D Printing. 3D knitting offers an alternative way to produce 3D composites objects. Yet, knitted 3D textiles are too flexible to stand under its selfweight and they need extra molds or supporting frames to form desired shape. So, design of molding systems is part of the design of the overall composites preforms. Future works will focus on knitted composites structures for architecture. Main focus includes how to design a lightweight mold system and integrate the mold system with knitted textile to form a structural whole. Apart from this, form finding and structural simulation for heterogeneous textile-reinforced composites will also be explored. Acknowledgements This research is supported by Open Projects Fund of Key Laboratory of Ecology and Energy-saving Study of Dense Habitat (Tongji University), Ministry of Education 201810102. Machine knitting experiments have been done in collaboration with Stoll and Chemtax Industrial Co Ltd.
References 1. La Magna, R., et al. (2018). Isoropia: An encompassing approach for the design, analysis and form-finding of bending-active textile hybrids. In Proceedings of IASS annual symposia. 2018, International Association for Shell and Spatial Structures (IASS) (pp 1–8). 2. Tamke, M. (2015). Designing CNC knit for hybrid membrane and bending active structures. In Textiles Composites and Inflatable Structures VII: Proceedings of the VII International Conference on Textile Composites and Inflatable Structures. Barcelona: CIMNE. 3. Tamke, M., et al. (2016). Bespoke materials for bespoke textile architecture. In Proceedings of IASS Annual Symposia. 2016, International Association for Shell and Spatial Structures (IASS) (pp. 1–10). 4. Thomsen, M.R., et al. (2016). Knit as bespoke material practice for architecture. In ACADIA // 2016: posthuman frontiers: data, designers, and cognitive machines [Proceedings of the 36th Annual Conference of the Association for Computer Aided Design in Architecture (ACADIA) (pp. 280–289). Ann Arbor. 5. Ahlquist, S., & Menges, A. (2013) Frameworks for computational design of textile microarchitectures and material behavior in forming complex force-active structures. In ACADIA 13: Adaptive Architecture—Proceedings of the 33rd Annual Conference of the Association for Computer Aided Design in Architecture (ACADIA) (pp. 281–292). Cambridge. 6. Ahlquist, S., et al. (2013). Exploring materials reciprocities for textile-hybrid systems as spatial structures. In M. Stacey (Ed.), Prototyping Architecture: The Conference Papers (pp. 187–210). London. 7. Ahlquist, S., et al. (2013). Physical and numerical prototyping for integrated bending and formactive textile hybrid structures. In Rethinking prototyping: Proceedings of the design modelling symposium (pp. 1–14). Berlin. 8. Ahlquist, S. (2015). Integrating differentiated knit logics and pre-stress in textile hybrid structures. In Modelling behaviour (pp. 101–111). Cham: Springer. 9. Ahlquist, S. (2014). Post-forming Composite Morphologies: Materialization and design methods for inducing form through textile material behavior. In ACADIA 14: Design Agency— Proceedings of the 34th Annual Conference of the Association for Computer Aided Design in Architecture (ACADIA) (pp. 267–276). Los Angeles.
330
Y. Liu et al.
10. Ahlquist, S., McGee, W., & Sharmin, S. (2017). PneumaKnit: Actuated architectures through wale-and course-wise tubular knit-constrained pneumatic systems. In ACADIA 2017: Disciplines & Disruption [Proceedings of the 37th Annual Conference of the Association for Computer Aided Design in Architecture (ACADIA) (pp. 38–51). Cambridge. 11. Sabin, J. E. (2013). myThread Pavilion: Generative fabrication in knitting processes. In ACADIA 13: Adaptive Architecture—Proceedings of the 33rd Annual Conference of the Association for Computer Aided Design in Architecture (ACADIA) 2013. Cambridge: Association for Computer-Aided Design in Architecture (pp. 347–354). 12. Popescu, M., et al. (2018). Building in concrete with an ultra-lightweight knitted stay-in-place formwork: prototype of a concrete shell bridge. In Structures. Elsevier. 13. Popescu, M., et al. (2017). Automated generation of knit patterns for non-developable surfaces. In K. de Rycke, et al. (Eds.), Humanizing Digital Reality—Design Modelling Symposium (pp. 271–284). Paris: Springer. 14. Popescu, M., et al. (2016). Complex concrete casting: Knitting stay-in-place fabric formwork. In Proceedings of IASS Annual Symposia. International Association for Shell and Spatial Structures (IASS). 15. Igarashi, Y., Igarashi, T., & Suzuki, H. (2008). Knitting a 3d model. Computer Graphics Forum, 27(7), 1737–1743. 16. Yuksel, C., et al. (2012). Stitch meshes for modeling knitted clothing with yarn-level detail. ACM Transactions on Graphics (TOG), 31(4), 37. 17. Wu, K., et al. (2018). Stitch meshings. ACM Transactions on Graphics (TOG), 37(4), 130. 18. Wu, K., Swan, H., & Yuksel, C. (2019). Knittable stitch meshes. ACM Transactions on Graphics (TOG), 38(1), 10. 19. McCann, J., et al. (2016). A compiler for 3D MACHINE KNITting. ACM Transactions on Graphics (TOG), 35(4), 49. 20. Narayanan, V., et al. (2018). Automatic machine knitting of 3D meshes. ACM Transactions on Graphics (TOG), 37(3), 35. 21. Underwood, J. (2009). The design of 3D shape knitted preforms. RMIT. 22. Surazhsky, V., et al. (2005). Fast exact and approximate geodesics on meshes. ACM Transactions on Graphics (TOG), 24(3), 553–560. 23. Crane, K., Weischedel, C., & Wardetzky, M. (2013). Geodesics in heat: A new approach to computing distance based on heat flow. ACM Transactions on Graphics (TOG), 32(5), 152.