The Metabolic Pathway Engineering Handbook: Fundamentals [1 ed.] 1439802963, 9781439802960

This first volume of the Metabolic Pathway Engineering Handbook provides an overview of metabolic pathway engineering wi

205 24 7MB

English Pages 680 Year 2009

Report DMCA / Copyright

DOWNLOAD PDF FILE

Recommend Papers

The Metabolic Pathway Engineering Handbook: Fundamentals [1 ed.]
 1439802963, 9781439802960

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

THE METABOLIC PATHWAY ENGINEERING HANDBOOK Fundamentals

he Metabolic Pathway Engineering Handbook, 1st Edition he Metabolic Pathway Engineering Handbook: Fundamentals he Metabolic Pathway Engineering Handbook: Tools and Applications

THE METABOLIC PATHWAY ENGINEERING HANDBOOK Fundamentals

Edited by

Christina D. Smolke

CRC Press Taylor & Francis Group 6000 Broken Sound Parkway NW, Suite 300 Boca Raton, FL 33487-2742 © 2010 by Taylor & Francis Group, LLC CRC Press is an imprint of Taylor & Francis Group, an Informa business No claim to original U.S. Government works Printed in the United States of America on acid-free paper 10 9 8 7 6 5 4 3 2 1 International Standard Book Number-13: 978-1-4398-0296-0 (Hardcover) This book contains information obtained from authentic and highly regarded sources. Reasonable efforts have been made to publish reliable data and information, but the author and publisher cannot assume responsibility for the validity of all materials or the consequences of their use. The authors and publishers have attempted to trace the copyright holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this form has not been obtained. If any copyright material has not been acknowledged please write and let us know so we may rectify in any future reprint. Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfilming, and recording, or in any information storage or retrieval system, without written permission from the publishers. For permission to photocopy or use material electronically from this work, please access www.copyright.com (http:// www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that provides licenses and registration for a variety of users. For organizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged. Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe. Library of Congress Cataloging-in-Publication Data The metabolic pathway engineering handbook : fundamentals / editor, Christina D. Smolke. p. ; cm. Includes bibliographical references and index. ISBN 978-1-4398-0296-0 (hardcover : alk. paper) 1. Genetic engineering--Handbooks, manuals, etc. 2. Biosynthesis--Handbooks, manuals, etc. I. Smolke, Christina D. II. Title. [DNLM: 1. Genetic Engineering--methods. 2. Metabolic Networks and Pathways. 3. Biological Products--metabolism. 4. Biotechnology--methods. 5. Models, Biological. QU 450 M5871 2010] TP248.6.M478 2010 660.6’5--dc22 Visit the Taylor & Francis Web site at http://www.taylorandfrancis.com and the CRC Press Web site at http://www.crcpress.com

2008051635

Contents Introduction .............................................................................................................................. ix Editor .......................................................................................................................................... xv Contributors ........................................................................................................................... xvii

SeCtIon I

Cellular Metabolism

Andy Ekins and Vincent J.J. Martin

1

Solute Transport Processes in the Cell .................................................................... 1-1 Adelfo Escalante, Alfredo Martínez, Manuel Rivera, and Guillermo Gosset

2

Catabolism and Metabolic Fueling Processes ....................................................... 2-1 Olubolaji Akinterinwa and Patrick C. Cirino

3

Biosynthesis of Cellular Building Blocks: The Prerequisites of Life ............... 3-1 Zachary L. Fowler, Efendi Leonard, and Mattheos Kofas

4

Polymerization of Building Blocks to Macromolecules: Polyhydroxyalkanoates as an Example ................................................................... 4-1 Si Jae Park, Soon Ho Hong, and Sang Yup Lee

5

Rare Metabolic Conversions—Harvesting Diversity through Nature .............................................................................................................. 5-1 Manuel Ferrer and Peter N. Golyshin

SeCtIon II

Balances and Reaction Models

Walter M. van Gulik

6

Growth Nutrients and Diversity ............................................................................... 6-1 Joseph J. Heijnen

7

Mass Balances, Rates, and Experiments ................................................................. 7-1 Joseph J. Heijnen

8

Data Reconciliation and Error Detection ............................................................... 8-1 Peter J.T. Verheijen

9

Black Box Models for Growth and Product Formation ...................................... 9-1 Joseph J. Heijnen v

vi

10

Contents

Metabolic Models for Growth and Product Formation .................................... 10-1 Walter M. van Gulik

11

A Thermodynamic Description of Microbial Growth and Product Formation ...................................................................................................... 11-1 Joseph J. Heijnen

SeCtIon III Bacterial transcriptional Regulation of Metabolism James C. Liao

12

Transcribing Metabolism Genes: Lessons from a Feral Promoter ................. 12-1 Alan J. Wolfe

13

Regulation of Secondary Metabolism in Bacteria .............................................. 13-1 Wenjun Zhang, Joshua P. Ferreira, and Yi Tang

14

A Synthetic Approach to Transcriptional Regulatory Engineering .............. 14-1 Wilson W. Wong and James C. Liao

SeCtIon IV

Modeling tools for Metabolic engineering

Costas D. Maranas

15

Metabolic Flux Analysis ............................................................................................ 15-1 Maria I. Klapa

16

Metabolic Control Analysis ...................................................................................... 16-1 Joseph J. Heijnen

17

Structure and Flux Analysis of Metabolic Networks ........................................ 17-1 Kiran Raosaheb Patil, Prashant Madhusudan Bapat, and Jens Nielsen

18

Constraint-Based Genome-Scale Models of Cellular Metabolism ................ 18-1 Radhakrishnan Mahadevan

19

Multiscale Modeling of Metabolic Regulation ................................................... 19-1 C.A. Leclerc and Jefrey D. Varner

20

Validation of Metabolic Models .............................................................................. 20-1 Sang Yup Lee, Hyohak Song, Tae Yong Kim, and Sung Bum Sohn

SeCtIon V

Developing Appropriate Hosts for Metabolic engineering

Jens Nielsen

21

Escherichia coli as a Well-Developed Host for Metabolic Engineering ............................................................................................... 21-1 Eva Nordberg Karlsson, Louise Johansson, Olle Holst, and Gunnar Lidén

22

Metabolic Engineering in Yeast .............................................................................. 22-1 Maurizio Bettiga, Marie F. Gorwa-Grauslund, and Bärbel Hahn-Hägerdal

Contents

23

vii

Metabolic Engineering of Bacillus subtilis ........................................................... 23-1 John Perkins, Markus Wyss, Hans-Peter Hohmann, and Uwe Sauer

24

Metabolic Engineering of Streptomyces ................................................................24-1 Irina Borodina, Anna Eliasson, and Jens Nielsen

25

Metabolic Engineering of Filamentous Fungi ..................................................... 25-1 Mikael Rørdam Andersen, Kanchana Rucksomtawin, Gerald Hofmann, and Jens Nielsen

26

Metabolic Engineering of Mammalian Cells .......................................................26-1 Lake-Ee Quek and Lars Keld Nielsen

Index .......................................................................................................................................... I-1

Introduction

Progression of Biological Synthesis Methods toward Commercial Relevance he advent of recombinant DNA in the 1970s brought transformative technologies for the synthesis and manipulation of artiicial genetic material. he ability to amplify, cut, and piece together fragments of DNA outside of a cell and to get (or transform) that DNA into a cell of interest resulted in a set of molecular cloning tools that enabled the ield of genetic engineering. In genetic engineering, foreign DNA that encodes for new or altered functions or traits is inserted into an organism of interest. Many early applications of recombinant DNA technology focused on heterologous protein production in microbial hosts. he irst medicine made through recombinant DNA technology that was approved by the United States Federal Drug Administration was the synthesis of synthetic “human” insulin in Escherichia coli. his was an important early application of recombinant DNA technology, as the success of producing a safe and efective synthetic hormone in a bacteria led to the widespread acceptance of the technology and signiicant resources and funding to be directed to its support and advancement. As the technologies in support of synthesizing and manipulating artiicial DNA matured and advanced, so did the applications to which they were applied. he early successful applications of recombinant DNA technology resulted in alternative routes to the synthesis of medicines, such as insulin, human growth factor, and erythropoietin, vaccines, and even genetically modiied organisms, including crops that exhibit more desirable traits. Technologies were developed for the manipulation of artiicial DNA in both prokaryotic and eukaryotic host organisms, including mammalian and plant cells. In addition, inspired by the diversity of natural products, chemicals, and materials synthesized by biological systems that are observed in the natural world, researchers began to look beyond applications that were limited to the synthesis of a single heterologous protein product in a cellular host to more complicated engineering feats. In particular, these new applications focused on the manipulation of sets or combinations of proteins, or enzymes, that acted in conjunction in a cell, within metabolic pathways, to convert energy and precursor chemicals into desired natural and non-natural products. he production of chemicals, materials, and energy through biology presents an alternative to traditional chemical synthesis routes. While the development of chemical synthesis methods for the production of valuable chemicals and small molecule pharmaceuticals is a more mature ield and has demonstrated signiicant successes, many chemicals remain diicult to be synthesized through such strategies, particularly those with many chiral centers. Biological catalysts, or enzymes, have demonstrated remarkable adeptness at the synthesis of very complex molecules. In addition, cellular biosynthesis strategies ofer several advantages over traditional chemical synthesis strategies in that the former is oten conducted under less harsh conditions, thereby enabling “green” synthesis strategies that are associated with the production of fewer toxic by-products. In addition, cellular biosynthesis ix

x

Introduction

takes advantage of the cell’s natural ability to replenish enzymes and cofactors and to provide precursors from oten inexpensive and renewable starting materials. Such advantages are particularly compelling in light of the global challenges we face today in energy, the environment, and sustainability. However, new challenges are presented when manipulating the metabolic pathways in cellular hosts that link energy sources and starting materials to products of commercial interest. he unique challenges faced in engineering metabolic pathways, when compared to the early genetic engineering applications of heterologous protein production, require the development of new enabling technologies, spanning experimental and analytical techniques and computational tools.

the Field of Metabolic engineering Metabolic engineering is a ield that includes the construction, redirection, and manipulation of cellular metabolism through the alteration of endogenous and/or heterologous enzyme activities and levels to achieve the biosynthesis or biocatalysis of desired compounds. Researchers in metabolic engineering oten view the biological system as a chemical factory that is converting starting materials to diferent value-added products. Because the yield or productivity of the process is linked to its commercial viability, the ability to precisely regulate the low of energy and materials through diferent cellular pathways becomes critical to the optimization of the overall process, drawing parallels to the more traditional engineering discipline of chemical process design. he basic tenet of metabolic engineering, the use of biology as a technology for the conversion of energy, chemicals, and materials to value-added products, has a long history. Early applications can be cited, even prior to the development of recombinant DNA technology, in the food and beverage industry where more traditional methods of strain development based on evolution, mating, and selection strategies were used to develop more desired production hosts for particular applications. However, recombinant DNA technology enabled the capability to introduce new enzymatic activities and pathways into production hosts allowing access to diferent energy resources and starting materials and to the production of diferent chemicals and materials. Such technologies support the forward design of more complex synthetic pathways in host organisms or the targeted manipulation of endogenous pathways, enabling more directed manipulation of the cellular host. Current metabolic engineering eforts are focused on the synthesis of products such as chemical commodities, small molecule drugs, and alternative energy sources including biofuels. In addition, signiicant efort is also directed to the engineering of host metabolisms to utilize renewable, low cost energy resources. Many of the challenges faced in metabolic engineering are related to the engineering of energy and material low within complex systems. More speciically, metabolic pathways make up complex interconnected networks in cells, which can rarely be manipulated in isolation of the rest of the network. Highlighting the interconnections between cellular metabolites is the fact that all metabolites are made from a set of 12 common precursors. In addition, the low of metabolites through a network of enzymes, and in the background of other cellular enzymes that may exhibit activity on these metabolites, is oten controlled through layered processes that act at diferent time scales, implement dynamic feedback control, and utilize localization and transport. Metabolic engineering requires a breadth of skill sets to tackle diferent points of system design and as a result has developed into a very interdisciplinary ield. Researchers with expertise spanning a variety of disciplines, including chemical engineering, biological engineering, environmental engineering, biochemistry, molecular biology, cell biology, bioinformatics, and control theory, are working in diferent areas of metabolic engineering. However, as an academic endeavor, metabolic engineering has remained an interdisciplinary research discipline with courses covering aspects of the ield depending on the expertise of the department in which it is taught. As it has matured, metabolic engineering has gained greater industrial signiicance. Initial industrial interest was directed to the synthesis of chemical commodities in microorganisms largely at groups within larger chemical companies. However, many smaller startup companies have developed in recent years that are focused on the synthesis of specialty chemicals such as pharmaceuticals and biofuels, on

Introduction

xi

the development of computational and modeling programs to direct metabolic engineering eforts, and on the discovery and development of new enzyme activities in support of engineering new synthetic pathways into host organisms. he intersection of metabolic engineering, with other emerging areas of systems and synthetic biology, presents exciting opportunities to develop solutions to many of the global challenges we face in energy, the environment, health and medicine, resources, and sustainability, and will likely continue to fuel a signiicant sector of the biotechnology industry in future years.

An overview of the Metabolic Pathway engineering Handbook he purpose of he Metabolic Pathway Engineering Handbook is to provide a thorough overview of the ield of metabolic engineering. Each section provides an overview of diferent aspects of a particular topic that is a central component of the ield by experts in that area. Sections are introduced by section editors to provide a perspective on the topic and a description of how the chapters in that section link together to form an integrated overview of that particular topic. he sections are split into two books, where the content of the irst book focuses on “fundamentals” or basic principles of metabolic engineering and the second book focuses on “tools and applications” in metabolic engineering. Due to its organization, the handbook can be used as a reference book and read for individual sections or chapters, or it can be used as a book for advanced courses in metabolic engineering. Section I in he Metabolic Pathway Engineering Handbook: Fundamentals provides an overview of the basic processes that support cellular metabolism. he boundary of a cell is deined by its cellular membrane, which acts to separate cellular constituents from the environment. Metabolism begins with systems that allow the import of nutrients and starting materials across the cellular membrane and eforts to engineer transport systems for particular chemicals have been important strategies in enabling cells to convert those chemicals to desired products. Once inside the cell, nutrients are broken down into common precursors for metabolic syntheses, which provide the energy and reducing power necessary for cell survival. In addition, precursors are channeled into the synthesis of important building blocks that the cell then utilizes to build larger macromolecules, including lipids, nucleic acids, and proteins. An understanding of the central metabolic pathways and the general low of metabolism through a small number of common precursors and carriers is critical to being able to efectively link new synthetic nutrient or product pathways to endogenous metabolisms. Finally, the wealth of untapped diversity in nature, particularly in the microbial biosphere, provides signiicant opportunities in harvesting new enzymatic activities from nature that can be applied to the production of new chemicals and materials in engineered hosts. Section II provides an overview of mass balances and reaction models applied to predicting product formation and microbial growth in fermentation processes. Various models have been proposed and utilized in the ield that exhibit varying levels of detail to provide predictions of product yield and cell growth. Conversion rates are calculated from mass balances and rate equations that take into account the basic nutrients and constituents of cellular systems. Diferent models, such as those based on thermodynamic or metabolic network constraints, can be utilized to predict product yield and cell growth in fermentation processes. Diferent models may be more or less appropriate based on the speciics of the fermentation. he application of such models to experimental systems can allow minimization of error in detection strategies resulting in optimized control schemes for fermentations based on such experimental measurements. Section III provides an overview of transcriptional regulation of metabolic pathways in bacterial systems. Bacterial cells use a variety of mechanisms to regulate the transcription of enzymes involved in primary and secondary metabolisms. Transcriptional regulatory strategies exist that regulate a small set of genes in response to speciic environmental chemicals, such as operon-speciic regulation and two-component systems. However, other strategies exist that regulate larger sets of genes in response to signiicant environmental changes such as heat shock or nitrogen starvation, through sigma factors and global transcriptional factors. An understanding of the strategies used to regulate the expression of

xii

Introduction

enzymes in a cellular host is critical in metabolic engineering to developing efective strategies to alter the expression of endogenous enzymes and to design synthetic systems that exhibit more sophisticated regulatory schemes to balance and coordinate the expression of multiple enzymes to ultimately optimize lux through desired pathways. Section IV is an overview of modeling tools that have been developed for metabolic engineering applications. Earlier modeling and computation eforts that resulted in tools for metabolic lux analysis (MFA) and metabolic control analysis (MCA) have been very powerful for the elucidation of luxes and control strategies in metabolic networks given partial sets of data. Computation tools based on network and graph concepts have enabled structure and lux analyses that provide optimization tools for metabolic engineering. In addition, metabolic network reconstruction and modeling eforts have resulted in genome-scale models of cellular metabolism for speciic organisms based on sets of constraints that enable prediction of lux distributions under diferent conditions. Whereas multi-scale modeling tools are extending current predictive capabilities by integrating stoichiometry, kinetics, and regulatory and control responses in metabolic networks, such multi-scale tools can be utilized by metabolic engineers to predict the dynamic metabolic response. Section V provides an overview of common cellular hosts that are used in metabolic engineering applications. In particular, the bacterial hosts Escherichia coli, Bacillus subtilis, and Streptomyces have been utilized in various metabolic engineering applications, with E. coli being the most well-developed and utilized host largely due to the genetic tools available for manipulating pathways in this host organism. In addition, two lower eukaryotic hosts, yeast and ilamentous fungi, have been utilized in various metabolic engineering applications for the production of natural products or for pathway enzymes that are more readily expressed in functional forms in eukaryotic organisms. Finally, much efort has also been put toward the development of mammalian cell culture hosts for the production of metabolites and products that are more readily produced in mammalian cells. Each host may present advantages and disadvantages in the synthesis of a desired chemical based on the genetic tools available for manipulating pathways and the endogenous metabolism and processing pathways present in that organism, such that the selection of a suitable host is driven largely by the properties of the pathway of interest. Section I in he Metabolic Pathway Engineering Handbook: Tools and Applications provides an overview of the evolutionary tools widely in use in the engineering of metabolic enzymes and networks. Evolutionary strategies have been traditionally used in metabolic engineering to select for desired phenotypes in host organisms. As biological organisms naturally undergo processes of evolution and selection, design strategies that integrate evolutionary engineering objectives with metabolic engineering objectives may result in a more robustly performing engineered cellular system. Directed evolution is a laboratory tool that is used to mimic the evolutionary process in a test tube, by generating diversity in cellular components and then screening or selecting through this diversity for optimized component properties. Various experimental strategies have been utilized for generating and screening through component diversity. In addition, computational tools have been developed that optimize the design of laboratory evolution strategies. hese experimental and computational tools have been applied to the directed evolution of enzymes, regulatory systems, pathways, and whole genomes for the optimization of lux through targeted metabolic pathways. Section II provides an overview of gene expression tools that have been utilized in metabolic engineering applications. Various tools have been developed that regulate DNA copy number and enable chromosomal engineering in host organisms. In addition, a variety of other genetic tools have been developed that precisely regulate gene expression levels through post-transcriptional and translational mechanisms. Still other tools have been developed that regulate the activity of enzymes through posttranslational engineering strategies. he application of the tools described in this section is critical to balancing the expression of multiple enzymes, such that individual conversion steps do not limit product yield, toxic intermediates do not accumulate, and cellular resources and energy are eiciently utilized by the host cell. Several examples exist of engineered systems that have utilized such genetic tools for the optimization of lux through metabolic pathways.

Introduction

xiii

Section III provides an overview of emerging technologies and their application to metabolic engineering. Genome-wide technologies that allow global proiling of cellular transcripts, proteins, metabolites, and phenotypes are critical for eicient troubleshooting and debugging of engineered systems. Bioinformatics tools that allow for management and analysis of the vast amounts of data collected from these techniques are also critical. As these technologies mature and become more available, their implementation as standard techniques in metabolic engineering will improve our understanding of the engineered system response and result in eicient troubleshooting and optimization strategies. Section IV provides an overview of key future prospects in metabolic engineering. he integration of new computational tools, such as genome-scale models, and new technologies for analyzing and understanding complex systems, such as systems biology, with metabolic engineering are rapidly advancing the success with which metabolic networks can be forward engineered. In addition, alternative strategies to cellular biosynthesis that remove complications associated with engineering living, evolving systems, such as cell-free synthesis systems, have demonstrated impressive successes. Finally, the modeling and optimization of engineered metabolic pathways in silico, prior to construction and characterization, will signiicantly transform the ield of metabolic engineering and integrate advances in computational modeling, systems biology, and engineering design. Section V provides an overview of common tools that are utilized to determine lux through metabolic pathways. Various types of isotope lux labeling strategies have been widely used to monitor lux through metabolic pathways, where the data from such experiments are typically integrated into the modeling tools described in Section IV. In addition, various analytical strategies are utilized to proile cellular metabolites, where current and future eforts have been focused on developing strategies to proile and quantify global metabolite levels. Section VI provides an overview of various metabolic engineering application areas. One broad application area is focused on the engineering and regulation of the energy state, cofactor supply, and redox balance of cellular hosts. his is a challenge that afects most if not all metabolic engineering applications, where the introduction of new pathways or the manipulation of endogenous pathways can result in imbalances in cellular pathways and stress responses. Metabolic engineering applications are generally directed toward the synthesis of commercially relevant molecules including specialty or commodity chemicals, small molecule drugs, or alternative energy sources. Each of these application areas of metabolic engineering presents distinct challenges that must be addressed in the process design based on chemical and pathway complexity, market cost of the product, volume demand of the product, end use of the product, and purity requirements.

Metabolic engineering: Looking toward the Future Metabolic engineering as a ield has evolved signiicantly over the past 10 to 15 years in large part due to the scientiic and technological advances made during this time frame in support of this application area. he future prospects of metabolic engineering are extremely exciting, and as other supporting scientiic and engineering ields mature it is likely to see transformative advances that direct it further toward an engineering discipline. here are several key supporting ields that will aid in directing this transformation. First, enzyme engineering and enzyme discovery will be critical to expanding the diversity of natural and non-natural products that can be produced in engineered organisms. Much of the living world has not been cultured and characterized. Even in those organisms that have been cultured, we do not have genome sequence information, have not mapped functions to many of the sequenced genes, or have not characterized many of the enzyme activities in these organisms. For example, many pathways in plants responsible for the synthesis of diverse pharmacologically relevant molecules have not been elucidated, although many of these activities and their corresponding genes are currently present in large expressed sequence tag (EST) libraries. Because we cannot forward design enzymes to exhibit speciic catalytic

xiv

Introduction

activities, the existing limitations in characterized enzyme activities severely limit the pathways that we can reconstruct in organisms. In addition, programs that will allow us to predict and design enzyme function from sequence will be critically enabling for the design of new activities that have not been recovered from natural systems. Second, because metabolic engineering is largely a systems engineering challenge, continued advances in systems biology will provide important insights into the function of biological systems that will inform engineering design and strategies directed at manipulating metabolic pathways. Many analytic techniques in support of systems biology, including strategies that allow global proiling of transcript, protein, and metabolite levels, are providing vast amounts of information regarding levels of cellular constituents under diferent conditions. In addition, computational tools are being developed to process the vast amounts of data coming from these techniques. Newer and future eforts in systems biology must focus on taking the information coming from these techniques and abstracting from it the organizing principles governing cellular metabolism and regulation. An understanding of how cells generally layer metabolic pathways with diferent regulatory strategies will allow engineers to design more robustly performing synthetic pathways that are better integrated with the endogenous metabolic pathways. In addition, such understanding will allow better identiication of manipulation points in endogenous networks to alter lux through pathways. hird, the integration of information theory and control theory with systems biology and metabolic engineering will likely have a signiicant impact on our understanding of biological systems. Such tools will enable a deeper understanding of architectures and properties of complex networks that support robustness, evolvability, and fragility of the system, providing a conceptual framework to systems biology. In addition, such tools will allow researchers to more quantitatively examine models of control schemes around metabolic pathways to better elucidate the design principles around regulating lux through metabolic pathways. Such tools can also be used to examine synthetic network and control scheme designs and guide the more efective design of engineered systems. Finally, metabolic engineering is seeing a transformation with the emerging ield of synthetic biology. Synthetic biology is the design, construction, and characterization of biological systems using engineering design principles. To support a framework for engineering biology, synthetic biology is rooted in foundational technologies that enable the construction of more complex, heterologous networks in living systems. With advances in DNA sequencing and synthesis it is becoming common practice to synthesize entire genes and pathways from scratch, no longer limiting researchers to the physical DNA that they obtain from natural organisms. In addition, abstraction frameworks have been proposed to enable rapid assembling and reassembling of basic biological components (or parts) into larger networks (or devices) and systems, supporting the rapid prototyping and troubleshooting and reliable construction of complex metabolic pathways in cellular hosts (or chassis). An example of a synthetic biology approach to the rapid prototyping of a metabolic pathway in Escherichia coli was recently described (http://parts. mit.edu/wiki/index.php/MIT_2006). here are also eforts directed to the engineering of speciic chassis, or cellular hosts, optimized for metabolic engineering applications. Finally, enabling genetically encoded technologies are being developed for use in precise and quantitative manipulation of pathway components such as enzymes.

Christina D. Smolke Editor-in-Chief

Editor

Christina Smolke is an assistant professor in the Department of Bioengineering at Stanford University. She graduated with a BS in chemical engineering with a minor in biology from the University of Southern California in 1997. She conducted her graduate training as a National Science Foundation Fellow in the Chemical Engineering Department at the University of California at Berkeley and earned her PhD in 2001. Christina conducted her postdoctoral training as a National Institutes of Health Fellow in cell biology at UC Berkeley. She started her independent research program as an assistant professor in the Division of Chemistry and Chemical Engineering at the California Institute of Technology from 2003– 2008. She has pioneered a research program in developing foundational technologies for the design and construction of engineered ligand-responsive RNA-based regulatory molecules, their integration into molecular computation and signal integration strategies, and their reliable implementation into diverse cellular engineering applications. hese technologies are resulting in scaleable platforms for the construction of molecular tools that work across many cellular systems and allow regulation of targeted gene expression levels in response to diverse endogenous or exogenous molecular ligands. Her research is rapidly advancing current capabilities of noninvasive detection of cellular state and programming cellular function. In particular, her laboratory is examining the application of these tools to the optimization of metabolic pathway engineering strategies in organisms such as yeast. Dr. Smolke’s innovative research program has recently been recognized with the receipt of a National Science Foundation CAREER Award, a Beckman Young Investigator Award, an Alfred P. Sloan Research Fellowship, and the listing of Dr. Smolke as one of Technology Review’s Top 100 Young Innovators in the World. She is also a member and adjunct faculty of the Comprehensive Cancer Center’s Cancer Immunotherapeutics Program at the City of Hope, where she has several translationally oriented collaborative projects exploring the clinical applications of these technologies. She is the inventor of over nine patents and serves on the Scientiic Advisory Board of Codon Devices. Dr. Smolke is currently serving as the President of the Institute of Biological Engineering. She is a member of AIChE, ACS, the RNA Society, and IBE.

xv

Contributors

olubolaji Akinterinwa

Andy ekins

Peter n. Golyshin

Department of Chemical Engineering Pennsylvania State University University Park, Pennsylvania

Department of Biology Centre for Structural and Functional Genomics Concordia University Montreal, Quebec, Canada

Department of Environmental Microbiology HZI-Helmholtz Centre for Infection Research Braunschweig, Germany

Anna eliasson

Marie F. GorwaGrauslund

Mikael Rørdam Andersen Center for Microbial Biotechnology BioCentrum-DTU Technical University of Denmark Lyngby, Denmark

Center for Microbial Biotechnology BioCentrum-DTU Technical University of Denmark Lyngby, Denmark

Prashant Madhusudan Bapat

Adelfo escalante

Center for Microbial Biotechnology BioCentrum-DTU Technical University of Denmark Lyngby, Denmark

Maurizio Bettiga Department of Applied Microbiology Lund University Lund, Sweden

Irina Borodina Center for Microbial Biotechnology BioCentrum-DTU Technical University of Denmark Lyngby, Denmark

Patrick C. Cirino Department of Chemical Engineering Pennsylvania State University University Park, Pennsylvania

Cellular Engineering Biocatalysis Department Biotechnology Institute National Autonomous University of Mexico Cuernavaca, Mexico

Joshua P. Ferreira Department of Chemical and Biomolecular Engineering University of California Los Angeles, California

Manuel Ferrer

Department of Applied Microbiology Lund University Lund, Sweden

Guillermo Gosset Cellular Engineering Biocatalysis Department Biotechnology Institute National Autonomous University of Mexico Cuernavaca, Mexico

Bärbel Hahn-Hägerdal Department of Applied Microbiology Lund University Lund, Sweden

Department of Biocatalysis Institute of Catalysis Consejo Superior de Investigaciones Cientíicas Madrid, Spain

Joseph J. Heijnen

Zachary L. Fowler

Gerald Hofmann

Department of Chemical and Biological Engineering State University of New York at Bufalo Bufalo, New York

Center for Microbial Biotechnology, BioCentrum-DTU Technical University of Denmark Lyngby, Denmark

Bioprocess Technology Group Department of Biotechnology Delt University of Technology Delt, the Netherlands

xvii

xviii

Contributors

Hans-Peter Hohmann

C.A. Leclerc

Vincent J.J. Martin

DSM Nutritional Products Ltd Basel, Switzerland

Department of Chemical Engineering McGill University Montreal, Quebec, Canada

Department of Biology Centre for Structural and Functional Genomics Concordia University Montreal, Quebec, Canada

olle Holst Department of Biotechnology Lund University Lund, Sweden

Sang Yup Lee

Department of Chemical Engineering and Bioengineering University of Ulsan Ulsan, Republic of Korea

Department of Chemical and Biomolecular Engineering Center for Systems and Synthetic Biotechnology Institute for the BioCentury Korea Advanced Institute of Science and Technology Daejeon, Korea

Louise Johansson

effendi Leonard

Department of Chemical Engineering Lund University Lund, Sweden

Department of Chemical and Biological Engineering State University of New York at Bufalo Bufalo, New York

Soon Ho Hong

eva nordberg Karlsson Department of Biotechnology Lund University Lund, Sweden

tae Yong Kim Department of Chemical and Biomolecular Engineering Center for Systems and Synthetic Biotechnology Institute for the BioCentury Korea Advanced Institute of Science and Technology Daejeon, Korea

Maria I. Klapa

James C. Liao Chemical and Biomolecular Engineering Department University of California Los Angeles, California

Gunnar Lidén Department of Chemical Engineering Lund University Lund, Sweden

Radhakrishnan Mahadevan

Alfredo Martínez Cellular Engineering Biocatalysis Department Biotechnology Institute National Autonomous University of Mexico Cuernavaca, Mexico

Jens nielsen Systems Biology Department of Chemical and Biological Engineering Chalmers University of Technology Gothenburg, Sweden and Center for Microbial Biotechnology BioCentrum-DTU Technical University of Denmark Lyngby, Denmark

Lars Keld nielsen Australian Institute for Bioengineering and Nanotechnology he University of Queensland Brisbane, Australia

Department of Chemical and Biomolecular Engineering Institute of Chemical Engineering and High-Temperature Chemical Processes Foundation for Research and Technology-Hellas Patras, Greece

Department of Chemical Engineering and Applied Chemistry Institute of Biomaterials and Biomedical Engineering University of Toronto Toronto, Ontario, Canada

Mattheos Koffas

Costas D. Maranas

Kiran Raosaheb Patil

Department of Chemical and Biological Engineering State University of New York at Bufalo Bufalo, New York

Department of Chemical Engineering Pennsylvania State University Fenske Laboratory University Park, Pennsylvania

Center for Microbial Biotechnology BioCentrum-DTU Technical University of Denmark Lyngby, Denmark

Si Jae Park Corporate R&D LG Chem, Ltd. Daejeon, Republic of Korea

xix

Contributors

John Perkins

Christina D. Smolke

Jeffrey D. Varner

DSM Nutritional Products Ltd Basel, Switzerland

Division of Chemistry and Chemical Engineering California Institute of Technology Pasadena, California

Department of Chemical and Biomolecular Engineering Cornell University Ithaca, New York

Seung Bum Sohn

Peter J.t. Verheijen

Lake-ee Quek Australian Institute for Bioengineering and Nanotechnology he University of Queensland Brisbane, Australia

Manuel Rivera Cellular Engineering Biocatalysis Department Biotechnology Institute National Autonomous University of Mexico Cuernavaca, Mexico

Kanchana Rueksomtawin Center for Microbial Biotechnology BioCentrum-DTU Technical University of Denmark Lyngby, Denmark

Department of Chemical and Biomolecular Engineering Center for Systems and Synthetic Biotechnology Institute for the BioCentury Korea Advanced Institute of Science and Technology Daejeon, Korea

Hyohak Song Department of Chemical and Biomolecular Engineering Center for Systems and Synthetic Biotechnology Institute for the BioCentury Korea Advanced Institute of Science and Technology Daejeon, Korea

Department of Biotechnology Delt University of Technology Delt, the Netherlands

Alan J. Wolfe Department of Microbiology and Immunology Loyola University at Chicago Stritch School of Medicine Maywood, Illinois

Wilson W. Wong Chemical and Biomolecular Engineering Department University of California Los Angeles, California

Yi tang Department of Chemical and Biomolecular Engineering University of California Los Angeles, California

Markus Wyss DSM Nutritional Products Ltd Basel, Switzerland

Uwe Sauer

Walter M. van Gulik

Wenjun Zhang

Institute for Molecular Systems Biology ETH Zürich Zürich, Switzerland

Bioprocess Technology Group Department of Biotechnology Delt University of Technology Delt, the Netherlands

Department of Chemical and Biomolecular Engineering University of California Los Angeles, California

I Cellular Metabolism Andy Ekins and Vincent J.J. Martin Concordia University

1 Solute Transport Processes in the Cell Adelfo Escalante, Alfredo Martínez, Manuel Rivera, and Guillermo Gosset .....................................................................................1-1 Introduction  •  Structure and Function of the Bacterial Membrane  •  he Transporter Classiication (TC) System

2 Catabolism and Metabolic Fueling Processes Olubolaji Akinterinwa and Patrick C. Cirino ..........................................................................................................................2-1 Introduction  •  Classiication of Organisms  •  hermodynamics of Fueling Processes  •  Products of Fueling Processes  •  Redox Potentials and Mobile Electron Carriers  •  Examples of Catabolic Processes in Diferent Organisms  •  Concluding Remarks

3 Biosynthesis of Cellular Building Blocks: he Prerequisites of Life Zachary L. Fowler, Efendi Leonard, and Mattheos Kofas .....................................3-1 Introduction  •  Amino Acid Biosynthesis  •  Nucleotides as Building Blocks  •  Synthesis of Carbohydrates for Building Cells  •  Cell Synthesis of Lipids

4 Polymerization of Building Blocks to Macromolecules: Polyhydroxyalkanoates as an Example Si Jae Park, Soon Ho Hong, and Sang Yup Lee ................................................................................................................................4-1 Introduction  •  PHAs  •  PHA Synthases  •  Metabolic Engineering of Microorganisms for PHA Production  •  Conclusion

5 Rare Metabolic Conversions—Harvesting Diversity through Nature Manuel Ferrer and Peter N. Golyshin .......................................................5-1 Introduction  •  How Diverse Are Functional Groups?  •  Diversity of Enyzmes and Current Frontiers for Bioconversions  •  Main Chemical Conversions Mediated by Enzymes: Putative “Rare” Conversions  •  How Can New Catalytic Functions Be Achieved?  •  Recent Advances in Metagenomics: he Untapped Reservoir of Proteins from Unculturable Microbes

I-1

I-2

T

Cellular Metabolism

HE FIELD OF METABOLIC ENGINEERING has advanced over time and will undoubtedly continue to do so, based on a solid scientiic foundation and continued research and innovation. Of critical importance is a solid understanding of the metabolism of the cell which will ultimately be manipulated through various techniques to produce a desired end product, or alternatively, remove or breakdown an undesirable one. his section describes the well deined knowledge of how, in particular Escherichia coli, is able to transport a variety of nutrients and use such nutrients, through a variety of metabolic pathways, to derive energy and synthesize the wide spectrum of cellular components required for the maintenance of a cell. In some instances, the alteration of such metabolic pathways of a particular cell may lead to the production of a desired product, while in others the synthesis of the desired product may require the introduction of foreign genes, isolated and characterized from other organisms, in order to allow synthesis to proceed. Furthermore, with the advent of metagenomics, one can sit out genes allowing unique metabolic conversions which have yet to have been described in cultured microorganisms. While the organism of choice for many metabolic engineering studies is E. coli, all cells are, by deinition, enveloped by a membrane which separates cellular components from the extracellular environment. Highly eicient transport and export systems have evolved to allow exchange across this barrier. A sound knowledge of the transport systems which import the nutrients required for cell growth and drive the metabolism of the cell is essential in order to ensure that the desired metabolic pathway receives the necessary precursors and energy required for the production of a selected compound. As the outer membrane of E. coli is only capable of allowing the passive difusion of molecules with a molecular weight less than approximately 600 Da, it is of utmost importance to determine if the import of precursors, for instance, can cause a bottleneck in the synthesis of a particular compound. Additionally, the type of transport system present in the cell can have an impact on the carbon lux within a cell. As an example, modiication of the E. coli phosphotransferase system has been a strategy successfully applied to metabolically engineered strains (Gosset, 2005). Manipulation of the transport systems can increase the diversity of nutrients imported while manipulation of the regulatory systems of the cell can allow the simultaneous import and use of multiple carbon sources, as is the case for carbon catabolite repression mutants for example (Dien et al., 2002). As nutrients are catabolized, precursors for metabolic syntheses are generated along with the energy and reducing power required to drive the synthesis of all the components required by the cell. Solid knowledge of the metabolic pathways within the cell allows one to ensure that the proper precursors, reducing power and energy are in ample supply in order to produce the molecule of interest. Furthermore, culture components and conditions can be altered to enhance the eiciency of a particular metabolic conversion. In some instances, one may wish to overproduce a compound in an organism that does not naturally produce said compound. One such example is the production of polyhydroxyalkanoates (PHAs) in E. coli, an organism that does not naturally produce PHAs. It is therefore necessary to express PHA synthase genes isolated from foreign organisms. Additionally, in order to increase production of the desired PHA, it is crucial to evaluate the metabolic lux of the host organism and perhaps amplify certain endogenous pathways to increase the availability of precursors and reducing power, without decreasing the overall health of the expressing organism. In the case of poly (3-hydroxybutyrate) [P(3HB)] production, it was found that over expression of glucose-6-phosphate dehydrogenase and 6-phosphogluconate dehydrogenase led to an increase in the NADPH/NADP+ ratio and a subsequent rise in the concentration of P(3HB), there was however a detrimental efect on the producing cells and the observed increase in P(3HB) production was due to a lower cell concentration (Lim et al., 2002). In another experiment, 2-D gel electrophoresis and metabolic lux analysis was performed on E. coli producing P(3HB) and it was revealed that there was an increase in certain glycolytic pathway enzymes. Subsequently, ampliication of the glycolytic pathway enzymes led to increased production of acetyl-CoA, which could subsequently be used to increase yields of P(3HB) (Park and Lee, unpublished results).

Cellular Metabolism

I-3

he pursuit of rare conversions have, up to this point, focused on the ability of cultured organisms and their enzymes to perform such functions. he products of such rare conversions are invaluable to a variety of industries spanning the agricultural, pharmaceutical, food additive and bioremediation ields, as examples. While many advances have been made in the realm of protein engineering using techniques such as directed evolution, it is reasonable to assume that an even greater diversity exists within the genomes of unculturable microorganisms. he diversity of the cultured microbial world has led to the discovery of many rare conversions, there remains, however, a large pool of untapped genetic material within the many “unculturable” microorganisms that are currently estimated to represent close to 99% of the microbial world (Fütterer et al., 2004). With the knowledge that each sequenced microbial genome yields on average 30–50% of genes with unknown function (Bode and Müller, 2005) and the recent shotgun sequencing of DNA isolated from the Sargasso Sea revealed greater than 1.2 million genes of unknown function (Venter et al., 2004) it would appear reasonable to assume that there exists a vast pool of untapped genetic resources that can be applied to all realms of biotechnology.

References Bode, H.B., and Müller, R. he impact of bacterial genomics on natural product research. Angew. Chem. Int. Ed. Engl., 44, 6828, 2005. Dien, B.S., Nichols, N.N., and Bothast, R.J. Fermentation of sugar mixtures using Escherichia coli catabolite repression mutants engineered for production of L-lactic acid. J. Ind. Microbiol. Biotechnol., 29, 221, 2002. Fütterer, O., et al. Genome sequence of Picrophilus torridus and its implications for life around pH 0. Proc. Natl. Acad. Sci. USA., 101, 9091, 2004. Gosset, G. Improvement of Escherichia coli production strains by modiication of the phosphoenolpyruvate:sugar phosphotransferase system. Microb. Cell. Fact., 4, 14, 2005. Lim, S.J. et al. Ampliication of the NADPH-related genes zwf and gnd for the oddball biosynthesis of PHB in an E. coli transformant harboring a cloned phbCAB operon. J. Biosci. Bioeng. 93, 543, 2002. Venter, J.C. et al. Environmental genome shotgun sequencing of the Sargasso Sea. Science, 304, 66, 2004.

1 Solute Transport Processes in the Cell

Adelfo Escalante, Alfredo Martínez, Manuel Rivera, and Guillermo Gosset National Autonomous University of Mexico

1.1 1.2

Introduction .......................................................................................1-1 Structure and Function of the Bacterial Membrane....................1-2

1.3

he Transporter Classiication (TC) System .................................1-8

Structure of the Cellular Membrane  •  Functions  •  Kinetics of Transport Processes

Channels and Pores  •  Electrochemical Potential-Driven Transporters  •  Primary Active Transporters  •  Group Translocators  •  Transmembrane Electron Flow Systems

References ....................................................................................................1-19

1.1 Introduction he cell membrane constitutes a hydrophobic barrier that isolates the cytoplasm from the external medium. he entry and exit of most of the nutrients required for cell growth and the byproducts generated by metabolism are highly restricted by this cellular structure. However, to sustain high growth rates, microbes require a high rate of nutrient import. he presence of specialized transport proteins in the membrane allows the cell to circumvent the permeability restrictions imposed by this barrier. Analyses of microbial genomes have revealed that approximately 10% of the genes encode proteins involved in transport [1]. hese transport systems participate in the import and export of diferent classes of molecules and also in other important cellular functions. hey allow the entry of nutrients to sustain metabolism and ion species to maintain concentration gradients leading to membrane potential and energy generation. Transporters allow the excretion of metabolite by-products and other toxic substances, like drugs or certain metal ions. Transport systems also participate in the secretion of lipids, carbohydrates, and proteins into membrane(s) or the external medium. hey enable the transfer of nucleic acids between organisms, contributing to microbial diversity. Finally, transporters participate in the uptake of diferent types of signaling molecules like alarmones and hormones, among others, thus allowing cellular communication [2]. Solute transport and metabolism are linked processes in the cell. Genetic organization in bacteria frequently relects this functional coupling by the clustering of genes encoding both transport and metabolic activities in transcriptional units. his association is generally observed in operons encoding catabolic pathways for carbon sources [3]. Transport and regulatory systems participate in the process whereby the bacterial cell can select from a mixture of nutrients those that aford the highest growth rate [4]. In addition, the diferential expression of genes encoding distinct transporters for a speciic substrate allow the cell to select the transport mechanism according to the physiological state and environmental conditions [5]. Transport systems are potential targets for modiication with the aim of microbial production strain improvement. Metabolic engineering eforts usually focus on modifying metabolic enzyme activities. 1-1

1-2

Cellular Metabolism

However, it can be envisioned that high performance production strains will also require the modiication of other cellular functions, including transport. Modiication of transport systems can result in the improvement of several cellular properties including: (a) increasing the range of carbon source utilization [6]; (b) increasing metabolic precursor availability for the synthesis of amino acids, shikimate pathway intermediates, TCA cycle intermediates, and fermentation products like ethanol [7–10]; (c) increasing the eiciency in sugar mixture utilization by partial disruption of catabolic repression [11]; and (d) controlling overlow metabolism, thus reducing acetate production [12].

1.2 Structure and Function of the Bacterial Membrane 1.2.1 Structure of the Cellular Membrane he cell membrane, also known as the cytoplasmic membrane, plasma membrane, or cell surface membrane, is a thin structure that surrounds the cell. It is the barrier that deines the boundaries of the cell, separating the cytoplasm from its environment. If the membrane is damaged, the integrity of the cell is altered and the cytoplasm leaks into the environment, causing cell death. he general structure of prokaryotic and eukaryotic cell membranes (and the outer membranes of Gram-negative bacteria) is a bilayer composed of phospholipids, which contain both hydrophobic (fatty acid) and hydrophilic (glycerol-phosphate) components. It can exist in many chemical forms as a consequence of a diversity of compounds attached to the glycerol backbone. As phospholipids aggregate in aqueous solution they spontaneously organize to form two parallel rows, known as a lipid bilayer. Phospholipid molecules align with the fatty acids pointing inward toward each other to form a hydrophobic environment, whereas the hydrophilic portions face both the external side and the internal or cytoplasmic side of the membrane. he bilayer structure represents the most stable arrangement of the lipid molecules in an aqueous environment. he whole structure of the plasma membrane is stabilized by hydrogen bonds and hydrophobic interactions. In addition, cations such as Mg2฀+฀and Ca 2฀+฀help to stabilize the membrane due to ionic interactions with the negative charges of the phospholipids. A model of the structures of the bacterial cell membranes of Gram-positive and Gram-negative bacteria is shown in Figure 1.1 [13–15]. An important amount of protein and other materials is partially or completely embedded in the membrane layer. A typical bacterial membrane contains up to 200 diferent kinds of proteins (approximately 75% of the mass of the membrane). Protein molecules in the membrane are arranged in a variety of ways. Some proteins are fully embedded in the membrane and are thus called integral or transmembrane proteins. hey can be removed from the membrane only ater disrupting the lipidic bilayer. Some of these proteins are channels that have a pore, through which substances enter and exit the cell. Other proteins, called peripheral, are easily removed from the membrane by mild treatments and are irmly associated with the inner or outer surface of the membrane. hey may function as enzymes that catalyze chemical reactions, as scafolds for support of cell components, and as mediators of changes in membrane shape during movement. Some peripheral membrane proteins contain a lipid tail on the amino terminus that anchors the protein to the membrane. hese proteins are called lipoproteins and interact directly with integral proteins in important cellular processes such as energy metabolism. Many proteins and some of the lipids on the outer membrane of the plasma membrane have carbohydrates attached to them. hese structures are known as glycoproteins and glycolipids, respectively. Both of these structures help to protect the cell and are involved in cell-to-cell interactions. Sterols and related molecules are present in eukaryotic membranes. hey are rigid and planar molecules, whereas fatty acids are lexible; their presence stabilizes and makes membranes less lexible. Sterols are absent in prokaryotic cellular membranes, except for methanotrophs and mycoplasms. Polycyclic compounds known as hopanes (derivatives of pentacyclic triterpenoides) are widely distributed among bacteria, and it is proposed that they may play a role in maintaining membrane rigidity (Figure 1.2). One widely distributed hopane is the C30 hopanoid diploptene. Hopanes are not present in species of Archaea [16].

1-3

Solute Transport Processes in the Cell

O-specific side chains

Gram-negative

Lipopolysaccahride Outer membrane Murein lipoprotein Periplasmic space and cell wall

Murein Phospholipids

Cytoplasmic membrane

Peripheral proteins

Transmembrane proteins

Gram-positive Cytoplasmic membrane

FIguRE 1.1 Cell membranes of Gram-positive and Gram-negative bacteria. Schematic representation of the inner and outer membrane lipid bilayers of Gram-negative bacteria (upper panel) and Gram-positives (lower panel). Several structures associated to cell membranes such as porins, integral or transmembrane and peripheral proteins, and cell wall components are shown. CH3

(a) H3C

C

CH3 H2C

H2C

H2 C CH CH3

H CH3

HO OH

(b)

CH3

CH3

OH

CH3 OH

OH

CH3

CH3

CH3

FIguRE 1.2 Structure of membrane sterols and hopanoids. (a) Structure of the cholesterol molecule, a typical sterol present in cell membranes of eukaryotic cells, methanotrophic bacteria, and mycoplasmas. (b) Structure of a hopane, a polyterpenoid present in prokaryotic cell membranes.

1-4

Cellular Metabolism

Membranes have a viscosity similar to that of light-grade oil. Experimental evidence has demonstrated that at temperatures that permit growth, membrane molecules are not static but move quite freely within the membrane surface. Individual lipid molecules are also generally free to exchange places with another lipid in the membrane, resembling a two dimensional luid. It is proposed that this movement is most probably associated to the functions of the plasma membrane. hese dynamics of phospholipids and proteins are known as the luid mosaic model [17]. However, it is also proposed that some membrane regions have considerable order, because some lipids molecules are not free due to their relationship with speciic membrane proteins and some other components [18]. he phospholipids of the cell membrane from bacteria contain ester linkages bonding the fatty acids to glycerol whereas in Archaea the membrane lacks fatty acids (Figure 1.3). Instead, their side chains are composed of repeating units of the ive carbon hydrocarbon isoprene that is linked to glycerol by an ether bond; however, the overall architecture of the cytoplasmic membrane of Archaea, forming an inner and outer hydrophilic surfaces with a hydrophobic interior, is the same as in bacteria. Glycerol diethers and glycerol tetraethers are the major lipids present in membranes from Archaea. In the tetramer molecule, the phytanyl side chains (composed of four linked isoprenes) from each glycerol molecule are covalently bonded together (Figure 1.4), leading a lipid monolayer instead of a bilayer cytoplasmic membrane. his structure is widely distributed among hyperthermophilic Archaea helping to maintain the membrane architecture at high temperatures [19–20].

1.2.2 Functions he most important function of the cell membrane is to serve as a selective barrier through which material enters and exits the cell. he cytoplasm consists of an aqueous solution containing salts, sugars, amino acids, nucleotides, vitamins, coenzymes, proteins, and a variety of other soluble materials. he hydrophobic nature of the internal region of the plasma membrane constitutes a tight difusion barrier with selective permeability, allowing certain molecules and ions to pass through and blocking passage to others. Some smaller molecules, such as water, oxygen, carbon dioxide, and some simple sugars, usually pass freely through the membrane by difusion (Table 1.1). his is also the case for molecules that are dissolved easily in lipids (oxygen, carbon dioxide, and nonpolar organic molecules). In contrast, hydrophilic and small charged molecules such as the hydrogen ion (H+) do not pass through the membrane but instead must be speciically transported. Water is a molecule that freely crosses the membrane, because it is suiciently small to pass through the phospholipid bilayer. However, water transport through the membrane can be accelerated

(a)

(b)

Ester

(c)

O H2C

O

C

Ether

R H2C

O

C

R

HC

O

C

R

H2C

O

O HC

O

C

R

O H2C

O

P O–

CH3

O O–

P

O–

H2C

C

C H

CH2

O–

FIguRE 1.3 Chemical diversity of lipidic bonds in cell membranes. (a) An ester linkage found in lipids of bacteria and eukaryotic cells. (b) An ether linkage of lipids of cell membrane of Archaea. (c) Structure of isoprene, the parent structure of the hydrophobic side chains present in Archaea.

1-5

Solute Transport Processes in the Cell (a) Glycerol diether

Phytanyl

Ether linkage

H

H H

Glycerol phosphate

C

O

CH2

CH2

H

C

O

CH2

H

C

O

R

Phytanyl

H

O

C

H

CH2

O

C

H

R

O

C

H

H

CH3 group

Lipid bilayer H

(b) Diglycerol tetraether H

H

O

C

H

H

C

O

CH2

CH2

O

C

H

H

C

O

CH2

CH2

O

C

H

H

C

O

H

H

Biphytanyl

H

Lipid monolayer

FIguRE 1.4 Structures of the Archaea cell membranes. (a) Schematic representation of bilayers of isoprenoids linked to glycerol by ether bonds. (b) Structure of the monolayers of the isoprenoid biphytanyl glycerol ether.

TaBLE 1.1 Comparison of Difusion-Controlled and Carrier Mediated Solute Fluxes across Bacterial Plasma Membranes Typical Transfer Rate [µmol min-1 (g Dry Mass) -1] Difusion-Controlled at a Concentration Diference of Transported Solute +

Potassium ion (K ) Glutamate Glucose Isoleucine Phenylalanine Urea

10 µM

10 mM

Carrier Mediated (Vmax)

0.00002

0.02

100

฀Kj) a change in Cj has hardly any efect on qi. his is called zero (0) order behavior, and happens in batch conditions. his last observation is very relevant because it shows that when the concentration of e.g., a trace metal, vitamin or hormone remains far above its Kj-value, a decrease in its concentration Cj (which will occur because e.g., a trace metal j or vitamin j is consumed during biomass growth) will not signiicantly change the value of qi. his condition (Cj฀>>>฀Kj) is called nonlimiting condition for compound j. his condition should be applied to each medium component which should not have an efect on the q-rates. Furthermore, saturation kinetics ofers us a simple and practical kinetic format. As has been pointed out above a proper cultivation medium, should be designed such that all nutrients, except one, are nonlimiting. he medium should thus be designed in such a way that for each nonlimiting nutrient the concentration during the experiment is at all time much larger than the ainity (Cj฀>>฀Kj). For the limiting nutrient, however, the concentration must be in the range of its ainity constant. he choice of the type of nutrient limitation has a very signiicant efect on cellular behavior. For example if a vitamin is the limiting nutrient this will result in limiting an enzyme activity which depends on this vitamin. his metabolic bottleneck then can lead to drastic changes in secreted metabolic products. A famous example is the citric acid production by Aspergillus niger. he cultivation medium used in the production process of citric acid should not contain any manganese (Mn). he absence of this metal blocks the conversion of isocitrate to alpha-ketoglutarate in the TCA-cycle, because the enzyme for this reaction cannot function without Mn. he result of this blockage is that large amounts of citric acid are secreted by the cells. Clearly, if the N-source is the limiting nutrient, the formation of biomass is restricted e.g., due to the limitation of protein biosynthesis. his will result in a surplus of e.g., the electron donor which might lead to the formation of large amounts of byproducts (e.g., S. cerevisiae (bakers yeast), produces ethanol from glucose under N-limited conditions). he concept of single nutrient limitation leads to important kinetic simpliications. he only extracellular concentration which inluences the q-values under single limiting nutrient condition is the limiting nutrient itself, hence the q-rates only depend on the concentration Cj of the limiting nutrient j. he very complex kinetic function for qi can thus be simpliied to: qi฀=฀f (pH, T, pressure, concentration Cj of the single limiting nutrient)

(9.3)

And, if pH, T, pressure are kept constant, can be further simpliied to: qi฀=฀f (Cj only)

(9.4)

Example 1: Design of a Nonlimiting Medium Biotin is a cofactor in certain enzymes. Assume that the value of the ainity constant K of microorganisms for biotin is equal to 1*10 -6 M. Also assume that for the synthesis of 1 g dry matter of biomass 1.1*10 -6 mol of biotin is consumed. Task: Answer:

If one desires to reach a inal concentration of 15 g/l biomass, how much biotin is needed in the medium to keep biotin nonlimiting. Biotin remains nonlimiting if C฀>>฀K, e.g., Cbiotin฀>฀10*K฀=฀10*(1*10 -6)฀=฀10*10 -6 M. For growth 15*(1.1*10 -6)฀=฀16.5*10 -6 mol biotin/l is needed.

The total biotin concentration added to the medium should therefore be: 10*10 -6฀+฀16.5*10 -6฀=฀26.5*10 -6 mol biotin/l.

Black Box Models for Growth and Product Formation

9-5

9.3 Fermentor transport Mechanisms as a tool to Control extracellular Concentrations and therewith Control the q-Rates: the Chemostat 9.3.1 transport Mechanisms Can Be Applied to Control extracellular Concentrations It has been outlined above that the biomass speciic rates (q-rates) for uptake and secretion of compounds are in general inluenced by the properties of the organism (the genes) and the environmental conditions to which the organism is exposed. Obvious environmental factors of inluence are pH and T. herefore these are usually experimentally controlled at a selected constant value. A proper design of the cultivation medium in principle allows to study the efect of a single nutrient on the behaviour of the organism, that is on the biomass speciic conversion rates (q-rates). As has been argued before, batch cultivation is not the preferred way to carry out these studies because in batch culture the concentrations can not be controlled by the experimenter. To do so a cultivation method is required which allows us to precisely control the extra cellular concentration of a certain compound of choice at desired levels (Cj฀≈฀Kj) in order to study their efect on the q-rates. he question is how control of extracellular concentrations can be achieved while there is ongoing consumption and production by the cells/organisms which are present in the vessel. he answer is that properly designed transport mechanisms (which can be diferent for the different compounds) must be implemented in the cultivation vessel/space in which the organisms are cultivated.

9.3.2 Control of a Constant extracellular Substrate Concentration Using Substrate transport We have seen that in a batch experiment the substrate concentration drops due to cellular consumption. Such a drop can only be stopped by adding substrate to the cultivation vessel from an external source at a certain rate. Hence we need a mechanism to transport substrate from a substrate storage vessel into the cultivation vessel. One can think of many possible ways to achieve this, but a particularly simple method is to have a sterilized substrate solution available in a storage vessel and pump this solution into the cultivation vessel with a controlled low rate. Assuming that the substrate concentration in the substrate solution is equal to Cs,in (mol/m3) and that the low rate of this substrate solution equals ∅in (m3/h) we can write for the rate of transport of substrate to the cultivation vessel: Substrate feed rate฀=฀Cs,in ∅in (molS/h)

(9.5)

When this transport rate is kept equal to the consumption rate of substrate by the cells, which can be achieved by manipulating ∅in, then the amount of substrate Ms in the cultivation vessel remains constant. Because the substrate amount present in the cultivation vessel equals Ms฀=฀V ⋅฀Cs this means that a constant substrate concentration Cs can only be achieved if the broth volume V in the cultivation vessel is also kept constant. However, the continuous addition of substrate solution ∅in will lead to an increase in broth volume, hence V will increase with time and Cs (=฀Ms/V) will still drop in time. To avoid this we need to keep V฀=฀constant while feeding substrate solution, which requires that liquid should be transported out of the cultivation vessel. his can be done by pumping out broth. However, this also results in the removal of biomass.

9-6

Balances and Reaction Models

9.3.3 Control of a Constant Biomass Concentration Using Biomass transport We have also observed that in the batch experiment the biomass amount M x increases due to cellular growth and hence Cx increases. he biomass concentration under condition of growth can only be kept constant by removing the produced biomass from the cultivation vessel. If the rate of biomass removal from the fermentor (in C -molX/h), would equal the rate of biomass production, Ratex (in C -molX/h), then the biomass amount M x (RateX, in C -molX) in the cultivation vessel would not change anymore. A simple method to remove biomass is to pump out the complete broth which contains extracellular water (called supernatant) and biomass with a low rate ∅out (m3/h). Usually a cultivation vessel is ideally mixed using, e.g., a stirring device. he term well mixed means that concentrations inside the cultivation vessel have the same value at each position inside the vessel. Hence the biomass concentration, Cx, is the same everywhere and this means that it can be safely assumed that the biomass concentration is also Cx at the point where the broth is removed from the fermentor. For the transport rate of biomass from the cultivation vessel (in C -molX/h) one can write: Removal rate of biomass฀=฀Cx ∅out (C -molX/h)

(9.6)

Continuous removal of broth from the cultivation vessel will thus result in a constant total amount of biomass M x when there is continuous production of biomass. However several additional aspects must now be considered: •  If broth is removed the broth volume V inside the cultivation vessel will decrease. To maintain a constant volume V requires that the outlow of broth should be compensated by a suicient inlow of another solution. he most logical choice is the inlow of a substrate solution as discussed before. he problem of a changing volume, due to either inlow of substrate solution or broth outlow, can be solved by using a simultaneous in- and outlow. his allows that V฀=฀constant at a value chosen by the experimenter. Please note that it is not so that ∅in฀=฀∅out; this hardly ever occurs!! •  It should be realized that the broth does not only contain biomass!! It contains also supernatant in which substrate but also products and other nutrients are present. Hence transport of biomass by broth removal also creates a transport of substrate, products, and nutrients from the fermentor: Removal rate of substrate฀=฀Cs ∅out. (molS/h)

(9.7)

Removal rate of product฀=฀Cp ∅out. (molP/h)

(9.8)

whereby Cs and Cp are the substrate and product concentrations in the fermentor. One should realize that the broth supernatant contains much more compounds, e.g., vitamins, minerals, hormones, NH4+฀, H2PO4- , SO42- which are not completely consumed. hese are, therefore, also transported out of the cultivation vessel by the broth removal. Because these compounds are also consumed for cellular growth it is clear that the amount of each of these compounds would only decrease (due to transport-out and consumption). To achieve constant amounts in the cultivation vessel also these compounds needs to be transported to the vessel. his is most easily achieved by adding these compounds to the solution which contains the growth limiting substrate which is pumped into the cultivation vessel to provide substrate transport in order to achieve a constant substrate concentration in the vessel. Hence it is necessary to pump in a complete medium solution and not a solution containing only the substrate.

Black Box Models for Growth and Product Formation

9-7

9.3.4 Control of a Constant extra-Cellular Product Concentration using Product transport In a batch experiment the product concentration can only rise, because it is produced by the organism. A constant product concentration requires therefore that product is transported out of the cultivation vessel. A constant product amount Mp, and hence a constant product concentration Cp, in the cultivation vessel will be achieved when the rate of production by the organisms in the fermentor (Ratep, molP/h) equals the rate of transport out of the cultivation vessel. We have seen above that this product transport already occurs when broth is removed to control the biomass concentration, because the product is also present in the broth. Other possibilities for product transport here are more possibilities to control the product concentration by using alternative transport mechanisms (compared to broth removal) •  One could add product to the medium inlow. his would create a second transport mechanism where product is transported into the vessel. In this way the product concentration in the cultivation vessel can be increased, for example to study the efect of higher product concentrations on the q-values (e.g., product inhibition). •  Some products are volatile (examples Are ethanol and CO2) and are transferred easily from the supernatant to a gas phase. It is then possible to remove the product by sparging gas through the broth (called “stripping”). he above shows that a cultivation vessel with an inlow of fresh growth medium and a simultaneous outlow of broth one has suicient transport mechanisms to be able to achieve constant concentrations of all compounds (Cs, Cx, Cp, Ci,…) in the broth supernatant in a situation where there is simultaneous cellular consumption and production of s, x, p, i,… his cultivation system is called a chemostat.

9.3.5 Manipulation of Biomass Speciic Conversion Rates in a Chemostat A classical chemostat is a well mixed cultivation vessel with a constant inlow rate of medium, containing a single growth limiting nutrient, and an outlow rate of broth which is controlled in such a way that the culture volume is kept at a certain desired value within narrow limits. Although the culture volume V in a chemostat can be assumed constant (dV/dt฀≈฀0), it is well possible that ∅in and ∅out are not the same. Explanations are •  Evaporation of water from the broth always occurs, due to aeration of the broth (needed to transport O2 into and to transport CO2 out of the broth). Evaporation causes ∅out฀฀∅in. •  Densities of medium and broth may be diferent. his is usually of minor importance. he most characteristic property of a chemostat is that ater suicient time a steady state is reached, which means that all concentrations, T, pH, and V become constant in time. Hence for a steady state chemostat it holds that: dV =0 dt

and

dC i =0 dt

9-8

Balances and Reaction Models

Total conversion rates can be calculated from the proper mass balances. he mass balance for compound i in a chemostat reads: d(V ⋅ C i ) = Ratei + Φ in ⋅ C i,in - Φ out ⋅ C i dt

(9.9)

Compared to the mass balances for a batch culture system the mass balances for a chemostat system also contains transport terms to and from the culture system. Ater a chemostat has reached a steady state, the accumulation term becomes equal to zero and thus the mass balance for compound i can be simpliied to: 0 = Ratei + Φ in ⋅ C i,in - Φ out ⋅ C i

(9.10)

So where in case of a batch culture system the mass balance contains zero transport terms but a nonzero accumulation term, the mass balance for a steady state chemostat has a zero accumulation term but nonzero transport terms. Because of the presence of transport the chemostat is the most suitable cultivation system to manipulate the q-rates of microorganisms or cultured cells. he fact that speciic conversion rates can be set by the experimenter, by means of manipulation of the transport rates, becomes clear from the biomass mass balance. If we assume that biomass is not present in the feed of the chemostat (Cx,in฀=฀0) then it follows from the steady state mass balance for biomass that: 0฀=฀Ratex-∅out · Cx

(9.11)

his result shows that in a steady state chemostat the rate of biomass production equals its removal rate in the broth outlow. By deinition it holds that: Ratex฀=฀µ·M x฀=฀µ·V·Cx

(9.12)

Combination of Equations 9.11 and 9.12 yields: ฀

µ·V·Cx฀=฀∅out·Cx,

(9.13)

µ฀=฀∅out/V

(9.14)

which can be rewritten as: ฀

his wonderful simple result shows that the experimenter (who can set the broth volume V and broth outlow rate ∅out) can set the value of the biomass speciic growth rate µ which he can impose on the organism in his chemostat. he ratio ∅out/V is called the dilution rate D. Hence the chemostat enables to do diferent experiments with an organism at diferent µ-values. In each chemostat experiment (see example below) one can then measure the concentrations of diferent compounds i, the low rates and volumes, which can be entered into the diferent mass balances for the diferent compounds from which e.g., µ, qs, q p, etc. can be calculated. In general sets of qi- and Cs values (limiting substrate) can be obtained for diferent µ-values which can be accomplished by performing chemostat cultivations at different values of ∅out/V. hese sets of q-rates, (qs, q p, µ) together with the measured substrate concentrations in the broth, are the basis of a stoichiometric and kinetic understanding of cultured microorganisms or cells.

Black Box Models for Growth and Product Formation

9-9

Example 2: Calculation of q-Rates from a Chemostat Experiment A microorganism is grown in a chemostat on a cultivation medium containing substrate s. The broth volume is kept at a ixed value of V฀=฀1.25 l. The feed solution contains 10 g/l of substate s and no biomass. The inlow rate of the feed solution is 0.10 l/h. The broth is pumped out of the reactor with a low rate of 0.13 l/h and contains 4 g/l substrate and 2 g/l biomass. The diference between the inlow and outlow rates is caused by the addition of an alkali solution, needed to maintain the pH at the proper value. Task: Answer:

Calculate the total rates of biomass formation (Ratex) and substrate consumption (Rates) and their give their properunits The total rate of substrate consumption Rates follows from the substrate mass balance which can be written as: 0฀=฀Rates฀+฀0.1 * 10-0.13 * 4

This gives Rates฀=฀-0.420 g/h (negative!!) The value of Ratex follows from the biomass mass balance as: Ratex฀=฀0.260 g/h (positive).

Task: Answer:

Calculate the biomass speciic rates qs and µ and provide the proper units The biomass speciic rates of substrate consumption (qs) and growth (µ) can be calculated directly from the previously calculated total rates and the total amount of biomass Mx present in the reactor: Mx฀=฀1.25 * 2฀=฀2.50 gX qs฀=฀Rates/Mx฀=฀-0.420/2.50฀=฀-0.168 gS/gXh



µ฀=฀Ratex /Mx฀=฀0.26/2.50฀=฀0.104 gX/gXh

From this experiment it has been found that for Cs฀=฀4 g/l, qs฀=฀-0.168 gS/gXh and µ฀=฀0.104 h -1.

It has been shown above that the chemostat is an excellent tool to obtain the kinetics (qs, µ, q p) under single nutrient (substrate) limited condition. Before we do so, however, it is needed to introduce the necessary kinetic functions, which will be done below. It will furthermore be shown how the chemostat can be used to obtain kinetic parameters.

9.4 Black Box Kinetic Functions for qs, qp, µ under Single nutrient (Substrate) Limited Conditions 9.4.1 Substrate Uptake Rate Cells consume their carbon substrate with a certain speciic rate (qs). Generally this substrate is used at diferent rates for diferent purposes: •  Growth (rate µ) •  Maintenance (rate ms) •  Product formation (rate q p)

9-10

Balances and Reaction Models

An important question is now how each rate qs, q p, and µ depends on the extracellular concentration of substrate, under the condition that the carbon substrate (which is oten identical to the electron donor) is the only growth limiting nutrient (single nutrient limitation). If we further assume that T, pressure and pH are constant, it can be understood that only the extracellular concentration of substrate has an efect on the value of qS, hence we can write: qS฀=฀f (CS)

(9.15)

he question is now to relect on the form of this function. Here we have to consider our global knowledge on the metabolism of the substrate. Clearly, the substrate has to be irst transported over the cellular membrane, usually by a speciic membrane associated protein, called transporter. Hence one can expect that qS increases at increasing extracellular concentration of substrate, CS. he question is now to consider the form of the increase. A transporter has always a maximum speciic transport rate (similar to enzymes). In addition there is a limit to the amount of transporter proteins present in the cellular membrane, because of space limitations or due to genetic regulation. Both factors explain why there is always a maximal value for qS, called qSmax. Describing the mechanism of transport of substrate over the cell membrane allows to derive a rate equation for substrate uptake. Assume that the cell membrane contains a transporter protein (Tr) which is able to form a reversible substrate-transporter complex (STr) when extracellular substrate is present: (STr)฀฀S฀+฀Tr

(9.16)

he dissociation equilibrium constant of the (STr) complex follows as: KS =

(C S )(C Tr ) C STr

(9.17)

he transporter exists in two forms, the unbound form, with concentration CTr, and the substrate bound form, with concentration CSTr. Note that the sum of both concentrations is constant (indicated tot ). with C Tr tot CSTr฀+฀CTr฀=฀ C Tr (9.18) Combination of these two equations yields an expression for the fraction of substrate-bound tot ): transporters (CSTr/ C Tr C STr CS (9.19) = tot C Tr K S + CS tot ฀=฀0 and that if CS฀>>฀KS the value of It can be inferred from this equation that if CS฀=฀0, CSTr/ C Tr tot ฀=฀1. CST/ C Tr he complex STr is formed at the outside of the membrane and is subsequently translocated to face the inside. Because intracellular concentration of S is very low (due to the consumption by metabolic reactions) the complex dissociates with a rate q max and releases S inside. his implies that the substrate s tot , leading to a hyperbolic function (see Figure 9.1): transport rate is proportional to CSTr/ C Tr

q s = q max s

Cs (K s + C s )

(9.20)

his function for qS resembles the Michaelis and Menten kinetics for single enzyme kinetics, but this is only a mathematical resemblance; qS holds for the overall kinetics of a complex biological system (microorganims, tissues, etc.).

9-11

Black Box Models for Growth and Product Formation

qSmax

–qS

0.5qSmax

qS

0

qSmax

CS KS + C S

KS

CS

FIguRE 9.1 Hyperbolic function for qS.

he hyperbolic function contains two kinetic parameters: qSmax and KS. hese parameters: •  Can be estimated from experiments in which qs and Cs are varied. •  Will change when the same microorganism is grown on a diferent substrate (electron donor) or electron acceptor. •  Will change when a diferent T and pH is used. he rate qS is 1st order in CS for CS฀฀KS. qSmax is the maximum substrate uptake rate (which has a negative value, in mol S/C -molX/h). KS is the substrate ainity (mol substrate/m3). he substrate limited condition can now be quantiied precisely, as a substrate concentration such that qS฀>฀KS, e.g., CS฀=฀20 *฀KS, then qS฀=฀0.95 qSmax฀≈฀qSmax, which means that no nutrient is limiting the microbial rates, meaning that all rates qi are at their so-called batch values qimax. Problems in measuring CS in a fermentor under nutrient limited condition Unfortunately the substrate concentration in a fermentor cannot be measured easily by a substrate speciic sensor. On-line measurement systems have been developed but they are expensive and still not robust enough and therefore not used very oten. he usual approach is still to withdraw a broth sample from the fermentor, to remove the biomass by iltration or centrifugation and subsequently analyze the substrate in the supernatant. It should be realized, however, that if the substrate to be measured is the growth limiting nutrient, the concentration is very low and thus time is a critical factor. Suppose that the real substrate concentration is 10 mg/l, the fermentor volume is 1.0 l, and that the microorganisms present in the broth consume the substrate at RateS฀=฀3600 mg substrate/h which is equivalent to 1 mg/second. Compared to the total substrate amount in the fermentor, which is 10 mg, the substrate uptake rate is very high. herefore, when the sampling process, or biomass iltration takes several seconds, the substrate concentration will drop signiicantly because the microorganisms keep on eating the substrate, and the analysis of the substrate concentration in the sample will result in completely wrong results.

9.4.2 Substrate Consumption for Maintenance he substrate which is taken up with rate qs partially has to be used for maintenance. Maintenance stands for the rate of energy expenditure needed to maintain the viability of a living cell. his energy is expressed as a rate of Gibbs-energy mG in kJ of Gibbs energy used per hour per amount of biomass present in the experimental system.

9-12

Balances and Reaction Models

he units for mG are therefore (kJ per hour used for maintenance/C -mol biomass present in the fermentor). A literature survey has shown (Heijnen, 1991) that the rate of maintenance Gibbs energy mG is similar for many microorganisms/cells: 1 kJ per hour  69000  1 mG = 4.5exp  -    R  298 T   C-mol biomass presentin thefermentor

(9.21)

In this equation R is the gas constant (8.314 J/mol K) and T is the absolute temperature (273฀+฀°C). his relation shows that mG is only dependent on temperature, according to a typical Arrhenius relation (with an activation energy of 69,000 J/mol). he temperature efect is strong; it can be calculated from this equation that a diference of 8°C (e.g., from 298 K to 306 K, meaning 25–33°C) approximately doubles mG from 4.5 kJ/C -molX/h to 9 kJ/C -molX/h. Another point of interest is that mG does not depend signiicantly on the nature of the C -source and of electron donor and electron acceptor used in catabolism to generate the maintenance energy. his is understandable because maintenance relates to biomass which has already been synthesized and for which viability must be maintained at the expense of a deined rate of Gibbs energy mG; it does not relate to new biomass that is being formed. he need for maintenance energy can be increased signiicantly by addition of so-called energy uncoupling agents. E.g., a weak acid like benzoic acid which is present at pH฀=฀4–5 easily crosses the cell membrane and releases H฀+฀at the cell interior. To maintain the proton motive force and to avoid unacceptable high accumulation of the benzoate-ion (Ac -) inside the cell, both H฀+฀and Ac - must be exported at the expense of energy (ATP). his cyclic transport (in and out) of benzoic acid and (H฀+฀+฀Ac -) represents an energy dissipating cycle. It is obvious that the energy needed for maintenance is generated in a catabolic reaction, where electron donor (or substrate S), electron acceptor and catabolic products e.g., ethanol, CO2 , etc. are involved. Hence maintenance is not only characterized by mG but the generation of this energy leads to associated so-called “chemical maintenance rates” of electron donor, electron acceptor, and catabolic products which are consumed and produced in the catabolic reaction, with rates m S, mO2 , methanol, mCO2 , etc. he relation between the various mi-values follows directly from the catabolic reaction used by the cellular system (see examples below).

Example 3: Calculation of All Chemical Maintenance Coeicients mi from the Known mG Consider the yeast Saccharomyces cerevisiae that grows aerobically with glucose as electron donor. The catabolic reaction under these conditions is: ฀

-1 C6H12O6 -฀6 O2฀+฀6 HCO3-฀+฀6 H฀+฀

(9.22)

Under standard conditions (25°C฀=฀298 K, pH฀=฀7) the -∆GR฀=฀∆Gcat฀=฀2843.1 kJ. The energy need for maintenance at 25°C (฀=฀298 K) is (see correlation before) mG฀=฀4.5 kJ/C -molX/h. To generate this Gibbs energy the organism must catabolize glucose with a rate mS฀=฀-(4.5/2843.1)฀=฀-0.00158 mol glucose/ C -molX/h. In addition O2 is needed to catabolize glucose, with a stoichiometry of 6O2 per mol glucose. Hence: mO2฀=฀-6 * 0.00158฀=฀-0.0095 mol O2/C -molX/h.

Black Box Models for Growth and Product Formation

9-13

the production of CO2 is equal to: mHCO3-฀=฀+฀6 * 0.00158฀=฀0.0095 mol CO2/C -molX/h, and the production of protons equals: mH+฀=฀6 * 0.00158฀=฀0.0095 mol H+฀/C -molX/h. Note that ms and mO2 are negative because substrate and oxygen are consumed. Consider now the case that the yeast S. cerevisiae is cultured in the absence of O2 (anaerobically). It is known that under these conditions a diferent catabolic reaction is used, involving the production of ethanol (C2H6O) from glucose according to the following overall reaction: ฀

-1 C6H12O6 -฀2H2O฀+฀2C2H6O฀+฀2 HCO3-฀+฀2 H+฀

(9.23)

For this reaction ∆GR฀=฀∆Gcat฀=฀-225.4 kJ. The stoichiometry of the catabolic reaction provides the chemical mi-values for catabolic reactants. Using mG฀=฀4.5 kJ/C -molX/h it is easy to calculate that: mS = −4.5/225.4 = -0.020 mol glucose/C -molX/h meth = 2 * 0.02 = 0.040 mol ethanol/C -molX/h mHCO3- = 2 * 0.02 = 0.040 mol HCO3- /C -molX/h mH+ = 2 * 0.02 = 0.040 mol H + /C -molX/h mH2O = 2 * -0.02 = -0.040 mol H2O/C -molX/h

Same maintenance energy requirement, but diferent mS!! It should be noted that mS under anaerobic conditions is about 13 (0.020/0.00158) times higher than under aerobic conditions, although the maintenance energy requirement (mG) is the same (4.5 kJ/C -mol X h). he reason for this is that the catabolic energy gain from 1 mol glucose under aerobic conditions is 13 times (2843.1/225.4) higher than under anaerobic conditions. In conclusion it appears that the kinetics of maintenance energy requirement are relatively straightforward. It is assumed that maintenance energy requirement is independent of the growth rate and is therefore usually expressed as a constant mS. he only relevant factor is temperature, where roughly speaking mS doubles for each 8°C increase in temperature. All other associated maintenance related rates mi (mG, mO2 , mCO2 , meth, etc.) follow from the catabolic reaction used to generate the energy needed for maintenance.

9.5 the Herbert–Pirt Substrate Distribution equation It has already been noted that the substrate which is taken up is used for three purposes: maintenance (rate ms), growth (rate µ) and product formation (rate q p). his allows postulating the following substrate distribution equation: qs฀=฀aµ฀+฀b q p฀+฀MS

(9.24)

his is the famous Herbert–Pirt equation for substrate distribution (Pirt, 1965). Note that a, b, and ms are negative numbers, whereas µ and q p are positive. Hence, qs is by deinition negative.

9-14

Balances and Reaction Models

he units of the parameters of the Herbert–Pirt equation depend on the units of qs and q p. Assuming that all amounts are expressed in mol, the units are a mol substrate consumed per C -molX produced b mol substrate consumed per mol product produced ms molS/h catabolized for maintenance per C -molX present in the cultivation vessel Several important aspects of the Herbert–Pirt equation will be discussed below.

9.5.1 Distribution of Consumed Substrate (Micro)organisms consume expensive substrate with rate qs and use it for growth, product formation and maintenance. A relevant problem is to ind out how the consumed substrate is distributed over these three independent processes. his is best illustrated using an example. Let us consider the following Herbert–Pirt equation for aerobic growth with lysine as a product. (All rates are expressed in mol per amount of biomass per time.) qS

฀=฀−0.333µ

−1.5qP

−0.005

Total uptake of substrate

Part used for growth

Part used for lysine production

Part used for maintenance

(9.25)

Question: Consider the above substrate Herbert–Pirt equation. Assume that µ฀=฀0.05 h -1, qP฀=฀0.05 mol lysine/C -molX/h. Calculate the substrate distribution for growth, product formation and maintenance. answer: he total substrate consumption equals qS฀=฀-0.333 *฀0.05-1.5 *฀0.05-0.005฀=฀-0.0967 mol glucose/C -molX/h. he distribution of substrate is then: Growth: (0.333 *฀0.05)/0.0967฀=฀0.172 Lysine production: (1.5 *฀0.05)/0.0967฀=฀0.776 Maintenance: 0.005/0.0967฀=฀0.052 From this we can conclude that substrate is used for growth (17%), product formation (78%) and maintenance (5%). his tells us that the organism is already highly eicient with respect to the production of lysine!!

9.5.2 theoretical Maximum Yields Consider the general Herbert–Pirt relation. Assume the theoretical case that only product formation occurs, no biomass growth is takes place (µ฀=฀0) and maintenance is negligible (ms฀=฀0). In this case the Herbert–Pirt equation reduces to qs฀=฀b qP. his shows that all substrate consumed is only used for product formation. he yield of product on substrate Ysp฀=฀qP/-qs (mol product/mol substrate) is then at its theoretical maximal value because no substrate is used for the production of new biomass and no substrate is spent for maintenance. In such a theoretical situationYsp = 1/b = Yspmax . For the lysine case shown above it can thus be calculated that max฀=฀1/1.5฀=฀0.666 mol lysine/mol glucose. Ysp Hence the coeicient b of the substrate Herbert–Pirt equation represents the reciprocal of the maximal theoretical product yield on substrate. his is essential information because this maximum can be compared to the actual, operational, yield and this comparison shows how much room there is for

Black Box Models for Growth and Product Formation

9-15

improvement of the operational product yield. It should be kept in mind that the operational product yield will always be lower than the theoretical maximum yield, because part of the substrate will be spent for growth and maintenance. Similarly, the coeicient a of the Herbert–Pirt relation represents the reciprocal of the theoretical maximum biomass yield Ymax sx ฀=฀1/a (in C -molX/mol substrate).

9.6 Kinetics of Product Formation 9.6.1 the qP(µ) Function In industrial fermentation processes microorganisms are usually applied to produce an economically attractive product. he performance of the micoorganisms in producing this product, is represented by the biomass speciic rate of product formation qP, which is therefore an important rate. Under the here considered single nutrient (substrate) limited conditions qP, is only a function of the concentration of the extracellular substrate CS and thus we can write: qP฀=฀f (CS)

(9.26)

he nature of this function is not easily deduced theoretically, as we did earlier for the qs (Cs) function. herefore oten an experimental approach is applied. However, although the experimental quantiication of qP is relatively easy using the product mass balance, the measurement of CS under nutrient limited conditions is very diicult, as has been illustrated above. However, it can be argued that it is not necessary to measure CS. Because under single nutrient limited conditions µ฀=฀function (CS) (see below) then it is formally possible to use this (unknown) function to eliminate CS from qP฀=฀function (CS) to obtain: qP฀=฀another function (µ)

(9.27)

his function is in most cases nonlinear. Because µ is easily manipulated experimentally (in a chemostat) it is fairly easy to experimentally measure the relation between qP and µ. his is the qP(µ) concept, which only holds under single nutrient limited conditions.

9.6.2 Categories of Product Formation It is important to distinguish the diferent categories of product formation which might occur. he irst category is catabolic product formation. In case of catabolic product formation the product is produced in the catabolic reaction and therefore, the rate of product formation is directly coupled to the rate of the catabolic reaction. Examples are anaerobic formation of acetate, lactate, ethanol etc. Because the catabolic product formation is the unique, and therefore the sole source of energy generation which is stoichiometrically (meaning linear) coupled to growth and maintenance, it becomes clear that qP is coupled to growth and maintenance in a (stoichiometric) linear fashion: qP฀=฀αµ฀+฀β

(9.28)

with α and β being the parameters of this linear q p (µ) relation. he second category is noncatabolic product formation. In this case the product is derived from the anabolic network. Examples are vitamins, amino acids, antibiotics, proteins, etc.

9-16

Balances and Reaction Models

Some examples of qP - µ relations for noncatabolic products are (α, β, γ are kinetic constants): decrease of q P with µ: q P =

α β+µ

power law relation: q P = α µ β hyperbolic function of µ: q P = function with a maximum: q P =

αµ β+µ

(9.29)

αµ β + µ + γ µ2

Depending on the speciic case, for noncatabolic product formation any relation might exist between the rate of product formation qP and the growth rate µ. Under the condition of single nutrient limitation the relation between the rate of product formation and the growth rate can be expressed by an algebraic function: qP฀=฀function (µ). In some cases this function is linear in µ. his especially happens in case of catabolic product formation. Usually the qP(µ) function is nonlinear, especially for noncatabolic products. he function itself and the parameter values must be obtained from proper experiments.

9.6.3 Kinetics of Growth In the previous sections we have introduced: •  Hyperbolic kinetics for substrate uptake qS •  (Non)linear qP(µ)-relation •  Linear substrate Herbert–Pirt equation for substrate distribution with constant kinetics for maintenance (mS) hese three kinetic functions are suicient to calculate how µ depends on CS. Two cases can be distinguished: Case 1: he µ(CS)-function when there are no anabolic but only catabolic products In this case (all q-rates in mol i/C -molX/h) the Herbert–Pirt equation only relates qs and µ, there is no separate contribution for qP. Let us consider aerobic growth on glucose (CO2 is the only catabolic product). Assume the following Herbert–Pirt relation for substrate distribution: qS฀=฀-0.3125 ⋅฀µ฀-฀0.0015

(9.30)

In this equation the maintenance coeicient can be recognized as mS฀=฀-0.0015 mol glucose/C -molX/h max = 1/0.3125 and YSX C -molX/mol glucose. Let us now assume that hyperbolic kinetics apply for the speciic rate of substrate consumption qS as a function of CS according to: qS =

- 0.03C S 18 + C S

(9.31)

In this hyperbolic relation CS is the extracellular substrate concentration in mg/l. From this equation it can be inferred that qSmax฀=฀-0.03 mol glucose/C -molX/h and KS฀=฀18 mg glucose/l. Combining these two equations by eliminating qS yields the following relation between µ and CS: µ=

1  0.03C S  0.0015 0.3125  18 + C S  0.3125

(9.32)

9-17

Black Box Models for Growth and Product Formation

A plot of this relation is shown in Figure 9.2. Several remarks can be made about the above derived kinetic equation for µ as a function of CS. In the literature oten the Monod equation is used to express µ as a function of CS, that is µ = µ max ⋅

CS K S + CS

(9.33)

he equation which has been derived before (Equation 9.32) is clearly not identical with the Monod equation (Equation 9.33). It should be noted, however, that only if maintenance is absent Equation 9.32 becomes identical to the Monod type equation because the maintenance term (in the above case 0.0015/0.125) disappears. Furthermore it should be noted for this example that: •  At CS฀>>฀18 mg/l, µ approaches µmax which equals (0.03/0.3125)-(0.0015/0.3125)฀=฀0.0912 h -1 •  At C S฀=฀0 µ is negative and equal to -0.0048 h -1. he interpretation is that at C S฀=฀0 (see Figure 9.1), there is no substrate uptake (qS฀=฀0). However maintenance energy is still required. In practice it is observed that, under conditions of absence of substrate (C S฀=฀0) organisms start to catabolize part of themselves; they loose weight!, which means that the cell mass decreases and hence µ฀>฀18 mg/l, qSmax฀=฀-0.03 mol glucose/C -molX/h, which is correct. However from this equation it follows that when there is no substrate (CS฀=฀0), there is still substrate uptake (qs)฀=฀-0.0015 mol glucose/C -molX/h. his is of course complete nonsense. his problem is eliminated by introducing the hyperbolic kinetic function for qS into the Herbert–Pirt substrate distribution equation as shown above, which leads to the µ (CS) function as shown earlier (Equation 9.32). Case 2: he µ(CS) function in case of noncatabolic product formation Let us now consider the case were growth is accompanied by the formation of a noncatabolic product. he procedure to obtain the kinetic function for µ฀=฀ƒ(CS) is most easily demonstrated with an example.

Example 4: Corynebacterium: Aerobic Growth and Lysine Production Assume that the following Herbert–Pirt substrate distribution has been found: qS฀=฀-0.333µ-1.5qP -฀0.005

(9.38)

max = 3 C-molX/mol glucose and This equation shows that mS฀=฀-0.005 mol glucose/C -molX/h, YSX max = 0.666 mol lysine/mol glucose . Let us further assume that the (hyperbolic) glucose uptake kinetics YSP are given by (where CS is the substrate concentration in mg/l):

qS =

- 0.10 ⋅ CS 5 + CS

(9.39)

Black Box Models for Growth and Product Formation

9-19

This shows that qSmax฀=฀-0.10 mol glucose/C -molX/h and KS฀=฀5 mg glucose/l. Also the lysine production kinetics are known, with the following hyperbolic qP(µ) function (qP in mol lysine/C -molX/h): qP =

0.03 ⋅ µ 0.01+ µ

(9.40)

Introducing the relations for qS and qP in the Herbert–Pirt substrate distribution relation yields: 0.10CS 0.03µ = 0.333µ + 1.5 + 0.005 5 + CS 0.01+ µ

(9.41)

The result is a nonlinear relation between µ and CS. Let us irst consider the properties of this relation: • It can be shown that µ increases monotonously with CS to a maximal value, called µmax. For CS฀>>>฀5 mg/l, the left side becomes constant and independent of CS (qS has then its maximal value of -0.10 mol glucose/C -molX/h). Under these conditions µ also achieves its maximal value which can be found by solving the equation:

0.10 = 0.333 ⋅ µ max + 1.5 ⋅

0.03 ⋅ µ max + 0.005 0.01+ µ max

(9.42)

This can be solved to give µ฀=฀µmax฀=฀0.1580 h -1. • By combining the µ (Cs) relation (Equation 9.41) and the qP(µ) function, a function for the relation between qp and Cs is obtained. • Also in this case µ฀=฀0 at a certain Cmin ; At this value of CS the value of qS equals the maintenance s rate. For µ฀=฀0, the nonlinear relation between µ and CS (Equation 9.41) becomes: 0.10 ⋅ CS = 0.005 5 + CS

(9.43)

฀=฀0.26 mg/l. At this concentration µ฀=฀0 and qP฀=฀0, but From this result it can be calculated that Cmin s qS฀=฀mS฀=฀-0.005 mol glucose/C -molX/h฀=฀mS. • When product formation would be absent, and assuming the same maintenance and substrate uptake kinetics, the µ(CS) relation would be: 0.10 ⋅ CS = 0.333µ + 0.005 5 + CS

(9.44)

The µmax-value (at Cs฀>>>฀5 mg/l) follows now as µmax฀=฀(0.10-0.005)/0.333฀=฀0.285 h -1. This µmax value is much higher then when product formation occurs (µmax฀=฀0.1580 h -1). The reason for this is that in case the substrate is not used for product formation all consumed substrate can be channeled to growth and maintenance, resulting in a higher growth rate.

It can be concluded from the above example that the presence of noncatabolic product formation has a signiicant efect on the µ(CS) relation, such that the growth rate can be much lower when product formation happens. his is logical and this phenomenon is called “metabolic burden.”

9-20

Balances and Reaction Models

9.6.4 A Single Degree of Freedom under Single nutrient Limited Condition he kinetic model for substrate limited growth (hyperbolic substrate uptake equation, Herbert–Pirt substrate distribution, qP(µ) relation) is now complete. he three basic q-rates (qs, q p, µ) are completely speciied as function of CS. Alternatively, because µ and CS are uniquely related (by the µ(CS) function) one can also state that all rates are determined when one rate is known, for example µ. Clearly at a chosen µ, the q p(µ) function yields q p. he Herbert–Pirt relation then yields the value of qs. Finally the hyperbolic qs-relation yields CS. his consideration clearly shows that the complete black box kinetic model only contains only 1 degree of freedom. Choosing the free variable, e.g., CS or µ or qs or q p determines which variables are ixed by the kinetic equations (Table 9.1). It is a matter of practical consideration which variable is chosen as free variable. In case of chemostat experiments the growth rate µ is a logical choice (is equal to the dilution rate and can be easily set by the experimenter), in a fed batch culture the feed rate of the substrate is a logical choice (which is directly related to qs under the condition of single nutrient limitation), etc.

9.7 estimation of the Parameters of the Kinetic Model from Chemostat experiments he kinetic and stoichiometric description of growth and product formation of cultured microorganisms or cells from higher organisms under single substrate limited condition requires information on: •  he (hyperbolic) substrate uptake kinetics, (the qS(CS) function with qSmax, KS as parameters) •  he qP(µ) relation with its parameters (α, β, γ) •  he Herbert–Pirt substrate distribution equation with parameters a, b, ms

9.7.1 Minimal number of Chemostat experiments needed As has been shown earlier, chemostat cultivation allows manipulating the growth rate µ and therefore µ is the most obvious free variable for such a system. Experiments at diferent growth rates µ can therefore be carried out to obtain the parameters of the black box kinetic model. Previously it was shown for batch systems how the biomass speciic consumption (or production) rate of a compound of interest, e.g., qS for substrate, µ for biomass, is calculated from the experimental measurements (volumes and concentrations) in combination with the proper mass balances. Here we will show how the q-rates can be obtained from chemostat experiments. An important question thereby is what the minimal number of diferent experiments is that is needed to obtain the parameters. In case of noncatabolic product formation three diferent data sets on µ, qs, q p, CS are needed to obtain the values of a, b, mS of the Herbert–Pirt equation, by solving the resulting set of linear equations, as is shown in the example below. TaBLE 9.1 Choices of Free Variables for the Black Box Kinetic Model Free Variable

Determined by Kinetic Model

CS

qs, µ, qp as function of CS

µ qs

CS, µ, qp as function of qs

qp

µ, qs, CS as function of qp

qp, qs, CS as function of µ

Black Box Models for Growth and Product Formation

9-21

and Ks can be obtained from a plot of qs versus CS and the relation Furthermore the parameters qmax s between q p and µ can be found from a plot of qP versus µ. If no noncatabolic product formation occurs the Herbert–Pirt equation reduces to: qS฀=฀aµ฀+฀mS

(9.45)

In this case minimally two diferent datasets on µ, qs and CS are required to obtain the parameters a, ms, qsmax, and Ks. In this case the q p(µ) function is a linear function of the growth rate: q p฀=฀αµ฀+฀β

(9.46)

Only in this case a and mS can be obtained graphically. According to the Herbert–Pirt equation a straight line is expected if qs is plotted versus µ, which is indeed found in most cases. he slope of this line equals a. he intercept with the vertical axis equals mS. he two parameters (α, β) of the q p (µ) relations and the parameters of the hyperbolic relation (KS, qsmax) also require minimally, two experiments. hese are obtained from plots of qs versus CS (hyperbolic qs function) and q p versus µ for the q p (µ) function. In practice, however, it is wise to carry out more experiments than the minimum amount, i.e., two or three, in order to obtain statistically reliable parameter values.

9.7.2 Chemostat experiments and obtaining the Model Parameters A typical set of chemostat experiments consists of cultivations at diferent known low rates φout, whereby the concentrations of substrate, product, biomass, the low rates and V are measured under steady state conditions. From these measurements and the mass balances the q-values are calculated at each µ imposed on the biological system. his set of calculated biomass speciic rates and measured Cs values can then be used to establish the kinetic and stoichiometric functions and their parameters. he required procedures to do so are outlined in the example below.

Example 5: Kinetics and Stoichiometric Model from Chemostat Experiments A microorganism is cultivated in a chemostat at diferent inlow rates. The chemostat broth volume is 1.2 m3 and is kept at this value by controlling the outlow rate. The organism grows aerobically, uses glucose as carbon source, NH4+฀as N-source and produces alanine (C3H7O2N) as (noncatabolic) product. The pH is controlled using a 10 N solution of NaOH. Air sparging is used to transfer O2 to the culture and to remove the produced CO2. A stirrer is used to achieve ideal mixing of the contents of the culture vessel. The nutrient solution, which is fed into the chemostat contains glucose at a concentration of 2000 mol/m3. For each low rate applied, the chemostat is allowed to achieve steady state, where the glucose is the single limiting nutrient. From the considerations outlined above it can be inferred that a minimum of three experiments is needed in this case because alanine is a noncatabolic product. However, in practice six experiments are performed at diferent in-low rates of the nutrient solution. Achievement of a steady state is observed from the measured concentrations in the chemostat which reach constant values after some time. During each steady state measurements are performed on: φin φout CSin CS V

low rate into the chemostat of the nutrient solution (m3/h) low rate of broth out of the chemostat (m3/h) the glucose concentration in the inlowing nutrient solution (mol/m3) the glucose concentration in the chemostat (mol/m3) the volume of the broth in the chemostat (m3)

9-22

Balances and Reaction Models CX CP φalk

the biomass concentration in the chemostat (C -mol/m3) the alanine concentration in the chemostat (mol alanine/m3) the supply rate of NaOH solution (alkali), to control the pH

Results can be found in Table 9.2. Task 1: Answer:

Calculate for experiment 4 the biomass speciic rates µ, qS, qP. It has been derived earlier in this chapter that from the mass balance for biomass for a steady state chemostat it follows that the speciic growth rate µ is equal to the dilution rate of the chemostat (Equation 9.14), thus µ = (D = (Φ v,out / V)) and therefore µ=

0.42 = 0.35 C-molX / C-molX / h 1.2

The value of qs is the total rate of consumed substrate divided by the total biomass present in the chemostat qs฀=฀Rates/Mx. The rate of consumed substrate, Rates, is obtained from the substrate mass balance: (d(VCS ) /dt) = 0฀=฀Rates฀+฀rate of substrate entering -฀rate of substrate leaving This gives -฀Rates฀=฀-0.3623 * 2000฀+฀0.42*0.097฀=฀-724.6฀+฀0.0407฀=฀-724.56 mol glucose/h. Note that this rate is negative, which is logical because substrate is consumed. Subsequently qS is calculated by dividing Rates by the total biomass amount present in the fermentor: qS฀=฀Rates/Mx฀=฀-724.56/3394.8฀=฀-0.2134 mol glucose/C -molX/h. The value of qP follows similarly from the product mass balance as: qP฀=฀+฀0.10 mol alanine/C -molX/h In a similar way these rates are calculated for the other ive chemostat experiments. The results are shown in Table 9.3. Task 2: Answer:

Make a graph of -qS versus CS, obtain the values for qSmax and KS and give their proper units. This graph shows that -qS increases with CS in a nonlinear, hyperbolic way. The exact values of KS and qSmax can be obtained by nonlinear itting of the qS and CS data to the hyperbolic substrate uptake relation. A popular alternative is to rewrite the hyperbolic substrate uptake relation in its inverse form (Lineweaver-Burke plot): 1  KS  1 1 = +  CS qmax qS  qmax S S

(9.47)

This shows that a plot of 1/qS versus 1/CS gives a linear line with slopes KS/qSmax and intercept 1/qSmax. The slope and intercept can be obtained by linear regression, and subsequent qSmax and KS follow. This method is not advised because it gives a disproportionate weight to the low concentration data!! Using nonlinear regression one obtains K S฀=฀0.1 mol glucose/m3 and qSmax฀=฀-0.433 mol glucose/C -molX/h. Task 3: Answer:

Propose an equation for the qP(µ) function A plot of qP versus µ using the data of the six experiments shows that:

9-23

Black Box Models for Growth and Product Formation TaBLE 9.2

Results from a Series of Steady State Chemostat Experiments

Experiment

φin (m3/h)

φout (m3/h)

CSin (mol/m3)

CS (mol/m3)

V (m3)

CX (C -mol/m3)

CP (mol/m3)

φalk (m3/h)

1 2 3 4 5 6

0.0347 0.1125 0.3316 0.3623 0.4017 0.4699

0.036 0.12 0.36 0.42 0.4797 0.5759

2000 2000 2000 2000 2000 2000

0.008 0.016 0.048 0.097 0.190 1.390

1.200 1.200 1.200 1.200 1.200 1.200

1805 3126 3941 2829 2335 1939

0 0 0 809 1167 1454

0.0013 0.0075 0.0284 0.0577 0.0780 0.1060

TaBLE 9.3 Experiment

Calculated Speciic Rates for the Chemostat Experiments CS (mol/m3)

qS (mol i/C -molX/h)

µ (C -mol X/C -molX/h)

qP (mol i/C -molX/h)

1

0.008

-฀0.032

0.030

0

2

0.016

-฀0.060

0.100

0

3

0.040

-฀0.140

0.300

0

4

0.097

-฀0.213

0.350

0.100

5

0.190

-฀0.287

0.400

0.200

6

1.39

-฀0.404

0.480

0.360

For µ฀฀0.30 h -1 qP increases linear with µ. The slope is 2. Hence qP฀=฀2(µ-0.30) This type of product formation kinetics is a typical overlow metabolism where there is an imbalance between the uptake rate of the substrate and the rate of biomass formation. The surplus of the substrate taken up is spend by secretion of a product. If the cell would not have such an “escape”, the surplus of substrate taken up (but which cannot be converted into biomass) would lead to very high levels of unprocessed intracellular intermediates. Task 4: Answer:

Provide the substrate Herbert–Pirt relation for the experiments where µ฀ 0.05h-1 ) 0.40µ + 0.021

(9.59)

(9.60)

From a plot Ysp as function of µ (see Figure 9.3) it can be observed that Ysp has a maximum value at µ฀=฀0.05 h -1. This µ-value is called the optimal µ, µopt. In this example µopt฀=฀0.05 h -1 and qpopt ฀=฀0.0075 molP/C -molX/h.

Example 8: Catabolic Product Formation Consider the fermentative growth of yeast on glucose and ammonium, with production of ethanol (C2H6O) as catabolic product. From chemostat experiments the linear equation for substrate consumption has been obtained: qs฀=฀-1.111 µ-0.020

(9.61)

9-28

Balances and Reaction Models 0.2

Ysp (mol/mol)

0.15

0.1

0.05

0 0

0.05

0.1 0.15 0.2 Specific growth rate µ(h–1)

0.25

0.3

FIguRE 9.3 Plot of the operational yield of product on substrate as a function of the speciic growth rate µ for the lysine example.

If the elemental composition of the biomass is known (here we assume the standard average composition) this is suicient information to derive the independent growth reaction: -1.111 C6H12O6-0.2 NH4+฀+฀1.8722 C2H6O฀+฀C1H1.8O0.5N0.2฀+฀1.9222 CO2฀+฀0.2 H+฀+฀0.45 H2O฀=฀0

(9.62)

The independent maintenance (catabolic) reaction, which is the fermentation of glucose to ethanol and CO2 (rate 0.02 mol glucose/C -molX/h) is: -1 C6H12O6฀+฀2 C2H6O฀+฀2 CO2฀=฀0

฀ Task 1: Answer:

(9.63)

Give the algebraic relation for the ethanol (symbol e) yield on glucose Yse as function of µ and give its units Ethanol is produced both in the growth reaction and in the maintenance reaction. From the ethanol stoichiometry of both reactions and the reaction rates (µ for the growth reaction and 0.02 for the maintenance reaction the linear expression for the speciic ethanol production is obtained as qe฀=฀1.8722µ฀+฀0.04

(9.64)

Now the expression for the operational yield of ethanol on glucose can be derived from Yse฀=฀qe/qs Yse =

qe 1.8722µ + 0.04 = ( - qs ) 1.111µ + 0.02

(9.65)

A plot of Yse as a function of µ is shown in Figure 9.4. The units of Yse are in mol ethanol per mol glucose. This equation shows that Yse depends on µ. At µ฀=฀0, the yield is 2, at high µ, Yse฀=฀1.685; these values which are the ethanol/glucose ratios in the two independent reactions. The highest Yse is obtained at µ฀=฀0, hence µopt฀=฀0 and Yseopt ฀=฀2 mol/mol. Task 2:

Calculate the biomass yield on glucose as function of µ YSX(µ)

Answer: Ysx =

µ µ = ( - qs ) (1.111µ + 0.020 )

(9.66)

9-29

Black Box Models for Growth and Product Formation 2.5

Yse (mol/mol)

2 1.5 1

0.5 0

0

0.05

0.1 0.15 0.2 Specific growth rate µ(h–1)

0.25

0.3

FIguRE 9.4 Plot of the operational yield of ethanol on substrate as a function of the speciic growth rate µ for the anaerobic yeast example.

Ysx (C-molX/mol glucose)

1

0.5

0

0

0.05

0.1 0.15 0.2 Specific growth rate µ(h–1)

0.25

0.3

FIguRE 9.5 Plot of the operational yield of biomass on substrate as a function of the speciic growth rate µ for the anaerobic yeast example.

A plot of Ysx as function of µ is shown in Figure 9.5. It can be seen from this igure that Ysx decreased at decreasing µ, due to the increasing contribution of maintenance at low growth rates.

9.8.3 Calculation of the Stoichiometry of the overall Growth Plus Product Reaction It has been shown above that from the independent reactions for growth, product formation, and maintenance and the linear equation for substrate consumption mathematical expressions can be derived to express the operational yields of biomass and product on the substrate as a function of the growth rate µ. Note that in case of the formation of a noncatabolic product also an expression for qP as a function of µ is needed. In a similar way the relations for the operational yields of the other relevant compounds of the system as a function of µ can be obtained. hese relations allow to calculate the stoichiometry of a single overall reaction for growth and product formation for a certain growth rate µ. As can be inferred from

9-30

Balances and Reaction Models

these relations, the stoichiometry of this overall growth plus product reaction changes as a function of µ. his will be illustrated in the following example:

Example 9: Calculation of the Stoichiometry of the Overall Growth Plus Product Reaction for Noncatabolic Product Formation Assume three independent reactions for growth, product formation, and maintenance: Independent growth reaction: (rate µ) -0.333 C6H12O6-0.2 NH4+฀-0.95 O2฀+฀1 C1H1.8O0.5N0.2฀+฀1 CO2฀+฀0.2 H฀+฀+฀1.40 H2O฀=฀0

(9.67)

Independent Lysine (C6H15O2N2+฀) production reaction: (rate qP) ฀

-1.5 C6H12O6-2 NH4+฀-2 O2฀+฀1 C6H15O2N2+฀+฀3 CO2฀+฀1 H฀+฀+฀5 H2O฀=฀0

(9.68)

Independent maintenance reaction: (rate -฀mS฀=฀-(-0.005)฀=฀0.005 mol glucose/C -molX/h) ฀

-1 C6H12O6-6 O2฀+฀6 CO2฀+฀6 H2O฀=฀0 The coeicients of the linear equation for substrate consumption; qS฀=฀-0.333µ-1.5qP-0.005

(9.69)

have been used to derive these reactions. From the stoichiometries of the three reactions given above similar linear equations can be derived for the speciic conversion rates of the other reactants: qNH4+฀=฀-0.2µ-2qP qO2฀=฀-0.95µ-2qP-6*0.005 qCO2฀=฀+฀1µ฀+฀3qP฀+฀6*0.005 qH+฀=฀+฀0.2µ฀+฀1qP qW฀=฀+฀1.4µ฀+฀5qP฀+฀0.030 These linear relations can be used to calculate the overall growth (plus product) reaction at diferent growth rate µ and speciic rate of product formation qP. Assume e.g., that µ฀=฀0.05 Cmol/Cmol /and that qP฀=฀0.05 mol lysine/C -molX/h at this growth rate. The linear relations yield then the following q-values ((C)mol i/C -molX/h) ฀

µ฀=฀+0.05 qP฀=฀+0.05 qS฀=฀-0.0967 qNH4+฀=฀-0.11 qO2฀=฀-0.1775 qCO2฀=฀+฀0.230

Black Box Models for Growth and Product Formation

9-31

qH+฀=฀+฀0.060 qW฀=฀+฀0.35 Dividing all conversion rates by the speciic rate of lysine production provides the stoichiometric coeficients of the overall growth and product reaction normalized to 1 mol lysine (C6H15O2N2+) produced. -1.934 C6H12O6 - 2.2 NH4+ - 3.55 O2฀+฀1 C1H1.8O0.5N0.2฀+฀1 C6H15O2N2+฀+฀4.6 CO2฀+฀1.2 H฀+฀+฀7 H2O฀=฀0

(9.70)

For a diferent µ and qP a diferent overall growth (plus product) reaction can be calculated, as shown above.

9.9 Conclusions In this chapter the basic concepts of black box modeling of fermentation processes have been introduced. It has been shown that a relatively simple approach, which does not require detailed information on the metabolism of the applied microorganism is very useful for the design and optimization of fermentation processes. First the concept of single nutrient limited growth has been introduced. It has been shown that the kinetics of microbial growth and product formation under these conditions can be described with only a few parameters. •  A hyperbolic kinetic relation for substrate uptake, requiring only two parameters, Ks and qmax s . •  A relation between the speciic rate of product formation, q p and speciic growth rate, µ. For a noncatabolic product this function must be established experimentally. •  he Herbert–Pirt substrate distribution relation which gives: •  Information on the maintenance requirements, expressed in the parameter mS •  he stoichiometry of the independent reactions for − growth (which requires the stoichiometric parameter a฀=฀1/Ymax sx ) max) − product formation (which requires the stoichiometric parameter b฀=฀1/Ysp − maintenance (catabolism which does not require a stoichiometric parameter) max max his model description contains surprisingly few parameters (qmax s , K s , ms , Ysp , Ysx , and several parameters in the qP(µ) function), but still provides a description of how all q-rates depend on Cs, or equivalently how all q-rates and yields depend uniquely on µ. It should be considered remarkable that the enormous complexity of the living cell can be well described with respect to all relevant uptake and secretion rates using a relatively simple black box model with only a few parameters.

References and Further Reading de Poorter, L.M.I., Geerts, W.J., and Keltjens, J.T., 2007. Coupling of Methanothermobacter thermautotrophicus methane formation and growth in fed-batch and continuous cultures under diferent H2 gassing regimes. Appl. Environ. Biotechnol., 73:740–49. Geerdink, M.J., van Loosdrecht, M.C.M., and Luyben, K.Ch.AM., 1996. Biodegradability of diesel oil. Biodegradation, 7:73–81. Jansen, M.L.A., Krook, D.J.J., de Graaf, K., van Dijken, J.P., Pronk, J.T., and de Winde, J.H., 2006. Physiological characterizationand fed-batch production of an extra cellular maltase of Schizosaccharomyces pombe CBS 356. FEMS Yeast Res., 6:888–901. Heijnen, J.J., Roels, J.A., and Stouthamer, A.H., 1979. Application of balancing methods in modeling the penicillin fermentation. Biotechnol. Bioeng. 21(12):2175–201. Heijnen, J.J., 1991. A new thermodynamically based correlation of chemotrophic biomass yields. Antonie Van Leeuwenhoek, 60(3–4):235–56.

9-32

Balances and Reaction Models

Lineweaver, H. and Burk, D., 1934. he determination of enzyme dissociation constants. J Am. Chem. Soc., 56: 658–66. Michaelis, L. and Menten, M., 1913. Die Kinetik der Invertinwirkung. Biochem. Z., 49:333–69. Pirt, S.J., 1965. he maintenance energy of bacteria in growing cultures. Proc. R. Soc. Lond. Ser B, 163:224–31. Revilla, G., Lopez-Nieto, M.J., Luengo, J.M., and Martin, J.F., 1984. Carbon catabolite repression of penicillin biosynthesis by Penicillium chrysogenum. J. Antibiot. (Tokyo), 37(7):781–89. Savageau, M.A., 1995. Michaelis-Menten mechanism reconsidered: Implications of fractal kinetics. J. heor. Biol., 176, 115–24. Schill, N., van Gulik, W.M., Voisard, D., and von Stockar, U., 1996. Continuous cultures limited by a gaseous substrate: Development of a simple, unstructured mathematical model and experimental veriication with Methanobacterium thermoautotrophicum. Biotechnol. Bioeng., 51:645–58. Smolders, G.J.F., Van der Meij, J., Van Loosdrecht, M.C.M., and Heijnen, J.J., 1994. Model of the anaerobic metabolism of the biological phhosphorus removal process; stoichiometry and pH inluence. Biotech. Bioeng., 43:461–70. Van Gulik, W.M., Antoniewicz, M.R., deLaat, W.T., Vinke, J.L., and Heijnen, J.J., 2001. Energetics of growth and penicillin production in a high-producing strain of Penicillium chrysogenum. Biotechnol Bioeng., 20;72(2):185–93. Van Gulik, W.M., ten Hoopen, H.J., and Heijnen, J.J., 2001. he application of continuous culture for plant cell suspensions. Enzyme Microb. Technol., 28(9–10):796–805. Xu, F. and Ding, H., 2007. A new kinetic model for heterogeneous (or spatially conined) enzymatic catalysis: Contributions from the fractal and jamming (overcrowding) efects. Appl. Catal. A Gen., 317, 70–81.

10 Metabolic Models for Growth and Product Formation 10.1 Introduction .....................................................................................10-1 10.2 Modular Approach ......................................................................... 10-2 10.3 Detailed Stoichiometric Models................................................... 10-3            

Walter M. van Gulik Delft University of Technology

Estimation of ATP Stoichiometry Parameters  •  ATP Stoichiometry  in Metabolic Networks  •  Calculation of Maximum Yields of  Biomass  •  Calculation of Maximum Yields of Biomass and  Product  •  Calculation of Metabolic Network Topology for  Growth on Mixed Substrates  •  heoretical Yield Limits to the  Overproduction of Amino Acids  •  Limit Functions for Maximum  Product Yields

10.4 Conclusions ....................................................................................10-17 References ..................................................................................................10-18

10.1 Introduction In many fermentation processes the product yield YSP is a parameter of major economic importance. he theoretical maximum to the product yield for a certain organism and product, Yspmax , is determined by the stoichiometry of the product pathway and connected central metabolic pathways. he theoretical maximum product yield can be changed by changing the stoichiometry of the metabolic network by means of genetic interventions, such as: •  Replacement of a transporter which consumes ATP by a transporter which does not (active transport becomes passive transport or vice versa) •  Replacement of a decarboxylation reaction •  Replacement of an NADPH consuming reaction by an NADH consuming reaction (or vice versa) •  Introduction of a catabolic pathway for a novel substrate •  Replacement of an ATP consuming reaction by a non-ATP consuming reaction •  Introduction of an alternative product pathway he quantitative efects of such changes on the maximum theoretical yields of product, but also of biomass on substrate can be calculated with so-called stoichiometric metabolic models. In principle a stoichiometric metabolic model is tailor made for each organism and incorporates all the available biochemical information of the studied organism. With the advent of (partially) annotated genomes the present knowledge is rather extensive (and is increasing every day) and large scale stoichiometric metabolic models based on genomic information may easily contain more than 1000 metabolic reactions (Feist et al., 2007, Oh et al., 2007, Duarte et al., 2004). 10-1

10-2

Balances and Reaction Models

TaBLE 10.1 Setting Up a Stoichiometric Metabolic Model Detailed Approach •  Include all potentially available enzymes based on textbook, literature and (annotated) genome information. •  Uses transcript analysis on expressed genes and enzymes present. •  Formulate the complete stoichiometry of each of the about 200–1000 reactions. •  Deine the stoichiometric matrix S. •  Apply matrix calculations for analysis of the model and lux analysis.

Modular Approach •  Central metabolism lumped into the biosynthesis reactions for the 12 key carbon metabolites. •  Lumped reactions for regeneration of key cofactors such as ATP, NADH, NADPH. •  Lumped reactions for the biosynthesis of monomers, e.g., amino acids, nucleotides, fatty acids from the key carbon metabolites. •  Reactions for polymerization of monomers to polymers. •  Single, lumped reaction for biomass formation. •  Lumped reaction for product formation. •  Calculations can be performed by hand.

In principle there are two approaches in formulating a stoichiometric metabolic model (see Table 10.1). he detailed approach, which results in a large and detailed model containing hundreds of reactions, becomes now more and more feasible with the genome wide approaches. Here the computer must be used and there is now sotware under development which allows the direct deinition of the stoichiometric metabolic model from the annotated genome, (together with, e.g., transcriptome information which yields the expressed genes, and hence the enzymes and therefore the reactions which are present). It has been shown that, because of their detailed structure, genome-scale metabolic models are well suited for the in silico analysis of e.g., the behavior of certain organisms and the a priori assessment of the efects of alterations of the metabolic network stoichiometry by means of genetic intervention. However, the various applications of genome-scale models will not be treated here, but can be found in Section IV of this book. In this section we will focus on the applications of stoichiometric metabolic models of moderate size, containing 100–200 reactions as well as the modular approach.

10.2 Modular Approach he synthesis of each of the hundreds of diferent molecules which are present in the cells (amino acids present in proteins, nucleotides present in RNA and DNA, fatty acids, glycerol, and other compounds present in the membrane lipids, carbohydrates, and other compounds present in cell-walls, cofactors, etc.) and in secreted products occurs in the biosynthetic pathways. Here each particular compound synthesized has its own pathway, oten a sequence of enzymecatalyzed steps. In the modular approach many parts of the metabolic network are lumped into a single reaction. his results in reducing the complexity to a large extent, however, thereby loosing detailed stoichiometric information. It should be realized here that if the lumping procedure is performed in a proper way, the simpliied model should still provide the correct information on the overall stoichiometry of metabolism, e.g., maximum theoretical yields of biomass and product(s). his approach was followed by Ingraham et al. (1983). hey showed that precursor synthesis in central metabolism can be reduced to 12 reactions for the production of the 12 key intermediates from the C-source supplied. As an example they tabulated the biosynthesis costs, in terms of C-source and associated consumption/production of ATP, NADPH, NADH, and CO2 for two diferent substrates, namely glucose and malate. Furthermore, reactions were deined for the biosynthesis of the monomers, i.e., amino acids, nucleotides, fatty acids, lipopolysaccharides, carbohydrates, etc. from the 12 precursors, 1-C compounds, NH3, and S, taking into account the consumption/production of ATP, NADPH, and NADH. Also the energy requirements for polymerization of the monomers to macromolecules were given. Finally, from the biochemical composition of the biomass, the required amounts of building blocks were calculated. his approach can be considered as a relatively simple stoichiometric metabolic model which provides understanding of how cellular metabolism is organized, demonstrates what resources are needed to produce all the building blocks and coenzymes for the production of a certain amount of cells. If in addition the production of

Metabolic Models for Growth and Product Formation

10-3

ATP and reducing equivalents from the substrate is included, this approach allows the calculation of biomass yields on diferent C-sources. However, in order to perform metabolic lux analysis (MFA), i.e., calculate the luxes through metabolic pathways under certain conditions, more detail is required. One of the irst published papers on the application of a stoichiometric metabolic model is probably that of Verhof and Spradlin (1976). hey used a stoichiometric model of the TCA cycle including variations thereof to analyze diferent possible metabolic routes for the production of citric acid by Aspergillus niger, which can be considered as a irst example of elementary mode analysis.

10.3 Detailed Stoichiometric Models One of the irst examples of the application of a detailed stoichiometric metabolic model for the quantitative estimation of metabolic luxes has been published by Rabkin and Blum in 1985. hey used a complete stoichiometric model of the “upper” metabolic pathways (gluconeogenesis, glycolysis, pentose phosphate pathway) and a minimal model of the “lower” pathways (mitochondrial and associated reactions) to perform MFA of hepatocytes in the presence and absence of the hormone glucagon. At present the availability of (partly) annotated genomes for an increasing number of microorganisms ofers the possibility of genome-scale metabolic reconstruction, i.e., the construction of detailed metabolic models based on the available genes. herefore such models consist of a large number of biochemical reactions, oten more than 1000, contain many parallel pathways, a large number of transport reactions for many diferent compounds which may serve as alternative substrates and many connected catabolic reactions. In fact a genome-scale metabolic reconstruction should not be considered as a model but rather as a database of all biochemical reactions for which a certain organism has the genes available. It should be realized that, due to the fact that most genomes are not yet completely annotated, these genome-scale reconstructions contain many dead-end reactions, that is, reactions which produce a certain compound for which no reaction is available to consume it. his is not a problem because genome scale reconstruction is an ongoing process and as annotation of genomes proceeds the reaction databases can be extended and dead ends can be resolved. It should be realized, that in real life a microorganism growing under certain conditions and in the presence of certain nutrients does never express all available genes. Depending on the conditions which the microorganisms encounter, genetic regulation will alter the topology of the biochemical reaction network such that the organism is optimally adapted to the growth conditions. hus, depending on the growth conditions the biochemical reaction network consists of a certain subset of all available reactions. Genome-scale metabolic reconstructions can be used, e.g., to explore the metabolic capabilities of a certain microorganism to adapt to certain conditions, to predict the efects of genetic alterations, to identify and characterize all possible phenotypes, to calculate optimal reaction network states for maximum product formation, etc. (Price et al., 2004). Genome scale metabolic models can also be used to calculate the lux distributions through the metabolic network from measured net conversion rates under certain conditions. However, this is not possible as such with a complete metabolic reconstruction for a certain microorganism, because many parallel pathways and futile cycles exist. Meaningful lux distributions can only be obtained by using a relevant subset of reactions from the complete database. Subsets of reactions can be obtained in several ways, either manually by using available biochemical and/or transcriptome information, i.e., on which enzymes are expressed under certain conditions or computationally by means of constraint based optimization. In the case of constraint based optimization, e.g., linear programming (LP) is used to calculate the lux distribution through the metabolic network under the condition that a certain objective function, e.g., yield of biomass on substrate, is maximized. It has been shown that from a large database of reactions, obtained from genomic reconstruction, largely reduced minimal models can be obtained for the description of growth under certain deined conditions, e.g., on a deined minimal medium (Burgard et al., 2001), as is oten the case under laboratory

10-4

Balances and Reaction Models

and even in an increasing number of cases under industrial conditions. Burgard et al. showed that a large stoichiometric model for E. coli, consisting of 720 biochemical reactions could be reduced to 224 reactions to support growth on a glucose-only medium and 229 for an acetate-only medium. Such reduced models can subsequently be applied to calculate lux distributions by means of MFA. However, uncertainties remain concerning speciic details of the metabolic network, e.g., possible alternative pathways, intracellular compartmentation and cofactor speciicity of particular reactions. If these details are important, knowledge on speciic aspects can be obtained through additional biochemical research. An exception to this is the operational ATP-stoichiometry of oxidative phosphorylation as well as growth dependent and growth independent maintenance energy costs. hese are generally not known beforehand, and can not be obtained in a straightforward manner from biochemical research. Furthermore the values of these parameters vary between microorganisms. herefore the ATP-balance is normally not used as a constraint in the lux balancing procedures. However, the unknown ATPstoichiometry parameters can be estimated from experimental data as will be shown later on in this chapter. It will also be demonstrated that if the ATP stoichiometry is known the stoichiometric model can be applied to perform a priori lux calculations, and to calculate maximum theoretical yields of biomass and product on single and multiple substrates.

10.3.1 estimation of AtP Stoichiometry Parameters By using diferent carbon substrates or ratios of mixed substrates and diferent growth rates the relative contributions of substrate level and oxidative phosphorylation to the total generation of ATP can be manipulated experimentally. his allows the estimation of the unknown ATP-stoichiometry coeicients of oxidative phosphorylation (P/O-ratio), growth dependent maintenance (K x) and growth independent maintenance (mATP) of the metabolic network model from experimental data (van Gulik and Heijnen, 1995, Vanrolleghem et al., 1996, van Gulik et al., 2001). A correct estimation of these coeicients, and moreover a veriication whether these coeicients may be considered constant for a certain range of experimental conditions, is crucial for acceptable lux predictions within the range of experimental conditions (interpolation) and beyond (extrapolation). his is for instance the case if the metabolic network model is used to predict maximum biomass and/or product yields. So far this aspect has received little attention. he method to estimate the ATP stoichiometry parameters is directly based on the ATP-balance, which contains the parameters only in a linear form. his makes the application of the estimation procedure relatively straightforward.

10.3.2 AtP Stoichiometry in Metabolic networks he basis of metabolic network models is formed by the balance equations formulated for the components that take part in the biochemical conversions in the cell. For ATP, a component considered to be in pseudo steady state, the production equals the consumption, which puts the net result of the balance to zero. Although the ATP stoichiometry coeicients of many ATP generating and ATP consuming reactions are known, diiculties arise with respect to the uncertain ATP stoichiometry of oxidative phosphorylation, additional ATP-costs of anabolism and the ATP consumption in maintenance processes. As a result, the ATP balance can be written as: P ⋅ q 2e O

∑q

i ATP

- K X ⋅ µ - m ATP = 0

(10.1)

where q2e is the speciic lux of electrons through the respiratory chain, Σq iATP is the net rate of ATP consumption in the part of the metabolic network model of which the ATP stoichiometry is known (i.e., the result of all stoichiometrically ixed ATP usage, as well as production in substrate level phosphorylation), and µ is the speciic growth rate of the cells.

10-5

Metabolic Models for Growth and Product Formation

he parameters K X and mATP are operational values for growth associated maintenance, and nongrowth-associated maintenance respectively. It should be realized that P/O, being the rate of ATP synthesis divided by the rate of oxygen consumption in oxidative phosphorylation can not be considered as a parameter. he reason is that this ratio is determined by the division of the electron lux over the diferent proton translocating complexes (I, III, and IV) of the respiratory chain which have diferent H+฀/2e stoichiometries. his division, and thus the P/O-ratio is a function of the growth conditions, e.g., the carbon substrate used, the growth rate, the rate of product formation, etc. However, if the metabolic model applied for the metabolic lux balancing is suiciently detailed, the origin of the reducing equivalents generated in microbial catabolism is known and thus the relative contributions of complexes I, II, and IV of the respiratory chain to oxidative phosphorylation. To include this, the ATP balance has to be extended to: δ ⋅ (q NADH:mit + α ⋅ q NADH:cyt + β ⋅ q FADH )2e 2e 2e

∑q

i ATP

- K ⋅ µ - m ATP = 0

(10.2)

where α and β represent the relative contributions to proton translocation of electrons delivered by cytosolic NADH and FADH, respectively. he values of these parameters depend on the construction of the electron transport chain of the organism under study. If electrons derived from cytosolic NADH and FADH bypass complex I, both α and β may have for example a value of 2/3. If complex I is not operative, e.g., in case of Saccharomyces cerevisiae α and β are both equal to 1. he parameter δ represents the maximum P/O-ratio, i.e., when all electrons pass the complete respiratory chain and is thus, by deinition, not equal to the P/O-ratio, as deined in Equation 10.1. However, if the H+฀/2e stoichiometry in proton translocation as well as the H฀+฀/ATP stoichiometry of the ATP-synthase can be considered independent of the growth conditions, δ can also be considered independent of the growth conditions. Estimates of the parameters δ, K, and mATP are obtained by calculating for each experimental condition the values for q NADH:mit ,q NADH:cyt ,q FADH ,Σq iATP and µ from metabolic lux balancing without using the 2e 2e 2e ATP-balance as a constraint in the lux balancing procedure. his can be accomplished by either leaving out the ATP-balance from the network model altogether, or by including an ATP-hydrolysis reaction: ATP฀+฀H2OADP฀+฀Pi฀+฀H

(10.3)

Subsequently, ater the luxes have been obtained from metabolic lux balancing, the coeicients δ, K, and m ATP are estimated using Equation 10.2. his requires at least three diferent sets of the above mentioned luxes (q’s and µ). his can, e.g., be achieved by performing chemostat cultivations on diferent carbon substrates, or a varying ratio of mixed substrates (van Gulik and Heijnen, 1995; Vanrolleghem et al., 1996, van Gulik et al., 2001). Because Equation 10.2 is linear, the estimation procedure is straightforward. If only sets at a single growth rate are available growth dependent and nongrowth-dependent maintenance can not be distinguished and Equation 10.2 must be simpliied to: δ ⋅ (q NADH:mit + α ⋅ q NADH:cyt + β ⋅ q FADH )2e 2e 2e

∑q

i ATP

- K ′⋅ µ = 0

(10.4)

in which K′ is the overall result of growth related maintenance and non growth-related maintenance: K ′=K X +

m ATP µ

(10.5)

he resulting values of δ, K X, and mATP (or δ and K′) will be the best estimates for the given sets of luxes.

10-6

Balances and Reaction Models

10.3.3 Calculation of Maximum Yields of Biomass A detailed stoichiometric metabolic model which contains the proper ATP stoichiometry allows the calculation of maximum theoretical yields of biomass and product(s) on the carbon substrate and therefore (see Chapter 9 of this section on black box modeling) also on consumed oxygen, produced carbon dioxide, etc. An example of this approach can be found in van Gulik and Heijnen (1995). hey used published data on biomass yields in steady state carbon limited chemostat cultures of S. cerevisiae and C. utilis at a dilution rate of 0.1 h -1 (Verduyn, 1991). he diferent conditions were anaerobic growth of S. cerevisiae on glucose and aerobic growth of S. cerevisiae on glucose, ethanol, and acetate and aerobic growth of C. utilis on acetate, citrate, ethanol, gluconate, glucose, glycerol, lactate, pyruvate, and succinate. Determined stoichiometric metabolic models were constructed for the two yeast strains and the different growth conditions. All models contained three degrees of freedom, which means that three rates had to be speciied in order to calculate all other rates (net conversion rates and reaction rates). he rates to be speciied were the speciic growth rate of the cells µ, the rate of ATP hydrolysis to account for maintenance and (in case of aerobic growth) the rate of NADH oxidation which was required to introduce the P/O ratio as a parameter in the stoichiometric model: ATP hydrolysis for maintenance: 1 ATP฀+฀1 H2O1 ADP฀+฀1 Pi฀+฀1 H NADH oxidation: 1 NADH฀+฀0.5 O2฀+฀1 H1 H2O฀+฀1 NAD he fact that all experimental data were collected at one and the same growth rate prevents the distinction between growth- and nongrowth-associated maintenance energy requirements. As a consequence the growth dependent and growth independent maintenance coeicients could not be estimated separately but instead only a combination of the two, namely K′ (see Equation 10.5). It is known, however, that nongrowth-associated maintenance needs of yeasts are relatively low (Verduyn et al., 1991), certainly at a speciic growth rate of 0.1 h -1. herefore, it could be assumed that for both S. cerevisiae and C. utilis nongrowth-associated maintenance energy needs were negligible. In this study the costs for peptide chain elongation were assumed to be 4 ATP per amino acid. However, as pointed out by Verduyn et al. (1991) it should be realized that this is a relatively uncertain igure which might well be higher due to extra energy costs associated with the addition of incorrect amino acids to the chain and with subsequent proofreading. For this reason, growth-associated maintenance energy needs were assumed to be proportional to the rate of protein synthesis and to equal an amount of K′ mol ATP/C-mol of protein synthesized. he rate of extra ATP consumption for growthassociated maintenance therefore equals: rATP,maint =K ′⋅ X P ⋅µ

(10.6)

where X P is the protein fraction of the biomass. From the metabolic network for anaerobic growth of S. cerevisiae on glucose, which had two degrees of freedom, namely the growth rate µ and the rate of ATP hydrolysis for maintenance, the following expression for the biomass yield as a function of ATP consumption for maintenance was obtained: an YSX,netw =

1 0.904 + 0.5 ⋅ K ′⋅ X P (i )

(10.7)

Under aerobic conditions the biomass yield is a function of two parameters, the efective P/O ratio, δ, and the maintenance coeicient, K′ and has the general form: aer = YSX,netw(i)

α1(i)+α 2 (i) ⋅ δ α 3 (i)+α 4 (i) ⋅ δ+α 5 (i) ⋅ K ′⋅ X P (i)

(10.8)

10-7

Metabolic Models for Growth and Product Formation

where α1(i)-α5(i) (for i฀=฀1–12 for the 12 diferent yeast-substrate combinations under aerobic conditions) are constant coeicients. It was assumed that the value of K′ is the same for both S. cerevisiae and C. utilis. However, the efective P/O ratio, δ, was assumed to be diferent, which is obvious from the diferences between the electron transport chain of both yeasts, i.e., S. cerevisiae lacks phosphorylation site I while C. utilis does not. Both parameters, K′ and δ, were assumed not to be inluenced by the carbon substrate used. hus, three parameters had to be estimated to describe the growth yields of the two yeasts on diferent carbon substrates. his was accomplished by minimizing the sum of the squared diferences between the experimental biomass yields, YSX,exp(i) and the biomass yield calculated from the model, YSX,netw(i). he estimated values for the efective P/O ratio and growth associated ATP needs for maintenance were, respectively: δ฀=฀1.20 for S. cerevisiae and δ฀=฀1.53 for C. utilis, and K′฀=฀1.37 mol ATP/C-mol protein for both yeasts. he lower estimation of the efective P/O ratio of S. cerevisiae agrees well with the absence of phosphorylation site I. Using an average igure for the protein content of the biomass of 50%, or 0.022 C-mol protein/g biomass, the estimated growth-associated maintenance was 30 mmol/g biomass. Having the estimated values of the ATP stoichiometry parameters, δ and K′, allows to calculate the biomass yields for the diferent experimental conditions using Equations 10.7 and 10.8. A comparison between the yields calculated from the metabolic networks with the itted ATP stoichiometry parameters and the experimental yields obtained for S. cerevisiae and C. utilis is shown in Figure 10.1. As can be seen from this igure, the experimental biomass yields could be predicted well for growth on a wide variety of substrates using the estimated values for the efective P/O ratio and growth associated maintenance.

10.3.4 Calculation of Maximum Yields of Biomass and Product A comparable approach as has been described above was followed for P. chrysogenum, although a more detailed metabolic network was constructed wherein also cellular compartmentation, i.e., division of the 0.8 12 3

Predicted yield (g/g)

0.6 2 7 6 0.4

10

11 9

8 13 5

4 0.2 1

0

0

0.2

0.4 Measured yield (g/g)

0.6

0.8

FIguRE 10.1 Predicted versus measured biomass yields of S. cerevisiae and Candida utilis in carbon-limited chemostat culture at a dilution rate of 0.1 (h -l). (฀∆): S. cerevisiae: 1. anaerobic growth on glucose; 2. aerobic growth on glucose; 3. aerobic growth on ethanol; 4. aerobic growth on acetate. (■): Aerobic growth of C. utilis: 5. acetate; 6. succinate; 7. lactate; 8. gluconate; 9. glucose; l0. citrate; 11. glycerol; 12. ethanol; 13. pyruvate. (From van Gulik, W.M. and Heijnen, J.J., Biotechnol. Bioeng., 1995, 48, 681–698. With permission.)

10-8

Balances and Reaction Models TaBLE 10.2 Estimated Values of the ATP-Stoichiometry Parameters for P. chrysogenum with heir 95% Conidence Intervals Parameter δ KX

Value 1.84 ±฀0.08 mol ATP/mol O 0.38 ±฀0.11 mol ATP/C-mol biomass

KP

73 ±฀20 mol ATP/mol penicillin

mATP

0.033 ±฀0.012 mol ATP/C-mol biomass/h

Source: van Gulik, W.M., Antoniewicz, M.R., DeLaat, W.T.A.M., Vinke J.L., and Heijnen, J.J., Biotechnol. Bioeng., 71:185–193. With permission.

cells in cytosolic, mitochondrial, and peroxisomal compartments was taken into account (van Gulik et al., 2000). From chemostat experiments on three diferent carbon sources and carried out at a range of diferent dilution rates the ATP stoichiometry parameters, that is the P/O ratio, and the growth dependent and growth independent maintenance coeicient were estimated (van Gulik et al., 2001). In addition to this an additional parameter K P was introduced to account for additional ATP dissipation for penicillin-G production. Because the penicillin biosynthesis pathway is divided over diferent compartments of the cell and the inal product is actively excreted it was anticipated that additional ATP would be required for transport processes. However, the estima ted value of the parameter KP appeared to be surprisingly high, namely 73 mol ATP per mol of penicillin-G produced. he estimated parameters are shown in Table 10.2. In a similar way as has been pointed out above expressions for the maximum yields of biomass and the product penicillin-G on substrate as a function of the P/O-ratio and growth dependent and growth independent maintenance coeicients were derived from the stoichiometric metabolic models for the diferent substrates. he obtained relations are shown in Table 10.3. Ater substitution of the estimated ATP stoichiometry parameters the numerical values for the yield and maintenance coeficients on the supplied carbons source and on oxygen can be calculated (see Tables 10.4 and 10.5 respectively). Validation of the predictions of the biomass yield under producing and nonproducing conditions in independent experiments showed that model predictions and experimental results corresponded very well (see Table 10.6).

10.3.5 Calculation of Metabolic network topology for Growth on Mixed Substrates Using a relatively simple, uncompartmented stoichiometric model for the growth of S. cerevisiae on diferent carbon sources, van Gulik and Heijnen (1995) showed that constrained based optimization can provide correct predictions of changes of the metabolic network structure initiated by changes in the environmental conditions. he subset of reactions applied in the model allowed for growth on both ethanol and glucose as carbon substrates because in addition to the central metabolic pathways for glucose catabolism also the pathways required for growth on ethanol (i.e., gluconeogenesis and glyoxylate shunt) were present. his resulted in a stoichiometric metabolic model with seven degrees of freedom which was underdetermined because the only input variables were the consumption rates of glucose and ethanol. herefore constrained linear optimization was applied to estimate the metabolic lux pattern as a function of the glucose/ethanol ratio in the feed. he constraint which was chosen for the optimization was maximum biomass yield on the mixed carbon substrate. Subsequently the luxes through the metabolic network of S. cerevisiae were estimated for growth on glucose and ethanol alone and for growth on a range of glucose/ethanol mixtures, using the estimated values for the ATP stoichiometry parameters. It was calculated that changes in the topology of the metabolic network, that is, switching on and switching of of metabolic pathways, occurred at ethanol fractions of the feed of 0.09, 0.48, 0.58, and 0.73 C-mol/C-mol (Figure 10.2a through f).

10-9

Metabolic Models for Growth and Product Formation TaBLE 10.3 Derived Relations for the Calculation of the Maximum Biomass and Penicillin Yields and Maintenance Coeicients on Substrate and Oxygen from the Estimated ATP-Stoichiometry Parameters for P. chrysogenum Growth on Glucose max = YSX

δ + 0.283 1.07δ + 0.566K X + 1.02

max = YOX

δ + 0.283 0.0256δ + 0.566K X + 0.727

max = YSP

δ + 0.283 11.2δ + 0.566K P + 12.3

max = YOP

δ + 0.283 1.72δ + 0.566K P + 9.58

 5 - 2δ  mS =  + 0.203 ⋅ m ATP  9.83δ + 2.78 

 5 - 2δ  mO =  + 0.203 ⋅ m ATP  9.83δ + 2.78  Growth on Ethanol

max = YSX

δ - 0.179 0.732δ + 0.357K X + 0.82

max = YOX

δ - 0.179 0.0471δ + 0.536K X + 1.42

max = YSP

δ - 0.179 7.71δ + 0.357K P + 9.29

max = YOP

δ - 0.179 2.07δ + 0.536K P + 15.6

 5 - 2δ  mS =  + 0.154 ⋅ m ATP  13δ - 2.32 

 5 - 2δ  mO =  + 0.231 ⋅ m ATP  8.67δ - 1.55  Growth on Acetate

max = YSX

δ - 0.278 1.14δ + 0.556K X + 1.37

max = YOX

δ - 0.278 0.0833δ + 0.556K X + 1.66

max = YSP

δ - 0.278 11.9δ + 0.556K P + 16.8

max = YOP

δ - 0.278 2.36δ + 0.556K P + 19.4

 5 - 2δ  mS =  + 0.25 ⋅ m ATP  8δ - 2.22 

 5 - 2δ  mO =  + 0.25 ⋅ m ATP  8δ - 2.22 

Source: van Gulik, W.M., Antoniewicz, M.R., DeLaat, W.T.A.M., Vinke J.L., and Heijnen, J.J., Biotechnol. Bioeng., 71:185–193. With permission.

TaBLE 10.4 Calculated Yield and Maintenance Parameters of Penicillin and Biomass on Carbon Source with heir 95% Conidence Intervals C-source

max (C-mol/C-mol) YSX

max (mol/C-mol) YSP

ms (C-mol/C-mol/h)

Glucose

0.663 ±฀0.013

0.029 ±฀0.004

0.0088 ±฀0.0032

Ethanol

0.721 ±฀0.015

0.034 ±฀0.005

0.0071 ±฀0.0026

Acetate

0.425 ±฀0.010

0.020 ±฀0.003

0.0117 ±฀0.0042

Source: van Gulik, W.M., Antoniewicz, M.R., DeLaat, W.T.A.M., Vinke J.L., and Heijnen, J.J., Biotechnol. Bioeng., 71: 185–193. With permission.

Figure 10.2a shows the calculated metabolic luxes through the central metabolic pathways for growth on 100% glucose. he irst change in the network structure, which was predicted when the ethanol fraction of the feed was 0.09 (C-mol/C-mol), was that the lux through transketolase converting xylulose 5-phosphate฀+฀erythrose 4-phosphate into glyceraldehyde 3-phosphate฀+฀fructose 6-phosphate became equal to zero (Figure 10.2b). he reason for this is that, at this point, there is no need for NADPH synthesis through the pentose phosphate pathway because suicient NADPH can be produced in a more economic way through NADP linked acetaldehyde dehydrogenase and NADP linked isocytrate dehydrogenase. However, it should be realized that the predicted switch depends on the assumption that

10-10

Balances and Reaction Models TaBLE 10.5 Calculated Yield and Maintenance Parameters of Penicillin and Biomass on Oxygen with heir 95% Conidence Intervals max (C-mol/mol) YOX

max (mol/mol) YOP

mO (mol/C-mol/h)

Glucose

2.15 ±฀0.143

0.039 ±฀0.008

0.0088 ±฀0.0032

Ethanol

0.97 ±฀0.04

0.028 ±฀0.005

0.0106 ±฀0.0038

Acetate

0.77 ±฀0.03

0.024 ±฀0.004

0.0117 ±฀0.0042

C-source

Source: van Gulik, W.M., Antoniewicz, M.R., DeLaat, W.T.A.M., Vinke J.L., and Heijnen, J.J., Biotechnol. Bioeng., 71: 185–193. With permission.

TaBLE 10.6 Measured and Calculated Efect of the Speciic β-Lactam Production on the Steady State Biomass Concentration in Glucose Limited Chemostat Cultures of P. chrysogenum Measured Biomass Concentration (g/L)

Calculated Biomass Concentration (g/L)

High Production

Low Production

High Production

Low Production

-1

1.95

2.74

1.98

2.60

Chemostat at µ฀=฀0.03 h-1

2.77

3.27

2.63

3.28

Chemostat at µ฀=฀0.06 h-1

3.25

3.69

3.31

3.65

Chemostat at µ฀=฀0.01 h

Source: van Gulik, W.M., Antoniewicz, M.R., DeLaat, W.T.A.M., Vinke J.L., and Heijnen, J.J., Biotechnol. Bioeng., 71: 185–193. With permission.

these two NADP linked enzymes are active under these conditions. he second change in the network was predicted to occur at an ethanol content of the feed of 0.48 C-mol/C-mol. At this point all acetylCoA is now completely synthesized from ethanol and therefore the lux through pyruvate dehydrogenase becomes equal to zero (Figure 10.2c). When the ethanol content of the feed is further increased the illing-up of the citric acid cycle can no longer be provided for by pyruvate carboxylase alone and the metabolic network predicted that the glyoxylate shunt (i.e., isocytrate lyase and malate synthase) became operative. At an ethanol fraction of 0.58 C-mol/C-mol. the lux through pyruvate carboxylase became equal to zero and the carbon lux was predicted to be channeled through PEP-carboxykinase to convert oxaloacetate to PEP (Figure 10.2d). hereater, a further increase of the ethanol fraction resulted in a predicted reversal of several reversible steps in glycolysis until, at an ethanol content of 0.73 C-mol/C-mol, the calculated lux through phosphofructokinase fell to zero and was replaced by the reversed reaction through fructose-l,6-bisphosphatase (Figure 10.2e). Ater this last change the minimal model for growth on ethanol was obtained. Figure 10.2f shows the metabolic lux pattern for growth on 100% ethanol. he question is now how to validate these model predictions experimentally. A way to do so would be the cultivation of S. cerevisiae in carbon limited chemostat cultures on diferent mixtures of glucose and ethanol and then measure the activities of the relevant enzymes. his was done by de Jong-Gubbels et al. (1995). hey found that the activities of isocytrate lyase, malate synthase, PEP-carboxykinase, and fructosel,6-biphosphatase in cell free extracts were negligible in chemostat cultures on 100% glucose. However, when the cells were cultivated on glucose/ethanol mixtures malate synthase activity was detected at an ethanol content of 0.4 C-mol/C-mol and above and fructose- 1,6-biphosphatase activity was detected at an ethanol content of 0.7 C-mol/C-mol and higher. his corresponds very well with the predictions obtained from the metabolic network. However, activities of isocytrate lyase and PEP-carboxykinase were already detectable at low ethanol fractions of the feed. Furthermore it was found that the pyruvate kinase activity decreased at increasing ethanol content. his was indeed predicted by the metabolic network (Figure 10.2a through f), although it was also predicted that the lux through this enzyme reached a low but constant level above an ethanol content of 0.58 C-mol/C-mol. Unfortunately, the experimental data on pyruvate kinase activity contained too much scatter to draw further conclusions. he model predicted that the lux

10-11

Metabolic Models for Growth and Product Formation (a)

(b)

GLUC 1.00

0.08 0.23

RIBU5P

GLUC 0.91

0.03 0.07

RIBU5P

GLUC6P

GLUC6P 0.61

0.50 RIB5P

XYL5P 0.10

0.15 GAP + SED7P

RIB5P

FRUC6P 0.66

0.15

GAP 0.63

0.04

G3P 0.66

G3P 0.60

FRUC6P + E4P

PEP 0.63

PEP 0.58

PYR

PYR

0.30

0.13

FRUC6P 0.64

0.00

0.04 GAP + SED7P

GAP 0.69

FRUC6P + E4P

XYL5P

0.09

ACET 0.26 +NADH +NADPH

0.13

AC

ACCOA

ACCOA

OAA

ETOH

OAA

0.64

0.78

0.47

0.38 MAL

ISOCIT 0.53

0.38 FUM

AKG 0.36

0.36 SUC

SUCCOA 0.36

MAL

ISOCIT 0.65

0.47 FUM

AKG 0.45

0.45 SUC

SUCCOA 0.45

FIguRE 10.2 Estimated optimal metabolic lux patterns for aerobic growth of S. cerevisiae on a mixture of glucose and ethanol. All luxes are given in C-mol of carbon transferred and are presented as fractions of the consumption rate of mixed carbon substrate (C-mol/h). (a) Growth on 100% glucose. (b) Cessation of NADPH production in the pentose phosphate pathway at an ethanol fraction of 0.09 C-mol/C-mol in the feed. (c) Cessation of the lux through pyruvate decarboxylase and start of glyoxylate cycle at 0.48 C-mol ethanol/ C-mol. (d) Cessation of the lux through pyruvate carboxylase and, instead, reversal of the carbon lux via PEPcarboxykinase at an ethanol fraction of 0.58 C-mol/C-mol. (e) Reversal of several reversible steps in glycolysis, cessation of the carbon lux through phosphofructokinase, and instead reversal of the carbon lux via fructose1,6-biphosphatase at an ethanol fraction of 0.73 C-mol/C-mol. (f) Growth on 100% ethanol. (From van Gulik, W.M. and Heijnen, J.J., Biotechnol. Bioeng., 1995, 48, 681–698. With permission.)

through pyruvate carboxylase was constant up to an ethanol content of 0.48 C-mol/C-mol. Up to this ethanol fraction this is the sole anaplerotic route to ill up the TCA cycle. Between an ethanol fraction of 0.48 and 0.58 C-mol/C-mol the model predicted that the anaplerotic function of pyruvate carboxylase is gradually taken over by the glyoxylate shunt. However, in contrast to this, enzyme activity measurements did not reveal signiicant changes in the activity of pyruvate carboxylase upon transition from 100% glucose to 100% ethanol (de Jong-Gubbels et al., 1995). Finally, the lux through phosphofructokinase was predicted to decrease at increasing ethanol fractions and to fall to zero at an ethanol fraction of 0.73 (Figure 10.2e). Also, in this case, the in vitro measured enzyme activity was not inluenced by the ethanol fraction in the feed. It was concluded by the authors, however, that the actual luxes through the enzymes most probably have been modulated at the metabolome level, instead of the enzyme level.

10-12 (c)

Balances and Reaction Models

RIBU5P

(d)

GLUC 0.52

0.03 0.07

GLUC6P

RIB5P

0.73

0.98

0.00

FUM

AKG 0.71

0.71 SUC

SUCCOA 0.71

G3P 0.11 PEP 0.08 PYR

0.00

ETOH 0.58 ACET +NADH +NADPH AC

ACCOA

OAA

1.17

ACCOA 0.00 ISOCIT GLYO

MAL

0.12 FRUC6P 0.14 GAP 0.14

0.00

AC

ACCOA

0.73

GLUC6P

XYL5P

0.04 GAP + SED7P 0.04 FRUC6P + E4P

ETOH 0.48 PYR ACET 0.00 + NADH +NADPH

OAA

0.07

RIBU5P

0.22 XYL5P RIB5P FRUC6P 0.00 0.24 0.04 GAP GAP + SED7P 0.24 0.04 G3P FRUC6P + E4P 0.21 PEP 0.18

0.13

GLUC 0.42

0.03

0.94

0.13

MAL

1.29 ACCOA 0.06

GLYO

0.81 FUM

0.13

AKG 0.66

0.79 SUC

ISOCIT 0.91

SUCCOA 0.66

FIguRE 10.2 (Continued)

It is clear that in vitro enzyme activity measurements can be used to verify the presence of certain enzymes, however, they do not provide proof for an actual lux through an enzyme. A much more elegant approach to verify the model predictions for growth of S. cerevisiae on glucose/ ethanol mixtures was followed by Stueckrath et al. (2002). hey constructed null mutants for the glyoxylate cycle enzymes malate synyhase and isocitrate lyase and the gluconeogenic enzymes PEP carboxykinase and fructose bisphosphatase. Subsequently these null mutants were cultivated in carbon limited chemostat cultures on glucose/ethanol mixtures ranging from 0 to 100% ethanol. Following this approach the metabolic switching points can be found experimentally. At an increasing ethanol content of the feed the cells need certain pathways in order to be able to metabolize all the ethanol supplied. If a key enzyme is not available the cells will only be able to catabolize the ethanol supplied up to a certain ethanol/glucose ratio. Above this ratio the surplus of the ethanol can not be consumed and this will result in measurable amounts of residual ethanol in the eluent of the chemostat. From the experiments carried out by Stueckrath et al. (2002) it was found that both the null mutants for isocitrate lyase and for malate synthase, which both result in a non functional glyoxylate cycle, could grow in ethanol฀+฀glucose limited chemostats up to an ethanol fraction in the feed of 0.50 C-mol/C-mol. A further increase of the ethanol content of the feed resulted in a proportional increase of the residual ethanol concentration and a proportional decrease of the biomass concentration, up to an ethanol content of 100% where growth of these mutants was not possible at all. his observation corresponded very well with the ethanol fraction of 0.48 C-mol/C-mol which was predicted by the stoichiometric model as the switch point for requirement of the glyoxylate cycle (see Figure 10.3). It was found that, as was predicted by the model calculations, the PEP carboxykinase null mutant was able to grow at higher ethanol fractions.

10-13

Metabolic Models for Growth and Product Formation (e)

GLUC 0.27

0.03 0.07

RIBU5P RIB5P

0.03 FRUC6P 0.00 GAP 0.00

0.00

RIB5P

PEP 0.08 PYR

0.14

1.24 MAL

0.31

0.93

XYL5P

ACET +NADH +NADPH

0.83 AKG 0.60

0.91 SUC

SUCCOA 0.60

GAP 0.26 G3P 0.29

PYR

0.39

ETOH 1.00

ACET +NADH +NADPH AC

ACCOA

AC

ACCOA 0.16 ISOCIT GLYO

FUM

0.00

0.29 FRUC6P 0.25

PEP 0.08

OAA

1.46

0.31

GLUC6P

ETOH 0.73

ACCOA

OAA

0.07

0.04 GAP + SED7P 0.04 FRUC6P + E4P

G3P 0.03

FRUC6P + E4P

0.03 RIBU5P

GLUC6P

XYL5P

0.04 GAP + SED7P 0.04

(f )

1.79 MAL

0.64

1.15

1.79 ACCOA 0.32

GLYO 0.64

FUM

ISOCIT 0.70

AKG 0.50

1.13 SUC

SUCCOA 0.50

FIguRE 10.2 (Continued)

However, above an ethanol fraction of 0.60 C-mol/C-mol the residual ethanol concentration increased and the biomass concentration decreased in a linear fashion until no growth occurred at 100% ethanol. Also these experimental observations corresponded very well with the predicted switch point for this enzyme of 0.58 C-mol/C-mol. Only the behavior of the fructose bisphosphatase null mutant during chemostat growth on the ethanol/glucose mixtures deviated from the model predictions, although the trend was predicted well. Already at an ethanol fraction of 0.60 C-mol/C-mol the measured biomass concentration was signiicantly lower than predicted by the model, although no residual ethanol could be detected. his occurred above an ethanol fraction of 0.84 C-mol/C-mol, while the model predicted the metabolic switch to occur at an ethanol fraction of 0.73 C-mol/C-mol. It can be concluded from these results that metabolic models of still moderate complexity can provide a fairly accurate description of changes in metabolic network structure as a result of changes in growth conditions. Furthermore the approach to validate the model predictions experimentally by construction the proper null mutants proved to be very successful. he experimental results showed that the metabolic model was able to provide a quantitative description of the behavior of these null mutants during growth on ethanol/glucose mixtures.

10.3.6 theoretical Yield Limits to the overproduction of Amino Acids As has been shown above for penicillin-G production in P. chrysogenum, stoichiometric metabolic models can be applied to calculate limits to maximum product yields, if they contain the proper ATP stoichiometry parameters. In the following example theoretical yield limits to the overproduction of amino

10-14

Balances and Reaction Models (b) 0.7

0.6

0.6

0.5 0.4 0.3

Yield (C-mol/C-mol)

Predicted Measured

0.2 0.1 0

(c)

Yield (C-mol/C-mol)

0.7

0

20 40 60 80 Ethanol fraction in feed (%)

0.4 0.3

Predicted Measured

0.2 0.1 0

20 40 60 80 Ethanol fraction in feed (%)

100

(d)

0.7

0.7

0.6 0.5 0.4 0.3

Predicted Measured

0.2 0.1 0

0.5

0

100

Yield (C-mol/C-mol)

Yield (C-mol/C-mol)

(a)

0

20 40 60 80 Ethanol fraction in feed (%)

100

0.6 0.5 0.4 0.3

Predicted Measured

0.2 0.1 0

0

20 40 60 80 Ethanol fraction in feed (%)

100

FIguRE 10.3 Predicted and measured biomass yields of S. cerevisiae grown in carbon limited chemostat cultures on diferent ratios of glucose and ethanol in the feed. (a) Wild type; (b) ∆ mls1 and ∆ icl1; (c) ∆ pck1; (d) ∆ bp1. (From Stückrath, I., Lange, H.C., Kötter, P., van Gulik, W.M., Entian, K.-D., and Heijnen, J.J., Biotechnol. Bioeng., 2002, 77(1), 61–72. With permission.)

acids will be calculated using the uncompartmented metabolic network model for S. cerevisiae (van Gulik and Heijnen, 1995). When nongrowth-associated maintenance energy needs are negligible the well-known linear equation for substrate consumption for growth and product formation can be written as: qS =

qP µ + max max YSX YSP

(10.9)

he operational yield of product on substrate is then given by: YSP =

qP = qS

qP µ qP + max max YSX YSP

(10.10)

By applying the metabolic network for growth of S. cerevisiae possible stoichiometric limits to amino acid production were studied. Using the estimated values of δ′ and K′ and glucose as the substrate the max parameter for each of the 20 amino acids which can be metabolic network provides values for the YSP theoretically produced. From Equation 10.10 it follows that, at zero growth rate, µ, the maximum theomax . For each amino acid, retical value of the operational product yield, YSP, is equal to the parameter YSP max can be calculated from the metabolic network. However, it was found that calculation the value of YSP

10-15

Metabolic Models for Growth and Product Formation

of the luxes through the metabolic network for the production of each of the 20 amino acids at zero growth rate (µ฀=฀0) resulted, in some cases, in thermodynamic inconsistencies (e.g., backward operation of the citric acid cycle). It appeared that these inconsistencies occurred only for amino acids for which the production was accompanied by a net production of ATP. hese thermodynamic inconsistencies could be avoided by dissipating the excess ATP produced. In these cases, biomass production might be a sink for excess ATP produced. Another possibility the cells might have is hydrolysis of ATP in futile cycles. For this example it was assumed that excess ATP could only be consumed through biomass production. For each amino acid produced the minimum biomass production rate was calculated for which no thermodynamic inconsistencies occurred. From Equation 10.10 it can be inferred that, when biomass growth is required for production of these amino acids, and thus part of the carbon substrate is necessarily consumed for biomass formation, this will result in a limit to the maximum theoretical max where: yield, YSP lim < Y max YSP ≤ YSP SP

(10.11)

In such cases, ATP dissipation by other means, e.g., by increased maintenance energy requirements, lim. hese limits have been calculated for all 20 amino acids. he results are shown in would increase YSP lim may reach values of only 50% Figure 10.4. It can be seen from this igure that, for some amino acids, YSP max or less than YSP .

10.3.7 Limit Functions for Maximum Product Yields Given the stoichiometry of the metabolic network the linear equation for substrate consumption for growth and amino acid production can be derived, as has been shown above for penicillin production 1.2

Theoretical product yield Cmol/Cmol

1

0.8

0.6

0.4

Valine

Tyrosine

Tryptophane

Threonine

Serine

Proline

Phenylalanine

Methionine

Lysine

Leucine

Isoleucine

Histidine

Glycine

Glutamine

Glutamate

Cysteine

Aspartate

Asparagine

Arginine

0

Alanine

0.2

FIguRE 10.4 Metabolic network estimation of maximum theoretical yields for amino acid production in S. cerevisae. Grey bars: Maximum theoretical yield of product on the carbon source under the assumption of zero biomass growth. Black bars: Limits to the theoretical product yields resulting from thermodynamic constraints (see text). (From van Gulik, W.M. and Heijnen, J.J., Biotechnol. Bioeng., 1995, 48, 681–698. With permission.)

10-16

Balances and Reaction Models

in P. chrysogenum. As an example this was done for aerobic growth S. cerevisiae on glucose with production of leucine, using the metabolic model of van Gulik and Heijnen (1995). he resulting equation contains the ATP-stoichiometry parameters and reads  0.176 ⋅ δ + 0.0833 ⋅ K ′X + 0.166   1.25 ⋅ δ + 0.667  -q S =   ⋅ µ +  δ + 0.400  q P  δ + 0.400

(10.12)

where δ is the P/O-ratio and K ′X (mol ATP/C-mol biomass) is the growth dependent maintenance coeicient which was estimated from chemostat data obtained at a speciic growth rate of 0.1 h -1 (van Gulik and Heijnen, 1995). From this equation the maximum yields of biomass and leucine on glucose follow as: max = YSX

δ + 0.400 0.176 ⋅ δ + 0.0833 ⋅ K ′X + 0.166

and

YLmax EU =

δ + 0.400 1.25 ⋅ δ + 0.667

(10.13)

From the relation for the maximum product yield it could be inferred that the minimum yield of leucine on glucose (for δ฀=฀0) would be 0.60 mol/mol and that the maximum yield for the estimated P/O-ratio (δ฀=฀1.20) would be 0.74 mol/mol. However, as has been pointed out above, if the biosynthesis of a product is accompanied with a net production of ATP, there should be a sink for the produced ATP as well. For the example on amino acid overproduction the formation of biomass has been assumed as the ATP sink, which resulted in lower operational yields for the amino acids alanine, glutamate, glutamine, glycine, leucine, valine, phenylalanine, proline, serine, tyrosine, and valine. For each of these amino acids limit functions can be derived from the stoichiometry of the metabolic network, giving the upper limit to the yield of product on substrate as a function of the ATP stoichiometry parameters, such that thermodynamic constraints (i.e., reversal of reactions which are irreversible under physiological conditions) are not violated. Below the yield limit functions for the overproduction of two amino acids, namely leucine and valine are shown: lim = YLEU

-δ + 2.48 ⋅ K ′X + 2.46 0.932 ⋅ δ + 4.14 ⋅ K ′X + 4.110

(10.14)

lim = YVAL

-δ + 2.48 ⋅ K ′X + 2.46 -0.127 ⋅ δ + 2.90 ⋅ K ′X + 2.87

(10.15)

Substitution of the estimated values of the ATP stoichiometry parameters, δ฀=฀1.20 (mol ATP/0.5 lim = 0.363 (mol/mol) and Y lim = 0.624 mol oxygen) and K ′X ฀=฀0.644 (mol ATP/C-mol biomass) yields YLEU VAL (mol/mol). As can be inferred from these equations these yield limits are not a function of the speciic growth rate µ. he reason for this is that in the stoichiometric model for yeast of van Gulik and Heijnen (1995) growth independent maintenance energy requirements were not taken into account, because the data used were obtained from chemostat cultures carried out at the same growth rate and thus the growth independent maintenance could not be estimated. In order to close the ATP balance, the production rate of an amino acid which leads to a net production of ATP should be accompanied by a certain biomass production rate to consume the produced ATP. his implies that for each of these amino acids a ratio between the speciic rate of amino acid production and speciic growth rate exists for which the net production of ATP is equal to zero. his results in a ixed limit to the yield of amino acid on substrate which is independent of the growth rate. However, if growth independent maintenance is taken into account substitution of K ′X = K X + ( m ATP / µ ) in

10-17

Metabolic Models for Growth and Product Formation

0.8 0.7

Yield limit (mol/mol)

0.6 0.5 0.4 0.3 0.2 0.1 0 0

0.2 0.4 Growth rate (h–1)

0.6

FIguRE 10.5 Predicted theoretical limits to the yield of the amino acid leucine on glucose; (…) without thermodynamic constraints and without taking growth independent maintenance requirement into account, (---) with thermodynamic constraints and without growth independent maintenance, () with thermodynamic constraints and with growth independent maintenance.

Equation 10.14 yields an expression for the thermodynamic limit of the leucine yield as a function of the growth rate. m   -δ + 2.48 ⋅  K X + ATP  + 2.46  µ  lim = YLEU m   0.932 ⋅ δ + 4.14 ⋅  K X + ATP  + 4.10  µ 

(10.16)

Assuming a growth independent maintenance coeicient m ATP฀=฀0.033 mol ATP/C-mol biomass and a growth dependent maintenance coeicient K X฀=฀0.31 mol ATP/C-mol biomass, which yields K ′X ฀=฀0.64 mol ATP/C-mol for a growth rate of 0.1 h -1, a plot can be made of the thermodynamic limit to the leucine yield on glucose as a function of the growth rate (see Figure 10.5). As a comparison the yield limit for the case where growth independent maintenance was not taken into account (Equation 10.14 with K′฀=฀0.64 mol ATP/C-mol) is plotted in the same igure (dashed line) as well as the maximum theoretical yield if thermodynamic constraints are violated (dotted line). It can be seen from this igure that if growth independent maintenance is taken into account a decreases of the growth rate toward zero results in a progressive increase in the yield limit, to reach a value of 0.6 for zero growth.

10.4 Conclusions It has been shown in this chapter that stoichiometric metabolic models of moderate complexity can be successfully applied to provide a fairly accurate description of changes in metabolic network structure as a result of changes in growth conditions. It has also been shown that such models can be applied, in combination with experimental results, to estimate the ATP stoichiometry of oxidative phosphorylaton and maintenance requirements for a certain microorganism. Incorporating the estimated ATP stoichiometry in the model allows the prediction of maximum yields of biomass and products for diferent substrates, substrate mixtures and metabolic network topologies. An important prerequisite for these calculations is that thermodynamic constraints are not violated.

10-18

Balances and Reaction Models

References Burgard P.A., Vaidyaraman S., and Maranas C.D., 2001. Minimal reaction sets for Escherichia coli metabolism under diferent growth requirements and uptake environments. Biotechnol. Prog., 17:791–97. de Jong-Gubbels P., Vanrolleghem P.A., Heijnen J.J., van Dijken J.P., and Pronk J.T., 1995. Regulation of carbon metabolism in chemostat cultures of Saccharomyces cerevisiae grown on mixtures of glucose and ethanol. Yeast, 11:407–18. Duarte N.C., Herrgård M.J., and Palsson B.O., 2004. Reconstruction and validation of Saccharomyces cerevisiae iND750, a fully compartmentalized genome-scale metabolic model. Genome Res., 14(7):1298–309. Feist A.M., Henry C.S., Reed J.L., Krummenacker M., Joyce A.R., Karp P.D., Broadbelt L.J., Hatzimanikatis V., and Palsson B.O., 2007. A genome-scale metabolic reconstruction for Escherichia coli K-12 MG1655 that accounts for 1260 ORFs and thermodynamic information. Mol Syst. Biol., 3(121). Ingraham J.L., Maaloee O., and Neidhardt F.C., 1983. Growth of the Bacterial Cell. Sinauer Associates, Sunderland, MA. Oh Y.K., Palsson B.O., Park S.M., Schilling C.H., and Mahadevan R., 2007. Genome-scale reconstruction of metabolic network in Bacillus subtilis based on high-throughput phenotyping and gene essentiality data. J. Biol. Chem., 282(39):28791–9. Price N.D., Reed J.L., and Palsson B.O., 2004. Genome-scale models of microbial cells: evaluating the consequences of constraints. Nat. Rev. Microbiol., 2:886–97. Rabkin M. and Blum J.J., 1985. Quantitative analysis of intermediary metabolism in hepatocytes incubated in the presence and absence of glucagon with a substrate mixture containing glucose, ribose, fructose, alanine and acetate. Biochem. J., 225:761–86. Stückrath I., Lange H.C., Kötter P., van Gulik W.M., Entian, K.-D., and Heijnen J.J., 2002. Characterization of null mutants of the glyoxylate cycle and gluconeogenic enzymes in S. cerevisiae through metabolic. Biotechnol. Bioeng., 77(1):61–72. van Gulik W.M. and Heijnen J.J., 1995. A metabolic network stoichiometry analysis of microbial growth and product formation. Biotechnol. Bioeng., 48:681–98. van Gulik W.M., De Laat W.T.A.M., Vinke J.L., and Heijnen J.J., 2000. Application of metabolic lux analysis for the identiication of metabolic bottlenecks in the biosynthesis of penicillin-G. Biotechnol. Bioeng., 68:602–18. van Gulik W.M., Antoniewicz M.R., Delaat W.T.A.M., Vinke J.L., and Heijnen J.J., 2001. Energetics of growth and penicillin production in a high-producing strain of Penicillium chrysogenum. Biotechnol. Bioeng., 72:185–93. Vanrolleghem P.A., de Jong-Gubbels P., van Gulik W.M., Pronk J.T., van Dijken J.P., and Heijnen J.J., 1996. Validation of a metabolic network for Saccharomyces cerevisiae using mixed substrate studies. Biotechnol. Prog., 12(4):434–48. Verhof F.H. and Spradlin J.E., 1976. Mass and energy balances of metabolic pathways applied to citric acid production by Aspergillus niger. Biotechnol. Bioeng., 18:425–32. Verduyn C., Stouthamer A.H., Schefers W.A., and van Dijken J.P., 1991. A theoretical evaluation of growth yields of yeasts. Ant. van Leeuwenhoek Int. J. Gen. Mol. Microbiol., 59:49–63. Verduyn C., 1991. Physiology of yeasts in relation to growth yields. Ant. van Leeuwenhoek Int. J. Gen. Mol. Microbiol. (Special issue: Growth and Metabolism of Microorganisms), 60:325–53.

11 A Thermodynamic Description of Microbial Growth and Product Formation 11.1 Introduction .....................................................................................11-1 11.2 hermodynamics of Microbial Growth max .........................................................................11-2 Stoichiometry, YDX he Anabolic Reaction for Biomass Synthesis  •  Calculation of the Electron Donor Needed for Anabolism Using the Balance of Degree of Reduction  •  Calculation of the Gibbs Energy from the Catabolic Reaction  •  he Required Amount of Gibbs Energy for Anabolism and Calculation of the Amount of Electron Donor hat Must Be Catabolized  •  A hermodynamic Relation to Calculate the Biomass max Yield on Electron Donor, YDX

Joseph J. Heijnen Delft University of Technology

11.3 hermodynamics of Maintenance..............................................11-13 11.4 Calculation of the Operational Stoichiometry of a Growth Process at Diferent Growth Rates, Including Heat Using the Herbert–Pirt Relation for Electron Donor ......................... 11-14 11.5 A Correlation to Estimate the Maximum Speciic Growth Rate, µmax .........................................................................................11-15 11.6 hermodynamic Prediction of Minimal Concentration Electron Donor and Maximal Concentration of Catabolic Product ............................................................................................11-16 11.7 hermodynamics and Stoichiometry of Product Formation .......................................................................................11-17 11.8 Conclusions ....................................................................................11-19 References and Recommended Reading................................................11-19

11.1 Introduction Growth of (micro)organisms occurs under a wide range of conditions (such as pH 0–13, temperature 0–110°C, salt concentration 0.1–2 M), using a huge variety of electron donors, electron acceptors, carbon, and nitrogen sources, each of which can be organic or inorganic. Growth of organisms is usually described by four parameters which belong to the hyperbolic substrate max, m ). he values of these four parameters uptake relation (µmax, (Ks)) and the Herbert–Pirt relation (YSX s are essential to design processes in which growing organisms are used. However their values depend on the nutrients used (nature of carbon and nitrogen sources, electron donor and acceptor), temperature and pH and easily span a range of two orders of magnitude. For example Escherichia coli grows 11-1

11-2

Balances and Reaction Models

on glucose (electron donor) using O2 (aerobically) with YSX฀=฀0.50 g biomass/g glucose and µmax฀=฀1 h -1. In contrast methane bacteria, using acetate show YSX฀=฀0.01 g biomass/g acetate and µmax฀=฀0.005 h -1. A general method to predict the values of these four parameters for diferent growth systems is therefore of great value. Here we will present a thermodynamic approach to predict these four parameters for any growth system for which the C- and N-source, the electron donor and acceptor, the biomass speciic growth rate µ, and the cultivation temperature are speciied. his thermodynamic method also allows to understand the efect of changes in nutrients, temperature, and pH on these four parameters. max 11.2 thermodynamics of Microbial Growth Stoichiometry, YDX

Growth stoichiometry is of major interest for biotechnological process design and is relected in the parammax (from the Herbert–Pirt relation, see Chapter 9 of this section). (1 / Y max ) represents the amount eter YSX SX max , electron donor, in mol, needed to synthesize 1 Cmol of biomass. Usually YSXmax is also written as YDX because both the term substrate (S) and electron donor (D) are used to indicate the same compound.

11.2.1 the Anabolic Reaction for Biomass Synthesis Microorganisms are composed of protein, RNA, DNA, lipid, and carbohydrates. Comparing many diferent (micro)organisms, it is found that their relative contents are similar (40–70% protein, 1–2% DNA, 5–15% RNA, 2–10% lipid, 3–10% carbohydrate). his similarity leads to an elemental composition of biomass which is also very similar, such that the organic part of biomass can be represented by a simple 1-C-formula: Biomass฀=฀C1H1.8O0.5N0.2 his composition formula holds close for many organisms. However for each speciic situation one can establish an element analysis of the biomass and obtain a more precise elemental composition. For convenience we excluded the other elements P, S, K+, Mg2฀+,฀etc. present in biomass, because of their minor contribution ( 0, ε3 > 0). We can now distinguish diferent experimental measurement situations following perturbations, as before (Section 16.6.1.1). •  Only lux and enzyme measurements We can eliminate ln (X/Xo) from the three kinetic equations (Figure 16.6) leading to two equations involving only enzyme and lux ratio’s. ε o  J eo  J1 e1o - 1 = 1o  o2 2 - 1 o ε 2  J2 e 2  J1 e1 ε o  J eo  J2 eo2 - 1 = o2  o3 3 - 1 ε 3  J3 e 3  Jo2 e 2 In these two equations one can enter the lux results from one (!!) perturbation experiment, e.g., in enzyme 1. e1 e1o = 2.0, e 2 eo2 = 1, e3 e3o = 1, J1 J1o = 2.86 / 2.0 , J2 Jo2 = 1.29 / 1, J3 Jo3 = 1.57 / 1 Entering these results in the two relations gives ε o2 ε o3 = 0.50, ε1o ε o2 = -1.00 he above two relations with the mass balance (J1฀=฀J2฀+฀J3) give the relations for each Ji Joi as function of the three enzyme levels with only the two elasticity ratio’s as parameters. It has also been shown (Heijnen et al., 2004) that these two ε-ratios completely deine the nine CJ-values of this network. he relations are given in Appendix B. •  Metabolite, enzyme and lux measurements Assume that in the above perturbation also the change in metabolite was measured X/Xo฀=฀1.77. We can now enter all information in the lin-log kinetic relations for the three reactions. 2.86 = 2(1 + ε10 ln1.77 ) 2 1.29 = 1(1 + ε 02 ln1.77) 1 1.57 = 1(1 + ε 03 ln1.77) 1 his directly give the three elasticity values ε1o ฀=฀–0.50, ε o2 ฀=฀0.50, ε o3 ฀=฀1, which allows to calculate all nine CJ and three Cx-values (see Appendix B). Note that the ε-ratio’s previously obtained agree with these values and that only one perturbation is minimally required to obtain all ε-values. Of course more perturbations lead to ε-values with smaller error (Heijnen et al., 2004). •  Noncharacterized perturbations Suppose we perform a noncharacterized perturbation (see also Section 16.6.1.1) in reaction 1 (hence e1 e1o is not known, but e 2 eo2 ฀=฀1 and e3 eo3 ฀=฀1). he perturbation leads to J1 J1o ฀=฀2.86/2, J2 Jo2 ฀=฀1.29/1, J3 Jo3 ฀=฀1.57/1, X X o฀=฀1.77. For reaction 2 and 3 there follow the lin-log kinetics: 1.29 = 1(1 + ε 02 ln1.77) 1

16-20

Modeling Tools for Metabolic Engineering

1.57 = 1(1 + ε 03 ln1.77) 1 Next we perform a second noncharacterized perturbation, but now in reaction 2 leading to J1 J1o ฀=฀2.33/2, J2 Jo2 ฀=฀1.67/1, J3 Jo3 ฀=฀0.67/1, X/Xo฀=฀0.72. For reaction 1 and 3 there follow as lin-log kinetic relations: 2.33 = 1(1 + ε10 ln 0.72) 2 0.67 = 1(1 + ε 03 ln 0.72) 1 Solving these four equations for the elasticities gives ε1o ฀=฀–0.50, ε o2 ฀=฀0.50, ε o3 ฀=฀1.0, as before leading (see Appendix B) to all CJ and Cx-values. A remarkable aspect of noncharacterized perturbations is that, when only lux measurements are available, without e/eo information, one can still obtain CJ as follows! For the two perturbations one enters the results in the equations derived before (see “only lux and enzyme measurements” above). Uncharacterized perturbation in e1 (hence e 2 eo2 = 1, e3 e3o = 1 ): ε 0 1.57 1.29 × 1 - 1 = 02 ( × 1 - 1) ε3 1 1 Uncharacterized pertubation in e2 (or e1 e1o = 1, e3 eo3 = 1) 2.33 ε 0  0.67  × 1 - 1 = 10  × 1 - 1  2 ε3  1 his gives ε 02 / ε 03 = 0.50 and ε10 / ε 03 = -0.50, which agrees with the irst approach (only lux and enzyme measurements) and gives all CJ-values according to Appendix B. Note that one needs 1 characterized lux perturbation (knowing e/eo and J/Jo) or two noncharacterized perturbations (knowing only J/Jo) to solve the branch. his approach has been applied to experimental branch point data (lysine production, glutamate production, and glycolysis) in a recent paper by Heijnen et al. (2004).

16.6.2 Dynamic Perturbation experiments 16.6.2.1 the Measurement Problem for Steady State Perturbation experiments For the previously mentioned steady state perturbation experiments of metabolic networks one needs diferent measurements: Extracellular concentrations hese concentrations give (using proper mass balances) the uptake/secretion rates. hese rates are used in lux balance analysis to calculate the steady state luxes. his gives Ji Joi Enzyme activities In a mutant, in which a target gene has been changed, leading to a change in enzyme k, e k eok, for each individual other enzyme in the network a measurement must be done to provide the other values ei eio. It is not suicient to quantify only the change in, e.g., the enzyme whose level was modiied using genetic techniques. Due to genetic regulation mechanisms, which respond to the changed metabolite levels, in

Metabolic Control Analysis

16-21

principle all enzyme levels (and not only the target enzyme) have changed in the mutant (Niederberger et al. 1992). In practice this poses severe problems. Oten traditional enzyme activity assays are the only method of quantiication, but for many enzymes these are not available. Here the recent developments in protein identiication and quantiication using mass spectrometry (Groot et al., 2007) are a big leap forward. We have seen (Section 16.6.1) that in principle, using Ji Jio and ei eio data one can elaborate the lux control coeicients CJ. It is not possible to obtain Cx; for this one needs metabolite measurements. Intracellular metabolite measurements For each perturbed steady state one needs to measure all concentrations of the intracellular metabolites, to provide X j X oj . he experimental efort for intracellular metabolite measurements is signiicant. Due to rapid turnover of metabolites one needs rapid sampling and quenching of the biomass, such as the cold methanol method (Lange et al., 2001). Subsequently one needs to wash the biomass, to remove the extracellularly present metabolites under cold conditions. hen the washed biomass is extracted for intracellular metabolites (using, e.g., the boiling ethanol method). Finally the intracellular metabolites must be quantitatively analysed. his requires sophisticated MSMS techniques and the use of 13C standards (Dam et al., 2002, Wu et al., 2005, Mashego et al., 2004). he combined data of J/Jo, X j X oj and ei eio allow to calculate the metabolite concentration control coeicients (Cx), and lux control coeicients CJ. his is called the direct approach. CJ and Cx values can then be used to obtain the elasticity coeicients using a general matrix equation (Westerhof and Kell, 1987). It is, however, much simpler to use Ji Jio, ei eio, X j X oj , C k C ok directly with lin-log kinetics to obtain the elasticities using linear regression as shown in Section 16.6.1. his avoids the nasty problem of (usually unknown) dependencies between CJ, Cx parameters (Appendix B), which must be taken into account when one applies the direct method. It is clear that to obtain all elasticities from steady state perturbation experiments one needs extensive quantitative datasets of luxes, enzyme levels and metabolite levels. his is a formidable task. 16.6.2.2 Dynamic Perturbations only Require Metabolite Measurements Dynamic perturbation experiments are an interesting alternative possibility to obtain ε-values. In such experiments the organism in steady state is perturbed extracellularly. his can be the addition of substrate, a switch of electron acceptor, addition of an inhibitor, change in dissolved O2 or CO2, a change in pH, etc. Subsequently the dynamic pattern of intracellular metabolites is measured in a short time frame, e.g., during a hundred seconds. In these rapid pulse experiments the change in enzyme activity is considered absent due to the short time (Rizzi et al., 1997, heobald et al., 1997, Vaseghi et al., 1999, Kresnowati et al., 2006). his method only requires extra/intracellular metabolite analysis, and enzyme activity levels are not needed (being constant). Recently it was shown that these rapid pulse experiments can be performed in a mini (3 ml) reactor, called Bioscope (Visser et al., 2002 and Mashego et al., 2006). his Bioscope is fed, from the chemostat, with a constant (about 1 ml/min) broth stream containing steady state biomass, which is perturbed and sampled in the Bioscope. his is highly advantageous, compared to performing the pulse in the fermentor, because many diferent pulses can be performed with biomass from the same steady state chemostat. Also the amount of sample per time point is unlimited. Delgado and Liao (1991, 1992) have shown that CJ and Cx parameters can be directly obtained from such concentration time traces. However, it would be more convenient to obtain directly the elasticity parameters from such rapid pulse experiments. he data set of dynamic concentrations can be used for parametrization of a classical nonlinear model of, e.g., glycolysis (Rizzi et al., 1997 and Chassagnole et al., 2002). Such a nonlinear kinetic model allows then, for a given steady state the calculation of elasticities, followed by CJ and Cx-values. he key problem is that the parameter estimation in such nonlinear models is troublesome. Degenring et al. (2004), Haunschild et al. (2005, 2006), Nikerel et al. (2008), Wahl et al. (2006), Wiechert (2002), Wiechert and Takors (2004). he reason is that a nonlinear parameter estimation algorithm does need an initial guess of the parameters (which is not available) and that therefore, the best global estimate is not guaranteed.

16-22

Modeling Tools for Metabolic Engineering

In addition many parameters are hardly identiiable. his leads to model reduction aterwards. Kresnowati et al. (2005) and Nikerel et al. (2006) have shown that the direct use of lin-log kinetics enables the direct estimation of elasticities. he key in lin-log kinetics is that the elasticity parameters are linearly present in the equations, but the concentrations are nonlinear (logarithm) present. A dynamic experiment is then described (using lin-log kinetics) using the independent mass balances for intracellular metabolites (Equation 16.1b): dX = S[ Jo ][ E xo ln( X / X o + E co ln(c / c o )] dt

(16.14a)

In addition there are the mass balances for the extracellular concentration, which contain the transport terms (DCin for in transport and DC for out transport) and biomass concentration Cx: dC = C x Sc [ Jo ][ E xo ln( X / X o + E co ln(c / c o )] DC in - DC dt

(16.14b)

Integration of let and right side of these equations between time intervals (using linear approximation of X/Xo to calculate the integral of (ln X/Xo)) leads to a set of equations, which are linear in elasticities. Linear regression then gives a irst estimate of the ε-parameters. his estimate is the initial parameter set for a conventional nonlinear parameter estimation algorithm (Nikerel et al., 2006, 2007, 2008). Figure 16.8 shows an example toy network (branch) which has two intracellular metabolites (X1, X 2), a substrate S and two products P1 and P2. here is inhibition of X 2 on reaction 1 and of P2 on reaction 3. he kinetics are highly nonlinear as shown. he stoichiometry matrix (Sc, S) is shown. A reference steady state is shown, in which the indicated elasticities hold. Using the elasticities, and reference luxes a lin-log model is constructed. he steady state in a chemostat is perturbed at t฀=฀0 by shiting S (1฀→฀2), X1 (2฀→฀1.5), X 2 (1฀→฀1.5), P1 (1฀→฀0.8). Figures shows the calculated metabolite response from the original model (dots) and the lin-log kinetics (line). Clearly, lin-log kinetics describes these large perturbations very well. Subsequently these metabolite data points were used to estimate the elasticity set (assuming that kinetic knowledge allowed to put certain, see Figure 16.8, Ec,x, elasticities to zero), using the above approach. Figure 16.9 shows the result of the estimated elasticities, which are very close to the expected values. Also the dynamic lin-log model with the evaluated ε-values performs very well. Recently (Nikerel et al., 2006, 2008) the glycolysis was studied in silico with respect to estimating the elasticities from dynamic concentration data and a lin-log model. It appeared that not all elasticities could be identiied. However, due to the linear parameter character of lin-log kinetics the required model reduction could be performed a priori, using only dynamic metabolite measurement data. In addition it was shown that all parameters could be identiied using a proper combination of dynamic and steady state perturbations which were used simultaneous for the correct estimation of elasticities. Having obtained the elasticities of the metabolic reaction network, with the reference luxes, linlog kinetics provides a complete dynamic model to be used for simulation and network optimization (Visser et al., 2004). Also the lin-log model directly allows: •  he calculation of CJ, CX. •  he calculation of large changes in X and J upon large changes in enzyme levels. •  he inverse calculation where the new J and X are speciied and the design equation gives the required changes in enzymes (e/eo) (Visser and Heijnen, 2003). •  Unravelling of silent mutations (Raamsdonk et al., 2001) using only lux and metabolite data (Wu et al., 2005). his approach seems very powerful to use metabolome and lux data for detailed functional genomics (Wu et al., 2005).

16-23

Metabolic Control Analysis

P1 v1

v2

v3

v4

–1

0

0

0

P1

0

0

0

1

P2

0

0

1

0

X1

1

–1

0

–1

X2

0

1

–1

0

0

1

Sc

v4 S

v1 S

X1 v2 X2

v1 = 2

S

P2

S S+1 X1

v2 = 1.867

v3 = 4

v3

X1 + 2

X2

v1

3 + X2

v2 v3

(1– 0.65·P2)

X2 + 1

v4 = 0.6

3

0

S

1 =

v4

X1

0.7

X2

0.7 0.3

X31

2 =

1

P1

1

P2

1

X31+ 23

2.5 X1 Ec,x

S, X1, X2, P1, P2

2 P1 1.5 X2

P2 1

0.5

S

0

5

10

15 Time

20

25

S

P1

P2

X1

X2

v1

0.50

0

0

0

0

v2

0

0

0

0.5

–0.25

v3

0

0

–1.86

0

0.5

v4

0

0

0

1.5

0

30

FIguRE 16.8 Toy metabolic network with nonlinear kinetics, the stoichiometric matrix, the used reference steady state, generated perturbation data (•), calculated reference elasticities and simulation of the same perturbation with the calculated elasticities (—).

16.7 Conclusion and outlook It has been shown that the traditional (small perturbation) MCA can be extended easily, using lin-log kinetics, to a full nonlinear kinetic model which describes large perturbations well. his kinetic model has the MCA elasticities (and Jo) as kinetic parameters. his model is nonlinear in concentrations, but linear in parameters. his last property is a key element compared to traditional nonlinear enzyme kinetics (which are nonlinear in both concentration and parameter). he parameter linearity in lin-log kinetics allows to use the powerful toolbox of linear algebra for parameter (elasticities) estimation. In addition this parameter linearity reveals parameter identiiably problems and allows a priori model reduction. Finally, due to its linear parameter character of lin-log kinetics these methods of parameter identiication scale favorable for large, realistic reaction networks.

16-24

Modeling Tools for Metabolic Engineering

2.5 X1 S, X1, X2, P1, P2

2

Ec,x P1

1.5

1

0.5 0

X2

P2

S

5

10

15 Time

20

25

S

P1

P2

X1

X2

v1

0.49

0

0

0

0

v2

0

0

0

0.47

–0.26

v3

0

0

–1.68

0

0.40

v4

0

0

0

1.47

0

30

FIguRE 16.9 Generated dynamic perturbation data using mechanistic model (•), simulation of the perturbation of the toy network using the estimated elasticities of the lin-log kinetic (—) and the estimated elasticities.

In future the aspects of noise and experimental design need to studied more extensively. Ever more challenging is to design in vivo experiments which enable to extract elasticities in an unbiased way, by allowing that all entries in Exo and Eco can be diferent from zero (inverse engineering). his allows an unbiased investigation of metabolite/enzyme allosteric interactions. hese results are all very relevant to quantitatively describe metabolic reaction networks. However, enzyme levels/activities in these models are present as parameters. In reality there is a coupling between metabolite status and gene expression, (de)phosphorylation cascades of enzymes. his regulation level must also be put into a convenient mathematical framework. he challenge will be to use model formats which are scalable to large networks, which can accommodate large perturbation, easy parameter estimation and identiiability studies. In the past, the power law approach (Savageau, 1976, Voit, 2000) has shown promises here.

Appendix A enzyme Kinetics in the Presence of Conserved Moieties Assume that the reaction rate of an enzyme is inluenced by an intracellular metabolite X1, three metabolites which belong to a conserved moiety (X 2, X 3, X4) and an extracellular metabolite present in concentration Cs (Equation 16.1). We can then write for the reaction rate V: V฀=฀f(X1, X 2, X3, X4, Cs, enzyme, parameters) his kinetic equation is linearized (using the approach in Section 16.4.4) around a steady state. V X X X  e     - 1 =  o - 1 + ε1o  1o - 1 + ε o2  o2 - 1 + ε 3o  o3 - 1 o        J e X1 X2 X3  C X   + ε o4  o4 - 1 + ε os  os - 1  X4   Cs 

(A.1a)

here is also a conserved moiety sum with sum total T: X 2฀+฀X3฀+฀X4฀=฀T

(A.2a)

16-25

Metabolic Control Analysis

In the reference steady state the conserved moiety sum follows as X o2 + X o3 + X o4 = T o

(A.2b)

Here To is the conserved moiety sum in the reference steady state which can be perturbed, hence the general sum T. We can combine Equations A2a and A2b (subtraction) and rewrite: X X X  T     X o2  o2 - 1 + X o3  o3 - 1 + X o4  o4 -11 = T o  o - 1 T   X4   X2   X3 

(A.2c)

To obtain independent mass balances the stoichiometric matrix S was reduced by eliminating a row corresponding to a chosen metabolite present in the conserved moiety sum. Also the dependent metabolite vector is accordingly reduced (matrix Sfull becomes S and vector X full becomes vector X). We now also have to eliminate the removed metabolite from the enzyme kinetic relations, where present. If we assume that X4 is the chosen removed metabolite, then we have to adapt Equation A1a, by using Equation A2c to eliminate (( X 4 / X o4 ) - 1). his gives us as new kinetic equation. V  e  o  X1   o o X o2   X 2  = 1  eo - 1 + ε1  X o - 1 + ε 2 - ε 4 X o   X o - 1 Jo  1 4  2 X o  X C   To  T    + ε 3o - ε o4 o3   o3 - 1 + ε os  os - 1 + ε o4 o   o - 1       X C X T X   s 3 4  4 

(A.1b)

his kinetic relation has still the linear format but: •  For the remaining conserved moiety metabolites (X 2, X3) new composite elasticities (`) arise Xo   ε o21 =  ε o2 - ε o4 o2   X4  Xo   ε 3o1 =  ε 3o - ε o4 o3   X4  It should be noted that these elasticities can be quite diferent from 1, due to the metabolite ratio’s which can be very diferent from 1. •  A new independent metabolite shows up, the conserved moiety sum T/To with its own composite elasticity o ε 01 T = ε4

To X o4

his shows that the vector of independent concentrations C is extended with conserved moiety sum. In case of lin-log kinetics the conserved moiety sum (e.g., Equation A2c) is approximated by its logarithmic format (y -฀1 ≈฀ln y), which is, therefore, only accurate for not too large (plus or minus 30%, Heijnen, 2005) changes in conserved moiety metabolites. his kinetic equation (Equation A1b) is the equation to be introduced in the independent metabolite mass balances represented by the reduced matrix S and reduced vector X. In these mass balances the elasticity matrices Ex and Ec contain the composite elasticities. Its steady state metabolite and lux solutions then also show the efect of changed conserved moiety sums on metabolite and lux changes. hese (composite) elasticities are the only kinetic parameters which can be estimated using proper perturbation experiments (see Section 16.6).

16-26

Modeling Tools for Metabolic Engineering

In case that in such experiments the conserved moiety sum does not change (hence T฀=฀To), it directly follows that ε oT cannot be obtained!, only the composite elasticities for X 2 and X 3 (ε o21 and ε o31) can be obtained. hese two composite elasticities are made up of the three unknown individual elasticities (one for each metabolite X 2, X3, X4, of the conserved moiety) and the known reference state metabolite levels of the conserved moiety ( X o2 , X 3o , X o4 ). herefore, the original elasticities (ε o2 , ε o3 , ε o4 ) cannot be resolved, showing that conserved moieties lead to an identiiability problem. In case that the conserved moiety sum is perturbed, then ε oT can be found and therewith all original elasticities can be found (as expected, because conserved moieties are absent if T is allowed to vary).

Appendix B Analysis of Control Coefficients and Dependency Relations for a Branch Point he branch point split ratio a฀=฀J20/J10 is deined in the reference state. Solving this network gives the following equations for the metabolite x and J1 J2 and J3 as a function of the enzyme levels: J3 e03 e1 e10 ⋅ ( ε 03 ε10 - 1) + a ⋅ e 2 e02 ⋅ ( ε 02 ε10 - ε 03 ε10 ) ⋅ = J03 e3 a ⋅ e 2 e02 ⋅ ε 02 ε10 + (1 - a ) ⋅ e3 e03 ⋅ ε 03 ε10 - e1 e10

(B.1)

J2 e02 (1 - a ) ⋅ e3 e30 ⋅ ( ε 03 ε10 - ε 02 ε10 ) + e1 e10 ⋅ ( ε 02 ε10 - 1) ⋅ = J02 e 2 a ⋅ e 2 e02 ⋅ ε 02 ε10 + (1 - a ) ⋅ e3 e30 ⋅ ε 30 ε10 - e1 e10

(B.2)

J1 J J = a ⋅ 02 + (1 - a ) ⋅ 03 0 J1 J2 J3

(B.3)

- ln ( x x 0 ) =

- e1 e10 + a ⋅ e 2 e02 + (1 - a ) ⋅ e3 e30 - e1 e10 ⋅ ε10 + a ⋅ e 2 e02 ⋅ε 02 + (1 - a ) ⋅ e3 e03 ⋅ ε 03

(B.4)

he following relations are obtained for the nine lux control coeicients: J0 C11 = ( a ⋅ ε 02 ε10 + (1 - a ) ⋅ ε 03 ε10 ) D

(B.5a)

J0 C12 = -a D

(B.5b)

J0 C13 = ( -1 + a ) D

(B.5c)

C J210 = ( ε 02 ε10 ) D

(B.5d)

C J220 = ((1 - a ) ⋅ ε 03 ε10 - 1) D

(B.5e)

C J230 = ( - (1 - a ) ⋅ ε 02 ε10 ) D

(B.5f)

J0 C 31 = ( ε 03 ε10 ) D

(B.5g)

16-27

Metabolic Control Analysis J0 C 32 = ( -a ⋅ ε 03 ε10 ) D

(B.5h)

J0 C 33 = ( a ⋅ ε 02 ε10 - 1) D

(B.5i)

Where, the denominator D is deined as: D = a ⋅ε 02 ε10 + (1 - a ) ⋅ ε 03 ε10 - 1 Eliminating the elasticity ratio’s (ε20/ε10 and ε30/ε10) gives seven relations between CJ, showing strong dependency Mass balance derived constraints C1J0i = a ⋅ C 2J0i + (1 - a ) ⋅ C 3J0i ,

i = 1, 2

(B.6)

Summation constraints C11 + C12 + C13 = 1

(B.7a)

C21 + C22 + C23 = 1

(B.7b)

C 31 + C 32 + C 33 = 1

(B.7c)

(1 - a ) ⋅ C12J0 - a ⋅ C13J0 = 0

(B.8a)

(1 - a ) ⋅ C J210 + a ⋅ C J230 = 0

(B.8b)

Branch point constraints

For the metabolite control coeicients the following relations are obtained: C1x 0 = 1 D '

(B.9a)

C 2x 0 = -a D '

(B.9b)

C 3x 0 = - (1 - a ) D '

(B.9c)

he denominator D′ is deined as: D ' = ε10 D = - ε10 + a ⋅ ε 02 + (1 - a ) ⋅ ε 03 hese three relations only contain one ε-group (฀=฀D′) which can be eliminated. he two constraints are the metabolite control summation theorem and the kinetics based relation: C1x 0 + C 2x 0 + C 3x 0 = 0

(B.10)

a ⋅ C1x 0 + C 2x 0 = 0

(B.11)

16-28

Modeling Tools for Metabolic Engineering

References Chassagnole, C. et al. 2002. Dynamic modelling of the central carbon metabolism of Escherichia coli. Biotechnol. Bioengin., 79: 53–73. Degenring, D. et al. 2004. Sensitivity analysis for the reduction of complex metabolism models. J. Process Contr., 14: 729–745. de Groot, M.J.L., Daran-Lapujade, P., van Breukelen, B., Knijnenburg, T.A., de Hulster, E.A.F., Reinders, M.J.T., Pronk, J.T., Heck, A.R., and Slijper, M. 2007. Quantitative proteomics and transcriptomics of anaerobic and aerobic yeast cultures reveals post-transcriptional regulation of key cellular processes. Microbiology, 153: 3864–3878. Delgado, J. and Liao, J.C. 1992. Metabolic control analysis using transient metabolite concentration. Biochem. J., 285: 965–972. Delgado, J.P. and Liao, J.C. 1991. Identifying rate-controlling enzymes in metabolic pathways without kinetic-parameters. Biotechnol. Prog., 7: 15–20. Fell, D. 1996. Understanding the Control of Metabolism. Portland Press, London. Hatzimanikatis, V. and Bailey, J.E., 1997. Efects of spatiotemporal variations in metabolic control: approximate analysis using (log)-linear kinetic models. Biotechnol. Bioeng., 57: 75–87. Haunschild, M.D. et al. 2005. Investigating the dynamic behaviour of biochemical networks using model families. Bioinformatics, 21: 1617–1625. Haunschild, M.D. et al. 2006. A general framework for large scale model selection. Optimiz. Methods Sotware, 21: 901–917. Heinrich, R. and Rapoport, T.A. 1974. A linear steady-state treatment of enzymatic chains: general properties, control and efector strength. Eur. J. Biochem., 42: 89–95. Heijnen, J.J. 2005. Approximative kinetic formats used in metabolic network modelling. Biotechnol. Bioeng., 91: 534–545. Heijnen, J.J., van Gulik, W.M., Shimizu, H., and Stephanopoulos, G. 2004. Metabolic lux control analysis of branch points: an improved approach to obtain lux control coeicients from large perturbation data. Metabol. Engin., 6: 391–400. Heinrich, R. and Schuster, S. 1996. he Regulation of Cellular Systems. Chapman & Hall, New York. Hofmeyr, J.H. and Cornish-Bowden, A. 1996. Co-response analysis:a new strategy for experimental metabolic control analysis. J. heor. Biol., 182: 371–380. Kacser, H. and Burns, I. 1973. Rate control in biological processes, Darries DD, ed., 65–104, Cambridge University Press, Cambridge. Kresnowati, M.T.A.P., van Winden, W.A., and Heijnen, J.J. 2005. Determination of elasticities, concentration and lux control coeicients from transient metabolite data using linlog kinetics. Metabol. Eng., 7: 142–153. Kresnowati, M.T.A.P., van Winden, W.A., Almering, M.J.H., ten Pierick, A., Ras, C., Knijnenburg, T.A., Daran-Lapujade, P.A.S., Pronk, J.T., Heijnen, J.J., and Daran, J.M. 2006. When transcriptome meets metabolome: fast cellular responses of yeast to sudden relief of glucose limitation. Mol. Systems Biol., 2 (49): 1–16. Lange, H.C., Eman, M., van Zuijlen, G., Visser, D., van Dam, J.C., Frank, J., Teixeira de Mattos, M.J., and Heijnen, J.J. 2001. Improved rapid sampling for in-vivo kinetics of intracellular metabolite in Saccharomyces cerevisiae. Biotechnol. Bioeng., 75 (4): 406–415. Mashego, M.R., Wu, L., van Dam, J.C., Ras, C., Vinke, J.L., van Winden, W.A., van Gulik, W.M., and Heijnen, J.J. 2004. Miracle: mass isotopomer ratio analysis of U-13-C labeled extracts. A new method for accurate quantiication of changes in concentrations of intracellular metabolites. Biotechnol. Bioeng., 85 (6): 620–628. Mashego, M.R., van Gulik, W.M., Vinke, J.L., Visser, D., and Heijnen, J.J. 2006. In-vivo kinetics with rapid perturbation experiments in Saccharomyces cerevisiae using a second generation Bioscope. Metabol. Eng., 8: 370–383.

Metabolic Control Analysis

16-29

Nasution, U., van Gulik, W.M., Pröll, A., van Winden, W.A., and Heijnen, J.J. 2006. Generating short-term kinetic responses of primary metabolism of Penicillium Chrysogenum through glucose perturbation in the Bioscope mini reactor. Metabol. Engin., 8 (5): 395–405. Niederberger, P., Prasad, R., Miozarri, G., and Kacser, H. 1992. A strategy for increasing an in-vivo lux by genetic manipulations. Biochem. J., 287: 473–479. Nielsen, J. 1995. Physiological engineering aspects of Penicillium Chrysogenum. DSc thesis, Technical University of Denmark, Lyngby, Denmark. Nikerel, I.E., van Winden, W.A., van Gulik, W.M., and Heijnen, J.J. 2006. A method for estimation of invivo elasticities in metabolic networks using data from steady-state and rapid sampling experiments with linlog kinetics. BMC Bioinformatics, 7: 540. Nikerel, I.E., van Winden, W.A., van Gulik, W.M., and Heijnen, J.J. 2007. Linear-logarithmic kinetics; a framework for modeling kinetics of metabolic reaction networks. Simulation News Europe, 17 (1): 19–26. Nikerel, I.E., van Winden, W.A., Verheijen, P.J.T., and Heijnen, J.J. 2009. Model Reduction and a priori kinetic parameter identiiability analysis using metabolome time series for metabolic reaction networks with lin-log kinetics. Met. Eng., 11: 20–30. Raamsdonk, L.M., Teusink, B., Broadhurst, D., Zhang, N.S., Hayes, A., and Walsh, M.C. et al. 2001. A functional genomics strategy that uses metabolome data to reveal the phenotype of silent mutations. Nature Biotechnol., 19: 45–50. Rizzi, M. et al. 1997. In vivo analysis of metabolic dynamics in Saccharomyces cerevisiae: II. Mathematical model. Biotechnol. Bioeng.. 55: 592–608. Savageau, M.A. 1976. Biochemical Systems Analysis: A Study of Function and Design in Molecular Biology. Addison-Wesley, London. Small, J.R. and Kacser, H. 1993a. Response of metabolic systems to large changes in enzyme activities and efectors 2. he linear treatment of branched pathway and metabolite concentrations. Assessment of the general non-linear case. Eur. J. Biochem., 213: 625–640. Small, J.R. and Kacser, H. 1993b. Responses of metabolic systems to large changes in enzyme-activities and efectors. 1. he linear treatment of unbranched chains. Eur. J. Biochem., 213: 613–624. Teusink, B., Passarge, J., Reijenga, C.A., Esgalhado, E., van der Weijden, C.C., and Schepper, M. et al. 2000. Can yeast glycolysis be understood in terms of in vitro kinetics of the constituent enzymes? Testing biochemistry. Eur. J. Biochem., 267: 5313–5329. heobald, U. et al. 1997. In vivo analysis of metabolic dynamics in Saccharomyces cerevisiae: I. Experimental observations. Biotechnol. Bioengin., 55: 305–316. van Dam, J.C., Eman, M.R., Frank, J., Lange, H.C., van Dedem, G.W.K., and Heijnen, J.J. 2002. Analysis of glycolytic intermediates in Saccharomyces cerevisiae using anion exchange chromatography and electrospray inonisation with tandem mass spectrometric detection. Analytica Chimica Acta, 460 (2): 209–218. van Gulik, W.M., De Laat, W.T.A.W., Vinke, J.L., and Heijnen, J.J. 2000. Application of metabolic lux analysis for the identiication of metabolic bottlenecks in the biosynthesis of Penicillin G. Biotechnol. Bioeng., 68: 602–618. van Gulik, W.M., van Winden, W.A., and Heijnen, J.J. 2003. Metabolic lux analysis, modeling and engineering solutions. In Handbook of Industrial Cell Culture, Vinci, V. and Parekh, S.R., eds. Humana Press, Totowa, New Jersey. Vaseghi, S., Baumeister, A., Rizzi, M., and Reuss, M. 1999. In-vivo Dynamics of the Pentose Phosphate Pathway in Saccharomyces cerevisiae. Metabol. Eng., 1(1): 128–140. Visser, D. and Heijnen, J.J. 2002. he Mathematics of Metabolic Control Analysis revisited. Metabol. Engin., 4: 114–123. Visser, D. and Heijnen, J.J. 2003. Dynamic simulation and metabolic re-design of a branched pathway using lin-log kinetics. Metabol. Engin., 5: 164–176. Visser, D., Schmid, J.W., Mauch, K., Reuss, M., and Heijnen, J.J. 2004. Optimal redesign of primary metabolism in Escherichia coli using lin-log kinetics. Metabol. Eng., 6: 378–390.

16-30

Modeling Tools for Metabolic Engineering

Visser, D. et al. 2002. Rapid sampling for analysis of in vivo kinetics using the BioScope: a system for continuous-pulse experiments. Biotechnol. Bioengin., 79: 674-681. Voit, E.O. 2000. Computational Analysis of Biochemical Systems. Cambridge University Press, Cambridge. Wahl, S.A. et al. 2006. Unravelling the regulatory structure of biochemical networks using stimulus response experiments and large scale model selection. IEE Proc. Systems Biol., 153: 275–286. Westerhof, H.V. and Kell, D.B. 1987. Matrix method for determining steps most rate limiting to metabolite luxes in biotechnological processes. Biotechnol. Bioengin., 30: 101–107. Wiechert, W. 2002. Modeling and simulation: tools for metabolic engineering. J. Biotechnol., 94: 37–63. Wiechert, W. and Takors, R. 2004. Validation of metabolic models: concepts, tools, and problems. Metabolic engineering in the post genomic era. Horizon Biosci., Wymondham, England. Wu, L., Wang, W., van Winden, W.A., Van Gulik, W.M., and Heijnen, J.J. 2004. A new framework for the estimation of control parameters in metabolic pathways using lin-log kinetics. Eur. J. Biochem., 271: 3348–3359. Wu, L., van Winden, W.A., van Gulik, W.M., and Heijnen, J.J. 2005. Application of metabolome data in functional genomics: A conceptual strategy. Metabol. Eng., 7: 302–310. Wu, L., van Dam, J.C., Schipper, D., Kresnowati, M.T.A.P., Pröll, A., Ras, C., van Winden, W.A., van Gulik, W.M., and Heijnen, J.J. 2006. Short term metabolome dynamics and carbon, electron and ATP balances in chemostat-grown Saccharomyces cerevisiae CEN-PK.113-7D following a glucose pulse. Appl. Environ. Microbiol., 72 (5): 3566–3577. Wu, L., Mashego, M.R., van Dam, J.C., Pröll, A., Vinke, J.L., Ras, C., van Winden, W.A., van Gulik, W.M., and Heijnen, J.J. 2005. Quantitative analysis of the microbial metabolome by isotope dilution mass spectrometry using uniformly 13C-labeled cell extracts as internal standards. Anal. Biochem., 336: 164–171.

17 Structure and Flux Analysis of Metabolic Networks Kiran Raosaheb Patil and Prashant Madhusudan Bapat Technical University of Denmark

Jens Nielsen Chalmers University of Technology

17.1 Introduction .....................................................................................17-1 17.2 Metabolic Network Structure ........................................................17-2 Representation of Metabolic Networks  •  Structure–Function Relationship

17.3 Network Functionality at Metabolite Level ..............................17-12 Experimental Estimation of Fluxes  •  In Silico Prediction of Fluxes  •  he Fluxome in Metabolic Engineering: Applications  •  Kinetic Models for Flux Simulations

17.4 Conclusions and Future Perspective ..........................................17-16 References ..................................................................................................17-16

17.1 Introduction Conceptual understanding of complex cellular organization can be facilitated through a perspective based on the central dogma of biology1 (Figure 17.1). Accordingly, information coded in a genome is translated into proteins via mRNA. Proteins play a variety of roles in a cell, including that of enzymes, which selectively catalyze chemical transformation between metabolites. Ensemble of all nongenetically encoded compounds (thus, excluding mRNA, proteins, etc.) and enzymes operating on them is generally referred to as a metabolic network.2 In essence, metabolic networks convert nutrients available from environment into fundamental building blocks for the synthesis of proteins, DNA, and other cellular components. By providing energy and building blocks for growth and maintenance of cells, metabolic networks play a central role in sustaining life. his key role of metabolic networks in cellular operations is evident by two facts. Firstly, the basic architecture of metabolic networks is largely conserved across several diferent species ranging from microscopic bacteria to humans.3 Second, cellular response and adaptation to genetic/environmental perturbations is oten mediated through or relected in the operation of metabolic networks.4 Although the structure of metabolic networks difer signiicantly at local levels (e.g., speciic pathway structures),3,5 their large-scale conservancy across diferent species implies common biochemical and evolutionary principles underlying their operation.6,7 Understanding such general principles has great implications for: (i) correlating and extrapolating knowledge across diferent species, especially from model organisms (such as yeast) to humans, (ii) devising rational strategies for metabolic engineering, iii) inding remedies for metabolism related diseases, and (iv) synthetic biology. Most metabolic engineering problems are concerned with optimization of metabolic network function at the level of luxes. Important exceptions may be found in higher eukaryotes such as plants where optimization of certain metabolic pools may be of more relevance.8 A lux for any reaction can be deined 17-1

17-2

Modeling Tools for Metabolic Engineering DNA Replication

(a)

(b) Enzymek Flux = f(Mi, Mj, Enzymek)

DNA Mi

Mj

Transcription mRNA Translation

Protein

Structural element

Nutrients

By-products

Enzyme

Regulatory protein

Metabolic network

Energy

... ... M1 + M2 > M3 + M4 :: Enzyme j M4 < > M5 :: Enzyme k M5 + M6 -> M1 + M7 :: Enzyme j ... ...

Building blocks for growth

FIguRE 17.1 (a) he central dogma in molecular biology. DNA replication can be thought as information low (back-up) from genome to genome. Information coded in genes lows to proteins via transcription and translation. Proteins may play a variety of functional roles in a cell. Only three roles are shown as examples. (b) Enzymes catalyze chemical transformation of metabolites. he rate of enzyme catalyzed reaction (lux) is not only a function of enzyme availability and properties, but also concentration of substrates and products. Several of such enzymatic steps constitute a metabolic network where products of some reaction (/s) serve as substrates for other reaction (/s), thus creating an interconnected reaction web. he overall function of a metabolic network can then be viewed as utilizing environmentally available nutrients to generate energy and building molecules for growth and maintenance of the cell.

as the amount of substrates processed (or products produced) per unit time. Whole metabolic network can be viewed as an interconnected set of mass low channels. Most microbial metabolic engineering problems then can be represented as an optimization of certain set of cellular exchange luxes, i.e., rates of secretion/uptake of compounds of interest. Knowledge of intracellular lux distribution and computational tools for predicting luxes in mutant strains is thus of prime importance for metabolic engineering. Some of the key aspects of network structure and lux analysis relevant for metabolic engineering applications are depicted in Figure 17.2. We here note that most of the discussion in this chapter is presented with a global view of metabolic network (at genome-scale). Although most of the lux and structure analysis tools are usually applied to semi-global reduced networks, the use of genome scale metabolic models will be inevitable in modern metabolic engineering studies. We have, therefore, also refrained ourselves from discussing tools and approaches that are based on the isolated analysis of selected pathways.

17.2 Metabolic network Structure 17.2.1 Representation of Metabolic networks Use of appropriate representations for depiction and analysis is an essential element in discovering universal organizational and operational principles in metabolic networks. Convenient and biologically

17-3

Structure and Flux Analysis of Metabolic Networks

Micro-organism

Genome sequence, Biochemical data, Literature

Structure -function

Structure analysis

• Graph topology • Petri-net • Flux coupling • Network based data integrations

Genome scale model

Reduced model

Flux analysis

Modeling tools

Experimental tools

• FBA, ROOM • MOMA, EBA

• 13C Flux analysis • Metabolites analysis

Fluxes : simulated

Fluxes : experimental

Metabolic engineering tools • OptGene, • OptKnock, optStrain • Heuristic based • Dynamic optimization

FIguRE 17.2 Schematic overview of tools and information low in global structure and lux analysis of metabolic networks. Structure analysis: the information retrieved from the genome sequence, biochemistry, and the literature can be utilized for deducing metabolic network structure for the given organism. Existing reduced models are oten used as templates for re-constructing global models. Flux analysis: he connectivity and stoichiometry from structure analysis is systematically exploited for measurement and simulation of intracellular luxes. he integrated information can then be applied for deducing underlying structure–function relationship. Metabolic engineering tools: Flux analysis and structure–function relationship is recruited for identifying metabolic engineering targets.

17-4

Modeling Tools for Metabolic Engineering

meaningful representations not only help in comparing diferent metabolic networks, both on global and local scale, but also for quantitatively categorizing diferent network structures. Furthermore, network structure has also been shown to be an inherent element in genome-scale data integration.4,9 We hence begin with a brief overview of the diferent representations of metabolic networks and then discuss the structure–function relationship. In order to facilitate the description of diferent network representations, an example metabolic network illustrated in diferent forms discussed in the following text (Figure 17.3). 17.2.1.1 Pathway Representation Pathway is the oldest and perhaps the most commonly used way to represent a metabolic network. A pathway generally depicts a part of metabolism (collection of enzyme-catalyzed reactions and corresponding metabolites) that performs a certain biochemical task. Examples of pathways include the TCA cycle, the Emden–Meyerhof–Parnas pathway, histidine biosynthesis, etc. Such representations are familiar to biologists due to their wide-spread use in biochemistry text books and online databases such as KEGG (Kyoto Encyclopedia of Genes and Genomes, http://www.kegg.com).10 In a pathway representation, metabolites are shown as nodes (usually simply as text) and enzymes as arrows connecting the corresponding metabolites. Currency metabolites, i.e., the metabolites taking part into large number of reactions (e.g., NADH, ATP, CO2, etc.), are shown only at the individual reaction level. Although, pathway representation is generally used only for small parts of metabolism (e.g., KEGG), there are examples of pathway-style representation of genome-scale metabolic networks (e.g., see EcoCyc database, http:// ecocyc.org/). Indeed, categorization and pictorial depiction of biochemical functionality is an important aid to the human understanding of the operational and regulatory logic of metabolism. Consequently, several pathway-based analysis tools are routinely used for the analysis of metabolic networks. Despite of their usefulness, there are certain drawbacks and pitfalls in the use of pathway-based representations. Most of these drawbacks can be attributed to two facts: (i) pathways oten fail to account for the high connectivity there is in metabolic networks, (ii) deinition of a pathway is vague and does not necessarily strictly correspond to a particular physiological functionality (e.g., homeostasis of metabolites, balanced lux distribution, etc.). Indeed, several metabolites span several diferent pathways since the end-point or intermediates from one pathway oten act as substrates/products in other pathways (see Table 17.1 for a list of selected metabolites and the number of yeast KEGG-pathways they participate in). Consequently, the choice of reactions shown as a part of a pathway is rather arbitrary. hese drawbacks are becoming more apparent as large amount of genome-wide data on gene expression, protein abundances, metabolite levels, and luxes is being generated. Complex patterns observed in these datasets seldom it into standard pathway deinitions and thus operation of metabolic networks can not be explained through them. Consequently, other more comprehensive representations are being sought in order to systematically describe metabolic networks. 17.2.1.2 Graph theoretical Representation Complex cellular organization can be viewed as an ensemble of several biomolecular interaction networks; such as protein–protein and protein–DNA interaction networks. Functional relationships between cellular species can also be conceptualized as interactions, in addition to the interactions arising due to the physical contacts between biomolecules. Metabolic reactions can thus be seen as functional relationship between metabolites and vice versa. Interaction centered view of cellular metabolism can be used to construct a graph theoretical representation (Figure 17.3b) where metabolites and reactions (/corresponding enzymes) are represented as nodes while interactions between them form edges. he resulting graph is essentially bi-partite, meaning that there are two classes of nodes, viz. metabolite nodes and enzyme nodes, and no two nodes in the same class are directly connected to each other. Two other uni-partite metabolic graphs can be derived from the bi-partite representation. In reaction interaction graph (Figure 17.3c) only reactions are represented as nodes while two reactions sharing common metabolite (/s) are connected by an edge. Similarly, metabolite interaction graph can also be

ADHE

ADHA

2 Acetate

2 Acetyl-P

PTA

Diacetyl

2 ATP

2 ADP

2

ALS ILVB CO2

2,3-Butanediol

NAD+

NADH

ALDB

Acetoin

NADH NAD+ BUTB

BUTA

+

2Lactate

2NAD

LDH

2NADH

2-Acetolactate O2 Chemical oxidation CO2 CO

2NAD

2NAD+

Acetoin

BUTB

2,3-Butanediol

Diacetyl

ADP

ALS

Pyruvate

2-Acetolactate

CO2

PDH

Acetyl-P

PFL Formate

PTA

ACKA

Acetyl-CoA Lactate

Chemical_oxidation O2

ATP

Acetate

Glycolysis

Glucose

LDH

NADH

ALDB

BUTA

NAD+

ADHE

Acetaldehyde

Ethanol ADHA

(b)

FIguRE 17.3 Diferent depictions of metabolic network that are commonly used for visualization, conceptual representation, or unraveling underlying structural and functional properties. A small schematic representation of pyruvate metabolism in Lactococcus lactis54 is used as an example. (a) Traditional pathway-style representation. Reactions are usually represented as arrows showing conversion of corresponding metabolites. Highly connected metabolites (especially cofactors such as NADH and ATP) are only displayed locally at individual reaction level. (b) Graph theoretical representation of metabolic network (bi-partite view). Reactions (circles) and metabolites (squares) form two classes of nodes in a graph and the interactions among them form the corresponding edges. (c) Reaction interaction graph. his is another graph theoretical view of the metabolic network where only reactions form the nodes while the metabolites form the edges connecting the corresponding reactions. (d) Metabolite interaction graph. Metabolite centered graphical representation where metabolites form the nodes while enzymes act as edges connecting the corresponding metabolites. (e) Flux coupling graph. A concept based on the stoichiometric constraints on the operation of the network. All reactions are represented as nodes. Two reactions for which the corresponding luxes are correlated are connected by an edge (directional coupling: dashed line, full coupling: thick complete line).

2 Ethanol

2 NAD+

2 NADH

2-Pyruvate

Glucose

2Formate

PFL

2ATP

2ADP

2 Acetyl-CoA

CO2

2NADH

2NAD+

PDH

Glycolysis

2 Acetaldehyde

2NAD+

2 NADH

(a)

Structure and Flux Analysis of Metabolic Networks 17-5

PTA

ADHE

FIguRE 17.3

ACKA

ALS

PFL

Chemical_oxidation

ALDB

PDH

LDH

Glycolysis

(Continued)

BUTA

BUTB

ADHA

(c)

CO2

Glucose

Pyruvate

NADH

ATP

Formate

Acetate

ADP Acetyl-P

Lactate

Acetyl-CoA

Acetaldehyde

NAD+

Ethanaol

2,3-Butanediol

Acetolin

Diacetyl

2-Acetolactate

O2

(d)

PDH

BUTA

ALDB

Chemical_oxidation

BUTB

ADHA

PTA

Glycolysis

ACKA

LDH

ADHE

(e)

ALS

PFL

17-6 Modeling Tools for Metabolic Engineering

17-7

Structure and Flux Analysis of Metabolic Networks TaBLE 17.1 Selected Metabolites from Yeast Central Metabolism hat Participate in Several Diferent Pathway Deinitions as per KEGG Database Metabolite Glucose 6-phosphate NH3 Glucose L-Glutamine Glyoxylate 2-Oxoglutarate Urea Isocitrate Oxaloacetate Acetyl-CoA Malate

Number of KEGG-Pathways 6 7 6 5 5 12 4 3 9 23 7

constructed (Figure 17.3d). Graph theoretical representations of metabolism ofer several advantages including conservancy of the global connectivity in the network and it is not necessary to remove the highly connected nodes from the network for simplifying the analysis. Another gain is in terms of an algorithm-friendly data structure ofered by graph-theoretical representations. Indeed, several algorithms for the analysis of metabolic networks are based on graph data structures. Graph-theoretical representations thus provide a platform for systematic integration of omics data with metabolic networks in order to discover new biological patterns and hypotheses. 17.2.1.3 Stoichiometric Matrix Although not falling strictly into the category of visual representation, a collective stoichiometric matrix of all reactions comprising a network has important theoretical and practical implications in the analysis of network structure and function. he stoichiometric information that is usually missing from the previously described network depictions is systematically arranged in a stoichiometric matrix. he low of mass through metabolic networks (luxes) can only be calculated/estimated/measured/understood only in the light of the stoichiometry of reactions occurring in the network. Indeed, at (pseudo-) steady-state conditions stoichiometric matrix implies the feasible space for all possible lux solutions in the network. By using a stoichiometric matrix, it is thus possible to enumerate all possible steady state lux solutions of a given network. For example, boundaries of the feasible solution space can be identiied in terms of elementary lux modes. Elementary lux mode is deined as a set of reactions that can operate at steady state and cannot be further decomposed in to such smaller sets. Consequently, any lux distribution at a steady state can be shown as a weighted linear combination of elementary lux modes. Elementary lux modes for the example metabolic network shown in Figure 17.3a are shown in Table 17.2. Further applications of stoichiometric matrix for lux estimations are discussed in the second part of this chapter. Diferent representations of metabolic networks discussed above must be only seen as diferent ways of depicting the same data. In this sense the stoichiometric matrix is perhaps the most complete data structure as it holds stoichiometric coeicients in addition to connectivity information. However, bipartite graph can also be easily extended to incorporate stoichiometric coeicients (e.g., as edge properties). On the other hand, any graph structure can always be represented in a matrix format (most common example being adjacency matrix). hus, the choice of representation should be dictated by the intended application. It must be noted, however, that the pathway-representation is perhaps mostly useful for pictorial depiction due to limitations of this approach as discussed before. In contrast, since other approaches usually do not make any a priori assumptions on a particular set of reactions/metabolites

17-8

Modeling Tools for Metabolic Engineering TaBLE 17.2 Elementary Flux Modes for the Reactions Depicted in Figure 17.3a Overall Reaction E1

Glucose฀+฀2 ADP฀=฀2 ATP฀+฀2 Lactate

E2

Glucose฀+฀2 ADP฀=฀2 ATP฀+฀2 Ethanol฀+฀CO2

E3

Glucose฀+฀2 ADP฀+฀O2฀=฀2 ATP฀+฀2,3-Butanediol฀+฀2 CO2

E4

Glucose฀+฀3 ADP฀=฀3 ATP฀+฀Ethanol฀+฀Acetate฀+฀2 Formate

E5

1.5 Glucose฀+฀3 ADP฀=฀3 ATP฀+฀Ethanol฀+฀2,3-Butanediol ฀+฀2 CO2฀+฀Formate

being a part of a particular process, they are more successful in uncovering the principles underlying complex operations of metabolic networks, both in terms of luxes and their regulation. Indeed, it is only for a visual and conceptual convenience that large highly connected metabolic networks are partitioned into pathways. Some examples illustrating this idea are discussed in the following section.

17.2.2 Structure–Function Relationship 17.2.2.1 topological and Functional Features of network elements One of the simplest and intuitive measures of topological importance of elements in a network is the degree of a node. Degree of a node is deined as number of edges connected to that node (or the number of its immediate neighbors). It may also be convenient to distinguish between in-degree and out-degree in a directed graph (see Figure 17.4 for illustration of some graph-related deinitions). Although relatively simple, distribution of degree of nodes in a network can elucidate several structural aspects of network characteristics. Degree distribution of metabolite nodes in the bi-partite graph of Saccharomyces cerevisiae is shown in Figure 17.5. his distribution projects an important feature of metabolic networks, namely existence of few metabolites that participate in a large number of reactions (e.g., ATP, NADH, and NADPH), while most of the metabolites take part in relatively few reactions. Degree distribution of metabolites thus obeys a power law P(k)฀=฀k -γ, where γ is a constant. Although the existence of highly connected metabolites is long known in biology (as currency/cofactor metabolites), the network structure at genome-scale allows a systematic study of network topology and structure–function relationship from applied and evolutionary perspectives. Study of several metabolic networks across all three domains of life has revealed that the power law degree distribution is prevalent among them.11,12 Interestingly, the power law degree distribution indicates a scale-free organization of metabolic networks, in line with other physical/biological networks occurring in the nature. As the name implies, “scale-free” networks show similar basic topological features irrespective of the scaling at which the network is viewed. As mentioned above for metabolic networks, such networks are characterized by the presence of few highly connected nodes (hubs) while rests of the nodes have relatively few links. Hubs bestow small world property to metabolic networks,13 meaning that any two nodes are, on average, at relatively small distance from each other. Although scale-free networks have been found in many biological and physical systems, the fact that they are scale-free is far from expected if these networks were created through a random process. hus, scale-free networks also display certain properties that are not found in random networks. Perhaps the most interesting property of scale-free networks in the metabolic engineering context is their robustness against random failures.11 Since most of the nodes have relatively low connectivity, deletion of randomly selected nodes does not alter the connectivity in the network. On the other hand, the presence of few highly connected nodes (hubs) makes the network susceptible to targeted attacks. To what extent these simple topological measures explain the functioning and evolutionary origin of metabolic network structure? Metabolic network can oten be conceptually viewed as a collection of modules working together (see the previous section on metabolic network representation). Such

17-9

Structure and Flux Analysis of Metabolic Networks (a)

0+2=2

(b)

2

n1

n1 1+0=1

3

1

n4

n2

1 n3

1

2+1=3 n4

n2

n3 n5

n5

0+2=2

1+0=1 Degree

In-degree + Out-degree = Degree

FIguRE 17.4 Illustration of some basic graph-related deinitions. (a) Undirected graph. Edges do not indicate any information about the direction of low (of information, mass, energy, etc.) between the nodes. his either implies that either such low is possible in both directions, or no information is available on directionality of edges. (b) Directed graph. In contrast to undirected graph, edges are “arrowed” and imply possible direction of low in the network.

500 450 400 350

Frequency

300 250 200 150 100 50 0 0

20

40

60 80 100 120 Degree of metabolites

140

160

FIguRE 17.5 Distribution of degree of metabolites in the bi-partite graph representing genome-scale metabolic network of Saccharomyces cerevisiae. (From Forster, J., Famili, I., Fu, P., Palsson, B.O., Nielsen, J. Genome Res., 2003, 13 (2), 244–253. With permission).

17-10

Modeling Tools for Metabolic Engineering

modularization could be based on, e.g., chemical nature of metabolites.14 Indeed, it has been computationally shown that scale-free nature of metabolic networks can be explained through hierarchical modular organization that evolves based on “rich becomes rich” principle.12 hus, a network is built starting with small nonscale-free modules that replicate and connect to other modules with preference. Several of these decomposed modules from E. coli metabolic network were found to coincide with the known biochemical functional modules in the metabolism. hus, the network structure not only provides clues to the evolutionary origin of organization in metabolic networks but may also help in automated and robust classiication of metabolism in diferent functional units.15,16 Another key information emerging from (/conirmed through) the topological analysis is the “bow tie” architecture of metabolic networks.16,17 Several nutrients thus enter the central knot, while diferent biosynthesis building blocks fan out from this knot. he central knot represents the 12 precursor molecules from which amino acids, nucleotides, and other essential components are built. Furthermore, redox and energy cofactors and other hub metabolites act as connecting links between the central knot and other parts of metabolic network. he bow tie architecture of metabolic networks bestows a remarkably balanced lexibility, robustness, and thus, evolvability to metabolic networks. his architecture can be seen as a combination of standardization and “plug and play” type modularity of nutrient intake and secondary metabolism achieved through a ixed set of precursor molecules. For example, new pathways for antibiotic synthesis can easily be acquired by an organism through horizontal gene transfer, since their synthesis will start from the existing precursor molecules. Robust and global control of the complex network is achieved via hub metabolites. his modular yet lexible design also allows keeping a minimum inventory of metabolites and just in time synthesis of necessary building blocks for growth. he bow tie nature of metabolic networks on the other hand also makes them fragile against changes in the central core and hub metabolites. From metabolic engineering point of view, the bow tie architecture can be used to formulate rules of thumb for choosing/rejecting certain targets for genetic manipulations. Indeed, it has been observed in several occasions that perturbations in metabolic network that adversely afect either pre-cursor or hub metabolites (e.g., ATP) oten lead to deleterious phenotypes. he central role of metabolite hubs can, on the other hand, also be exploited for redirecting luxes toward desired products.18,19 Plug and play nature of “fan-in” and “fan-out” part of bow ties can be exploited for creating super hosts for production of heterogenous proteins or secondary metabolites. Moreover, these rules of thumb will also help in devising better strategies for combating infectious microorganisms20 and understanding metabolic diseases.17 Understanding of large-scale organization principles can thus, lead to formulate more complete modeling platforms for in silico metabolic engineering. his can be achieved, e.g., by exploiting the general principles of operation rather than focusing on very detailed kinetic modeling where reliable in vivo information is diicult to obtain on a whole network scale. 17.2.2.2 Fluxes and Metabolic network Structure In contrast to protein–protein interaction networks, where a good correlation has been observed between the essentiality of a protein for growth and number of interactions that it takes part into; no such strong correlation was observed in metabolic networks.18,21 hus, operation of metabolic networks appears to be fundamentally diferent from that of other biological (e.g., protein–protein interaction network) and technological networks (e.g., internet). Furthermore, connectivity in commonly used graph theoretical representations of a metabolic network does not completely represent mass and energy low through the network. his is because the stoichiometry and transfer of structural moieties between metabolites are not generally accounted for in graph representations. Consequently, although topology of a metabolic network implies a small world, this characteristic high connectivity does not hold when strict biochemical transformation networks are considered.22 Since most of the metabolic engineering applications are aimed at manipulation and redirection of luxes, it is vital to account for relationship between diferent reactions in the network not only at shared metabolite level (as relected in the topology), but also at the lux level. Elucidation of such relationship can only be achieved by systematically accounting for the stoichiometry of all reactions involved.

Structure and Flux Analysis of Metabolic Networks

17-11

Flux coupling analysis, an elegant mathematical formulation reported by Burgard et al.23 can be used to identify the connectivity at the level of luxes (lux coupling) under the assumption of steady state operation. Flux coupling analysis uses linear programming to decide whether lux through a particular reaction implies a ixed/variable lux through other reactions such that no metabolites are accumulated or depleted in the cell. hus, two luxes f1 and f2 can be (i) fully coupled, i.e., a nonzero lux for f1 implies a nonzero and a ixed lux for f2 and vice versa; or (ii) partially coupled, i.e., a nonzero lux for f1 implies a nonzero, though variable, lux for f2 and vice versa; or (iii) directionally coupled, i.e., a nonzero lux for f1 implies a nonzero lux for f2 but not necessarily the reverse; or (iv) uncoupled , i.e., two luxes operate independently. Comparison between Figure 17.3c and e marks the diference between the reaction interaction graph and the lux coupling graph for the same metabolic network. In particular, lux coupling graph extends much further than the connectivity implied by the metabolites participating in the corresponding reactions. Flux coupling analysis can thus not only greatly aid metabolic engineering by revealing distant and nonintuitive relationships, but also provides a new representation of metabolic network that can be used as a data integration scafold. Interestingly, the topology of the genome-scale E. coli lux coupling graph also shows a scale-free architecture, 23 and so does the global organization of luxes.24 hus, a metabolic network is featured by large luxes through few reactions while most of the reactions carry relatively low luxes. Few of the luxes also act as hubs by being coupled to large number of luxes throughout the network. his topological similarity between diferent structural counterparts of a metabolic network underscores the global common principle of their operation. 17.2.2.3 network Structure and Regulation he relevance of graph theoretical analysis to metabolic engineering is perhaps not as directly evident as that of lux coupling analysis (and other stoichiometry centered steady state approaches). However, in several metabolic engineering problems in microorganisms as well as for problems in mammalian and plant cells, dynamic metabolic operation is of interest. Furthermore, the steady state analysis approaches, typically, only reveal the boundaries of the operation of metabolic luxes and thus, the observed solution is not always theoretically deducible from the stoichiometry alone. hus, cellular metabolism, as relected in the metabolite levels and luxes, is an integrated result of mass balance constraints (stoichiometric constraints) and regulation at several diferent levels. hus, the inherent interdependency between enzymatic regulation, metabolite levels and luxes is partially relected in the high connectivity of metabolic graphs. Both metabolite and enzyme nodes potentially contribute to the regulation of metabolite levels and luxes. Disturbances at any node (/s) of the network can then spread through a highly connected network in terms of changes in metabolite and enzyme levels, and luxes. Consequently, it can be hypothesized that the topology of the interactions involved in metabolism can be used to understand the underlying regulatory mechanisms (e.g., at transcriptional level) controlling the low of mass and energy. his hypothesis was formalized into an algorithm by Patil and Nielsen.4 he algorithm integrates gene-expression data with topological information from genome-scale metabolic models, and thus, enables systematic identiication of so-called reporter metabolites that represent hot spots in terms of metabolic regulation. Several metabolites, especially ones with high connectivity, usually span many pathways and act as connecting bridges across these pathways. Consequently, pathways as a whole are not subjected to strict stoichiometric/thermodynamic constraints on their own. Constraints on a pathway can thus, only be invoked in the connection with other connected pathways due to overlap of metabolites across pathways. On the contrary, coordinated transcriptional changes around metabolites are indeed necessary for one of two reasons (or both). Either to maintain homeostasis or to change the enzyme and metabolite levels so as to adjust to the new lux demands placed on the metabolic network by perturbation (/s). hus the transcriptional coregulation of the genes surrounding a metabolite is, in part, stoichiometric and thermodynamic necessity and reporter metabolites indicate speciic parts of metabolism where signiicant transcriptional regulation is exerted.

17-12

Modeling Tools for Metabolic Engineering

In order to identify the reporter metabolites each metabolite node in a metabolic graph is scored based on the normalized transcriptional response of its neighboring enzymes. Z metabolite =

1 k

∑Z

ni

where Zni is a score of ith neighboring enzyme, typically estimated as inverse normal cumulative of p-value indicating the signiicance of the expression change. Zmetabolite scores should be corrected for the background distribution by subtracting the mean (µk) and dividing by the standard deviation (σk), of the aggregated Z-scores of several sets of k enzymes chosen randomly from the metabolic graph. Z corrected metabolite =

(Z metabolite - µ k ) σk

he scoring used for identifying reporter metabolites is basically a test for the null hypothesis, “neighbor enzymes of a metabolite in the metabolic graph show the observed normalized transcriptional response by chance.” he metabolites with signiicant score are deined as reporter metabolites. 17.2.2.4 transcriptionally Responsive Sub networks Metabolic changes in a metabolic network are featured by coordinated changes throughout the network. An extension of the reporter algorithm4 searches the enzyme interaction graph to identify a sub network with maximum collective transcriptional response. hus, while reporter metabolites probe the local points in the metabolic network for signiicant changes, sub networks paint a global picture of the transcriptional regulation. Both reporter metabolites and sub networks can ind small but coordinated changes in a network without a priori assumption on particular pathway structures. Together, these tools have successfully been employed to correlate transcriptional changes with lux changes in a mutant strain.4 Due to strong biological hypothesis underlying the reporter algorithm, it can also be easily used to integrate other omics data with metabolic networks. An example includes use of metabolome data together with the transcriptome data to predict whether a particular lux is controlled at hierarchical level or metabolic level.25 At present, it is not possible to reliably estimate luxes in many diferent parts of metabolism, while mRNA expression can be measured for all genes in a sequenced organism. Moreover, metabolome and proteome data is becoming increasingly available for diferent parts of metabolism. Consequently, reporter and sub network algorithm are valuable tools for obtaining a holistic picture of metabolic changes, even from the lux point of view. In cases where luxome data is available, it can be used to improve the results/predictions from the reporter/sub network algorithm. Another approach that uses stoichiometric constraints in addition to topology for elucidation of regulatory logic is based on elementary lux modes. Stelling et al.26 introduced a concept of control-effective lux that accounts both for network eiciency and lexibility at a particular node in the network. Control-efective lux is deined as the average lux through a reaction in all elementary modes, whereby for each mode the actual lux it weighted by the modeís eiciency in terms of supporting cellular growth. Transcriptional changes were found to correlate well with control-efective luxes for several metabolic genes, results that were not possible to explain by considering only optimal routes. Accounting for all elementary lux modes thus accounts for the network lexibility, an important characteristics bestowing robustness to cellular networks.

17.3 network Functionality at Metabolite Level Due to high connectivity between and within various metabolic processes, the space of possible lux distributions in a given metabolic network is very large. In other words, substrates consumed by cells can

Structure and Flux Analysis of Metabolic Networks

17-13

be distributed through metabolic channels in numerous ways. Rechanneling of this mass low toward a desired compound thus demands understanding of biological basis of a particular lux distribution under a given condition. his task is challenging owing to the complexity of factors constraining and regulating luxes. Flux at any given reaction in the network is an (oten unknown) integrated function of: enzyme activity, substrate, and product concentrations and underlying kinetic mechanisms. Enzyme activity in turn is a function of transcriptional and translational eiciency of the corresponding protein as well as accompanying regulatory mechanisms. hus, a given lux can be thought as being regulated at hierarchical (from gene to enzyme activity) and metabolic level (kinetic dependence of lux on metabolite pools).27 Since it is now possible to quantify large number of intracellular metabolite pools, it is possible to infer whether the reactions are hierarchically or metabolically regulated. For example, the principle underlying the reporter metabolite algorithm can be used to map diferent layers of regulation within metabolic networks through combination of metabolome and transcriptome data.28 However, such analysis usually reveals the regulatory architecture only in qualitative terms and for given set of experimental conditions. Indeed, high connectivity of cellular processes at both hierarchical and metabolite level as well as regulatory interactions contribute toward the complexity of lux dependence on genotype in a given environmental conditions. On the other hand, this complexity can be conveniently exploited by viewing luxes as an integrated outcome of all complex cellular processes.29 Full exploitation of this view motivates the tools for measurement of in vivo luxes for a system under investigation. One of the useful simpliications that can be applied at both theoretical and experimental fronts of lux measurements is the assumption of (pseudo) steady state. We briely discuss both theoretical and experimental lux-estimation tools in the following text.

17.3.1 experimental estimation of Fluxes here are no direct methods available for the analysis of in vivo metabolic luxes. Intracellular luxes or in vivo reactions rates can be quantiied by combining experimental metabolite measurements with mass balances applied around intracellular metabolites. he mass balances are based on the stoichiometry of the intracellular reactions that are included in the metabolic model and is largely based on assumed biochemistry.30 he key assumption mentioned above means that for a given metabolic network the balances around each metabolite impose a number of constraints on the system. In general if there are J luxes and K metabolites, then the degree of freedom is F฀=฀J–K, and through measurements of only F luxes (biosynthetic requirement (µ), nutrient uptake (–rs), and product secretion rates (r p)), the remaining luxes can be calculated. Although this methodology works well with the linear reaction sequences, it oten fails at intermediary metabolism. Limited data and stoichiometric constraints oten lead to the undetermined system that does not allow resolving lux distribution uniquely. One approach to overcome this limitation is to combine metabolite balancing with feeding labeled tracers (stable isotope) to the cells and measuring the distribution of labeling in the diferent intracellular metabolites. Several experimental techniques for analysis of the enrichment pattern in intracellular metabolites have been developed (for excellent review please refer to Ref. 31). All these techniques are currently based on using nuclear magnetic resonance (NMR)32 or gas chromatography-mass spectrometry (GC-MS).33 Due to the low intracellular concentration of central metabolites, it is impractical to use these compounds for the analysis of labeling patterns. However, since central metabolites are converted to amino acids, this labeling information is saved/ stored in the respective proteins through conserved biosynthetic pathway. he proteins can then be hydrolyzed to release the labeled proteinogenic amino acids which can be further analyzed using NMR or GC-MS. A consequence of the use of proteinogenic amino acids for analysis is that steady state cultivation is required for lux quantiication through the 13C tracer approach. However, 13C-labeling methods can be applied in batch cultivation for quantitative assessment of lux distribution if there is sampling in the exponential growth phase ater several doublings of the biomass concentration.

17-14

Modeling Tools for Metabolic Engineering

Once NMR or MS spectra are recorded, the next process is quantitative interpretation of isotopomer data by employing mathematical models that describe the relationship between lux and observed isotopomer abundance. Similar to metabolite balancing, balances can be set up around all isotopomer of the particular metabolite. Schmidt et al.34 described an elegant method for automatically generating the complete set of isotopomer balances using a matrix based method. Some other approaches include, cumulative isotopomer (cumomers),35 bondomers36 and sum fractional labels.37 Such comprehensive accounting of all available physiological and isotopomer data from single experiments retrieves the maximum information through data integration. Although the mathematical framework for lux analysis (MFA) has emerged as a tool of great signiicance, an important limitation is a large search space to optimize the lux distribution, which is computationally expensive. Moreover, it imposes limitation when multiple isotopic tracers are used for the labeling of the system and oten reduces the ability of MFA to fully utilize the power of multiple tracers in elucidating physiology of the organism. Recently Antoniewicz and coworkers38 proposed a mathematical framework based on elementary metabolite unit (EMU), which is based on a highly eicient decomposition method that identiies the minimum amount of information needed to simulate isotopic labeling within a reaction network using the knowledge of atomic transition occurring in the network reaction. his helped in reduction of isotopomers from two millions to 354 EMUs in gluconeogenesis pathway with 2H, 13C, and 18O. Apart from this, new lux estimation tools are emerging that use the information from direct detection of 13C patterns in pathway intermediates rather than proteinogenic amino acids or accumulated extracelluler metabolites.39 his approach has been demonstrated for few selected metabolites and the method is not yet suitable for more global analysis.

17.3.2 In Silico Prediction of Fluxes Available experimental methods for intracellular lux measurements are oten limited to only a part of the whole metabolism. his limitation is problematic in connection with studying the systems at global level and in cases where the luxes of interest lie outside the scope of experimental determination. In these situations, computational methods for predicting luxes are desirable. More importantly, theoretical lux prediction tools will allow prediction of luxes in order to design mutants in silico. Due to overwhelmingly large lux solution space, even under steady state assumptions, it is not computationally feasible to enumerate all possible lux solutions under a given condition. One of the ways to overcome this problem is to simulate luxes by optimizing a functional property of the network. Such optimization function can be viewed as a biological objective of the cellular metabolism. For bacteria and simple eukaryotes such as Saccharomyces cerevisiae, it has been demonstrated that this objective function can chosen to be the formation of biomass building blocks and/or maximization of energy production. his objective function can be simply formulated as a linear combination of luxes in the metabolic network. Under steady state assumptions this results in a linear optimization problem, oten referred to as lux balance analysis (FBA). hus, given a metabolic network (in the form of stoichiometric matrix) and experimentally measured or hypothesized constraints on uptake of substrates, FBA yields metabolic lux distribution that maximizes, e.g., biomass formation. FBA with biomass formation (or growth rate when substrate uptake rate is ixed) as an objective function has been shown to successfully predict essentiality of single gene deletion mutants in E. coli40and S. cerevisiae.41 Moreover, several nonoptimally growing E. coli single gene deletion mutants were observed to evolve toward FBA predicted optimal solution.40 FBA with biomass formation thus seems to be an useful objective function for predicting intracellular luxes in microbial systems, although notable exceptions exist.42 One of the other limitations associated with the FBA approach is the nonuniqueness of the lux solution obtained under many physiological conditions. Additional constraints become necessary to resolve ambiguities, and such constraints can be, e.g., obtained from experimental measurements of some of the luxes.

Structure and Flux Analysis of Metabolic Networks

17-15

he FBA approach basically assumes optimal operation of the metabolic network. his assumption is justiied on the ground of the long evolutionary history of cells to maximize their growth. Consequently, assumptions of optimality may easily become invalid for mutants. In an alternate approach to FBA, Segre and coworkers43 proposed that the lux distribution in mutant strains is at minimal distance from the lux map of the reference metabolic network (wild type). he metabolic objective for mutant strains can thus be formulated as minimization of metabolic (lux) adjustments (MOMA). he MOMA approach usually predicts changes in a large number of luxes. his strategy may represent high adaptation cost for the perturbed cell. Shlomi and others44 therefore proposed a computational method termed regulatory on/of minimization (ROOM) where the number of lux changes in a perturbed strain are minimized. Some evidence, although not suicient, suggests that a genetic perturbation initially leads to a lux distribution predicted by MOMA and then eventually converges to a solution predicted by FBA or ROOM.44 All of these three strategies (FBA, MOMA, and ROOM) only partially consider thermodynamic constraints in the form of directionality of luxes. In a more explicit way, Beard et al.45 impose additional thermodynamic constraints on the system to improve the FBA solution.

17.3.3 the Fluxome in Metabolic engineering: Applications Genome-scale stoichiometric models represent the integrated metabolic potential of a microorganism by deining lux-balance constraints that characterizes all feasible metabolic phenotypes under steady state conditions. Combinatorial complexity prevents calculation of all feasible metabolic phenotypes that a microbial genotype can assume under a given environmental conditions. One of the approaches to determine the metabolic phenotype (i.e., the luxes through all metabolic reactions) is to use FBA/ MOMA/ROOM, desirably in combination with experimentally measured luxes. All these methods provide a basis for using genome-scale metabolic models to predict possible metabolic phenotypes, and hence for in silico metabolic engineering. he algorithm developed by Maranas et al.46 (Named OptKnock) represents one of the irst rational modeling frameworks for suggesting gene knockouts leading to the overproduction of a desired metabolite. OptKnock searches for a set of gene (reaction) deletions that maximizes the lux toward a desired product, while the internal lux distribution is still operated such that growth (or another biological objective) is optimized. hus, the identiied gene deletions will force the microorganism to produce the desired product in order to achieve maximal growth. Indeed, the design philosophy underlying OptKnock approach takes advantage of inherent properties of microbial metabolism to drive the optimization of the desired metabolic phenotype. he relation of OptKnock with the biological objectives of microorganisms makes it an attractive and promising modeling framework for in silico metabolic engineering. he same modeling framework can be extended for determining optimal set of new genes to be added in a given host for production of new compounds or for the optimization of native molecules of interest.47,48 OptKnock is implemented by formulating a bi-level linear optimization problem using mixed integer linear programming (MILP) that guarantees to ind the global optimal solution. he applicability of OptKnock approach can be extended by formulating the in silico design problem by using a genetic algorithm (GA), hereater referred to as OptGene.49 Direct relation of GA with biological evolution makes it a natural method of choice to identify suitable genetic modiications for improved metabolic phenotype. here are two major advantages of the OptGene formulation. Firstly, OptGene demands relatively less computational time and thus it enables to solve more complex problems. his is of particular importance as the relation between the size of the problem (as deined by the number of enzymes and number of deletions desired) and the corresponding search space (combinations of enzymes to be deleted) is combinatorial. Secondly, the OptGene formulation allows the optimization of non-linear objective functions, which is of considerable interest in several problems of commercial interest. One example of an important non-linear engineering objective function is the productivity (amount of product formed per unit time).

17-16

Modeling Tools for Metabolic Engineering

17.3.4 Kinetic Models for Flux Simulations Steady state models of metabolism show a good promise for predicting and exploiting the lux phenotype of cells for metabolic engineering. Assumption of steady state, however, is not valid under several conditions of practical importance, e.g., batch and fed-batch cultivations. Furthermore, a solution predicted by a steady state model may not be realizable in light of kinetic characteristics of the system and given initial state of the metabolic network. Although a full kinetic model of the system is desirable, present day experimental techniques are far from deducing all necessary in vivo kinetic parameters and accurate metabolic state (e.g., concentrations of all metabolites). Nevertheless, several metabolic engineering strategies based on kinetic modeling of metabolism are being proposed.50–53 hese modeling frameworks are, in general, limited to the use of a small scale metabolic model, which may still be practically relevant.

17.4 Conclusions and Future Perspective Understanding of the “genome to luxome” relationship is a key for rational designing of microbial cells through metabolic engineering. Unraveling of such a relationship (even to a partial extent), however, is not easy due to the highly nonlinear and complex nature of cellular organization and operations. his challenging task is to some extent being attempted (and further extended) through (i) simplifying assumptions such as steady-state; (ii) deducing general principles of metabolic regulation through hypothesis driven methods (e.g., FBA, MOMA, and reporter metabolites). Although these methods are successful in expanding our knowledge and capabilities for developing new rational tools for metabolic engineering, only a small fraction of the cellular complexity and nonlinearity is accounted for by the current methods. hus, new tools need to be developed that will allow us to generate quantitative metabolomic and luxomic data that span diferent species and environmental conditions of interest. Novel model-based and hypothesis-driven computational tools will be necessary to uncover and exploit patterns emerging from these datasets. Such algorithmic tools are bottlenecks even with the present day available datasets such as genome, transcriptome and (to limited extent) luxome, and metabolome information. Tools are necessary for use of genome-scale metabolic models in combination with experimental lux measurements for obtaining global lux mapping.

References 1. Crick, F. Central dogma of molecular biology. Nature, 1970, 227 (5258), 561–563. 2. Nielsen, J. and Oliver, S. he next wave in metabolome analysis. Trends Biotechnol., 2005, 23 (11), 544–546. 3. Peregrin-Alvarez, J. M., Tsoka, S., and Ouzounis, C. A. he phylogenetic extent of metabolic enzymes and pathways. Genome Res., 2003, 13 (3), 422–427. 4. Patil, K. R. and Nielsen, J. Uncovering transcriptional regulation of metabolism by using metabolic network topology. PNAS 2005, 102 (8), 2685–2689. 5. Huynen, M. A., Dandekar, T., and Bork, P. Variation and evolution of the citric-acid cycle: a genomic perspective. Trends Microbiol.,1999, 7 (7), 281–291. 6. Stryer, L. Biochemistry, 4 ed. W.H. Freeman & Company, New York, 2005. 7. Woese, C. he universal ancestor. PNAS, 1998, 95 (12), 6854–6859. 8. Ratclife, R. G. and Shachar-Hill, Y. Measuring multiple luxes through plant metabolic networks. Plant J., 2006, 45 (4), 490–511. 9. Ideker, T., Ozier, O., Schwikowski, B., and Siegel, A. F. Discovering regulatory and signalling circuits in molecular interaction networks. Bioinformatics, 2002, 18 (90001), 233S–240. 10. Kanehisa, M., Goto, S., Kawashima, S., Okuno, Y., and Hattori, M. he KEGG resource for deciphering the genome. Nucleic Acids Res., 2004, 32 Database issue, D277–D280.

Structure and Flux Analysis of Metabolic Networks

17-17

11. Jeong, H., Tombor, B., Albert, R., Oltvai, Z. N., and Barabasi, A. L. he large-scale organization of metabolic networks. Nature, 2000, 407 (6804), 651–654. 12. Ravasz, E., Somera, A. L., Mongru, D. A., Oltvai, Z. N., and Barabasi, A. L. Hierarchical organization of modularity in metabolic networks. Science, 2002, 297 (5586), 1551–1555. 13. Fell, D. A. and Wagner, A. he small world of metabolism. Nat. Biotechnol., 2000, 18 (11), 1121–1122. 14. Malygin, A. G. Structure-chemical approach to organization of information on metabolic charts. Biochemistry (Mosc. ) 2004, 69 (12), 1379–1385. 15. Gagneur, J., Jackson, D. B., and Casari, G. Hierarchical analysis of dependency in metabolic networks. Bioinformatics, 2003, 19 (8), 1027–1034. 16. Ma, H. W. and Zeng, A. P. he connectivity structure, giant strong component and centrality of metabolic networks. Bioinformatics, 2003, 19 (11), 1423–1430. 17. Csete, M. and Doyle, J. Bow ties, metabolism and disease. Trends Biotechnol., 2004, 22 (9), 446–450. 18. Roca, C., Nielsen, J., and Olsson, L. Metabolic engineering of ammonium assimilation in xylosefermenting Saccharomyces cerevisiae improves ethanol production. Appl. Environ. Microbiol., 2003, 69 (8), 4732–4736. 19. Verho, R., Londesborough, J., Penttila, M., and Richard, P. Engineering redox cofactor regeneration for improved pentose fermentation in Saccharomyces cerevisiae. Appl. Environ. Microbiol., 2003, 69 (10), 5892–5897. 20. Rahman, S. A. and Schomburg, D. Observing local and global properties of metabolic pathways: ‘load points’ and ‘choke points’ in the metabolic networks. Bioinformatics, 2006, 22 (14), 1767–1774. 21. Mahadevan, R. and Palsson, B. O. Properties of metabolic networks: structure versus function. Biophy. J., 2005, 88 (1), L7–L9. 22. Arita, M. he metabolic world of Escherichia coli is not small. PNAS, 2004, 101 (6), 1543–1547. 23. Burgard, A. P., Nikolaev, E. V., Schilling, C. H., and Maranas, C. D. Flux coupling analysis of genomescale metabolic network reconstructions. Genome Res., 2004, 14 (2), 301–312. 24. Almaas, E., Kovacs, B., Vicsek, T., Oltvai, Z. N., and Barabasi, A. L. Global organization of metabolic luxes in the bacterium Escherichia coli. Nature, 2004, 427 (6977), 839–843. 25. Cakir, T., Patil, K. R., Onsan, Z. I., Ulgen, K. O., Kirdar, B., and Nielsen, J. Integration of metabolome data with metabolic networks reveals reporter reactions. Mol. Syst. Biol., 2006, 2, 50. 26. Stelling, J., Klamt, S., Bettenbrock, K., Schuster, S., and Gilles, E. D. Metabolic network structure determines key aspects of functionality and regulation. Nature, 2002, 420 (6912), 190–193. 27. ter Kuile, B. H. and Westerhof, H. V. Transcriptome meets metabolome: hierarchical and metabolic regulation of the glycolytic pathway. FEBS Lett., 2001, 500 (3), 169–171. 28. Nielsen, J. It is all about metabolic luxes. J. Bacteriol., 2003, 185 (24), 7031–7035. 29. Varma, A. and Palsson, B. O. Metabolic lux balancing—basic concepts, scientiic and practical use. Bio-Technology 1994, 12 (10), 994–998. 30. Sauer, U. Metabolic networks in motion: C-13-based lux analysis. Mol. Syst. Biol., 2006, 2, 62. 31. Szyperski, T. C-13-NMR, MS and metabolic lux balancing in biotechnology research. Quart. Rev. Biophy., 1998, 31 (1), 41–106. 32. Dauner, M. and Sauer, U. GC-MS analysis of amino acids rapidly provides rich information for isotopomer balancing. Biotechnol. Prog., 2000, 16 (4), 642–649. 33. Schmidt, K., Carlsen, M., Nielsen, J., and Villadsen, J. Modeling isotopomer distributions in biochemical networks using isotopomer mapping matrices. Biotechnol. Bioeng., 1997, 55 (6), 831–840. 34. Wiechert, W., Mollney, M., Isermann, N., Wurzel, W., and de Graaf, A. A. Bidirectional reaction steps in metabolic networks: III. Explicit solution and analysis of isotopomer labeling systems. Biotechnol. Bioeng., 1999, 66 (2), 69–85. 35. van Winden, W. A., Heijnen, J. J., and Verheijen, P. J. T. Cumulative bondomers: a new concept in lux analysis from 2D [C-13,H-1] COSYNMR data. Biotechnol. Bioengin., 2002, 80 (7), 731–745.

17-18

Modeling Tools for Metabolic Engineering

36. Christensen, B., Gombert, A. K., and Nielsen, J. Analysis of lux estimates based on C-13-labelling experiments. Eur. J. Biochem., 2002, 269 (11), 2795–2800. 37. Antoniewicz, M. R., Kelleher, J. K., and Stephanopoulos, G. Elementary metabolite units (EMU): a novel framework for modeling isotopic distributions. Metabol. Eng., 2007, 9 (1), 68–86. 38. van Winden, W. A., van Dam, J. C., Ras, C., Kleijn, R. J., Vinke, J. L., van Gulik, W. M., and Heijnen, J. J. Metabolic-lux analysis of Saccharomyces cerevisiae CEN.PK113-7D based on mass isotopomer measurements of C-13-labeled primary metabolites. Fems Yeast Res., 2005, 5 (6–7), 559–568. 39. Ibarra, R. U., Edwards, J. S., and Palsson, B. O. Escherichia coli K-12 undergoes adaptive evolution to achieve in silico predicted optimal growth. Nature, 2002, 420 (6912), 186–189. 40. Forster, J., Famili, I., Palsson, B. O., and Nielsen, J. Large-scale evaluation of in silico gene deletions in Saccharomyces cerevisiae. OMICS: A J. Integrat. Biol., 2003, 7 (2), 193–202. 41. Fischer, E. and Sauer, U. Large-scale in vivo lux analysis shows rigidity and suboptimal performance of Bacillus subtilis metabolism. Nat. Genet., 2005, 37 (6), 636–640. 42. Segre, D., Vitkup, D., and Church, G. M. Analysis of optimality in natural and perturbed metabolic networks. PNAS, 2002, 99 (23), 15112–15117. 43. Shlomi, T., Berkman, O., and Ruppin, E. Regulatory on/of minimization of metabolic lux changes ater genetic perturbations. PNAS, 2005, 102 (21), 7695–7700. 44. Beard, D. A., Liang, S. C., and Qian, H. Energy balance for analysis of complex metabolic networks. Biophy. J., 2002, 83 (1), 79–86. 45. Burgard, A. P., Pharkya, P., and Maranas, C. D. OptKnock: a bilevel programming framework for identifying gene knockout strategies for microbial strain optimization. Biotechnol. Bioeng., 2003, 84 (6), 647–657. 46. Pharkya, P., Burgard, A. P., and Maranas, C. D. OptStrain: a computational framework for redesign of microbial production systems. Genome Res., 2004, 14 (11), 2367–2376. 47. Bro, C., Regenberg, B., Forster, J., and Nielsen, J. In silico aided metabolic engineering of Saccharomyces cerevisiae for improved bioethanol production. Metabol. Engin., 2006, 8 (2), 102–111. 48. Patil, K. R., Rocha, I., Forster, J., and Nielsen, J. Evolutionary programming as a platform for in silico metabolic engineering. BMC Bioinformatics, 2005, 6, 308. 49. Klipp, E., Nordlander, B., Kruger, R., Gennemark, P., and Hohmann, S. Integrative model of the response of yeast to osmotic shock (vol 23, pg 975, 2005). Nat. Biotechnol., 2005, 23 (8), 975–982. 50. Liebermeister, W. and Klipp, E. Bringing metabolic networks to life: integration of kinetic, metabolic, and proteomic data. heor. Biol. Med. Model., 2006, 3, 42. 51. Steuer, R., Gross, T., Selbig, J., and Blasius, B. Structural kinetic modeling of metabolic networks. PNAS, 2006, 103 (32), 11868–11873. 52. Wang, L. Q. and Hatzimanikatis, V. Metabolic engineering under uncertainty. I: framework development. Metabol. Eng., 2006, 8 (2), 133–141. 53. Oliveira, A. P., Nielsen, J., and Forster, J. Modeling Lactococcus lactis using a genome-scale lux model. BMC Microbiol., 2005, 5, 39. 54. Forster, J., Famili, I., Fu, P., Palsson, B. O., and Nielsen, J. Genome-scale reconstruction of the Saccharomyces cerevisiae metabolic network. Genome Res., 2003, 13 (2), 244–253.

18 Constraint-Based Genome-Scale Models of Cellular Metabolism 18.1 Introduction .....................................................................................18-1 18.2 Methods for Model Development .................................................18-2 Curated Reaction and Metabolite Database  •  Metabolic Network Reconstruction Methods  •  Representation of Biomass Reaction  •  Determination of Maintenance  •  Integration with Physiology Data for Validation and Reinement

18.3 Methods for Interrogating Metabolic Networks ....................... 18-6 Flux-Based Methods  •  Regulatory and Dynamic Extensions  •  hermodynamic and Metabolic Extensions  •  Optimization of Metabolic Networks

Radhakrishnan Mahadevan University of Toronto

18.4 Sotware and Databases for Genome-Scale Modeling ........... 18-12 18.5 Survey of Genome-Scale Metabolic Models ............................ 18-12 18.6 Conclusions ....................................................................................18-14 References ..................................................................................................18-14

18.1 Introduction he explosion in the database of microbial genome sequences has motivated intense eforts in the functional characterization of these genomes. As metabolism is fairly well conserved across organisms, several techniques for metabolic network reconstruction from the genome sequence using bioinformatics algorithms have been developed. he underlying metabolic network represents the metabolic potential of the organism, and hence is valuable for the interrogation of the metabolic capabilities and its relation to physiology. he reaction stoichiometry of the biochemical reactions associated with metabolism is well established and therefore, the stoichiometric matrix associated with the genome-scale network is a concise representation of the highly interconnected metabolic network and is amenable for systematic computational analysis. Unlike other biological networks such as protein interaction networks, the links in the metabolic network are mainly chemical reactions and consequently are time-invariant from the standpoint of connectivity. Hence, these reconstructions once validated represent unchanging snapshots of metabolic potential that get augmented and can only grow in size as new functional assignments for genes are made. he main advantage of such large-scale descriptions of metabolism is the molecular detail that is represented in genome-scale metabolic reconstructions. Such molecular detail enables the representation and analysis of genomic events such as gene expression data analysis, large-scale gene deletions, largescale growth physiology, and the large-scale pathway analysis for metabolic engineering. However, the 18-1

18-2

Modeling Tools for Metabolic Engineering

description of genome-scale network presents several challenges on the computational side regarding the scalability of algorithms. hese challenges are partly relected in the development of methods that are primarily based on the linear or quadratic optimization and mostly deal with representing the lux distribution in the metabolic network. he irst genome-scale model was developed in 1999 (H. inluenzae, Edwards and Palsson, 1999) and at that time it was derived primarily from the genome sequence and available literature data. However, subsequent genome-scale models began to incorporate other forms of high-throughput data and physiology data including gene and proteome expression sets. Further, the metabolite and reaction databases also became sophisticated as additional details on the charge, molecular formula were included and all reactions were charge and elementally balanced. Cheminformatic algorithms have been used to determine the charge from the analysis of the acid dissociation constants (pKa). his critical development enhanced the quality of the reaction network and enabled the tracking of protons generated as a part of metabolism which could impact the model predictions. An additional development was in the consideration of thermodynamics of chemical reactions and the derivation of constraints from the use of such data. Finally, such models have been integrated with metabolomic and thermodynamic data to identify feasible ranges of metabolite concentrations. Another efort toward enhancing the predictive capabilities is in the incorporation of nonmetabolic phenomena such as transcriptional regulation and translation. However, such eforts has been attempted only for well studied organisms such as Escherichia coli and Saccharomyces cerevisiae as the information on the regulatory mechanisms are not yet widely available. In this chapter, a brief introduction to the development and utilization of constraint-based genomescale modeling of cellular metabolism are presented. he methods for metabolic network reconstruction, the subsequent computational analysis of the reconstructed network, and the sotware for genome-scale modeling are presented in the irst section of this chapter. In the second part, the applications of these models and a brief summary of the status of the constraint-based models of metabolism in diferent organism are reviewed. Finally, the state of art in the development of such models and the future directions are outlined. Although the scope of this chapter is limited to covering the developments in constraint-based modeling, it is important to note that there are other metabolic modeling approaches that have been used for pathway scale metabolic networks and are also rapidly evolving in sophistication (Savageau, 1969; Varner, 2000).

18.2 Methods for Model Development Development and validation of genome-scale models requires several levels of biological information including genome sequence, physiology, gene essentiality, literature on biochemical and genetic data. he irst step in the construction of a CBM (Figure 18.1) is the reconstruction of a highly curated metabolic network which forms the basis for the numerical computations. his is followed by the determination of biomass composition and the representation of biomass synthesis. Finally, additional physiological data is required for the identiication of maintenance requirements to enable quantitative predictions of growth and by-product secretion patterns. It must be noted that modiications to the network can be identiied even at steps 2 and 3 in the event that reactions required for the synthesis of key biomass components are missing. Hence, although the model development is divided into separate steps in Figure 18.1, the discovery of additional network components and links is a continuous process necessitating an iterative approach to model development. In the next section, a brief summary of each of the steps in the model development process is provided.

18.2.1 Curated Reaction and Metabolite Database An important component in developing genome-scale models is well curated database of metabolites and reactions. Oten databases of metabolic reactions contain compounds with inconsistent formulas

Constraint-Based Genome-Scale Models of Cellular Metabolism

18-3

Metabolic network reconstruction

Genome sequence

Omics data

Determination of biomass composition

Biochemical databases

%w biomass Protein content RNA content DNA content Carbohydrate content Lipid content Other

46% 10% 4% 15% 15% 10%

Validation

Genome-scale models

Experimental data

Physiology Determination of maintenance energy

Growth associated maintenance

Substrate uptake rate Substrate requirement for nongrowth maintenance

Growth rate

Literature Biological data

FIguRE 18.1 hree critical steps in the development of genome-scale metabolic models from biological data.

and structures, or elementally unbalanced reactions that have to be reconciled before the modeling. his is especially critical as one of the features of constraint-based models is the ability to track the low of elements across the pathways based on the reaction stoichiometry. As an example, proton balancing of all of the chemical reactions was valuable in identifying the physiological basis for diference in the yield during growth of G. sulfurreducens with diferent electron acceptors, and in predicting the pH changes in the extra-cellular medium during growth of E. coli with varying degree of reduced substrates (Reed et al., 2003; Mahadevan et al., 2006). Hence, it is important to represent the correct charged formula for all the metabolites in the database and ensure that all the reactions are both elementally and charge balanced before incorporating them into the metabolic model.

18.2.2 Metabolic network Reconstruction Methods Network reconstruction is primarily performed using tools from bioinformatics and typically involves a sequence-based metabolic network identiication step followed by pathway analysis to close network gaps. he steps involved in the metabolic network reconstruction have been extensively reviewed before (see reviews by Covert et al., 2001; Francke et al., 2005) and are only summarized here. A notable diference between metabolic networks reconstructed in enzyme databases and those needed for the development of metabolic model is the requirement of a complete network that can synthesize the components

18-4

Modeling Tools for Metabolic Engineering

of the biomass from basic substrates typically found in minimal media (assuming the organism can grow in such media). However, as several of the reconstructed networks in databases are obtained automatically, it is critical to manually curate or inspect these automatically generated networks to ensure that any inconsistency in the network is eliminated. Increasingly, automatic algorithms for the pathway gap illing and model development are also being developed to further facilitate the process of genomescale modeling (Karp et al., 2002; Segre et al., 2003; Notebaart et al., 2006; Herrgard et al., 2006a). he irst step in this process of metabolic network reconstruction is the identiication of genes with a deined metabolic function in the annotation resulting from a completed genome. hese genes are then veriied by examining their homologs in other well characterized organisms and are typically assigned conidence levels according to the degree of sequence similarity. In addition to this step, all of the genes are evaluated through sequence comparison and phylogenetic analysis with up-to-date enzyme databases such as KEGG, BRENDA, and the manually curated subset of databases such as SWISSPROT (Kanehisa et al., 2004; Schomburg et al., 2004; Wu et al., 2006). In case, only the drat genome is available, metabolic network reconstruction tools such as metaSHARK (Pinney et al., 2005) in combination with manual curation can be utilized to identify the metabolic network. he next step in the network reconstruction involves the completion of metabolic pathways by identifying network gaps and analyzing them further. hese network gaps typically correspond to metabolites that can either be consumed only or produced only in the network. he missing reactions associated with such metabolites that can close the gaps are identiied from reaction databases. he next step is the identiication of genes encoding enzymes that catalyze the missing reactions in other organisms followed by sequence comparison of these genes with genome of the modeled organism. Such analysis can lead to the assignment of novel metabolic functions based on comparison of the protein sequence and domains. Another critical component of the model development process is the representation of the proton translocation stoichiometry of the proteins in electron transport chain. he proton translocation across the inner membrane for Gram negative bacteria and the cell membrane for Gram positive bacteria is directly correlated with ATP synthesis. Hence, variations in the translocation stoichiometry can signiicantly afect the maximum energy in terms of ATP that can be generated from a mole of the substrate such as glucose. An important factor to consider in the determination of the translocation stoichiometry is the total energy that can be generated given the substrate (electron donors such as reduced sugars) and oxidant (electron acceptor such as oxygen) and the eiciency of cellular machinery. As an example, the eiciency of the electron transport chain in human mitochondria is 57% of the theoretical maximum (1.25 mol ATP/mol electron with oxygen as the acceptor given a theoretical thermodynamic maximum of 2.2 mol ATP/mol electron) (Kroger et al., 2002). Such thermodynamic constraints are required to ensure that the rate of ATP generation is physiologically realistic. In addition, the standard Gibb’s free energy change associated with reactions can also be used to determine whether a reaction is reversible/irreversible so that the appropriate constraints on the reaction direction can be imposed. Another critical factor that has to be determined for many of the reactions, and especially for the redox reactions, is the choice of the cofactor (NADH/NADPH) that acts as the donor/acceptor. In some cases, the cofactor speciicity can be determined from the sequence and phylogenetic analysis (Zhu et al., 2005) and biochemical literature (Lehninger et al., 1993).

18.2.3 Representation of Biomass Reaction he second step in the development of a genome-scale metabolic model is representation of the biomass synthesis reactions in the model. he synthesis of one gram of cell requires over 30 metabolites which include structural components such as cell wall, proteins, energy metabolites such as ATP, and storage polymers such as glycogen. Experimental protocols for the determination of composition of macromolecules such as proteins, nucleic acids, carbohydrates, lipids, and other ions are established and the biomass composition of several well studied organisms are available. he ATP requirements for the

Constraint-Based Genome-Scale Models of Cellular Metabolism

18-5

synthesis of the macromolecular components are also included in the biomass reaction. he distribution of metabolites (the amino acid composition) that make up the macromolecules has to be determined to rigorously deine the biomass synthesis reaction. In the absence of such data, in speciic cases such as the amino acid composition, the distribution can be inferred from the sequence based on the assumption that all of the proteins are expressed. Another alternative employed in the development of genome-scale models is the assumption that the amino acid composition is similar to that of E. coli for which experimental data is available. It is important to note that the biomass reaction derived from experimental measurements is expected to be valid during growth simulations corresponding to the environment from which the data on biomass composition was collected. However, unless there are signiicant changes to biomass composition in diferent environments, small variations are unlikely to result in signiicant changes in the growth rate. As an example, for the case of Geobacter sulfurreducens, even 10% variations in the macromolecular composition resulted in only a 1.5% change in the growth rate which appears to be consistent with previous studies on the impact of the variations in E. coli biomass composition (Mahadevan et al., 2006). he deinition of a comprehensive biomass reaction is critical for the accuracy of gene essentiality predictions. he impact of any deletion that disrupts the synthesis of a metabolite required for biomass composition can be predicted accurately only if the metabolite is incorporated in the biomass reaction.

18.2.4 Determination of Maintenance A key component in the representation of the biomass synthesis reaction is the incorporation of ATP requirements for maintenance of cellular processes not included in the biomass reaction such as the energy required for the turnover of amino acid pools, maintenance of membrane potential, and other cellular events that might be proportional to the growth rate of the cells. Methods for calculating the growth and nongrowth associated maintenance are well established (Pirt, 1965; Neijssel et al., 1996) and essentially requires physiological data on the substrate uptake rate at diferent growth rates. A schematic of this procedure is shown in Figure 18.2, where the substrate uptake rate extrapolated to zero growth rate (y-intercept) is used to irst calculate the nongrowth associated ATP maintenance parameter. Here, the uptake rate is imposed as a constraint and the ATP synthesis rate is maximized and the resulting objective value is set as the nongrowth associated maintenance. he slope of the predicted uptake rate at diferent growth rates is dependent on the growth associated ATP maintenance parameter which is varied to match the experimental observations. Although in most cases the experimental data on substrate uptake rate and growth rate are linear, there can be instances where this relation can be piece-wise linear indicating that the energetic eiciency and the mode of growth can be diferent (e.g., Experimental data

Growth associated maintenance Substrate uptake rate Substrate requirement for nongrowth maintenance Growth rate

FIguRE 18.2

Schematic illustrating the determination of the maintenance parameters.

18-6

Modeling Tools for Metabolic Engineering

the Crabtree efect observed during chemostat growth of S. cerevisiae on glucose at high growth rates (van Hoek et al., 1998). In such instances, it will be necessary to incorporate additional constraints on the regulatory network to obtain an accurate prediction of metabolism at diferent growth rates.

18.2.5 Integration with Physiology Data for Validation and Reinement he inal step in the model development process is the validation with experimental data ater which the model can be generate experimental hypotheses about cellular functions. Data on growth and by-product secretion pattern in conditions other than those used to calculate the model parameters are useful for validation. Recently, it has become possible to obtain such data at large scale using high-throughput physiology techniques such as phenotype microarrays (Biolog Inc., (Bochner et al., 2001)). Phenotype microarrays essentially evaluate the growth and respiration patterns of cells in over 700 environments with varying substrates. he technique relies on a colorimetric assay based on dye reduction that is then linked to growth. his high-throughput assay of growth in a multitude of environments has been obtained for both E. coli and B. subtilis (Covert et al., 2004) and is useful for identifying missing cellular functions such as transporters. CBMs capture most of the known metabolic pathways and hence, can be used to predict cellular fate in the wake of a genetic perturbation. High-throughput gene essentiality data including genome-scale transposon mutagenesis, and single-gene knockouts are already available for well studied organisms such as E. coli, B. subtilis, and P. aeruginosa, etc. he comparison of the model predictions to large-scale gene essentiality can be used to identify novel pathways (in the case of false negatives), inactive enzymes represented in the model (false positives) and additional regulatory features not captured by the model. As an example, analyzing the gene deletion phenotypes of redundant pathways predicted by the model of Geobacter sulfurreducens identiied several cases of inactive enzymes in central metabolism such as pyruvate dehydrogenase, and succinyl-CoA synthetase (Segura et al., 2008). he reconciliation of the model with high-throughput growth physiology, gene essentiality, and by-product secretion patterns is essential in creating a compact and systematic representation of cellular metabolic capabilities for further computational and experimental interrogation.

18.3 Methods for Interrogating Metabolic networks Although the irst stoichiometric model of metabolism was constructed in 1990 (Majewski and Domach, 1990), the driver for computational tool development for analysis of metabolic networks, was the reconstruction of genome-scale models in the late 1990s. Since then, there have been a vast array of methods formulated and are extensively summarized in several reviews (Covert et al., 2003; Price et al., 2004a; Reed et al., 2006). he methods for metabolic network analysis (Figure 18.3) can be broadly categorized into four classes; (1) the methods based solely on the stoichiometric and reaction directionality constraints, (2) extensions that incorporated additional constraints based on thermodynamics, kinetics, and metabolite concentrations, (3) regulatory and dynamic extensions, and (4) optimization methods for design and analysis. Here, these methods are only briely recounted and further information is available in other chapters of this book.

18.3.1 Flux-Based Methods Most of the methods that relied on stoichiometric and reaction directionality constraints were developed to analyze the feasible solution space determined by the imposed constraints. hese methods either relied on the selection of one point in the solution space based on an objective function (thereby biasing the selection for optimizing the objective) or attempted to characterize the solution space in its entirety without any bias toward a particular solution (unbiased methods). hese unbiased methods included the deinition of extreme pathways, elementary modes, and random sampling of the solution

18-7

Constraint-Based Genome-Scale Models of Cellular Metabolism

Flux analysis based on C13 Isotope distribution (overdetermined systems)

Analysis of genome-scale metabolic networks

Physico-chemical constraints (underdetermined systems)

Thermodynamic and metabolic extensions



Energy balance analysis • Flux minimization • Network embedded thermodynamic analysis • k-Cone analysis

FIguRE 18.3

Stoichiometric and capacity constraints

Regulatory and dynamic extensions



Regulatory flux Balance analysis • Dynamic flux balance analysis

Optimization methods for metabolic networks •

OptKnock ObjFind • Optstrain • Optreg • OMNI • Error reconciliaton •

Biased methods analyzing sub-set of the solution space • •

FBA Sensitivity analysis • MOMA • ROOM • FVA • FCF • α-spectrum

Unbiased methods for uniform analysis of the entire space • • •

ExPA ElMo Random sampling • Volume analysis

Methods for computational analysis of genome-scale metabolic models.

space, whereas the biased methods included lux balance analysis, lux variability analysis, and deletion analysis as deined below. 18.3.1.1 Biased Methods Flux balance analysis (FBA). FBA has been extensively reviewed over the years and is the classical method for predicting the lux distribution in genome-scale metabolic networks based on linear programming, where an objective function corresponding to a cellular goal is deined. Typically, the growth rate maximization objective is used based on the hypothesis that cellular metabolism is programmed through evolution for optimal resource utilization and growth. he genome-scale metabolic network is used to derive the stoichiometric constraints based on the assumption that metabolite levels are at steady state during balanced growth (Equation 18.1a). he stoichiometric constraints are augmented with the directionality and enzymatic capacity constraints (Equation 18.1b), and substrate uptake constraints which correspond to the media composition. Hence, the FBA problem is formulated as follows: Max µ =f T v

n

s.t. Sv = 0

(18.1a)

lb ≤ v ≤ ub

(18.1b)

vs = qs

(18.1c)

where v is vector of luxes (v ∈ ℜ ), S is m฀×฀n dimensional stoichiometric matrix, m is the number of metabolites, n is the number of reactions, qs is the experimentally measured uptake rates, and vs are

18-8

Modeling Tools for Metabolic Engineering

luxes corresponding to the substrate (e.g., glucose) uptake rate. It is important to note that in the case where the substrate uptake rate is ixed the solution of the FBA problem results in a lux distribution that maximizes the growth yield. If uptake rates of several substrates and standard deviations are available, Equation 18.1c can be modiied to incorporate experimental error. In some cases, where there are variations in the experimental measurements, data reconciliation methods are required to ensure the consistency between the experimental measurements and the stoichiometric constraints (van der Heijden et al., 1994). Flux variability analysis (FVA). FVA is used to evaluate the degree of lexibility in the metabolic network and is based on a series of optimization functions to identify the extremes of the optimal solution space (Mahadevan and Schilling, 2003). he objective function is the maximization and minimization of every lux in the network subject to the constraint that the growth rate is optimal. he FVA problem is formulated as below: Max ei T v s.t. Sv = 0 lb ≤ v ≤ ub

(18.2)

vs = qs f Tv = µ* n

Where ei is unit vector (ei ∈ ℜ ) and µ* is the optimal growth rate calculated through the solution of a LP as described in Equation 18.1. he solution of resulting 2n linear programming problems deines the range of values, a reaction can have and still support the optimal growth rate. hese reactions represent redundant pathways in the network and can substitute for one another. A variant of this algorithm was used to analyze fermentation data of L. plantarum and identify the lexibility of the metabolic pathways for a case where the objective was not clearly deined (Teusink et al., 2006). Deletion analysis methods. hree diferent methods have been proposed for simulating the efect of gene deletion on the metabolic lux distribution (Edwards and Palsson, 2000a; Segre et al., 2002; Shlomi et al., 2005). he key diference among these methods is the hypothesis underlying their formulation. In the irst approach, the cellular goal is assumed to be the maximization of the growth rate even ater the loss of enzymatic activity due to gene deletion. Here, a LP problem is formulated by augmenting the FBA algorithm with an additional constraint eliminating lux through the reaction catalyzed by the deleted gene product. Segre et al. proposed another approach known as MOMA for minimization of metabolic adjustment. Here, the cellular objective of the mutant strain was assumed to be homeostasis of the metabolic lux distribution rather than the growth rate maximization, and the Euclidean distance between the mutant and the wild type lux distribution was minimized. In the third approach known as ROOM, the hypothesis is similar to MOMA, however, instead of the Euclidean distance, the number of lux changes was minimized. he rationale was that Euclidean distance minimization approach led to changes in several luxes and sometimes did not identify short alternative pathways used for rerouting metabolic lux. However, recent studies have shown that FBA is more predictive of the growth rate of the adapted mutant than the initially generated strain (Fong and Palsson, 2004). Further investigation is required to understand the changes in the lux distribution as the mutant strains evolved to higher growth rates during selection for growth. Sensitivity analysis methods. he impact of changes in the substrate uptake rates at both local and global scales can be investigated by a variety of methods. At the local scale, the shadow prices and reduced costs obtained during the solution of the linear programming problem can be used to assess the sensitivity of the objective function. he dimension of shadow price vector corresponds to the number of constraints or metabolites in the problem and the shadow prices contain information on the potential changes in the objective function value when a small change in the availability of the corresponding

Constraint-Based Genome-Scale Models of Cellular Metabolism

18-9

metabolite (source/sink term) is made. he shadow price relects the value of a metabolite and is useful in network debugging to identify missing biosynthetic pathways in the network. Further details on the shadow prices can be found in Edwards et al. (2002), Chvatal (1983), and Palsson (2006). Additional methods to evaluate the sensitivity of the objective function at a broader parametric range are also available. For example, in robustness analysis (Edwards and Palsson, 2000b), one of the substrate uptake rates (or any other lux) is varied over a range and the resulting objective function proile is plotted. Robustness analysis essentially represents a two dimensional slice of the feasible solution space deined in the lux coordinates by physico-chemical constraints in Equation 18.1. An extension of the robustness analysis is the phase plane analysis (Edwards et al., 2002), where the value of objective function is calculated by changing the two parameters (luxes) over a range. In the phase plane analysis, a region in which the shadow prices remain the same is deined as a metabolic phase and the boundaries between the phases are also calculated (Bell and Palsson, 2005). hese phases can be linked to a particular phenotype such as the acetate overlow in E. coli and the slope of phase boundaries can be used to identify regions of single and dual substrate limitations. 18.3.1.2 Unbiased Methods Although the assumption of growth rate maximization appears to describe metabolism in prokaryotic networks, it is not clear if the metabolism in higher organisms can be represented similarly. However, the physiochemical constraints such as mass and energy balances have to be satisied by the metabolic networks in complex biological systems. Hence, the analysis of the solution space deined by these constraints is valuable to characterize metabolism in higher organisms, where a clear objective function is absent. he details of methods proposed for the analyzing the properties of the solution space are discussed below. Extreme pathway analysis and elementary modes. Extreme pathway analysis and elementary mode analysis are two convex analysis based approaches for analyzing metabolic pathways (Schilling et al., 1999; Stelling et al., 2002; Papin et al., 2003). hese two related methods attempt to characterize all feasible metabolic lux distributions and deine the metabolic pathway associated with the distributions. Both of these approaches are combinatorial in nature and attempt to characterize the solution space in its entirety rather than pick out a particular solution. hese methods have been extensively reviewed and compared in detail elsewhere (Papin et al., 2004). Briely, ElMos is the set of all the feasible solutions that are non-decomposable (i.e., an ElMo is not a subset of any other ElMo), whereas the extreme pathways also require an additional condition of systemic independence and is a subset of the ElMos. ElMos, and ExPas are combinatorial (e.g., the number of ElMos for a 110 reaction network was 27099 during growth with glucose) and genome-scale computation of ElMos and ExPas is still an area of extensive research (Bell and Palsson, 2005). Random and uniform sampling. he challenges in computing genome-scale ElMos and ExPas led to other approaches such as the random sampling in an efort to comprehensively analyze the solution space (Wiback et al., 2004; Price et al., 2004b). Here, Monte Carlo method is used to generate random lux distributions uniformly throughout the constrained space. he physiological properties of the points that are still feasible ater an additional constraint (e.g., decreased capacity as in an enzymopathy) is imposed can be used to obtain information on the outcome of perturbations without reliance on biased methods. Almaas et al. (2004) used such sampling methods to analyze the genome-scale metabolic network of E. coli under varying environments and identiied a high-lux back bone in the network that was selectively reorganized in response to environment. A similar sampling algorithm was implemented to analyze the metabolic network of human mitochondria under diferent pathophysiological conditions (hiele et al., 2005). In that study, reaction co-sets which have highly correlated lux values in sampled distributions were identiied in these diferent physiological conditions providing insights on the regulation of the metabolic network. Flux coupling analysis (FCA). FCA is an optimization based algorithm for determining the correlations between metabolic luxes for a genome-scale reaction network (Burgard et al., 2004). Here, the

18-10

Modeling Tools for Metabolic Engineering

pair-wise ratio of lux values is maximized and minimized to obtain the range of the lux ratio. If two lux values are perfectly correlated then the lux ratio is a constant. FCA has been used to identify both perfectly correlated and partially correlated sets in genome-scale networks of S. cerevisiae, H. pylori, and E. coli and represents a powerful method for topological analysis of genome-scale metabolic networks and to identify metabolic modules that function together in the network.

18.3.2 Regulatory and Dynamic extensions Classical approaches to constraint-based modeling assumed that all of the metabolic pathways could be active at all times, whereas it is clear that some of these pathways are subject to regulatory mechanisms and are active only under speciic conditions. In order to account for such mechanisms, a regulatory extension to the classical approach was proposed by Covert and Palsson (2002). In that study, transcriptional regulation was represented as a Boolean formulation, whereby in the presence of environmental signal, pathways repressed by the signal would be constrained to have zero lux. Since in the FBA approach, concentrations are not represented, the environmental signals (e.g., presence of oxygen, carbon source) were determined from the FBA solution without any constraints. he addition of the regulatory constraints reduced the solution space and eliminated solutions inconsistent with such regulatory mechanisms. A genome-scale integrated model of transcriptional and metabolic pathways is available for both E. coli and S. cerevisiae (Covert et al., 2004;Herrgard et al., 2006b). However, such genome-scale extension to other organisms is possible only if the transcriptional regulatory network in those organisms is well characterized (Tavazoie et al., 1999;Tegner et al., 2003). Another area where the classical FBA has been extended is in the area of dynamic modeling of metabolism. Classical FBA relies on steady state assumption that leads to the linear stoichiometric lux balance constraints. he assumption that the intra-cellular metabolites are at steady state levels can be reasonable as the time scales of the enzymatic reaction events are fast (seconds–minutes) relative to time scale of cellular growth (minutes-hours). However, in several cases, the cellular environment changes during growth, and such changes can impact the metabolic lux distribution and the cellular growth. An example is oxygen limitation due to increased cell density in a batch culture that leads to the secretion of fermentation products. hese processes can be represented using classical FBA by switching the constraints to relect the changes in the oxygen levels (Varma et al., 1993). However, the switching between diferent metabolic states is assumed to be instantaneous in these formulations. In order to capture such dynamic efects due to regulation of pathways, the dynamic FBA (dFBA) was proposed, where the dynamics of the extra-cellular environment was integrated with metabolic models of cellular growth (Mahadevan et al., 2002). he dFBA formalism has been used to identify optimal genetic and environmental manipulation proiles for maximizing the formation of chemicals such as ethanol and acetate in a fed-batch bioreactor (Gadkar et al., 2005). Hence, this formulation is critical to integrate the detailed molecular representation of metabolism with the macroscopic bioprocess description for optimization and design of these processes. More recently, this formulation has been used for the analysis of metabolic dynamics in mammalian myocardia under ischemic conditions (Luo et al., 2006).

18.3.4 thermodynamic and Metabolic extensions he initial genome-scale models incorporated limited thermodynamic information on the directionality of the reactions based on the standard Gibb’s free energy change of metabolic reactions. However, it was recognized that the energy balances were required in addition to the stoichiometric constraints in order to enforce laws of thermodynamics (Beard et al., 2002). Energy balance analysis incorporated explicit constraints to prevent lux through thermodynamically infeasible pathways such as reaction cycles and obtained the lux distribution through the solution of quadratic programming problem. Price et al. (2002) presented an alternative approach to enforcing thermodynamic feasibility via the elimination of reaction cycles.

Constraint-Based Genome-Scale Models of Cellular Metabolism

18-11

Recently, Henry et al. (2006) calculated the Gibb’s free energy change associated with all of the reactions in the genome-scale metabolic model of E. coli and identiied thermodynamically unfavorable reactions essential for growth. Further, such thermodynamic information has been used to calculate feasible metabolite ranges based on the measurement of subset of metabolites using stoichiometric constraints (Mavrovouniotis, 1996; Kummel et al., 2006). An alternative formulation to FBA that incorporates thermodynamic information was proposed by Holzhutter ( 2006). his involves the calculation of the lux distribution that minimizes the weighted sum of the luxes when few measured luxes are speciied. he weights on these luxes are determined based on the standard Gibb’s free energy change associated with the reactions. he prediction of the lux distribution in the red blood cell metabolic network using the FM approach was found to be consistent with the kinetic model. hese results suggest that a combination of stoichiometry and other physicochemical constraints can be used to analyze metabolism in higher organisms even if the cellular goal in such cases is unclear. Another approach that integrated metabolomic data and stoichiometric constraints was proposed by Famili et al. (2005). Here, the data and the constraints were used to derive constraints on the range of kinetic parameters for dynamic model development. k-cone analysis of S. cerevisiae metabolism was performed to determine the consistency between in vitro enzymatic parameters and in vivo concentration, and to determine the minimum number of enzymatic parameters that needed to be changed to ensure the consistency with data. his approach was applied to the red blood cell metabolic network to determine the range of the kinetic parameters under diferent physiological conditions using Monte Carlo sampling.

18.3.5 optimization of Metabolic networks A suite of design and analysis approaches for metabolic engineering is now available for the constraintbased representation of the metabolic networks. Most of these approaches rely on well established optimization techniques routinely used in systems engineering and control. hese approaches use mathematical programming to optimize for an alternative objective function (e.g., number of active reactions) subject to either stoichiometric constraints alone or both stoichiometric constraints and the objective function of growth maximization. hese are classiied into two categories based on problem formulation and discussed in further detail below. 18.3.5.1 Integer Programming In this class of algorithms, Boolean or binary variables are used to represent the activity state (0 for inactive and 1 for the active) of the enzyme catalyzing the reaction along with the value of the lux through the reaction. hese variables can be then used to formulate problems by which both the lux and the activity of enzyme can be varied in a problem to optimize an objective function in both continuous and binary variables. As example, Burgard et al. (2001) used integer programming to determine the minimum number of reactions required for supporting the synthesis of biomass components. In another study, Pharkya et al.(2004) used this formulation to identify and incorporate metabolic reactions from a database that lead to enhanced yield of speciic metabolic products in E. coli. Hence, integer programming representations are valuable for identifying required additional metabolic functions or eliminating existing ones for optimization of the metabolic network. 18.3.5.2 Bilevel Programming Bilevel programming is another class of optimization approach where two optimization problems are nested within each other. Such problems naturally arise in the design of a metabolic network, whereby the FBA problem with the cellular objective of growth rate maximization is nested within another problem with a higher level engineering objective. his formulation was irst introduced by Burgard et al. (2003), where the outer level objective was the maximization of product yield given the cellular objective

18-12

Modeling Tools for Metabolic Engineering

and constraints on the maximum allowable number of knock-outs. In that study, the nested optimization problem was solved by converting the inner LP problem into linear constraints using duality theory. A binary variable was used to represent the activity of a reaction, and the solution of the resulting MILP problem identiied reactions that have to be knocked out to increase the product yield while still maximizing growth. Hence, the Optknock formulation, discussed in subsequent chapter is valuable in coupling product formation to growth. Additionally, bilevel programming has been used to identify objective functions that are most consistent with experimentally measured lux distributions, and to reconcile experimental measurements with the stoichiometric constraints (Burgard and Maranas, 2003; Raghunathan et al., 2003). Recently, Pharkya and Maranas (2006) extended the Optknock formulation to determine reaction activation/ inhibition rather than just knock-outs (binary state) that can lead to enhanced product yield. Finally, Herrgard et al. (2006a) proposed optimal metabolic network identiication for reconciling the predictions of the genome-scale model with experimentally observed lux distributions and identifying potential bottleneck reactions leading to suboptimal growth. Several of these optimization methods for metabolic network analysis and design are discussed in detail in other chapters of this book.

18.4 Software and Databases for Genome-Scale Modeling A number of alternatives including academic and commercial sotware is available for the development of genome-scale metabolic models and implementation of the computational analysis techniques discussed in the earlier sections. hese are briely summarized in Table 18.1. Most of these sotware include a reaction and a metabolite database for the construction of the models and a link to a optimization solver for the solution of the underlying linear or quadratic programming problem. One of the critical components in the construction of such models is the representation of appropriate charged form of the metabolites in the solution as the global proton balance can have a signiicant impact on the physiology predictions (Reed et al., 2003; Mahadevan et al., 2006). Another oten overlooked component in the analysis of genome-scale models is the numerical challenges associated with the large-scale metabolic model formulations. Although the solution of the underlying linear programming problems is comparatively eicient even at genome-scales, the formulation of the biomass component requirements which varies across two to three orders of magnitude can cause numerical scaling issues. Hence, the solution returned by the LP solver has to be examined carefully and the optimization parameters changed appropriately to ensure model accuracy. Another feature to consider, when evaluating the diferent sotware is the ability to import and export the models in a standard form such as the Systems Biology Markup Language (SBML), which is emerging as a primary standard for exchange and archiving of biological models. he available commercial and academic sotware and some of their features are summarized in Table 18.2.

18.5 Survey of Genome-Scale Metabolic Models As of 2006, 12 genome-scale models of bacteria, archaea and eukaryotes have been developed and utilized for applications ranging from metabolic engineering, recombinant protein production, bioremediation, and anti-microbial development (Table 18.1). Initial genome-scale models were constructed with academic grade sotware (FBA) and did not include charge balanced reactions. However, such models were able to predict the outcome gene deletions with high accuracy (~70%) (Edwards and Palsson, 2000a). With the availability of more sophisticated tools such as MetaFluxNet, CellAnalyzer, and SimPheny, the more recent genome-scale models have all incorporated charge and elemental balancing and are coupled to commercial linear programming solvers. he models range from 373 reactions for M. succiniproducens to 1220 reactions for the eukaryote M. musculus cell line and some of the models have also been updated as additional information in the genome annotation and sotware became available. here are four versions of E. coli model suggesting that these models grow in size as new functions are

18-13

Constraint-Based Genome-Scale Models of Cellular Metabolism

TaBLE 18.1 Commercial and Academic Sotware Available for Development and Analysis of Genome-Scale Metabolic Models

Sotware/Vendor

Built-in Charge/ Elementally Balanced Database

Linear Optimization Tools

Y N

Y Y

N N

Y Y

N

Y

SimPheny, Genomatica Inc. In Silico Discovery, InSilicoBiotechnology Inc. FBA, UCSD CellNetAnalyzer, Garching Innovation GmbH MetaFluxNet, KAIST

C13 Metabolic Flux Analysis

SBML Import/ Export

XPRESSS JAVA

N Y

N* Y

LINDO MATLAB/MEX Interface LPSOLVE

N N

N Y

N

Y

LP Solver

* Export available.

TaBLE 18.2 List of Genome-Scale Models and the Features Organism Escherichia coli

Model Version

Sotware Platform

Size (Metabolites฀×฀Reactions)

Reference Edwards and Palsson, 2000c Reed et al., 2003 Covert et al., 2004 Henry et al., 2006

iJE660a iJR904 iMC1010 iHJ873

FBA SimPheny SimPheny

438฀×฀627 625฀×฀937 626฀×฀939 518฀×฀873

Haemophilus inluenzae

iJE295

FBA

343฀×฀488

Edwards and Palsson, 1999

Helicobacter pylori

iCS291 iIT341

FBA SimPheny

339฀×฀388 411฀×฀476

Schilling et al., 2002

Saccharomyces cerevisiae

iFF708 iND750

FBA SimPheny

584฀×฀1175 646฀×฀1149

Staphylococcus aureus

iSB619

SimPheny

571฀×฀640

Becker and Palsson, 2005

Geobacter sulfurreducens

iRM588

SimPheny

541฀×฀523

Mahadevan et al., 2006

Mus musculus

iKS1156

Lindo

Methanosarcina barkeri

iAF692

SimPheny

558฀×฀619

Mannheimia succiniproducens

iHK335

MetaFluxNet

352฀×฀373

Hong et al., 2004

Streptomyces coelicolor

iIB711

500฀×฀700

Borodina et al., 2005

Bacillus subtilis

iYK850

MATLAB, Lindo SimPheny

Lactococcus lactis

iAO358

GNU LP kit

872฀×฀1220

986฀×฀1033 509฀×฀621

Famili et al., 2003 Duarte et al., 2004

Sheikh et al., 2005 Feist et al., 2006

Oh et al., 2007 Oliveira et al., 2005

identiied and non-stoichiometric constraints such as transcriptional regulatory and thermodynamic constraints are incorporated in the model. Genome-scale models of well studied organisms such as E. coli, S. cerevisiae, and B. subtilis have been extensively validated experimentally, whereas in the case of the other organisms, these models have been used primarily to understand the unique features of their metabolic network. For example, the genome-scale model of H. pylori was used to identify minimal media requirements for that organism and the G. sulfurreducens model revealed the metabolic challenges associated with metal respiration and extracellular electron transfer. he models of E. coli, S. cerevisiae, and M. succiniproducens have been used for designing strains with improved lactate, ethanol, and succinate yields, respectively, further highlighting the potential of validated models. hus far, the genome-scale models have been used in a variety of applications including: (1) analysis and reinement of network through reconciliation with data, (2) for the organization of high-throughput “omics” data, (3) for the redesign of cellular metabolism and optimization of bioprocesses, and (4) for the identiication of the network lux distribution through the analysis of C13 isotope label incorporation in biomass. he model predictions of growth and essentiality have been compared with genome-scale

18-14

Modeling Tools for Metabolic Engineering

data for E. coli, B. subtilis, and S. cerevisiae and have led to signiicant modiications in the models. Further, the model-based engineering of metabolism in S. cerevisiae and E. coli has led to improved product yields for ethanol, succinate, and lactic acid, respectively (Hong et al., 2004; Bro et al., 2006) and lux analysis has been used to identify the experimental lux distributions for several organisms. In summary, these genome-scale models have been used in a variety of applications to characterize and design cellular metabolism across the diferent kingdoms of life.

18.6 Conclusions As these recent studies suggest, the availability of additional layers of omics data along with computational analysis methods, has resulted in unprecedented opportunity to analyze cellular metabolism and redesign the metabolic networks for several practical applications ranging from production of biofuels such as ethanol, commodity chemicals such as succinate/lactate, nutraceuticals, and even unconventional products such as electrical current generation due to bacterial respiration in microbial fuel cells. While some of these model-based computational approaches are summarized in this chapter, the reader is referred to other chapters in this book for a comprehensive treatment of the applications to metabolic engineering. With the recent advances in bioinformatics enabling the eicient reconstruction of metabolic network followed by model development, the number of such genome-scale models is expected to increase. We expect that these models will initially be used to improve our understanding of metabolism by iteratively (1) designing and conducting experiments to test model predictions, (2) reconciling the experimental data with computational results to discover novel functional constraints, and (3) reining the model to account for the new constraints. his iterative process will ultimately lead to improved understanding of metabolism across these organisms and the resulting models will be critical for manipulating metabolism for practical applications in metabolic engineering, bioremediation, recombinant protein and anti-microbial discovery.

References Almaas, E., Kovacs, B., Vicsek, T., Oltvai, Z.N., and Barabasi, A.L. 2004. Global organization of metabolic luxes in the bacterium Escherichia coli. Nature, 427, 839–843. Beard, D.A., Liang, S.C., and Qian, H. 2002. Energy balance for analysis of complex metabolic networks. Biophys. J., 83, 79–86. Becker, S.A. and Palsson, B.O. 2005. Genome-scale reconstruction of the metabolic network in Staphylococcus aureus N315: an initial drat to the two-dimensional annotation. BMC. Microbiol, 5, 8. Bell, S.L. and Palsson, B.O. 2005. Expa: a program for calculating extreme pathways in biochemical reaction networks. Bioinformatics, 21, 1739–1740. Bochner, B.R., Gadzinski, P., and Panomitros, E. 2001. Phenotype microarrays for high-throughput phenotypic testing and assay of gene function. Genome Res., 11, 1246–1255. Borodina, I., Krabben, P., and Nielsen, J. 2005. Genome-scale analysis of Streptomyces coelicolor A3(2) metabolism. Genome Res., 15, 820–829. Bro, C., Regenberg, B., Forster, J., and Nielsen, J. 2006. In silico aided metabolic engineering of Saccharomyces cerevisiae for improved bioethanol production. Metab. Eng., 8, 102–111. Burgard, A.P. and Maranas, C.D. 2003. Optimization-based framework for inferring and testing hypothesized metabolic objective functions. Biotechnol. Bioengin., 82, 670–677. Burgard, A.P., Nikolaev, E.V., Schilling, C.H., and Maranas, C.D. 2004. Flux coupling analysis of genomescale metabolic network reconstructions. Genome Res., 14, 301–312. Burgard, A.P., Pharkya, P., and Maranas, C.D. 2003. OptKnock: a bilevel programming framework for identifying gene knockout strategies for microbial strain optimization. Biotechnol. Bioeng., 84, 647–657.

Constraint-Based Genome-Scale Models of Cellular Metabolism

18-15

Burgard, A.P., Vaidyaraman, S., and Maranas, C.D. 2001. Minimal reaction sets for Escherichia coli metabolism under diferent growth requirements and uptake environments. Biotechnol. Prog., 17, 791–797. Chvatal, V. 1983 Linear Programming. W.H. Freeman and Company, New York. Covert, M.W., Famili, I., and Palsson, B.O. 2003. Identifying constraints that govern cell behavior: a key to converting conceptual to computational models in biology? Biotechnol. Bioeng., 84, 763–772. Covert, M.W., Knight, E.M., Reed, J.L., Herrgard, M.J., and Palsson, B.O. 2004. Integrating high-throughput and computational data elucidates bacterial networks. Nature, 429, 92–96. Covert, M.W. and Palsson, B.O. 2002. Transcriptional regulation in constraints-based metabolic models of Escherichia coli. J. Biol. Chem., 277, 28058–28064. Covert, M.W., Schilling, C.H., Famili, I., Edwards, J.S., Goryanin, I.I., Selkov, E., and Palsson, B.O. 2001. Metabolic modeling of microbial strains in silico. Trends Biochem. Sci., 26, 179–186. Duarte, N.C., Herrgard, M.J., and Palsson, B. 2004. Reconstruction and validation of Saccharomyces cerevisiae iND750, a fully compartmentalized genome-scale metabolic model. Genome Res., 14, 1298–1309. Edwards, J.S. and Palsson, B.O. 1999. Systems properties of the Haemophilus inluenzae Rd metabolic genotype. J. Biol. Chem., 274, 17410–17416. Edwards, J.S. and Palsson, B.O. 2000a. Metabolic lux balance analysis and the in silico analysis of Escherichia coli K-12 gene deletions. BMC Bioinformatics., 1, 1. Edwards, J.S. and Palsson, B.O. 2000b. Robustness analysis of the Escherichia coli metabolic network. Biotechnol. Prog., 16, 927–939. Edwards, J.S. and Palsson, B.O. 2000c. he Escherichia coli MG1655 in silico metabolic genotype: its deinition, characteristics, and capabilities. Proc. Natl. Acad. Sci. USA, 97, 5528–5533. Edwards, J.S., Ramakrishna, R., and Palsson, B.O. 2002. Characterizing the metabolic phenotype: a phenotype phase plane analysis. Biotechnol. Bioeng., 77, 27–36. Famili, I., Forster, J., Nielsen, J., and Palsson, B.O. 2003. Saccharomyces cerevisiae phenotypes can be predicted by using constraint-based analysis of a genome-scale reconstructed metabolic network. Proc. Natl. Acad. Sci. USA, 100, 13134–13139. Famili, I., Mahadevan, R., and Palsson, B.O. 2005. k-Cone analysis: determining all candidate values for kinetic parameters on a network scale. Biophys. J, 88, 1616–1625. Feist, A.M., Scholten, J.C.M., Palsson, B.O., Brockman, F.J., and Ideker, T. 2006. Modeling methanogenesis with a genome-scale metabolic reconstruction of Methanosarcina barkeri. Mol. Syst. Biol., msb4100046, E1–E14. Fong, S.S. and Palsson, B.O. 2004. Metabolic gene-deletion strains of Escherichia coli evolve to computationally predicted growth phenotypes. Nat. Genet., 36, 1056–1058. Francke, C., Siezen, R.J., and Teusink, B. 2005. Reconstructing the metabolic network of a bacterium from its genome. Trends Microbiol., 13, 550–558. Gadkar, K.G., Doyle, F.J., Edwards, J.S., and Mahadevan, R. 2005. Estimating optimal proiles of genetic alterations using constraint-based models. Biotechnol. Bioeng., 89, 243–251. Henry, C.S., Jankowski, M.D., Broadbelt, L.J., and Hatzimanikatis, V. 2006. Genome-scale thermodynamic analysis of Escherichia coli metabolism. Biophy. J., 90, 1453–1461. Herrgard, M.J., Fong, S.S., and Palsson, B.O. 2006a. Identiication of genome-scale metabolic network models using experimentally measured lux proiles. PLoS. Comput. Biol., 2, e72. Herrgard, M.J., Lee, B.S., Portnoy, V., and Palsson, B.O. 2006b. Integrated analysis of regulatory and metabolic networks reveals novel regulatory mechanisms in Saccharomyces cerevisiae. Genome Res., 16, 627–635. Holzhutter, H.G. 2006. he generalized lux-minimization method and its application to metabolic networks afected by enzyme deiciencies. Biosystems, 83, 98–107. Hong, S.H., Kim, J.S., Lee,S.Y., In, Y.H., Choi, S.S., Rih, J.K., Kim, C.H., Jeong, H., Hur, C.G., and Kim, J.J. 2004. he genome sequence of the capnophilic rumen bacterium Mannheimia succiniciproducens. Nat. Biotechnol., 22, 1275–1281.

18-16

Modeling Tools for Metabolic Engineering

Kanehisa, M., Goto, S., Kawashima, S., Okuno, Y., and Hattori, M. 2004. he KEGG resource for deciphering the genome. Nucleic Acids Res., 32, Database issue, D277–D280. Karp, P.D., Paley, S., and Romero, P. 2002. he Pathway Tools sotware. Bioinformatics, 18 Suppl 1, S225–S232. Kroger, A., Biel, S., Simon, J., Gross, R., Unden, G., and Lancaster, C.R. 2002. Fumarate respiration of Wolinella succinogenes: enzymology, energetics and coupling mechanism. Biochim. Biophys. Acta, 1553, 23–38. Kummel,A., Panke, S., and Heinemann, M. 2006. Putative regulatory sites unraveled by network-embedded thermodynamic analysis of metabolome data. Mol. Syst. Biol., 2, 2006. Lehninger, A.L., Cox, M.M., and Nelson,D.L. 1993. Principles of Biochemistry. Worth Publishers, New York. Luo, R.Y., Liao, S., Tao, G.Y., Li, Y.Y., Zeng, S., Li, Y.X., and Luo, Q. 2006. Dynamic analysis of optimality in myocardial energy metabolism under normal and ischemic conditions. Mol. Syst. Biol., 2, 2006. Mahadevan, R., Bond, D.R., Butler, J.E., Esteve-Nunez, A., Coppi, M.V., Palsson, B.O., Schilling, C.H., and Lovley, D.R. 2006. Characterization of Metabolism in the Fe(III)-reducing organism Geobacter sulfurreducens by constraint-based modeling. Appl. Environ. Microbiol., 72, 1558–1568. Mahadevan, R., Edwards, J.S., and Doyle, F.J. 2002. Dynamic lux balance analysis of diauxic growth in Escherichia coli. Biophy. J., 83, 1331–1340. Mahadevan, R. and Schilling, C.H. 2003. he efects of alternate optimal solutions in constraint-based genome-scale metabolic models. Metab. Eng., 5, 264–276. Majewski, R. and Domach, M. 1990. Simple constrained-optimization view of acetate overlow in Escherichia-coli. Biotechnol. Bioeng., 35, 732–738. Mavrovouniotis, M.L. 1996. Duality theory for thermodynamic bottlenecks in bioreaction pathways. Chem. Eng. Sci., 51, 1495–1507. Neijssel, O.M., Teixeria de Mattos, M.J., and Tempest, D.W. 1996. Growth yield and energy distribution. In Neidhardt, F. (Ed.), Escherichia coli and Salmonella: Cellular and Molecular Biology. ASM Press, Washington, DC. Notebaart, R.A., Van Enckevort, F.H.J., Francke, C., Siezen, R.J., and Teusink, B. 2006. Accelerating the reconstruction of genome-scale metabolic networks. BMC Bioinformatics, 7. Oh, Y. K., Palsson, B. O., Park, S. M., Schilling, C. H., and Mahadevan, R. 2007. Genome–Scale reconstruction of metabolic network in Bacillus Subtilis based on high-throughout phenotyping and give essentiality data. J. Biol. Chem., 282, 28791–28799. Oliveira, A.P., Nielsen, J., and Forster, J. 2005. Modeling Lactococcus lactis using a genome-scale lux model. BMC Microbiol., 5, 39. Palsson, B. 2006. Systems Biology: Properties of Reconstructed Networks. Cambridge University Press, New York. Papin, J.A., Price, N.D., Wiback, S.J., Fell, D.A., and Palsson, B. 2003. Metabolic pathways in the postgenome era. Trends Biochem. Sci., 28, 250–258. Papin, J.A., Stelling, J., Price, N.D., Klamt, S., Schuster, S., and Palsson, B.O. 2004. Comparison of networkbased pathway analysis methods. Trends Biotechnol., 22, 400–405. Pharkya, P., Burgard, A.P., and Maranas, C.D. 2004. OptStrain: a computational framework for redesign of microbial production systems. Genome Res., 14, 2367–2376. Pharkya, P. and Maranas, C.D. 2006. An optimization framework for identifying reaction activation/inhibition or elimination candidates for overproduction in microbial systems. Metab. Eng., 8, 1–13. Pinney, J.W., Shirley, M.W., McConkey, G.A., and Westhead, D.R. 2005. metaSHARK: sotware for automated metabolic network prediction from DNA sequence and its application to the genomes of Plasmodium falciparum and Eimeria tenella. Nucleic Acids Res., 33, 1399–1409. Pirt, S.J. 1965. he maintenance energy of bacteria in growing cultures. Proc. R. Soc. London (Biol), 163, 224–231.

Constraint-Based Genome-Scale Models of Cellular Metabolism

18-17

Price, N.D., Famili, I., Beard, D.A., and Palsson, B.O. 2002. Extreme pathways and Kirchhof ’s second law. Biophys. J., 83, 2879–2882. Price, N.D., Reed, J.L., and Palsson, B.O. 2004a. Genome-scale models of microbial cells: evaluating the consequences of constraints. Nat. Rev. Microbiol., 2, 886–897. Price, N.D., Schellenberger, J., and Palsson, B.O. 2004b. Uniform sampling of steady-state lux spaces: means to design experiments and to interpret enzymopathies. Biophys. J., 87, 2172–2186. Raghunathan, A.U., Perez-Correa, J.R., and Biegler, L.T. 2003. Data reconciliation and parameter estimation in lux-balance analysis. Biotechnol. Bioeng., 84, 700–708. Reed, J.L., Famili, I., hiele, I., and Palsson, B.O. 2006. Towards multidimensional genome annotation. Nat. Rev. Genet., 7, 130–141. Reed, J.L., Vo, T.D., Schilling, C.H., and Palsson, B. 2003. Escherichia coli iJR904: an expanded genomescale model of E. coli K-12. Genome Biol., 4, R54.1–R54.12. Savageau, M.A. 1969. Biochemical systems analysis. I. Some mathematical properties of the rate law for the component enzymatic reactions. J. heor. Biol., 25, 365–369. Schilling, C.H., Covert, M.W., Famili, I., Church, G.M., Edwards, J.S., and Palsson, B.O. 2002. Genomescale metabolic model of Helicobacter pylori 26695. J. Bacteriol., 184, 4582–4593. Schilling, C.H., Schuster, S., Palsson, B.O., and Heinrich, R. 1999. Metabolic pathway analysis: basic concepts and scientiic applications in the post-genomic era. Biotechnol. Prog., 15, 296–303. Schomburg, I., Chang, A., Ebeling, C., Gremse, M., Heldt, C., Huhn, G., and Schomburg, D. 2004. BRENDA, the enzyme database: updates and major new developments. Nucleic Acids Res., 32, D431–D433. Segre, D., Vitkup, D., and Church, G.M. 2002. Analysis of optimality in natural and perturbed metabolic networks. Proc. Natl. Acad. Sci. USA, 99, 15112–15117. Segre, D., Zucker, J., Katz, J., Lin, X., D’Haeseleer, P., Rindone, W.P., Kharchenko, P., Nguyen, D.H., Wright, M.A., and Church, G.M. 2003. From annotated genomes to metabolic lux models and kinetic parameter itting. OMICS, 7, 301–316. Segura, D., Mahadevan, R., Juárez, K., and Lovely, D. R. 2008. Computational and experimental analysis of redundancy in the central metabolism of Geobacter sulfurreducers. PLOS Comput. Biol. 4, e 36. Sheikh, K., Forster, J., and Nielsen, L.K. 2005. Modeling hybridoma cell metabolism using a generic genome-scale metabolic model of Mus musculus. Biotechnol Prog., 21, 112–121. Shlomi, T., Berkman, O., and Ruppin, E. 2005. Regulatory on/of minimization of metabolic lux changes ater genetic perturbations. Proc. Natl. Acad. Sci. USA, 102, 7695–7700. Stelling, J., Klamt, S., Bettenbrock, K., Schuster, S., and Gilles, E.D. 2002. Metabolic network structure determines key aspects of functionality and regulation. Nature, 420, 190–193. Tavazoie, S., Hughes, J.D., Campbell, M.J., Cho, R.J., and Church, G.M. 1999. Systematic determination of genetic network architecture. Nat. Genet., 22, 281–285. Tegner, J., Yeung, M.K., Hasty, J., and Collins, J.J. 2003. Reverse engineering gene networks: integrating genetic perturbations with dynamical modeling. Proc. Natl. Acad. Sci. USA, 100, 5944–5949. Teusink, B., Wiersma, A., Molenaar, D., Francke, C., de Vos, W. M., Siezer, R. J., and Smid, E. J. 2006. Analysis of growth of lactobacillus plantarum WCFS1 on a complex medium using a genome-scale metabolic model. J. Biol. Chem., 281, 40041–40048. hiele, I., Price, N.D., Vo, T.D., and Palsson, B.O. 2005. Candidate metabolic network states in human mitochondria. Impact of diabetes, ischemia, and diet. J. Biol. Chem., 280, 11683–11695. van der Heijden, R.T.J.M., Heijnen, J.J., Hellinga, C., Romein, B., and Luyben, K.C.A.M. 1994. Linear constraint relations in biochemical reaction systems: II. Diagnosis and estimation of gross measurement errors. Biotechnol. Bioeng., 43, 11–20. van Hoek, P., van Dijken, J.P., and Pronk, J.T. 1998. Efect of speciic growth rate on fermentative capacity of baker’s yeast. Appl. Environ. Microbiol., 64, 4226–4233. Varma, A., Boesch, B.W., and Palsson, B.O. 1993. Stoichiometric interpretation of Escherichia coli glucose catabolism under various oxygenation rates. Appl. Environ. Microbiol., 59, 2465–2473.

18-18

Modeling Tools for Metabolic Engineering

Varner, J.D. 2000. Large-scale prediction of phenotype: Concept. Biotechnol. Bioeng., 69, 664–678. Wiback, S.J., Famili, I., Greenberg, H.J., and Palsson, B.O. 2004. Monte Carlo sampling can be used to determine the size and shape of the steady-state lux space. J. heor. Biol., 228, 437–447. Wu, C.H., Apweiler, R., Bairoch, A., Natale, D.A., Barker, W.C., Boeckmann, B., Ferro, S., Gasteiger, E., Huang, H., Lopez, R., Magrane, M., Martin, M.J., Mazumder, R., O’Donovan, C., Redaschi, N., and Suzek, B. 2006. he Universal Protein Resource (UniProt): an expanding universe of protein information. Nucleic Acids Res., 34, D187–D191. Zhu, G., Golding, G.B., and Dean, A.M. 2005. he selective cause of an ancient adaptation. Science, 307, 1279–1282.

19 Multiscale Modeling of Metabolic Regulation

C.A. Leclerc McGill University

Jeffrey D. Varner Cornell University

19.1 Introduction .....................................................................................19-1 19.2 Background ......................................................................................19-1 19.3 he Multiscale Nature of Metabolism Follows from the Central Dogma of Molecular Biology ......................................... 19-2 19.4 Construction of First-Principle and Reversed Engineered Models of Transcriptional Programs .......................................... 19-4 19.5 Models of the Prokaryotic Translational Program ................... 19-5 19.6 Integrating Transcriptional and Translational Programs with Physiology Leads to More Predictive Models ................... 19-6 Multiscale Constraints Based Models Have Increased Capabilities  •  Cybernetic Models Bridge Metabolic Hierarchies

19.7 Summary and Conclusions ........................................................... 19-8 References ................................................................................................... 19-8

19.1 Introduction he capability to gather organism wide data has far outstripped the ability to understand it. Transforming large-scale data sets into a better cell requires tools that integrate physiology with its environment. One such tool is multiscale mathematical modeling where stoichiometry and kinetics are integrated with metabolic regulation and control. Integrated multiscale models could in principle predict physiological shits resulting from environmental or genetic perturbation thereby enhancing our ability to engineer metabolism. However, the complexity underlying the formulation and validation of multiscale models of metabolic regulation severally hampers the approach. In this chapter, we review the salient developments in the area of multiscale metabolic models with an emphasis on understanding the evolution of the ield. We begin by presenting the general metabolic modeling landscape by reviewing both dynamic and stoichiometric models. We then present a general set of multiscale metabolic mass balances and frame the discussion of the formulation and validation of models of transcriptional and translational programs from large-scale data sets and irst-principles in the context of these balances. We conclude by reviewing two multiscale modeling techniques; the augmented constraints based models of Covert, Palsson, and coworkers and the cybernetic models of Ramkrishna and coworkers. One important area, namely stochastic models of gene expression, is not considered here, see Ref. [1] for a review of the origin of stochastic luctuation in gene expression.

19.2 Background he deepest level of metabolic analysis ultimately culminating in the prediction of metabolic dynamics, for example, the metabolic reprogramming observed in the seminal work of Brown and coworkers during the diauxie shit in Saccharomyces cerevisiae [2], requires that stoichiometry and kinetics be married 19-1

19-2

Modeling Tools for Metabolic Engineering

with metabolic regulation and control. Constructing multiscale or hierarchal models of physiology is not new; Shuler and coworkers in the late 1970 and early 1980s formulated dynamic single cell models of Escherichia coli [3–6], Chinese Hamster Ovary (CHO) cells [7,8] and S. cerevisiae [9]. hese models were capable of predicting physiological characteristics ranging from the dependence of cell geometry upon growth rate and the impact of nutrient conditions [6,10,11] to plasmid replication and host-plasmid interactions [12–14]. Many examples of the single cell model paradigm can be found in the literature, see Shuler [15]. While arguably being the best formalism to describe cell growth and physiology, single cell models are computationally expensive, require a large number of kinetic parameters and detailed biological knowledge [15]. Reuss and coworkers have developed structured unsegregated dynamic models (state averaged over the population) of both S. cerevisiae [16,17] and E. coli [18] and have studied the in vivo dynamics of key pathways such as the pentose phosphate pathway (PPP) and sugar transport in S. cerevisiae [19,20]. Dynamic models of varying complexity has also been constructed to study the penicillin biosynthetic pathway [21–23], threonine pathway dynamics [24,25], regulatory architectures in metabolic reaction networks [26,27], red-blood cell metabolic pathways [28–32] and plant metabolic pathways [33–35]. Stoichiometric models, such as those used in lux balance analysis (FBA), have also emerged as powerful analysis tools that couple observed extracellular phenomena (uptake/production rates, growth rate, product and biomass yields, etc.) with the intracellular carbon lux and energy distribution. Constraints based stoichiometric models do away with kinetics in favor of a pseudo-steady-state picture of metabolism. FBA and stoichiometric models have been employed to calculate genomic-scale snapshots of several organisms as well as portraits of key subnetworks such as central carbon metabolism. One of the irst examples of what would evolve into FBA was the analysis of butyric-acid bacteria by Papoutsakis [36–38]. Later, Varma, Palsson, and coworkers employed a stoichiometric model of E. coli W3110 to study oxygen limitation and by-product secretion [39,40]. Vallino and Stephanopoulos employed FBA to explore Corynebacterium glutamicum during lysine overproduction [41,42], while Sauer et al., characterized the metabolic capabilities of ribolavin producing B. subtilis [43]. Pramanik and Keasling explored the impact of time varying biomass composition and E. coli metabolism [44,45] while Maranas and coworkers explored the performance limits of E. coli subject to gene additions or deletions [46], the coupling of metabolic luxes in large-scale networks [47], the generation of optimal gene deletion strategies [48], the production of lactic acid in E. coli [49] and the computational identiication of reaction activation/inhibition or elimination candidates in metabolic networks [50]. Edwards, Schilling, Palsson, and coworkers extended FBA to genomic-scale metabolic reconstructions of Helicobacter pylori 26,695 (389 reactions) [51], E. coli MG1655 (740 reactions) [52,53], E. coli K-12 (931 reactions) [54], S. cerevisiae (1173 reactions) [55] and most recently to the human metabolic map with a genome scale reconstruction consisting of 3,311 metabolic and transport reactions and 2,766 metabolites [56]. An attractive feature of constraints based models is the relative ease of computation (solving a linear program or determining a matrix inverse) and the ability to directly incorporate process information, for example on-line CO2, O2, or cellmass measurements into the constraints (see Savinell and Palsson for discussion of optimal measurement selection [57] or Becker et al., for FBA sotware [58]). In addition to physiological measurements, 13C-NMR/GC-MS labeling techniques have been employed by many groups to add additional constraints to the lux calculation [59–74]. Sauer et al., (and others) have pushed 13C enhanced metabolic lux estimation beyond serial experiments into the realm of parallel high-throughput data generation; see Ref. [75].

19.3 the Multiscale nature of Metabolism Follows from the Central Dogma of Molecular Biology he central dogma of molecular biology (Figure 19.1), i.e., information stored in DNA is transcribed to an intermediate mRNA message which is then translated into a working protein machine which carries out a catalytic, regulatory, or structural role in the cell dictates that metabolism is hierarchical or

19-3

Multiscale Modeling of Metabolic Regulation

+

σ



Transcriptional programs

mRNA X P

Gene X

mRNA X mRNA X Translational programs

pX pX

pX pX

B

Metabolic programs

A

FIguRE 19.1 (See color insert following page 10-18.) Schematic of the central dogma of molecular biology. Genetic information is transcribed into mRNA which is then translated into protein machines. he layers and programs of metabolism are coupled and hierarchical; transcriptional programs inluence translation which then drives metabolic programs. Metabolic programs in turn inluence transcription, thus, forming feedback loops that integrate the metabolic layers.

multiscale. his is true as the diferent layers of metabolism are integrated together via explicit dependencies, e.g., translation of protein j cannot occur without the corresponding mRNA transcript or via feedback loops as described by Csete and Doyle [76] which have developed over evolutionary time to ensure robustness in the face of shiting external environments. Traditional dynamic metabolic models or constraints based stoichiometric models do not, in general, systematically account for metabolic regulation and control. his is not to say that regulation and control is neglected, rather, it is oten incorporated into the kinetics for dynamic models or into the constraints for stoichiometric models. he distinction between traditional metabolic modeling approaches and the multiscale paradigm is that the regulation and control programs governing the dynamics of the diferent metabolic hierarchies are explicitly and systematically incorporated into the model formulation. A general unsegregated multiscale model of metabolism consists of mass balance equations governing the time rate change of Z mRNA species, E protein, and M metabolite species, where each of the mass balances explicitly, however, not necessarily mechanistically, accounts for the output of the control programs managing metabolism. hus, the mass balance around transcript j under condition k, denoted by zjk, is given by: dzjk = rx , zjkujk - (kd , zjk + µk )zjk + ηjk dt

j = 1, 2,...,Z

(19.1)

where rx,zjk denotes the speciic rate of expression of transcript j under condition k and ujk denotes the control or management variable governing expression for transcript j. We assume transcript degradation is irst-order where kd,zjk denotes the rate constant governing the degradation of transcript j in condition k and ηjk denotes the speciic rate of constitutive expression of transcript j under condition k. he

19-4

Modeling Tools for Metabolic Engineering

quantity µk denotes the speciic growth rate in condition k. he transcript zjk can be translated to form protein ejk where the speciic concentration of ejk obeys the mass balance: dejk = rT,ejk(zjk,k) wjk - (kd,ejk +mk)ejk j = 1,2,...,E dt

(19.2)

he speciic rate of translation of transcript j in condition k, denoted by rT,zjk (zjk, k), is a function of the transcript concentration and is modiied by the wjk term which denotes the control of management variable governing the translation of transcript j under condition k. We assume protein degradation is non-speciic and irst-order where kd,ejk denotes the rate constant governing the degradation of protein j in condition k. he mass balance around metabolite j in condition k, denoted by xjk, is dxjk = dt

Q

R

∑ i =1

α ji ri (e,x,k)vi +

∑β q (t ) - µ x il l

k jk

j = 1, 2,…, M

(19.3)

l =1

where αji, βjl denote the stoichiometric coeicients relating metabolite xjk with reaction ri and transport lux ql. he term vi denotes the control variable describing enzyme activity regulation while R denotes the number of intracellular reaction rates or luxes (unknown) and Q denotes the number of exchange luxes (measured). he last term in the metabolite mass balances accounts for dilution of the speciic metabolite concentration by cell growth.

19.4 Construction of First-Principle and Reversed engineered Models of transcriptional Programs he ujk variable modifying the kinetic rate of transcription in Equation 19.1 could be thought of as the output of a transcriptional program governing the expression of gene j in condition k. If ujk฀