549 43 3MB
English Pages 626 Year 2019
Springer Optimization and Its Applications 153
Asoke Kumar Bhunia Laxminarayan Sahoo Ali Akbar Shaikh
Advanced Optimization and Operations Research
Springer Optimization and Its Applications Volume 153 Series Editor Panos M. Pardalos
, University of Florida, Gainesville, FL, USA
Honorary Editor Ding-Zhu Du, University of Texas at Dallas, Richardson, TX, USA Advisory Editors J. Birge, University of Chicago, Chicago, IL, USA S. Butenko, Texas A&M University, College Station, TX, USA F. Giannessi, University of Pisa, Pisa, Italy S. Rebennack, Karlsruhe Institute of Technology, Karlsruhe, Baden-Württemberg, Germany T. Terlaky, Lehigh University, Bethlehem, PA, USA Y. Ye, Stanford University, Stanford, CA, USA
Aims and Scope Optimization has continued to expand in all directions at an astonishing rate. New algorithmic and theoretical techniques are continually developing and the diffusion into other disciplines is proceeding at a rapid pace, with a spot light on machine learning, artificial intelligence, and quantum computing. Our knowledge of all aspects of the field has grown even more profound. At the same time, one of the most striking trends in optimization is the constantly increasing emphasis on the interdisciplinary nature of the field. Optimization has been a basic tool in areas not limited to applied mathematics, engineering, medicine, economics, computer science, operations research, and other sciences. The series Springer Optimization and Its Applications (SOIA) aims to publish state-of-the-art expository works (monographs, contributed volumes, textbooks, handbooks) that focus on theory, methods, and applications of optimization. Topics covered include, but are not limited to, non-linear optimization, combinatorial optimization, continuous optimization, stochastic optimization, Bayesian optimization, optimal control, discrete optimization, multi-objective optimization, and more. New to the series portfolio include Works at the intersection of optimization and machine learning, artificial intelligence, and quantum computing. Volumes from this series are indexed by Web of Science, zbMATH, Mathematical Reviews, and SCOPUS.
More information about this series at http://www.springer.com/series/7393
Asoke Kumar Bhunia Laxminarayan Sahoo Ali Akbar Shaikh •
Advanced Optimization and Operations Research
123
•
Asoke Kumar Bhunia Department of Mathematics The University of Burdwan Burdwan, West Bengal, India
Laxminarayan Sahoo Department of Mathematics Raniganj Girls’ College Raniganj, West Bengal, India
Ali Akbar Shaikh Department of Mathematics The University of Burdwan Burdwan, West Bengal, India
ISSN 1931-6828 ISSN 1931-6836 (electronic) Springer Optimization and Its Applications ISBN 978-981-32-9966-5 ISBN 978-981-32-9967-2 (eBook) https://doi.org/10.1007/978-981-32-9967-2 Mathematics Subject Classification (2010): 90Cxx, 90Bxx, 91Axx, 90B05 © Springer Nature Singapore Pte Ltd. 2019 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore
Dedicated to our respected teachers Prof. M. Maiti and Prof. T. K. Pal Vidyasagar University West Bengal, India and Prof. M. Basu University of Kalyani West Bengal, India
Preface
Due to the gradual complexity of day-to-day decision-making problems, engineers, system analysts and managers face challenges in offering the best solutions for these problems. Most of these real-life decision-making problems can be modelled as non-linear constrained or unconstrained optimization problems. Optimization is the art of finding the best alternative(s) among a given set of options. The process of finding the maximum or minimum possible value which a given function can attain in its domain of definition is known as optimization. Operations research, on the other hand, a relatively new discipline, is an interdisciplinary subject. Mainly concerned with different techniques, it provides new insights and capabilities for determining better solutions of decision-making problems with higher accuracy, competence and confidence. This book, Advanced Optimization and Operations Research, is intended to introduce the fundamental and advanced concepts of optimization and operations research. It has been written in simple and lucid language, according to the syllabi of the all major universities which include optimization and operations research in their graduate and post-graduate courses in areas of science, arts, commerce, engineering and management. Maximum care has been taken in preparing the chapters to include as many different topics as possible. There are several textbooks available on the market on optimization and operations research as separate topics. Our primary motivation to write this book was the absence of one book covering these two areas together. Existing books have been written in a conventional way, whereas in this book, all 18 chapters are written in details with equal emphasis. Each chapter states its objective in the beginning and includes a sufficient number of solved examples and exercises which will be of much help for aspirants preparing for competitive examinations like NET, SET, GATE, CAT and MAT. Chapter 1 highlights the historical perspective, various definitions, characteristic scope and purpose of operations research. Chapter 2 discusses convex and concave functions and their properties. Chapter 3 introduces the simplex method. In Chaps. 4–6, advanced linear programming techniques, like revised simplex method, dual simplex method and bounded variable technique, are discussed. Chapter 7 vii
viii
Preface
introduces the post-optimality analysis of linear programming problems (LPPs), and Chap. 8 deals with techniques for solving LPPs with integer values. Chapters 9 through 11 deal with the basics of unconstrained and constrained non-linear optimization problems with equality and inequality constraints. In Chap. 12, quadratic programming problems and their solution by Wolfe’s modified simplex method and Beale’s method are explained in details. Chapter 13 talks about game theory, and Chap. 14 discusses project management, which includes the topics of critical path analysis, PERT analysis and time-cost trade-off. Chapter 15 presents the concepts of queueing theory, including Poisson and non-Poisson queues. Chapter 16 presents ideas on flow and potential in networks, including the maximum flow problem and its solution procedure. Chapter 17 introduces deterministic, probabilistic and fuzzy inventory control models. Chapter 18 discusses the preliminaries of several topics, like matrices, determinant, vectors, probability and Markov process, which will be essential for studying the different chapters of this book. Salient features of this book are as follows: • It is written in a self-instructional mode so that the readers can easily understand the subject matter without any help from other sources. • Each topic has been discussed in detail with numerous worked-out examples to help students properly understand the underlying theories. • Advanced-level (research-level) topics have also been included. • A set of exercises has been included at the end of each chapter. • It is suitable to help students to prepare for different competitive examinations. We would like to express our sincere gratitude to all the faculty members of the Department of Mathematics at the University of Burdwan, who have contributed greatly to the success of this project. We also express our heartfelt thanks to our research scholars—Avijit Duary, Subash Chandra Das, Tanmoy Banerjee, Rajan Mondal, Md. Akhtar, Md. Sadikur Rahaman and Nirmal Kumar—for their constant help in preparing the manuscript. Finally, we wish to thank Shamim Ahmad, Tooba Shafique and the other publishing teams of Springer Nature for producing the book in its present form. West Bengal, India
Asoke Kumar Bhunia Laxminarayan Sahoo Ali Akbar Shaikh
Contents
1
Introduction to Operations Research . . . 1.1 Objectives . . . . . . . . . . . . . . . . . . . 1.2 Introduction . . . . . . . . . . . . . . . . . 1.3 Definitions of Operations Research . 1.4 Applications of OR . . . . . . . . . . . . 1.5 Main Phases of OR . . . . . . . . . . . . 1.6 Scope of OR . . . . . . . . . . . . . . . . . 1.7 Modelling in OR . . . . . . . . . . . . . . 1.8 Development of OR in India . . . . . 1.9 Role of Computers in OR . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
1 1 1 2 3 5 6 7 10 11
2
Convex and Concave Functions . . . . . . . . . . . . . . . . . . . . . 2.1 Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Convex and Concave Functions . . . . . . . . . . . . . . . . . 2.4 Geometrical Interpretation . . . . . . . . . . . . . . . . . . . . . 2.5 Properties of Convex and Concave Functions . . . . . . . 2.6 Differentiable Convex and Concave Functions . . . . . . 2.7 Differentiable Strictly Convex and Concave Functions
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
13 13 13 13 15 18 20 24
3
Simplex Method . . . . . . . . . . . . . . . . . . . . . . 3.1 Objective . . . . . . . . . . . . . . . . . . . . . . 3.2 General Linear Programming Problem . 3.3 Slack and Surplus Variables . . . . . . . . 3.4 Canonical and Standard Forms of LPP . 3.5 Fundamental Theorem of LPP . . . . . . . 3.6 Replacement of a Basis Vector . . . . . . 3.7 Improved Basic Feasible Solution . . . . 3.8 Optimality Condition . . . . . . . . . . . . . . 3.9 Unbounded Solution . . . . . . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
29 29 29 30 31 33 37 39 40 42
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
ix
x
Contents
3.10 3.11 3.12
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
45 46 53 54
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
55 56 66 76
Revised Simplex Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Standard Forms of Revised Simplex Method . . . . . . . . . . 4.4 Standard Form I of Revised Simplex Method . . . . . . . . . . 4.5 Computational Procedure of Standard Form I of Revised Simplex Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6 Standard Form II of Revised Simplex Method . . . . . . . . . 4.6.1 Charne’s Big M Revised Simplex Method . . . . . 4.6.2 Two-Phase Revised Simplex Method . . . . . . . . . 4.7 Advantages of Revised Simplex Method . . . . . . . . . . . . . 4.8 Difference Between Revised Simplex Method and Simplex Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . .
. . . . .
. . . . .
79 79 79 80 80
. . . . .
. . . . .
. 83 . 89 . 90 . 95 . 106
3.13 4
5
Simplex Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . Simplex Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Use of Artificial Variables . . . . . . . . . . . . . . . . . . . . 3.12.1 Big M Method . . . . . . . . . . . . . . . . . . . . . 3.12.2 Solution of Simultaneous Linear Equations by Simplex Method . . . . . . . . . . . . . . . . . 3.12.3 Inversion of a Matrix by Simplex Method . 3.12.4 Two-Phase Method . . . . . . . . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . 107 . . . 107
109 109 109 114
Dual Simplex Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Dual Simplex Algorithm . . . . . . . . . . . . . . . . . . . . . . . 5.4 Difference Between Simplex Method and Dual Simplex Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5 Modified Dual Simplex Method . . . . . . . . . . . . . . . . . . 5.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . 120 . . . . . 120 . . . . . 124
6
Bounded Variable Technique . . . . . 6.1 Objective . . . . . . . . . . . . . . . 6.2 Introduction . . . . . . . . . . . . . 6.3 The Computational Procedure 6.4 Exercises . . . . . . . . . . . . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
127 127 127 128 135
7
Post-optimality Analysis in LPPs . . . . . . . . . . . . . 7.1 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 7.3 Discrete Changes in the Profit Vector . . . . . . 7.4 Discrete Changes in the Requirement Vector
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
137 137 137 138 145
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
Contents
7.5 7.6 7.7 7.8 7.9 8
xi
Discrete Changes in the Coefficient Matrix . Addition of a Variable . . . . . . . . . . . . . . . . Deletion of a Variable . . . . . . . . . . . . . . . . Addition of a New Constraint . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
151 156 158 159 166
Integer Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.3 Gomory’s All-Integer Cutting-Plane Method . . . . . . . . . 8.4 Construction of Gomory’s Constraint . . . . . . . . . . . . . . 8.5 Gomory’s Mixed-Integer Cutting-Plane Method . . . . . . 8.6 Construction of Additional Constraint for Mixed-Integer Programming Problems . . . . . . . . . . . . . . . . . . . . . . . . 8.7 Branch and Bound Method . . . . . . . . . . . . . . . . . . . . . 8.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
171 171 171 172 173 182
Basics 9.1 9.2 9.3 9.4
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . . 182 . . . . . 187 . . . . . 193 . . . . .
. . . . .
. . . . .
197 197 197 201 209
10 Constrained Optimization with Equality Constraints . . . . . . . . . 10.1 Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3 Direct Substitution Method . . . . . . . . . . . . . . . . . . . . . . . . 10.4 Method of Constrained Variation . . . . . . . . . . . . . . . . . . . . 10.4.1 Geometrical Interpretation of Admissible Variation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.4.2 Necessary Conditions for Extremum of a General Problem for Constrained Variation Method . . . . . 10.4.3 Sufficient Conditions for a General Problem in Constrained Variation Method . . . . . . . . . . . . . 10.5 Lagrange Multiplier Method . . . . . . . . . . . . . . . . . . . . . . . 10.5.1 Necessary Conditions for Optimality . . . . . . . . . . 10.5.2 Sufficient Conditions for Optimality . . . . . . . . . . . 10.5.3 Alternative Approach for Testing Optimality . . . . 10.6 Interpretation of Lagrange Multipliers . . . . . . . . . . . . . . . . 10.7 Applications of Lagrange Multiplier Method . . . . . . . . . . . . 10.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . .
. . . . .
211 211 211 212 215
9
of Unconstrained Optimization . . . . . . . . . . . . Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . Optimization of Functions of a Single Variable . Optimization of Functions of Several Variables . Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . 217 . . 217 . . . . . . . .
. . . . . . . .
220 225 226 228 230 241 245 246
11 Constrained Optimization with Inequality Constraints . . . . . . . . . . 249 11.1 Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249 11.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
xii
Contents
11.3 11.4
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
251 253 253 255 267
12 Quadratic Programming . . . . . . . . . . . . . . . . . . . . . . . 12.1 Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.3 Solution Procedure of QPP . . . . . . . . . . . . . . . . 12.4 Wolfe’s Modified Simplex Method . . . . . . . . . . 12.5 Advantage and Disadvantage of Wolfe’s Method 12.6 Beale’s Method . . . . . . . . . . . . . . . . . . . . . . . . . 12.7 Beale’s Algorithm for QPPs . . . . . . . . . . . . . . . 12.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
269 269 269 271 275 276 292 293 301
Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cooperative and Non-cooperative Games . . . . . . . . . . . . Antagonistic Game . . . . . . . . . . . . . . . . . . . . . . . . . . . . Matrix Games or Rectangular Games . . . . . . . . . . . . . . . Dominance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Graphical Solution of 2 n or m 2 Games . . . . . . . . Matrix Games and Linear Programming Problems (LPPs) Infinite Antagonistic Game . . . . . . . . . . . . . . . . . . . . . . Continuous Game . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Separable Game . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
303 303 303 304 307 308 324 330 336 341 347 348 364
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
367 367 367 368 369 369 372 373 374 375 380 386 387 399
11.5
13 Game 13.1 13.2 13.3 13.4 13.5 13.6 13.7 13.8 13.9 13.10 13.11 13.12
Feasible Direction . . . . . . . . . . . . . . . . . . . . Kuhn-Tucker Conditions . . . . . . . . . . . . . . . 11.4.1 Kuhn-Tucker Necessary Conditions 11.4.2 Kuhn-Tucker Sufficient Conditions Exercises . . . . . . . . . . . . . . . . . . . . . . . . . .
14 Project Management . . . . . . . . . . . . . . . . . 14.1 Objectives . . . . . . . . . . . . . . . . . . . . 14.2 Introduction . . . . . . . . . . . . . . . . . . 14.3 Phases of Project Management . . . . . 14.4 Advantages of Network Analysis . . . 14.5 Basic Components . . . . . . . . . . . . . 14.6 Common Errors . . . . . . . . . . . . . . . 14.7 Rules for Network Construction . . . . 14.8 Numbering of Events . . . . . . . . . . . 14.9 Critical Path Analysis . . . . . . . . . . . 14.10 PERT Analysis . . . . . . . . . . . . . . . . 14.11 Difference Between PERT and CPM 14.12 Project Time-Cost Trade-Off . . . . . . 14.13 Exercises . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . .
. . . . . . . . . . . . . .
. . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
Contents
xiii
15 Queueing Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.1 Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.2 Basics of Queueing Theory . . . . . . . . . . . . . . . . . . . . . 15.2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 15.2.2 Reason Behind the Queue . . . . . . . . . . . . . . . 15.2.3 Characteristics of Queueing Systems . . . . . . . 15.2.4 Important Definitions . . . . . . . . . . . . . . . . . . 15.2.5 The State of the System . . . . . . . . . . . . . . . . 15.2.6 Probability Distribution in a Queueing System 15.2.7 Classification of Queueing Models . . . . . . . . . 15.3 Poisson Queueing Models . . . . . . . . . . . . . . . . . . . . . . 15.3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 15.3.2 Model ðM=M=1Þ:ð1=FCFS=1Þ . . . . . . . . . . 15.3.3 Model ðM=M=1Þ:ðN=FCFS=1Þ . . . . . . . . . . 15.3.4 Model ðM=M=CÞ:ð1=FCFS=1Þ . . . . . . . . . . 15.3.5 Model ðM=M=C Þ:ðN=FCFS=1Þ . . . . . . . . . . 15.3.6 Model ðM=M=RÞ:ðK=GDÞK [ R . . . . . . . . . . 15.3.7 Power Supply Model . . . . . . . . . . . . . . . . . . 15.4 Non-poisson Queueing Models . . . . . . . . . . . . . . . . . . 15.4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 15.4.2 Model ðM=Ek =1Þ:ð1=FCFS=1Þ . . . . . . . . . . 15.4.3 Model ðM=G=1Þ:ð1=GDÞ . . . . . . . . . . . . . . . 15.5 Mixed Queueing Model and Cost . . . . . . . . . . . . . . . . . 15.5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 15.5.2 Mixed Queueing Model . . . . . . . . . . . . . . . . 15.6 Cost Models in Queueing System . . . . . . . . . . . . . . . . 15.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
403 403 403 403 404 405 409 409 410 415 416 416 417 428 434 442 448 456 459 459 460 468 473 473 473 477 479
16 Flow in Networks . . . . . . . . . . . . . . . . . . . . . . . . 16.1 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . 16.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . 16.3 Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.4 Minimum Path Problem . . . . . . . . . . . . . . . 16.5 Problem of Potential Difference . . . . . . . . . 16.6 Network Flow Problem . . . . . . . . . . . . . . . 16.7 Generalized Problem of Maximum Flow . . 16.8 Formulation of an LPP of a Flow Network . 16.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
487 487 487 488 491 494 495 499 500 517
17 Inventory Control Theory . . . . . . . . . . . 17.1 Objectives . . . . . . . . . . . . . . . . . . 17.2 Introduction . . . . . . . . . . . . . . . . 17.2.1 Types of Inventories . . . 17.3 Basic Terminologies in Inventory .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
521 521 521 523 524
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
xiv
Contents
17.4 17.5 17.6 17.7 17.8 17.9
17.10
17.11
17.12 17.13
Classification of Inventory Models . . . . . . . . . . . . . . . . . . . Purchasing Inventory Model with No Shortage (Model 1) . . Manufacturing Model with No Shortage (Model 2) . . . . . . . Purchasing Inventory Model with Shortage (Model 3) . . . . . Manufacturing Model with Shortage . . . . . . . . . . . . . . . . . Multi-item Inventory Model . . . . . . . . . . . . . . . . . . . . . . . . 17.9.1 Limitation on Inventories . . . . . . . . . . . . . . . . . . 17.9.2 Limitation of Floor Space (or Warehouse Capacity) . . . . . . . . . . . . . . . . . . . 17.9.3 Limitation on Investment . . . . . . . . . . . . . . . . . . Inventory Models with Price Breaks . . . . . . . . . . . . . . . . . . 17.10.1 Purchasing Inventory Model with Single Price Break . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.10.2 Purchase Inventory Model with Two Price Breaks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Probabilistic Inventory Model . . . . . . . . . . . . . . . . . . . . . . 17.11.1 Single Period Inventory Model with Continuous Demand (No Replenishment Cost Model) . . . . . . 17.11.2 Single Period Inventory Model with Instantaneous Demand (no Replenishment Cost Model) . . . . . . . Fuzzy Inventory Model . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
18 Mathematical Preliminaries . . . . . . . . . . . . . . . 18.1 Objective . . . . . . . . . . . . . . . . . . . . . . . . 18.2 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . 18.3 Determinant . . . . . . . . . . . . . . . . . . . . . . 18.4 Rank of a Matrix . . . . . . . . . . . . . . . . . . 18.5 Solution of a System of Linear Equations . 18.6 Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . 18.7 Probability . . . . . . . . . . . . . . . . . . . . . . . 18.8 Markov Process . . . . . . . . . . . . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . .
. . . . . . .
526 527 530 533 538 543 544
. . 546 . . 547 . . 552 . . 553 . . 555 . . 557 . . 557 . . 564 . . 570 . . 576 . . . . . . . . .
. . . . . . . . .
581 581 581 587 589 590 591 596 610
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 615 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 619
About the Authors
Asoke Kumar Bhunia is Professor at the Department of Mathematics, The University of Burdwan, West Bengal, India. He obtained his Ph.D. degree in Mathematics and M.Sc. in Applied Mathematics from Vidyasagar University, India. His research interests include computational optimization, soft computing, interval mathematics and interval ranking. He has published more than 125 research papers in various national and international journals of repute. He is a reviewer of several Science Citation Index (SCI) journals. He has guided 13 Ph.D. and two M.Phil. students. An author of four research monographs and six book chapters, he is an Indian National Science Academy (INSA) Visiting Fellow and former Associate Editor of the Springer journal OPSEARCH. Laxminarayan Sahoo is Assistant Professor at the Department of Mathematics, Raniganj Girls’ College, West Bengal, India. He obtained his Ph.D. in Mathematics from the University of Burdwan and his M.Sc. in Applied Mathematics from Vidyasagar University, India. His research interests include reliability optimization, computational optimization, artificial intelligence, soft computing, interval mathematics, interval ranking and fuzzy mathematics. He has published several research papers in various national and international journals of repute. He has received a Ministry of Human Resource Development (MHRD) fellowship from the Government of India and the Professor M.N. Gopalan Award for the best Ph.D. thesis in operations research from the Operational Research Society of India. Ali Akbar Shaikh is Assistant Professor at the Department of Mathematics in The University of Burdwan, West Bengal, India. Earlier, he was a postdoctoral fellow at the School of Engineering and Sciences, Tecnológico de Monterrey, Mexico. He received the award SNI of level 1 (out of 0–3) presented by the National System of Researchers of Mexico from the Government of Mexico in the year 2017. He earned his Ph.D. and M.Phil. in Mathematics from the University of Burdwan, and his
xv
xvi
About the Authors
M.Sc. in Applied Mathematics from the University of Kalyani, India. Dr. Shaikh has published more than 37 research papers in different international journals of repute. His research interests include inventory control, interval optimization and particle swarm optimization.
Chapter 1
Introduction to Operations Research
1.1
Objectives
The objectives of this chapter are to discuss: • The needs of using Operations Research (OR) • The historical perspective of the approach of OR • Several definitions of OR in different contexts and various phases of scientific study • The classification and usage of various models for solving a problem under consideration • Applications of OR in different areas.
1.2
Introduction
The term ‘operations research (OR)’ was first introduced by McClosky and Trefthen in the year 1940. This subject came into existence in a military context. The main origin of the development of OR was the Second World War. During that war, the military services of England gathered some scientists/researchers from various disciplines and formed a team to assist them in solving strategic and tactical problems relating to air and land defence of the country. As they had very limited military resources, it was necessary to decide upon their most effective utilization, e.g. efficient ocean transport, effective bombing, etc. The mission of the team was to formulate specific proposals and plans to help the military services make decisions on optimal utilization of military resources and efforts and also to implement the decisions effectively. The members of this team were not actually engaged in the military operations. However, they worked as advisors in winning the war to the extent that the scientific and systematic approaches involved in OR provided good © Springer Nature Singapore Pte Ltd. 2019 A. K. Bhunia et al., Advanced Optimization and Operations Research, Springer Optimization and Its Applications 153, https://doi.org/10.1007/978-981-32-9967-2_1
1
2
1
Introduction to Operations Research
intellectual support to the strategic initiatives of the military services. In this context, Arthur C. Clarke gave an appropriate definition of OR, saying ‘OR is an art of winning the war without actually fighting it’. After seeing the encouraging results by the British OR team, the US military department was quickly motivated to start similar activities in their country. The major developments of the US teams were the invention of new fighting patterns, planning, sea mining and effective utilization of electronic equipment and their successful application in war. In USA, the work of the OR team was given different names, like Operational Analysis, Operational Evaluations, Operations Research, System Analysis, System Evaluation, etc. Among these, the name Operations Research became widely accepted. After the Second World War, industrial managers/executives who were seeking better solutions to their complex problems were attracted by the success of the military teams. The most common problem arising in industry was which method(s) to adopt so that the total cost/profit is optimum. The first mathematical technique in this field, called the simplex method of linear programming problems (LPPs), was developed in 1947 by George B. Dantzig of the US Air Force. Since then, new techniques and applications have been developed through the efforts and cooperation of interested researchers in both academic institutions and industry. The impact of OR can now be felt in different areas of government as well as in the private sector. A number of consulting firms are currently engaged in OR activities. Besides military and business applications, OR activities include solving the problems of transportation systems, libraries, hospitals, city planning, financial decision-making, etc. Most of the Indian industries, like Indian Railways, Indian airlines, defence organizations, Hindustan Unilever Ltd., TISCO, FCI, Telecom, Union Carbide and Hindustan Steel are using OR activities to obtain their best alternatives.
1.3
Definitions of Operations Research
OR has been defined in various ways. It is not possible to provide uniformly acceptable definitions of OR due to its wide scope of applications. A few opinions about the definition of OR are given as follows: 1. ‘OR is the art of winning the war without actually fighting it’. [Arthur C. Clarke] 2. ‘OR is a scientific method of providing executive departments with a quantitative basis for decisions regarding the operations under their control’. [P. M. Morse and G. E. Kimbol (1946)] 3. ‘OR is the scientific method of providing the executive with an analytical and objective basis for decisions’. [P. M. S. Blackett (1948)] 4. ‘OR is the art of giving bad answers to problems to which otherwise, worse answers are given’. [T. L. Saaty]
1.3 Definitions of Operations Research
3
5. ‘OR is a scientific approach to problem solving for executive management’. [H. M. Wagner] 6. ‘OR is an experimental and applied science devoted to observing, understanding and predicting the behaviour of purposeful man-machine systems and OR workers are actively engaged in applying this knowledge to practical problems in business, government and society’. [OR Society of America] 7. ‘OR is the application of scientific methods, techniques and tools to problems involving the operations of a system so as to provide those in control of the system with optimum solutions to the problem’. [Churchman, Ackoff and Arnoff (1957)] 8. ‘OR is applied decision theory. It uses many scientific, mathematical or logical means to attempt to cope with the problems that confront the executive when he tries to achieve a thorough going rationality in dealing with decision problems’. [D. W. Miller and M. K. Starr] 9. ‘OR has been characterized by the use of scientific knowledge through interdisciplinary team effort for the purpose of determining the best utilization of limited resources’. [H. A. Taha] 10. ‘The term “OR” has hitherto-fore been used to connate various attempts to study operations of war by scientific methods. From a more general point of view, OR can be considered to be an attempt to study those operations of modern society which involved organizations of men or of men and machines’. [P. M. Morse (1948)] 11. ‘OR is a management activity pursued in two complementary ways—one half by the free and bold exercise of commonsense untrammeled by any routine, and other half by the application of a repertoire of well established precreated methods and techniques’. [Jagjit Singh (1968)] 12. ‘OR is the application of the methods of science to complex problems in the direction and management of large systems of men, machines, materials and money in industry, business and defence. The distinctive approach is to develop a scientific model of the system, incorporating measurements of factors such as chance and risk with which to predict and compare the outcomes of alternative decisions, strategies or controls. The purpose is to help management to determine its policy and actions scientifically’. [Operational Research Society, UK (1971)]
1.4
Applications of OR
OR is mainly concerned with different techniques. It provides new insights and capabilities to determine better solutions to decision-making problems with higher speed, competence and confidence. During the last few decades, OR has been applied successfully in various areas such as defence, government sectors, service organizations and industry. Some areas of management decision-making where the tools and techniques of OR are applied are outlined as follows:
4
1
Introduction to Operations Research
1. Budgeting and investments (i) Cash flow analysis, long-range capital requirements, dividend policies, investment portfolios (ii) Credit policies, credit risk, etc. 2. Purchasing (i) Rules for buying, supplies with stable or varying prices (ii) Determination of quantities and timing of purchases (iii) Replacement policies, etc. 3. Production management (i) Physical distribution (a) Location and size of warehouses, distribution centres and retail outlets (b) Distribution policy (ii) Facilities planning (a) Numbers and location of factories, warehouses, hospitals, etc. (b) Loading and unloading facilities for railways and trucks determining the transport schedule (iii) Manufacturing (a) Production scheduling and sequencing (b) Stabilization of production and employment training layoffs and optimum product mix (iv) Maintenance and project scheduling (a) Maintenance policies and preventive maintenance (b) Project scheduling and allocation of resources 4. Marketing (i) Product selection, timing, competitive actions (ii) Number of sales representatives (iii) Advertising media with respect to cost and time 5. Personnel management (i) Selection of suitable personnel on minimum salary (ii) Mixes of age and skills (iii) Recruitment policies and assignment of jobs 6. Research and development (i) Determination of the areas of concentration of research and development
1.4 Applications of OR
5
(ii) Project selection (iii) Reliability and alternative design (iv) Determination of time-cost trade-off and control of development projects.
1.5
Main Phases of OR
In the first half of the twentieth century, it was difficult to obtain an OR method to find a procedure for conducting an OR project. The procedure for an OR study generally involves the following major phases: Phase I: Formulation of the Problem To find the solution of a problem, first it has to be formulated. Formulation of a problem for research consists of identifying, defining and specifying the measure of the computation of a decision-making problem. To formulate a problem, the following information is required: (i) What are the objectives? (ii) Controlled variables and their ranges (iii) Uncontrolled variables that may affect the outcomes of the available choices. Since it is difficult to extract a right answer from the wrong problem, this phase should be executed with considerable care. Phase II: Construction of a Mathematical Model After formulation of the problem, the next phase is to reformulate this problem into a specific form which is most useful for the analysis. The most convenient approach is to construct a mathematical model which represents the system. Three types of models are commonly used in OR and other sciences. These are (i) iconic, (ii) analogue and (iii) symbolic/mathematical. Phase III: Deriving a Solution from the Model After formulating the mathematical model for the problem under consideration, the next phase is to find a solution from this model. In OR, our objective is to search for an optimal solution which optimizes (maximizes or minimizes) the objective function of the model. In addition to solving the model, sometimes it is also essential to perform a sensitivity analysis for studying the effect of changes of different system parameters. Phase IV: Testing the Model and its Solution (Updating the Model) After deriving a solution from the model of the problem, the entire solution should be tested again to determine whether or not there is any error. This may be done by reexamining the formulation of the problem and comparing it with that model, which may help to reveal any mistakes.
6
1
Introduction to Operations Research
The solution of an OR problem should yield a better performance than a solution obtained by some alternative procedure. A solution derived from the model of the problem should be tested further for its optimality. Phase V: Implementing the Solution The final phase of an OR study is to implement the optimal solution derived by the OR team. Due to the constantly changing scenarios in the world, the model and its solution may not remain valid for a long period of time. With changes of scenario, the model, its solution and the resulting course of action can be modified accordingly.
1.6
Scope of OR
OR has considerable scope in the different sectors of our society. In general, we can say that wherever there is a problem, there is an OR technique for solving that problem. In addition to military applications, OR is widely used in many business and industry organizations among others. Here, we shall discuss the scope of OR in some particular areas of the government sector. (i) Defence: During World War II, the OR teams of Britain and America developed different techniques and strategies which helped them to win different battles. In modern warfare, military operations are carried out by the air force, army and navy. Therefore, it is necessary to formulate optimum strategies which will give maximum benefits. Because OR helps military personnel to select the best possible strategy to win battles, OR has extensive applications in defence. (ii) Industry: After observing successful applications of OR in the military, industry personnel became interested in this field. When a small organization expands, and in larger organizations in general, it is not possible to manage everything with a single department. As a result, the administration of an organization can establish the following departments with different objectives: (a) Production department The objective of this department is to produce an item with better quality by minimizing the manufacturing cost/time of the produced item. (b) Marketing department The objectives of this department are to: • Promote the item/product to the customers by attractive advertisements with the help of electronic and print media and also through sales representatives • Maximize the amount of profit • Minimize the cost of sales.
1.6 Scope of OR
7
(c) Finance department The objective of this department is to minimize the capital required to maintain any level of business. These departments can come into conflict with each other as their policies may conflict. This difficulty can be overcome by the application of OR techniques. (iii) Life Insurence Company: OR techniques can be applied to determine the premium rates of different policies for the better interest of the organization. (iv) Agriculture: With an increasing population and consequent shortages of food, there is a need to increase the agricultural output of a country. However, many problems arise, e.g. climatic conditions and the problem of optimal distribution of water from resources, etc., for the agricultural department of a country. Thus, there is a need for best policies under the given restrictions. OR techniques may be beneficial in determining the best policies. (v) Planning: Careful planning plays a crucial role in the economic development of a country, and OR techniques may be used in such planning. Thus, OR has extensive applications in the planning of different sectors of a country.
1.7
Modelling in OR
A model in OR may be defined as an idealized representation of a real-life system in which only the basic aspects or the most important features of a typical problem under investigation are considered. The objective of a model is to analyse the behaviour of the system for the purpose of improving its performance. Advantages of a Model A model has many advantages over a verbal description of a problem. Some of them are as follows: (i) It describes a problem much more concisely. (ii) It provides a logical and systematic approach to the problem. (iii) It enables the use of high-powered mathematical techniques to analyse the problem. (iv) It indicates the limitations and scope of the problem. (v) It tends to make the overall structure of the problem more comprehensible. Disadvantages of a Model Models also have a few disadvantages; these are as follows: 1. Models are only an attempt to understand an operation and should never be considered absolute in any sense. 2. The validity of any model with regard to the corresponding operation can only be verified by experimentation and the use of relevant data characteristics.
8
1
Introduction to Operations Research
Types of Models Three types of models are commonly used in OR: (i) iconic models, (ii) analogue models, (iii) mathematical models. Iconic Models These models represent the system as it is, but in a different size. Thus, iconic models are obtained by enlarging or reducing the size of the system. An iconic model is a pictorial representation of the system. Some common examples are photographs, drawings, maps and globes. Iconic models of the Sun and its planets are scaled down, while the model of an atom is scaled up so as to make it visible. These models have some advantages as well as disadvantages. The advantages are that they are specific, concrete and easy to construct. They can be studied more easily than the system itself. The disadvantages are that these models are difficult to manipulate for experimental purposes. To make any modification or improvement in these models is not easy. Adjustments with changing situations cannot be performed with these models. Analogue Models These models are more abstract than iconic models; thus, there is no ‘look-alike’ correspondence between these models and real-life items. In analogue models, one set of properties is used to represent another set of properties. For example, graphs are analogues, as distance is used to represent a wide variety of variables such as time, percentage, age, weight, etc. These models are easier to manipulate than iconic models. However, they are less specific and less concrete. Mathematical (Symbolic) Models Mathematical models are more abstract and more general in nature. They employ a set of mathematical symbols to represent the components of a real system. Thus, these models are mathematical equations or inequalities reflecting the structure of the system they represent. Inventory models and queueing models are some examples of symbolic models. These models can be classified into two categories: (i) deterministic (crisp) models and (ii) non-deterministic models. In deterministic models, all the parameters and functional relationships are assumed to be known and precise. Linear programming and classical inventory models are examples of deterministic models. On the other hand, models in which at least one parameter or decision variable is a random variable or a fuzzy number or an interval-valued number are called non-deterministic models. These models reflect the complexity of the real world and the uncertainty surrounding it. In the context of generality, models can also be categorized as (i) specific models and (ii) general models. When a model presents a system at some specific time, it is known as a specific model. In these models, if a time factor is considered, they are termed dynamic models; otherwise, they are called static models. An inventory problem of determining the economic order quantity for the next period with a uniform demand rate is an example of a static model, whereas the same problem with a time-dependent demand is an example of a dynamic model. On the other
1.7 Modelling in OR
9
hand, simulation and heuristic models are called general models. These models do not yield an optimum solution to the problem, but they give a solution near to the optimal one. General Solution Methods for OR Models Generally, OR models are solved by the following three methods. (i) Analytical methods (ii) Numerical methods (iii) Monte Carlo technique. Analytical Methods In these methods, all the tools of basic mathematics (such as differential calculus, integral calculus, differential equations, difference equations, etc.) are used for solving an OR model. Numerical Methods These methods are concerned with iterative or trial and error methods. These methods are primarily used when analytical methods fail to derive a solution. The analytical methods may fail because of the complexity of the constraints and/or number of variables. In this procedure, the algorithm is started with an initial approximate (trial) solution and continued with a set of given rules for improving it towards optimality. Then the trial solution is replaced by the improved one, and the process is repeated for a fixed number of times or until no further improvement is possible. Monte Carlo Technique This technique is based on the random sampling of a variable’s possible values. For this technique, some random numbers are required which may be converted into random variates whose behaviour is known from past experience. Darker and Kac developed this technique, combining probability and a sampling technique to provide the solution of partial or integral differential equations. Hence, this technique is concerned with experiments on random numbers, and it provides solutions to complicated OR problems. The Monte Carlo technique is useful in the following situations: (i) When the mathematical and statistical problems are too complicated and alternative methods are needed (ii) When it is not possible to gain any information from past experience for a particular problem (iii) To estimate the parameters of a model. Algorithm for Monte Carlo Technique The following steps are associated with the Monte Carlo technique: Step 1. To illustrate the general idea of the system, draw a flow diagram. Step 2. Make sample observations to select a suitable model for the system and determine the probability distribution for the variables of interest. Step 3. Convert the probability distribution to a cumulative distribution.
10
1
Introduction to Operations Research
Step 4. Select a sequence of random numbers with the help of random number tables. Step 5. Determine the sequence of values of the variables of interest with the sequence of random numbers obtained in Step 4. Step 6. Fit an appropriate standard mathematical function to the sequence of values obtained in Step 5. Advantages and Disadvantages of Monte Carlo Techniques The advantages of Monte Carlo techniques are as follows: (i) They are very helpful in finding solutions of complicated mathematical problems which cannot be solved otherwise. (ii) The difficulties of trial and error experimentation are avoided. Monte Carlo techniques have the following disadvantages: (i) These methods are time-consuming and costly ways of getting a solution of any problem. (ii) These techniques do not provide an optimal solution to a problem. The solution is nearly optimal only when the size of the sample is sufficiently large.
1.8
Development of OR in India
In India, OR started to appear when an OR unit was established at the Regional Research Laboratory in Hyderabad in the year 1949. At the same time, Professor R. S. Verma of Delhi University set up an OR team in the Defence Science Laboratory to solve different problems involving storing, purchasing and planning. In 1953, Professor P. C. Mahalanobis established an OR team at the Indian Statistical Institute in Kolkata for solving the problem of national planning and survey. In 1957, the OR Society of India (ORSI) was formed. India now publishes a number of research journals, namely, OPSEARCH (published by ORSI), Industrial Engineering and Management, Journal of Engineering Production, Defence Science Journal, SCIMA, IJOMAS, IAPQR, etc. Regarding OR education in India, the University of Delhi first introduced a complete M.Sc. course in 1963. Simultaneously, the Institute of Management at Kolkata and Ahmedabad started teaching OR in their MBA courses. Nowadays, OR has become such a popular subject that it has been introduced in almost all institutes and universities in various disciplines, like Mathematics, Statistics, Commerce, Economics, Computer Science, Engineering, Business Administration, etc. Also, because of the importance of this subject, it has been introduced in different competitive examinations, like IAS, CA, ICWA and NET. Profesor P. C. Mahalanobis first applied OR techniques in the formulation of a second five-year plan. The Planning Commission made use of OR techniques for planning the optimum size of the Caravelle fleet of Indian airlines. Some industries,
1.8 Development of OR in India
11
such as HLL, Union Carbide, TELCO, Hindustan Steel, TISCO, FCI, etc., have used these techniques, as well as other Indian organizations, like CSIR, TIFR, the Indian Institute of Science, the State Trading Corporation, and Indian Railways.
1.9
Role of Computers in OR
In the development of OR, computers play a significant role. Without computers it would not be possible to achieve OR’s current position. In most OR techniques, the computations are very complicated and require a large number of iterations. Without a computer, these techniques have no practical use. Many large-scale applications of OR techniques, which require only a few seconds/minutes on the computer, would take a very long time (maybe weeks/months/years) to obtain the same results manually. In this context, the computer has become an essential and integral part of OR.
Chapter 2
Convex and Concave Functions
2.1
Objective
The objective of this chapter is to study convex & concave functions and their properties.
2.2
Introduction
Convex and concave functions are the key concepts of mathematical analysis and have interesting consequences in the areas of optimization theory, statistical estimation, inequalities and applied probability. These functions have some properties which can be used in developing suitable optimality conditions and computational schemes for optimization problems.
2.3
Convex and Concave Functions
A function f : SRn ! R (where S is a non-empty convex set) is said to be convex at x 2 S, if f ðkx þ ð1 kÞxÞ kf ðxÞ þ ð1 kÞf ðxÞ for all x 2 S; 0 k 1 and kx þ ð1 kÞx 2 S:
The function f ðxÞ is said to be convex on S if it is convex at each x 2 S. The general definition of a convex function is as follows: A function f : S ! R is said to be convex on S, if x1 ; x2 2 S and 0 k 1 implies
© Springer Nature Singapore Pte Ltd. 2019 A. K. Bhunia et al., Advanced Optimization and Operations Research, Springer Optimization and Its Applications 153, https://doi.org/10.1007/978-981-32-9967-2_2
13
14
2 Convex and Concave Functions
f ðkx1 þ ð1 kÞx2 Þ kf ðx1 Þ þ ð1 kÞf ðx2 Þ: By induction, this inequality can be extended as follows: f
n X
!
ki x i
i¼1
n X
ki f ðxi Þwhere
i¼1
n X
ki ¼ 1
i¼1
for any convex combination of points from S. If there is a constant c > 0 such that for any x1 ; x2 2 S, 1 f ðkx1 þ ð1 kÞx2 Þ kf ðx1 Þ þ ð1 kÞf ðx2 Þ ckð1 kÞkx1 x2 k2 ; 2 then f is called a uniformly convex or strongly convex function on S. A function f defined on a convex set S Rn is said to be strictly convex at x 2 S if f ðð1 kÞx þ kxÞ\ð1 kÞf ðxÞ þ kf ðxÞ for x 2 S; 0\k\1 and ð1 kÞx þ kx 2 S. f ðxÞ is said to be strictly convex on S if it is strictly convex at each x 2 S. A function f : SRn ! R (where S is a non-empty convex set) is said to be a concave function at x 2 S if f ðkx þ ð1 kÞxÞ kf ðxÞ þ ð1 kÞf ðxÞ for all x 2 S; 0 k 1 and kx þ ð1 kÞx 2 S:
The function f ðxÞ is said to be concave on S if it is concave at each x 2 S. Then the general definition is as follows: A function f : S ! R is said to be concave on S if x1 ; x2 2 S and 0 k 1 implies f ðkx1 þ ð1 kÞx2 Þ kf ðx1 Þ þ ð1 kÞf ðx2 Þ: By induction, the preceding inequality can be extended as follows: f
n X i¼1
! ki x i
n X i¼1
ki f ðxi Þwhere
n X
ki ¼ 1
i¼1
for any convex combination of points from S. If there is a constant c > 0 such that for any x1 ; x2 2 S,
2.3 Convex and Concave Functions
15
1 f ðkx1 þ ð1 kÞx2 Þ kf ðx1 Þ þ ð1 kÞf ðx2 Þ ckð1 kÞkx1 x2 k2 ; 2 then f is called a uniformly convex or strongly convex function on S. A function f defined on a convex set S Rn is said to be strictly concave at x 2 S if f ðð1 kÞx þ kxÞ [ ð1 kÞf ðxÞ þ kf ðxÞ for x 2 S; 0\k\1 and ð1 kÞx þ kx 2 S. f ðxÞ is said to be strictly concave on S if it is strictly concave at each x 2 S. Note that a strictly convex (strictly concave) function on a convex set SRn is convex (concave) on S, but the converse is not always true. For example, a constant function on Rn is both convex and concave on Rn , but neither strictly convex nor strictly concave on Rn .
2.4
Geometrical Interpretation
The geometrical interpretation of the convexity of a single variable function is as follows. For a convex function, the function values are below the corresponding chord (see Fig. 2.1); i.e. the values of a convex function at any point in the interval x1 x x2 are less than or equal to the height of the chord joining the points ðx1 ; f ðx1 ÞÞ and ðx2 ; f ðx2 ÞÞ. This equality holds only for linear functions (See Fig. 2.1 and Fig. 2.3a, c).
Fig. 2.1 Convexity of single variable function f ðxÞ
16
2 Convex and Concave Functions
For a concave function the function values are above the corresponding chord; i.e. the values of a concave function at any point in the interval x1 x x2 are greater than or equal to the height of the chord joining the points ðx1 ; f ðx1 ÞÞ and ðx2 ; f ðx2 ÞÞ. (See Figs. 2.2 and 2.3b, c). Examples of Convex Functions The following are some examples of convex functions: f ðxÞ ¼ eax is convex on R, for any a 2 R. f ðxÞ ¼ xa is convex on R þ , when a 1 or a 0. f ðxÞ ¼ j xjp for p 1 is convex on R. The function f with domain [0, 1] defined by f ð0Þ ¼ f ð1Þ ¼ 1 f ðxÞ ¼ 0 for 0\x\1 is convex. It is continuous on the open interval (0, 1) but not continuous at x = 0 and x = 1. (v) Every affine function of the form f ðxÞ ¼ ha; xi þ b in R is convex. Here instead of inequality equality holds, i.e.
(i) (ii) (iii) (iv)
f ðax þ ð1 aÞyÞ ¼ af ðxÞ þ ð1 aÞ f ðyÞ; a 2 ð0; 1Þ; x; y 2 S: ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi pP n 2 (vi) Every Euclidean norm in R, i.e. f ðxÞ ¼ kxk ¼ i¼1 xi , is convex. (vii) The distance to a convex set, i.e. the function d ðx; SÞ from points x 2 Rn to a convex set S, is convex. Since for any convex combination ax þ ð1 aÞy, if we take sequences uðkÞ and vðkÞ from S such that
Fig. 2.2 Concavity of single variable function f ðxÞ
2.4 Geometrical Interpretation
17
Fig. 2.3 a Convex function of two variables. b Concave function of two variables. c Function (of single variable) that is convex over a certain interval and concave over another certain interval
d ðx; SÞ ¼ lim x uðkÞ k!1 d ðy; SÞ ¼ lim y vðkÞ ; k!1
the points a uðkÞ þ ð1 aÞvðkÞ lie in S, and taking limits in the inequality, d ½a x þ ð1 aÞy; S a x þ ð1 aÞy a uðkÞ ð1 aÞvðkÞ ax uðkÞ þ ð1 aÞy vðkÞ ; i:e: d ½ax þ ð1 aÞy; S a d ðx; SÞ þ ð1 aÞd ðy; SÞ: (viii) The function f ðxÞ ¼ Maxfx1 ; x2 ; . . .; xn g is convex on Rn . The function f ðxÞ ¼ Maxi xi satisfies
18
2 Convex and Concave Functions
f ðhx þ ð1 hÞyÞ ¼ Maxi ðhxi þ ð1 hÞyi Þ ½0 h 1 h Maxi xi þ ð1 hÞ Maxi yi ¼ hf ðxÞ þ ð1 hÞf ðyÞ (ix) The function f ðxÞ ¼ logðex1 þ þ exn Þ is convex on Rn . Examples of Concave Functions Some examples of concave functions are as follows: (i) f ðxÞ ¼ log x on R þ is a concave function. (ii) f ðxÞ ¼ xa for 0 a 1 is concave. Qn 1n (iii) The geometric mean f ðxÞ ¼ on Rn is a concave function. i¼1 xi (iv) f ðxÞ ¼ 8x2 is a strictly concave function. (v) The function f ðxÞ ¼ sin x is concave on the interval ½0; p.
2.5
Properties of Convex and Concave Functions
The properties of convex and concave functions are described as follows: (i) A function f is said to be concave if f is convex. (ii) Every linear function is both concave and convex on Rn . Proof Let f ðxÞ ¼ ha; xi þ bða 2 Rn Þ be a linear function on Rn . Let x1 2 Rn ; x2 2 Rn and 0 k 1. So, kx1 þ ð1 kÞx2 2 Rn . Then f ðkx1 þ ð1 kÞx2 Þ ¼ ha; kx1 þ ð1 kÞx2 i þ b; 8k ¼ kha; x1 i þ ð1 kÞha; x2 i þ b ¼ kfha; x1 i þ bg þ ð1 kÞfha; x2 i þ bg ¼ kf ðx1 Þ þ ð1 kÞf ðx2 Þ: Hence, f ðkx1 þ ð1 kÞx2 Þ ¼ kf ðx1 Þ þ ð1 kÞf ðx2 Þ; 0 k 1, i.e. f ðxÞ is both convex and concave on Rn . (iii) If f : S Rn ! R is concave, then S is a convex set on Rn . (iv) If f is linear and g is convex, then the composite function g f is convex. (v) If f is convex and g is convex and non-decreasing, then g f is convex. Proof Let Sð Rn Þ be a convex set and 0 k 1. Since f ðxÞ is a convex function on a convex set S, then
2.5 Properties of Convex and Concave Functions
19
f ðkx1 þ ð1 kÞx2 Þ kf ðx1 Þ þ ð1 kÞf ðx2 Þ: Since g is non-decreasing, we have gðf ðkx1 þ ð1 kÞx2 ÞÞ gðkf ðx1 Þ þ ð1 kÞf ðx2 ÞÞ:
ð2:1Þ
Again as g is convex, then gðkf ðx1 Þ þ ð1 kÞf ðx2 ÞÞ kgðf ðx1 ÞÞ þ ð1 kÞgðf ðx2 ÞÞ:
ð2:2Þ
From (2.1) and (2.2) we can write gðf ðkx1 þ ð1 kÞx2 ÞÞ kgðf ðx1 ÞÞ þ ð1 kÞgðf ðx2 ÞÞ; which shows that g f is convex on S. (vi) The sum of convex (concave) functions is convex (concave). Let f1 ; f2 ; . . .; fk : Rn ! R be convex functions. Then we have the following: P (a) f ðxÞ ¼ kj¼1 aj fj ðxÞ, where aj [ 0 for j ¼ 1; 2; . . .; k, is convex, (b) f ðxÞ ¼ Maxff1 ðxÞ; f2 ðxÞ; . . .; fk ðxÞg is convex. (vii) Let us suppose that g : Rn ! R is a concave function. Let S ¼ 1 . Then f is convex over S. fx : gðxÞ [ 0g and define f : S ! R as f ðxÞ ¼ gðxÞ (viii) An n-dimensional vector function f defined on a set S in Rn is convex (concave) at x 2 S, convex (concave) on S, if each of its components fi ði ¼ 1; 2; . . .; nÞ is convex (concave) at x 2 S, convex (concave) on S. Theorem Let f ðxÞ be a function defined on a convex set S Rn . If f ðxÞ is convex on S, then the set Aa ¼ fx : x 2 S; f ðxÞ ag S is convex for each real number a. Proof Let x1 ; x2 2 Aa ) f ðx1 Þ a and f ðx2 Þ a:
ð2:3Þ
Since f ðxÞ is convex on S, f ðkx1 þ ð1 kÞx2 Þ kf ðx1 Þ þ ð1 kÞ f ðx2 Þ ka þ ð1 kÞa ¼ a; i:e: f ðkx1 þ ð1 kÞx2 Þ a. Hence, kx1 þ ð1 kÞx2 2 Aa and Aa is a convex set. Note that the condition of the preceding theorem is not sufficient; i.e. if Aa is convex for each real number a, it does not follow that f ðxÞ is a convex function on S. Let us consider the function f ðxÞ ¼ x3 which is not convex on R. However, the set Aa ¼ x : x 2 R; x3 a ¼ x : x 2 R; x a1=3 is obviously convex for any a.
20
2 Convex and Concave Functions
Corollary Let f ðxÞ be a function defined on a convex set S Rn . If f ðxÞ is concave on S, then the set Ba ¼ fx : x 2 S; f ðxÞ ag S is a convex set for each real number a. Theorem Let f ðxÞ ¼ ðf1 ; f2 ; . . .; fm Þ be an m-dimensional vector function defined on S Rn . If f is convex (concave) at x 2 S, then each non-negative linear combination of its components fi ; /ðxÞ ¼ hc; f ðxÞi; c ¼ ðc1 ; c2 ; . . .; cm Þ 0 is convex (concave) at x. Proof Here we shall prove the theorem for a convex function. Let x 2 S; 0 k 1 and let kx þ ð1 kÞx 2 S. Then /½kx þ ð1 kÞ x ¼ \c; f ½kx þ ð1 kÞx [ \c; ½k f ðxÞ þ ð1 kÞf ðxÞ [ ¼ k\c; f ðxÞ [ þ ð1 kÞ\c; f ðxÞ [ ½By the convexity of f ðxÞ at x and c 0 ¼ k/ðxÞ þ ð1 kÞ/ðxÞ:
This proves that /ðxÞ is convex at x. Example Let the function f ðxÞ be convex on a convex set S Rn . Show that for every x1 2 S; x2 2 S, the one-dimensional function gðyÞ ¼ f ðyx1 þ ð1 yÞx2 Þ is convex for all 0 y 1. Solution Let y1 ; y2 2 R. Hence, 0 y1 ; y2 1. So, ky1 þ ð1 kÞy2 2 R for 0 k 1. Now, gðky1 þ ð1 kÞy2 Þ ¼ f ½fky1 þ ð1 kÞy2 gx1 þ f1 ky1 ð1 kÞy2 gx2 As fky1 þ ð1 kÞy2 gx1 þ f1 ky1 ð1 kÞy2 gx2 ¼ kfy1 x1 þ ð1 y1 Þx2 g þ ð1 kÞfy2 x1 þ ð1 y2 Þx2 g ¼ f ½kfy1 x1 þ ð1 y1 Þx2 g þ ð1 kÞfy2 x1 þ ð1 y2 Þx2 g kf ½y1 x1 þ ð1 y1 Þx2 þ ð1 kÞ f ½y2 x1 þ ð1 y2 Þx2 ¼ kgðy1 Þ þ ð1 kÞgðy2 Þ; which implies that gðyÞ is convex for all 0 y 1.
2.6
Differentiable Convex and Concave Functions
Here we shall discuss some of the properties of differentiable convex and concave functions. Let f be a function defined on an open set S in Rn. If f is differentiable at x 2 S, then for x 2 Rn and x þ x 2 S; f ðx þ xÞ ¼ f ðxÞ þ hrf ðxÞ; xi þ aðx; xÞkxk and limx!0 aðx; xÞ ¼ 0, where rf ðxÞ is the n-dimensional gradient vector of f at x whose n components are the partial derivatives of f with respect to x1 ; x2 ; . . .; xn evaluated at x, and a is a function of x.
2.6 Differentiable Convex and Concave Functions
21
Theorem Let f be a function defined on an open set S Rn and let f be differentiable at x 2 S. If f is convex at x 2 S, then f ðxÞ f ðxÞ hrf ðxÞ; ðx xÞi for each x 2 S: If f is concave at x 2 S, then f ðxÞ f ðxÞ hrf ðxÞ; ðx xÞi for each x 2 S. Proof Let f be convex at x 2 S. Since S is open, then there exists an open ball Bd ðxÞ around x, which is contained in S. Let x 2 S and x 6¼ x. Then for some l such that 0\l\1 and l\d=kx xk, we have ^x ¼ x þ lðx xÞ ¼ ð1 lÞx þ lx 2 Bd ðxÞ S: Since f is convex at x, from the convexity of Bd ðxÞ and the fact that ^x 2 Bd ðxÞ and 0\k\1, it follows that: ð1 kÞ f ðxÞ þ kf ð^xÞ f ½ð1 kÞx þ k^x or; f ðxÞ kf ðxÞ þ kf ð^xÞ f ½ð1 kÞx þ k^x or; k½ f ð^xÞ f ðxÞ f ½ð1 kÞx þ k^x f ðxÞ xÞf ðxÞ or; f ð^xÞ f ðxÞ f ½x þ kð^x k ¼ khrf ðxÞ;ð^xxÞi þ ka½x;kð^xxÞkk^xxk ½As f is differentiable at x 2 S ¼ hrf ðxÞ; ð^x xÞi þ a½x; kð^x xÞk^x xk ) f ð^xÞ f ðxÞ hrf ðxÞ; ð^x xÞi þ a½x; kð^x xÞk^x xk: Now taking the limit of both sides as k ! 0, we have f ð^xÞ f ðxÞ hrf ðxÞ; ð^x xÞi; since lim
k!0
a½x; kð^x xÞ ¼ 0:
Since f is convex at x; ^x 2 S and ^x ¼ ð1 lÞx þ lx;
ð2:4Þ ð2:5Þ
we have ð1 lÞf ðxÞ þ lf ðxÞ f ½ð1 lÞx þ lx or l½ f ðxÞ f ðxÞ f ð^xÞ f ðxÞ ð2:6Þ " # Since from ð2:4Þ or l½f ðxÞ f ðxÞ hrf ðxÞ; ð^x xÞi f ð^xÞ f ðxÞ hrf ðxÞ; ð^x xÞi " # From ð2:5Þ; since ^x ¼ ð1 lÞx þ lx or l½f ðxÞ f ðxÞ hrf ðxÞ; lðx xÞi ) ^x ¼ x lx þ lx or; ^x x ¼ lðx xÞ or f ðxÞ f ðxÞ hrf ðxÞ; ðx xÞi
½*l [ 0:
22
2 Convex and Concave Functions
The second part of the theorem can be proved similarly. Theorem Let f be a differential function on an open convex set S Rn . f is convex on S if and only if f ðx2 Þ f ðx1 Þ hrf ðx1 Þ; ðx2 x1 Þi for each x1 ; x2 2 S: f is concave on S if and only if f ðx2 Þ f ðx1 Þ hrf ðx1 Þ; ðx2 x1 Þi for each x1 ; x2 2 S: Proof Necessary part: This part of the proof follows from the previous theorem. Sufficient part: Let f ðx2 Þ f ðx1 Þ hrf ðx1 Þ; ðx2 x1 Þi for each x1 ; x2 2 S:
ð2:7Þ
Since S is convex and x1 ; x2 2 S, )ð1 kÞx1 þ kx2 2 S for 0 k 1. Now from (2.4), we have f ðx1 Þ f ½ð1 kÞx1 þ kx2 hrf ½ð1 kÞx1 þ kx2 ; ½x1 fð1 kÞx1 þ kx2 gi for x1 2 S and ð1 kÞx1 þ kx2 2 S or f ðx1 Þ f ½ð1 kÞx1 þ kx2 hrf ½ð1 kÞx1 þ kx2 ; fx1 x1 þ kx1 kx2 gi or f ðx1 Þ f ½ð1 kÞx1 þ kx2 khrf ½ð1 kÞx1 þ kx2 ; ðx1 x2 Þi: ð2:8Þ Again for x2 2 S; ð1 kÞx1 þ kx2 2 S, we have from (2.7) f ðx2 Þ f ½ð1 kÞx1 þ kx2 hrf ½ð1 kÞx1 þ kx2 ; ½x2 fð1 kÞx1 þ kx2 gi or f ðx2 Þ f ½ð1 kÞx1 þ kx2 hrf ½ð1 kÞx1 þ kx2 ; ½ð1 kÞx1 þ ð1 kÞx2 i or f ðx2 Þ f ½ð1 kÞx1 þ kx2 ð1 kÞhrf ½ð1 kÞx1 þ kx2 ; ðx1 x2 Þi: ð2:9Þ Now multiply (2.8) by ð1 kÞ and (2.9) by k; then adding, we have ð1 kÞf ðx1 Þ ð1 kÞf ½ð1 kÞx1 þ kx2 þ kf ðx2 Þ kf ½ð1 kÞx1 þ kx2 kð1 kÞhrf ½ð1 kÞx1 þ kx2 ; ðx1 x2 Þi kð1 kÞhrf ½ð1 kÞx1 þ kx2 ; ðx1 x2 Þi or ð1 kÞf ðx1 Þ þ kf ðx2 Þ ð1 k þ kÞf ½ð1 kÞx1 þ kx2 0 or ð1 kÞf ðx1 Þ þ kf ðx2 Þ f ½ð1 kÞx1 þ kx2 :
Hence, f is convex on S. Similarly, we can prove that f is concave on S.
2.6 Differentiable Convex and Concave Functions
23
Theorem Let f be a differentiable function on an open convex set S Rn . A necessary and sufficient condition that f be convex (concave) on S is that for each x1 ; x2 2 S h½rf ðx2 Þ rf ðx1 Þ; ðx2 x1 Þi 0 ð 0Þ: Proof We shall establish the theorem for the convex case only. The concave case is similar. The condition is necessary. Let f be convex on S and let x1 ; x2 2 S. Then we have f ðx2 Þ f ðx1 Þ hrf ðx1 Þ; ðx2 x1 Þi, i:e: f ðx2 Þ f ðx1 Þ hrf ðx1 Þ; ðx2 x1 Þi 0 and
ð2:10Þ
f ðx1 Þ f ðx2 Þ hrf ðx2 Þ; ðx1 x2 Þi;
i:e: f ðx1 Þ f ðx2 Þ hrf ðx2 Þ; ðx2 x1 Þi or
f ðx1 Þ f ðx2 Þ þ hrf ðx2 Þ; ðx2 x1 Þi 0:
ð2:11Þ
Now adding these two inequalities (2.10) and (2.11), we have f ðx2 Þ f ðx1 Þ hrf ðx1 Þ; ðx2 x1 Þi þ f ðx1 Þ f ðx2 Þ þ hrf ðx2 Þ; ðx2 x1 Þi 0 or hrf ðx2 Þ; ðx2 x1 Þi hrf ðx1 Þ; ðx2 x1 Þi 0 or h½rf ðx2 Þ rf ðx1 Þ; ðx2 x1 Þi 0: The condition is sufficient. Let h½rf ðx2 Þ rf ðx1 Þ; ðx2 x1 Þi 0 for each x1 ; x2 2 S:
ð2:12Þ
We shall prove that f is convex on S. Since S is a convex set and x1 ; x2 2 S, then for 0 k 1; ð1 kÞx1 þ kx2 2 S. Now by the mean value theorem, we have for some k; 0\k\1,
f ðx2 Þ f ðx1 Þ ¼ rf x1 þ kðx2 x1 Þ ; ðx2 x1 Þ : But by inequality (2.12),
rf x1 þ kðx2 x1 Þ rf ðx1 Þ ; kðx2 x1 Þ 0
or rf x1 þ kðx2 x1 Þ rf ðx1 Þ ; ðx2 x1 Þ 0
or rf x1 þ kðx2 x1 Þ ; ðx2 x1 Þ hrf ðx1 Þ; ðx2 x1 Þi or f ðx2 Þ f ðx1 Þ hrf ðx1 Þ; ðx2 x1 Þi ½using ð2:4Þ:
ð2:13Þ
24
2 Convex and Concave Functions
Hence, f is convex on S [By the theorem ‘if f is a differentiable function on an open convex set S Rn , f is convex on S and if f ðx2 Þ f ðx1 Þ hrf ðx1 Þ; ðx2 x1 Þi then f is convex on S.’] Note If f is an n-dimensional function on S Rn and h½f ðx2 Þ f ðx1 Þ; ðx2 x1 Þi 0 for all x1 ; x2 2 S, then f is said to be monotonic on S. So from the previous theorem, a differentiable function on the open convex set S Rn is convex if and only if rf is monotonic on S.
2.7
Differentiable Strictly Convex and Concave Functions
Theorem Let f be a function defined on an open set S Rn and let f be differentiable at x 2 S. If f is strictly convex at x 2 S, then f ðxÞ f ðxÞ [ hrf ðxÞ; ðx xÞi for each x 2 S; x 6¼ x: If f is strictly concave at x 2 S, then f ð xÞ f ðxÞ\hrf ðxÞ; ðx xÞi for each x 2 S; x 6¼ x: Proof Let f be strictly convex at x. Then f ½ð1 kÞx þ kx\ð1 kÞf ðxÞ þ kf ðxÞ for all x 2 S; x 6¼ x
ð2:14Þ
0\k\1; and ð1 kÞx þ kx 2 S: Since f is convex and differentiable at x 2 S, then f ðxÞ f ðxÞ hrf ðxÞ; ðx xÞi for all x 2 S:
ð2:15Þ
We shall now show that if equality holds in (2.15) for some x in S which is distinct from x, then a contradiction arises. Let the equality hold in (2.15) for x ¼ ~x; ~x 2 S and ~x 6¼ x. Then f ð~xÞ f ðxÞ ¼ hrf ðxÞ; ð~x xÞi or f ð~xÞ ¼ f ðxÞ þ hrf ðxÞ; ð~x xÞi:
ð2:16Þ
Since ~x 2 S, from (2.14) we have f ½ð1 kÞx þ k~x\ð1 kÞ f ðxÞ þ k f ð~xÞ ½putting x ¼ ~x in ð2:14Þ or f ½ð1 kÞx þ k~x\ð1 kÞ f ðxÞ þ k½ f ðxÞ þ hrf ðxÞ; ð~x xÞi ½using ð2:16Þ for 0\k\1 and ð1 kÞx þ k~x 2 S or f ½ð1 kÞx þ k~x\ f ðxÞ þ khr f ðxÞ; ð~x xÞi:
ð2:17Þ
2.7 Differentiable Strictly Convex and Concave Functions
25
Again for the points x ¼ ð1 kÞx þ k~x 2 S and x, from (2.15) we have f ½ð1 kÞx þ k~x f ðxÞ hrf ðxÞ; kð~x xÞi or f ½ð1 kÞx þ k~x krf ðxÞð~x xÞ þ f ðxÞ:
ð2:18Þ
Relation (2.18) contradicts relation (2.13). For small k; ð1 kÞx þ k~x 2 S is open. Hence, equality cannot hold in (2.15) for any x 2 S distinct. Theorem Let f ðxÞ be a twice differentiable function on an open convex set S Rn and HðxÞ be the Hessian matrix of f ðxÞ. Then f ðxÞ is convex on S if and only if HðxÞ is positive semi-definite for each x 2 S. Proof Let x 2 S and y 2 Rn . Since S is convex and open, there exists a k [ 0 such that x þ ky 2 S for 0\k\k. Since f ðxÞ is a twice differentiable function on S, then f ðx þ kyÞ ¼ f ðxÞ þ khrf ðxÞ; yi þ
k2 hHðxÞy; yi þ k2 bðx; kyÞkyk2 8 k ð0\k\kÞ: 2
When b ! 0 as k ! 0, f ðx þ kyÞ f ðxÞ khrf ðxÞ; yi ¼
i k2 h hHðxÞy; yi þ 2bkyk2 : 2
For a differentiable convex function, we know that f ðx þ hkyÞ f ðxÞ khrf ðxÞ; i yi 0 2 k2 or 2 hHðxÞy; yi þ 2bk yk 0 or hHðxÞy; yi þ 2bkyk2 0 for 0\k\k: Letting k ! 0, we have hHðxÞy; yi 0 for all y 2 Rn , which means that HðxÞ is positive semi-definite for all x 2 S. Let the Hessian matrix HðxÞ be positive semi-definite for each x 2 S. Hence, hHðxÞy; yi 0 for all y 2 Rn . Now by Taylor’s theorem, we have for x1 ; x2 2 S 1 f ðx2 Þ f ðx1 Þ hrf ðx1 Þ; ðx2 x1 Þi ¼ hH ðx1 þ hðx2 x1 ÞÞðx2 x1 Þ; ðx2 x1 Þi 2 for some 0\h\1. Since x1 þ hðx2 x1 Þ 2 S, the right-hand side of the preceding equation is non-negative as hHðxÞy; yi 0 for all y 2 Rn . Hence, f ðx2 Þ f ðx1 Þ hrf ðx1 Þ; ðx2 x1 Þi 0 for x1 ; x2 2 S, i.e. f ðxÞ is convex on S.
26
2 Convex and Concave Functions
Theorem Let f ðxÞ be a twice differentiable function on an open convex set Sð Rn Þ and HðxÞ be the Hessian matrix of f ðxÞ. Then f ðxÞ is strictly convex on S if HðxÞ is positive definite for each x 2 S. Remark (i) The converse of the previous theorem is not true. To prove this, let us consider the function y ¼ x4 ; x 2 R. Hence, HðxÞ ¼ 12x2 , which is not positive definite since Hð0Þ ¼ 0, but f ðxÞ is strictly convex on R because f 0 ðxÞ ¼ 4x3 is strictly increasing on R. (ii) Let f ðxÞ be a twice differentiable function of a single variable x on an open set Sð Rn Þ. Then f ðxÞ is strictly convex if f 00 ðxÞ is positive for each x 2 S. Example Determine whether the following functions are convex or concave: (a) (b) (c) (d)
f ðxÞ ¼ ex f ðxÞ ¼ 8x2 f ðx1 ; x2 Þ ¼ 2x31 6x22 f ðx1 ; x2 ; x3 Þ ¼ 4x21 þ 3x22 þ 5x23 þ 6x1 x2 þ x1 x3 3x1 2x2 þ 15
Solution (a) f ðxÞ ¼ ex ) f 00 ðxÞ ¼ ex [ 0 for all real values of x. Hence, f ðxÞ is strictly convex. (b) f ðxÞ ¼ 8x2 ; f 00 ðxÞ ¼ 16\0 for all real values of x. Hence, f ðxÞ is strictly concave. (c) f ðx1 ; x2 Þ ¼ 2x31 6x22 2 HðxÞ ¼
@2 f 2 4 @x2 1 @ f @x1 @x2
Since
@2 f @x1 @x2 @2 f @x22
3
5 ¼ 12x1 0
0 : 12
@2f ¼ 12x1 0 for x1 0 and @x21 0 for x1 0
and jHðxÞj ¼ 144x1 0 for x1 0 0 for x1 0; then f ðxÞ will be negative semi-definite for x1 0, i.e. f ðxÞ is convex for x1 0. (d) f ðx1 ; x2 ; x3 Þ ¼ 4x21 þ 3x22 þ 5x23 þ 6x1 x2 þ x1 x3 3x1 2x2 þ 15
2.7 Differentiable Strictly Convex and Concave Functions
2 HðxÞ ¼
@2 f 2 @x 6 21 6 @f 6 @x1 @x2 4 2 @ f @x1 @x3
@2 f @x1 @x2 @2 f @x22 @2 f @x2 @x3
@2 f @x1 @x3 @2 f @x2 @x3 @2 f @x23
3
2 8 7 7 4 ¼ 6 7 5 1
27
3 6 1 6 0 5: 0 10
Here the leading principal minors are given by j8j ¼ 8 [ 0: 8 6 1 8 6 6 6 ¼ 12 [ 0 and 6 6 0 ¼ 114 [ 0: 1 0 10 Hence, the matrix HðxÞ is positive definite for all real values of x1 ; x2 and x3 . Therefore, f ðxÞ is a strictly convex function.
Chapter 3
Simplex Method
3.1
Objective
The objective of this chapter is to discuss: • The concept of a linear programming problem (LPP) • The concepts of slack/surplus variables and canonical and standard forms of LPPs • The theories and algorithm of the simplex method • The Big M and two-phase methods • Finding the solution of linear simultaneous equations and inverse of a non-singular matrix by the simplex method.
3.2
General Linear Programming Problem
The general linear programming problem (LPP) is a problem of determining the value of n decision variables x1 ; x2 ; . . .; xn which optimizes (maximizes or minimizes) the linear function z ¼ c1 x1 þ c2 x2 þ þ cn xn
ð3:1Þ
9 a11 x1 þ a12 x2 þ þ a1n xn ; ¼; b1 > > = a21 x1 þ a22 x2 þ þ a2n xn ; ¼; b2 > > ; am1 x1 þ am2 x2 þ þ amn xn ; ¼; bm
ð3:2Þ
subject to the inequalities
© Springer Nature Singapore Pte Ltd. 2019 A. K. Bhunia et al., Advanced Optimization and Operations Research, Springer Optimization and Its Applications 153, https://doi.org/10.1007/978-981-32-9967-2_3
29
30
3 Simplex Method
and xj 0; j ¼ 1; 2; . . .; n:
ð3:3Þ
The linear function z ¼ c1 x1 þ c2 x2 þ þ cn xn which is to be maximized (or minimized) is called the objective function of the general LPP. The inequalities (3.2) are called the constraints of the general LPP, whereas the set of inequalities (3.3) is usually known as the non-negativity conditions of the general LPP. The right-hand quantities bi ði ¼ 1; 2; . . .; mÞ of the constraints (3.2) are known as the requirement parameters of the problem. In a compact form, the general LPP can be written as follows: Optimize z ¼
n P
cj xj
j¼1
subject to the constraints n P aij xj ; ¼; bi ; i ¼ 1; 2; . . .; m j¼1
and xj 0; j ¼ 1; 2; . . .; n: In matrix notation, the general LPP can be written as follows: Optimize z ¼ hcT ; xi subject to the constraints Ax ; ¼; b and x 0; where c; xT 2 Rn ; bT 2 Rm , and A is an m n real matrix. Note The notation hcT ; xi represents the inner product of two vectors. In some books, it is denoted by either cT x or cx. In a mathematical sense, hcT ; xi is the appropriate representation as cT x or cx is a matrix.
3.3
Slack and Surplus Variables
In the formulation of a general LPP, if the constraint is in the form of a type of inequality, such as ai1 x1 þ ai2 x2 þ þ ain xn bi ; then to convert this inequality into an equation, a non-negative quantity xn þ i is added on the left-hand side of the constraint. The converted equation is as follows:
3.3 Slack and Surplus Variables
31
ai1 x1 þ ai2 x2 þ þ ain xn þ xn þ i ¼ bi : Hence, the non-negative quantity xn þ i is known as a slack variable. Similarly, if a constraint is in the form of a type of inequality, such as ai1 x1 þ ai2 x2 þ þ ain xn bi ; then to transform this inequality into an equation, a non-negative quantity xn þ i is subtracted from the left-hand side of the constraint. The converted equation is as follows: ai1 x1 þ ai2 x2 þ þ ain xn xn þ i ¼ bi : Hence, the non-negative quantity xn þ i is known as a surplus variable. Note that the coefficients of slack and/or surplus variables in the objective function are always assumed to be zero. This means that the conversion of the constraints to a system of simultaneous linear equations does not change the objective function.
3.4
Canonical and Standard Forms of LPP
After the formulation of the LPP, the next step is to determine its solution. But, to solve the LPP, the problem must be expressed in the following two forms: (i) Canonical form (ii) Standard form. Canonical form The formulation of a general LPP can always be expressed in the following form: n P Maximize z ¼ cj xj j¼1
subject to the constraints n P aij xj bi ; i ¼ 1; 2; . . .; m j¼1
and xj 0;
j ¼ 1; 2; . . .; n
This form of LPP is called the canonical form of the LPP. The characteristics of this form are as follows:
32
3 Simplex Method
1. The objective function is of the maximization type. If the objective function is of the minimization type, it can be expressed as a maximization type by using the following mini-max property: Minimize f ð xÞ ¼ Maximize ff ð xÞg For example, the objective function Minimize z ¼ 3x1 þ x2 2x3 is equivalent to Maximize z0 ¼ z ¼ 3x1 x2 þ 2x3 : 2. All the constraints are of type, except the non-negativity restrictions. An inequality of type can changed to an inequality of type by multiplying both sides of the inequality by ð1Þ. For example, the constraint 3x1 þ 2x2 x3 6 is equivalent to 3x1 2x2 þ x3 6: On the other hand, an equation can be replaced by two weak inequalities. For example, the constraint 5x1 x2 þ 7x3 ¼ 15 is equivalent to 5x1 x2 þ 7x3 15 and 5x1 x2 þ 7x3 15; i.e. 5x1 x2 þ 7x3 15 5x1 þ x2 7x3 15: 3. All the variables are non-negative. If a variable is unrestricted in sign (i.e. the value of that variable may be positive, negative or zero), it can be replaced by the difference between two non-negative variables. For example, if a variable x1 is unrestricted in sign, we can replace this variable by x01 x001 where both x01 and x001 are non-negative, i.e. x1 ¼ x01 x001 where x01 0 and x001 0.
3.4 Canonical and Standard Forms of LPP
33
Standard form A general LPP in the form Maximize or Minimize z ¼
n P
cj xj
j¼1
subject to the constraints n P aij xj ¼ bi ; i ¼ 1; 2; . . .; m j¼1
and xj 0; j ¼ 1; 2; . . .; n is known as the standard form of the LPP. The characteristics of this form are as follows: 1. All the constraints are expressed in the form of equations, except the non-negativity restrictions. 2. The right-hand side of each constraint equation is non-negative. The inequality constraint can be converted to an equation by introducing slack/ surplus variables on the left-hand side of that constraint.
3.5
Fundamental Theorem of LPP
Theorem If the LPP admits an optimal solution, then the optimal solution will coincide with at least one basic feasible solution of the problem. Proof Here we shall prove the theorem for the maximization problem. Let us assume that x is an optimal solution of the following LPP: Maximize z ¼ hcT ; xi subject to Ax ¼ b and x 0:
ð3:4Þ
Here A is an m n (m\n) matrix and is given by A ¼ ½a1 ; a2 ; . . .; an , where aj is an m-component . . .; amj T ; j ¼ 1; 2; . . .; n.
column
vector
given
by
aj ¼ ½a1j ; a2j ;
Without any loss of generality, let us assume that the first k components of x are non-zero and the remaining ðn kÞ components of x are zero. Thus, x ¼ ½x1 ; x2 ; . . .; xk ; 0; 0; . . .; 0. Then, from (3.4), Ax ¼ b gives
34
3 Simplex Method k X
aij xj ¼ bi ; i ¼ 1; 2; . . .; m
j¼1
in which A ¼ ½a1 ; a2 ; . . .; ak ; ak þ 1 ; . . .; an so that a1 x1 þ a2 x2 þ ; ak xk ¼ b Also z ¼ zMax ¼
k P
ð3:5Þ
cj xj .
j¼1
Now, if the vectors a1 ; a2 ; . . .; ak corresponding to the non-zero components of x are linearly independent, then by definition x is a basic feasible solution; hence, the theorem holds in this case. If k ¼ m, then the basic feasible solution will be non-degenerate. Again, if k\m, then it will form a degenerate basic feasible solution with m k of the basic variables equal to zero. But if the vectors a1 ; a2 ; . . .; ak are not linearly independent, then they must be linearly dependent and there exist scalars kj ; j ¼ 1; 2; . . .; k, with at least one non-zero kj such that k1 a1 þ k2 a2 þ ; kk ak ¼ 0
ð3:6Þ
Let us suppose at least one kj [ 0; if the non-zero kj is not positive, then by multiplying (3.6) by (−1), one can get a positive kj . kj : 1 j k xj
Let l ¼ Max
ð3:7Þ
This l is positive as xj [ 0 for all j ¼ 1; 2; . . .; k and at least one kj is positive. Dividing (3.6) by l and subtracting it from (3.5), we have
k1 k2 kk x1 a1 þ x 2 a2 þ þ x k ak ¼ b l l l
and hence X1 ¼
k1 kk x1 ; . . .; xk ; 0; 0; . . .; 0 l l
is a solution of the system of equations Ax ¼ b.
ð3:8Þ
3.5 Fundamental Theorem of LPP
35
Again, from (3.7), we have k
l xjj , for k
or xj lj , for
j ¼ 1; 2; . . .; k j ¼ 1; 2; . . .; k
k
or xj lj 0, for j ¼ 1; 2; . . .; k. This means that all the components of X1 are non-negative and hence X1 is a feasible solution of Ax ¼ b, x 0. Again, for at least one value of j, we have, from (3.7), l ¼
kj xj
k
or, in other words, xj lj ¼ 0, for at least one value of j. Hence, it is seen that the feasible solution X1 will contain one more zero; i.e. the feasible solution X1 cannot contain more than ðk 1Þ non-zero variables. Therefore, the number of positive variables in an optimal solution can be reduced. Now we have to show that X1 is optimal. Let z0 be the value of the objective function for x ¼ X1 . Then z0 ¼
X k k k X X kj kj c j xj cj xj cj ¼ l l j¼1 j¼1 j¼1
k 1X ¼z c j kj l j¼1
½* z ¼
k X
ð3:9Þ
cj xj :
j¼1
If possible, let us suppose that k X
cj kj 6¼ 0:
j¼1
Then for a suitable real number b, we can write bðc1 k1 þ c2 k2 þ ; þ ck kk Þ [ 0; That is, c1 ðbk1 Þ þ c2 ðbk2 Þ þ ; þ ck ðbkk Þ [ 0 Adding ðc1 x1 þ c2 x2 þ ; þ ck xk Þ to both sides, we get
ð3:10Þ
36
3 Simplex Method
c1 ðx1 þ bk1 Þ þ c2 ðx2 þ bk2 Þ þ ; þ ck ðxk þ bkk Þ [ c1 x1 þ c2 x2 þ ; þ ck xk ¼ z :
ð3:11Þ
Again, multiplying (3.6) by b and adding it with (3.5), we get ðx1 þ bk1 Þa1 þ ðx2 þ bk2 Þa2 þ ; þ ðxk þ bkk Þak ¼ b This relation implies that ½ðx1 þ bk1 Þ; . . .; ðxk þ bkk Þ; 0; 0; . . .; 0T
ð3:12Þ
is also the solution of the system Ax ¼ b. Now we choose b in such a way that xj þ bkj 0 for all j ¼ 1; 2; . . .; k or bkj xj x or b kjj , if kj [ 0, x
b kjj ; if kj \0 and b is unrestricted, if kj ¼ 0. Then
xj xj Max ; kj [ 0 b Min ; kj \0 : j j kj kj
For this b, (3.12) becomes a feasible solution of Ax ¼ b; x 0. Hence, from (3.11) it is clear that the feasible solution (3.12) gives a greater value of the objective function than z for x . But this contradicts our assumption that z is the optimal value, and thus we must have k X
cj kj ¼ 0:
j¼1
Hence, X1 is also an optimal solution. Thus, it is seen that from the given optimal solution, we can construct a new optimal solution, whose number of non-zero variables is less than that of the given solution. If the vectors associated with the new non-zero variables are linearly independent, then the new solution will be a basic feasible solution, and hence it proves the theorem.
3.5 Fundamental Theorem of LPP
37
If the new solution is not a basic feasible solution, then we can further diminish the number of non-zero variables as above to get a new set of optimal solutions. We may continue the process until the optimal solution obtained is a basic feasible solution. This proves the theorem. For minimization problems, the theorem can be proved in a similar way.
3.6
Replacement of a Basis Vector
Theorem Let an LPP have a basic feasible solution. If one of the basis vectors is removed from the basis and a non-basic vector is introduced in a basis, then the new solution obtained is also a basic feasible solution. Proof Let us consider the LPP Maximize z ¼ cT ; x subject to Ax ¼ b and x 0 where c; xT 2 Rn ; bT 2 Rm and A is an m n real matrix: Also, let the rank of A be m. Let xB be a basic feasible solution to the LPP. ) BxB ¼ b
ð3:13Þ
where B is the basis matrix. Let aj be any column vector of A. Then aj ¼ y1j B1 þ y2j B2 þ þ yrj Br þ þ ymj Bm
ð3:14Þ
where Bi 2 B and yij are suitable scalars. Now, we know that if a basis vector Br with non-zero coefficient is replaced by aj 2 A, then the new set of vectors also forms a basis. Now, for yrj 6¼ 0, from (3.14) we can write Br ¼
m X aj yij Bi : yrj yrj i¼1 i 6¼ r
ð3:15Þ
38
3 Simplex Method
From (3.13), BxB ¼ b which implies
m P
xBi Bi þ xBr Br ¼ b
i¼1 i6¼r
or
m P
i¼1 i6¼r
or
2
3
m 6a P
xBi Bi þ xBr 4y j rj
h m P
yij xB xBr y i rj
i
i¼1 i6¼r
yij yrj Bi
7 5¼b
xB Bi þ y r aj ¼b rj
i¼1 i6¼r
From the preceding relation, it is clear that the components of the new basic solution ^xB are given by ^xBi ¼ xBi xBr
yij ; i ¼ 1; 2; . . .; m; i 6¼ r yrj
and ^xBr ¼ xyBrjr . Now we shall show that ^xB is also feasible; i.e. all the components of ^xB are non-negative. According to the value of xBr , two cases may arise. Case 1 xBr ¼ 0. In this case, ^xBi ¼ xBi 0; i ¼ 1; 2; . . .; m; i 6¼ r and xBr ¼ 0 i.e. ^xB 0. Case 2 xBr 6¼ 0. In this case, we must have yrj [ 0. For the remaining yij ði 6¼ rÞ, either yij ¼ 0 or yij 6¼ 0 for i 6¼ r. Now for yij 6¼ 0 for i 6¼ r, we have xBi xB r for yij [ 0 and i 6¼ r yij yrj xBi xB r for yij \0 and i 6¼ r: and yij yrj Therefore, if we select rðyrj 6¼ 0Þ in such a way that xyBrjr ¼ Min i
n
xBi yij
o ; yij [ 0; i 6¼ r ,
then the new set of basic variables is non-negative; hence, the basic solution ^xB is feasible.
3.7 Improved Basic Feasible Solution
3.7
39
Improved Basic Feasible Solution
Theorem Let xB be a basic feasible solution to the LPP: Maximize z ¼ hcT ; xi subject to Ax ¼ b and x 0; where c; xT 2 Rn , bT 2 Rm and A is an m n real matrix. If ^xB is another basic feasible solution obtained by introducing a non-basic column vector aj in the basis, for which zj cj is negative, then ^xB is an improved basic feasible solution to the problem, i.e. T T ^cB ; ^xB [ cB ; xB : Proof It is given that xB is a basic feasible solution. Let z0 ¼ cTB ; xB . Let ^ aj be the column vector introduced in the basis such that zj cj \0. Let Br be the vector removed from the basis and let ^xB be the new basic feasible solution. Then yij xB ^xBi ¼ xBi xBr ði 6¼ rÞ and ^xBr ¼ r : yrj yrj Hence, the new value of the objective function is given by ^z ¼
m P
^cBi ^xBi ¼
i¼1 ¼
m P
m P
y
cBi ðxBi xBr y ij Þ þ ^cBr rj
i¼1 i6¼r ¼
m P
yij xB cB ðxB xBr y Þ þ cj y r i i rj rj
i¼1 i6¼r
¼
m P
i¼1
cBi ^xBi þ ^cBr ^xBr
i¼1 i6¼r
cB xB i i
m P
i¼1 i6¼r
xBr yrj
½* ^cBr ¼cj
yij xB cB xBr y cBr xBr þ cj y r i rj rj
40
3 Simplex Method m X
xB r xB þ cj r yrj yrj i¼1 ! m X xB r ¼ z0 cBi yij cj yrj i¼1
xBr ¼ z 0 z j cj yrj ¼ z0
cBi yij
[ z0 : Hence, the new basic feasible solution ^xB gives an improved value of the objective function. Corollary If zj cj ¼ 0 for at least one j corresponding to the non-basis vector for which yij [ 0, i ¼ 1; 2; . . .; m, then a new basic feasible solution is obtained which gives an unchanged value of the objective function. Proof From the preceding theorem,
xB r ^z ¼ z0 zj cj yrj ¼ z0
; yrj [ 0
½* zj cj ¼ 0:
Note According to the preceding theorem, for the maximum improvement, determine the entering vector corresponding to the net evolution as follows:
zk ck ¼ Min zj cj : zj cj \0 : j
3.8
Optimality Condition
Theorem For a basic feasible solution xB of an LPP Maximize z ¼ cT ; x subject to Ax ¼ b and x 0 where c; xT 2 Rn and bT 2 Rm ; if zj cj 0 for every column vector aj of A, then xB is an optimal solution. Proof Let B ¼ ðB1 ; B2 ; . . .; Bm Þ be the basis matrix corresponding to the basic feasible solution xB so that BxB ¼ b.
3.8 Optimality Condition
41
Also, let zB be the value of the objective function corresponding to xB . m P Hence, zB ¼ cTB ; xB ¼ cBi xBi . i¼1
T Let X1 ¼ x01 ; x02 ; . . .; x0n be any other feasible solution of the LPP and let z0 be the value of the objective function corresponding to the solution. Then we have BxB ¼ b ¼ AX1 so that xB ¼ B1 ðAX1 Þ ¼ ðB1 AÞX1 ¼ yX1 ;
ð3:16Þ
where B1 A ¼ y ¼ ½y1 ; y2 ; . . .; yn . Now equating the ith component of (3.16) from both sides, we have xB i ¼
n X
yij x0j :
ð3:17Þ
j¼1
It is given that for every column vector aj of A, zj cj 0; i:e: zj cj ; j ¼ 1; 2; . . .; n:
ð3:18Þ
Hence, for all j ¼ 1; 2; . . .; n, zj x0j cj x0j ; as x0j 0 n n P P or zj x0j cj x0j j¼1 n P
j¼1
x0j ðcTB yj Þ z0 m P or x0j cBi yij z0 j¼1 i¼1 ! m n P P 0 or cB i xj yij z0 i¼1 j¼1 " # m n P P 0 0 or cBi xBi z , From ð3:17Þ xBi ¼ yij xj or
j¼1 n P
i¼1
j¼1
or zB z0 . This shows that the value of zB is greater than the objective function for any other feasible solution, and hence zB is the maximum value of the objective function. For a minimization problem, the optimality condition is as follows.
42
3 Simplex Method
Theorem For a basic feasible solution xB of an LPP Minimize z ¼ cT ; x subject to Ax ¼ b and x 0; where c; xT 2 Rn ; bT 2 Rm and A is an m n matrix; if zj cj 0 for every column vector aj of A, then xB is an optimal solution.
3.9
Unbounded Solution
If the objective function has no finite optimum value and it can be increased or decreased arbitrarily, then we say that the problem has an unbounded solution. Theorem If at any iteration of the simplex algorithm, zj cj \0 for at least one j, and for this j, yij \0 for all i ¼ 1; 2; . . .; m, then the LPP admits an unbounded solution in a maximization problem. Proof Let xB be the basic feasible solution of an LPP with basis matrix B at any iteration in which zj cj \0 and yij \0, for all i ¼ 1; 2; . . .; m. If zB is the value of the objective function for this basic feasible solution, then zB ¼ cTB ; xB and BxB ¼ b:
ð3:19Þ
Again, BxB ¼ b implies that m X
xBi Bi ¼ b;
ð3:20Þ
i¼1
where B ¼ ðB1 ; B2 ; . . .; Bm Þ, i.e. the Bi ’s are column vectors of B. In (3.20), adding and subtracting caj , where c is a positive scalar, we get m X
xBi Bi þ caj caj ¼ b:
i¼1
Again, we know that aj ¼
m P
yij Bj ,
i¼1
i.e. caj ¼ c
m P
yij Bi .
i¼1
Using this result in (3.21), we have
ð3:21Þ
3.9 Unbounded Solution
43 m X
xBi Bi þ caj c
i¼1
or
m X
yij Bi ¼ b
i¼1 m X
ð3:22Þ
ðxBi cyij ÞBi þ caj ¼ b:
i¼1
If now c is assumed to be positive and since yij \0; for i ¼ 1; 2; . . .; m; then we have xBi cyij 0:
ð3:23Þ
Now, from (3.22) and (3.23), it is seen that we have obtained a new set of feasible solutions in which the ðm þ 1Þ variables xBi cyij , together with c, can be different from zero and the remaining components are zero. This solution in general is feasible but not a basic solution. Then z0 ¼ ¼
m X i¼1 m X i¼1
cBi ðxBi cyij Þ þ cj c cBi xBi c
m X
cBi yij þ cj c
ð3:24Þ
i¼1
¼ zB cðzj cj Þ: From (3.24), it is observed that z0 will always be greater than zB as c [ 0 and ðzj cj Þ\0. As c increases, z0 also goes on increasing, and by making c sufficiently large, we can make z0 as large as desired. Hence, we conclude that under such circumstances there is no upper bound of the objective function, although the solution is feasible. Hence, the problem admits an unbounded solution. For a minimization problem, the corresponding property is as follows: If at any iteration of the simplex algorithm we get ðzj cj Þ [ 0 for at least one j, and for this j; yij \0, for all i ¼ 1; 2; . . .; m, then the LPP admits an unbounded solution in a minimization problem. Theorem Any convex combination of k different optimal solutions to an LPP is again an optimum solution to the problem. Proof Let X1 ; X2 ; . . .; Xk be k different optimum solutions of the following LPP: Maximize z ¼ hcT ; xi subject to Ax ¼ b and x 0; where c; xT 2 Rn ; bT 2 Rm and A is an m n real matrix:
44
3 Simplex Method
Let z0 be the optimum value of z. Hence, z0 ¼ hc; X1 i ¼ hc; X2 i ¼ ¼ hc; Xk i
ð3:25Þ
AX1 ¼ AX2 ¼ ¼ AXk ¼ b;
ð3:26Þ
and
where X1 0; X2 0; . . .; Xk 0: Now, any combination of X1 ; X2 ; . . .; Xk can be written as x¼
k P
ai Xi , where the ai ’s are positive scalars such that
i¼1
k P
ai ¼ 1.
i¼1
Now k X
Ax ¼ A
! ai Xi
¼
i¼1
¼
k X
ai b
k X
ai ðAXi Þ
i¼1
½using ð3:26Þ
i¼1
¼b
k X
ai ¼ b;
since
i¼1
k X
ai ¼ 1:
i¼1
Again, since Xi 0 and ai 0 for all i ¼ 1; 2; . . .; k, then every component of x will be non-negative. Thus, x is a feasible solution to the LPP. Again,
*
c ; x ¼ T
c ; T
k X i¼1
+ ai X i
¼
k X
cT ; ai Xi
i¼1
k k X X ¼ ai c T ; X i ¼ ai z 0 ¼ z 0 i¼1
i¼1
" *
k X
# ai ¼ 1 :
i¼1
Hence, a convex combination of k different optimum solutions to an LPP is also an optimal solution to the LPP.
3.10
3.10
Simplex Algorithm
45
Simplex Algorithm
The iterative procedure for finding an optimal basic feasible solution (BFS) to an LPP is as follows: Step 1: Check whether the given LPP is a maximization or minimization problem. If it is a minimization problem, then convert it into a problem of maximization, using Minimize z ¼ Maximize ðzÞ; where z is the objective function of the given LPP. Step 2: Check whether all bi ði ¼ 1; 2; . . .; mÞ are non-negative or not. If any one of bi is negative, then multiply the corresponding inequality/equation of the constraint by (−1) so that all bi ði ¼ 1; 2; . . .; mÞ will be positive. Step 3: Convert all the inequality constraints into equations by introducing slack and/or surplus variables in the constraints. Put the coefficients of these variables in the objective function as zero. Step 4: Obtain an initial BFS xB using xB ¼ B1 b, where B is the initial unit basis matrix. Alternatively, by taking non-basic variables as zero, obtain the initial BFS. Step 5: Construct the simplex table. Step 6: Compute the net evaluations zj cj ðj ¼ 1; 2; . . .; nÞ by using the for mula zj cj ¼ cB ; yj cj and examine their sign. Three cases may arise: (i) If all zj cj 0 and either no artificial variable is in the basis or artificial variables appear in the basis at zero level, then the current BFS is optimal. Stop the process. (ii) If all zj cj 0 but artificial variable(s) are in the basis at a positive level, then the given LPP has no feasible solution. (iii) If at least one zj cj \0, go to the next step. Step 7: If at least one zj cj \0, choose the most negative of zj cj . Let it be zr cr for some j ¼ r. Two cases may arise: (i) If all yir 0 ði ¼ 1; 2; . . .; mÞ, then there is an unbounded solution to the given LPP. Stop the process. (ii) If at least one yir [ 0, then the corresponding vector yr enters the basis. This vector is known as an incoming or entering vector. n o x Step 8: To select the departing or outgoing vector, compute Min yBiri ; yir [ 0 . x
i
Let the minimum value be yBkrk . Then the vector yk will leave the basis. This vector is known as a departing or outgoing vector. The common
46
3 Simplex Method
element ykr which is in the kth row and the rth column is known as a key or leading element. Step 9: Convert the key element to unity by dividing all the elements of its row by the key element and all other elements in its column to zeros using the following operations: ^yij ¼ yij
ykj yir ; i ¼ 1; 2; . . .; m; i 6¼ k; j ¼ 1; 2; . . .; n ykr
y
and ^ykj ¼ ykrkj ; j ¼ 1; 2; . . .; n. Then modify the components of xB by using the relations ^xBi ¼ xBi
ykj xB xB ; i ¼ 1; 2; . . .; m ði 6¼ kÞ and ^xBk ¼ k : ykr k ykr
Using these values, construct the next simplex table and go to Step 6.
3.11
Simplex Table
Let us consider the following standard LPP: Maximize z ¼ hcT ; xi subject to Ax ¼ b and x 0 where c ¼ ðc1 ; c2 ; . . .; cn Þ ¼ the profit vector x ¼ ðx1 ; x2 ; . . .; xn ÞT ¼ the decision vector A ¼ ða1 ; a2 ; . . .; an Þ ¼ the coefficient matrix b ¼ ðb1 ; b2 ; . . .; bm ÞT ¼ the requirement vector cB ¼ ðcB1 ; cB2 ; . . .; cBm ÞT ¼ the m-component column vector with elements corresponding to the basic variables B ¼ ðB1 ; B2 ; . . .; Bm Þ ¼ the basis matrix xB ¼ B1 b ¼ the basic feasible solution yj ¼ B1 aj ; j ¼ 1; 2; . . .; n: cj cB cB1 ... cBm
yB
yB1 ... yBm zB ¼ cTB ; xB
xB xB1 ... xBm
c1 y1
c2 y2
... ...
cj yj
... ...
cn yn
y11 ... ym1 z1 c 1
y12 ... ym2 z2 c 2
... ... ... ...
y1j ... ymj zj cj
... ... ... ...
y1n ... ymn zn c n
Initially, B ¼ Im = a unit matrix of order m
3.11
Simplex Table
47
For the initial simplex table, xB ¼ B1 b ¼ Im b ¼ b [* B ¼ Im )B1 ¼ Im ] yj ¼ B1 aj ¼ aj : Example 1 Solve the following LPP: Maximize z ¼ 5x1 þ 4x2 subject to 3x1 þ 4x2 24 3x1 þ 2x2 18 x2 5 and x1 ; x2 0: Solution: Here the given problem is: Maximize z ¼ 5x1 þ 4x2 subject to 3x1 þ 4x2 24 3x1 þ 2x2 18 x2 5 and x1 ; x2 0: This is a maximization problem. All the bi ’s are positive. Now introducing the slack variables x3 ; x4 ; x5 , one to each constraint, we get the system of equations as follows: 3x1 þ 4x2 þ x3 ¼ 24 3x1 þ 2x2 þ x4 ¼ 18 x2 þ x5 ¼ 5: Then the new objective function is z ¼ 5x1 þ 4x2 þ 0 x3 þ 0 x4 þ 0 x5 : The set of above equations (1) can be written as 0
3 @3 0
4 2 1
1 0 0
1 x1 0 1 C 24 0 0 B B x2 C C ¼ @ 18 A 1 0 AB x 3 B C 5 0 1 @ x4 A x5 1
0
or A x ¼ b; 0
3 where A ¼ @ 3 0
4 2 1
1 0 0 1 0 0
1 0 0 0 0A @3 1 0
0 2 1
1 0 0
1 0 0 0 0 1 0A @0 0 1 0
0 0 0
1 0 0
1 0 0 1 0 A. 0 1
48
3 Simplex Method
0 1 0 1 0 1 1 0 0 Hence, the rank of A is 3. Then @ 0 A; @ 1 A and @ 0 A are three linearly 0 0 1 0 1 1 0 0 independent column vectors of A. Therefore, the submatrix B ¼ @ 0 1 0 A is a 0 0 1 non-singular basis submatrix of A. The basic variables are therefore x3 ; x4 ; x5 and an obvious initial basic feasible solution is xB ¼ B1 b 0 1 0 B ¼ @0 1
0
0 0
1
1 0 1 24 x3 i.e. @ x4 A ¼ @ 18 A 5 x5
10
24
1
0
24
1
CB C B C 0 A@ 18 A ¼ @ 18 A 5
5
0
and cB ¼ ð c3 ; c4 ; c5 Þ ¼ ð 0
0
) x3 ¼ 24; x4 ¼ 18; x5 ¼ 5
0 Þ.
Now we construct the starting simplex table. cj cB 0 0 0 zB ¼ 0
yB
xB
y3 y4 y5
x3 ¼ 24 x4 ¼ 18 x5 ¼ 5
5 y1
4 y2
0 y3
0 y4
0 y5
3 3 0 5
4 2 1 4
1 0 0 0
0 1 0 0
0 0 1 0
zj cj
Here all zj cj are not greater than or equal to zero. Hence, the current solution is
not optimal. Now, Minimum zj cj ; zj cj \0 ¼ Minimum f5; 4g ¼ 5 which is z1 c1 . Hence, y1 is the incoming vector. n o
x Again, Minimum yBi1i ; yi1 [ 0 ¼ Minimum 24 3 ;
18 3
¼ 6,
which occurs for the element y21 ð¼ 3Þ. Hence, y21 ¼ 3 is the key element and y4 is the outgoing vector. In the key row, we then replace 0; y4 ; x4 under cB ; yB ; xB by 5; y1 ; x1 respectively, the corresponding elements of the key column.
3.11
Simplex Table
49
Now converting the key element to unity and all other elements of this column to zero by suitable operations, we get the new simplex table as follows: cj cB
yB
0 y3 5 y1 0 y5 zB ¼ 30
xB x3 ¼ 6 x1 ¼ 6 x5 ¼ 5
5 y1
4 y2
0 y3
0 y4
0 y5
0 1 0 0
2 2/3 1 2=3
1 0 0 0
1 1/3 0 5/3
0 0 1 0
zj c j
Here z2 c2 \0. Hence, the current solution is not optimal.
Now Minimum zj cj ; zj cj \0 ¼ Minimum f2=3g which is z2 c2 . Hence, y2 is the incoming vector. n o x 6 ; 51 ¼ 3, which occurs for Again, Minimum yBi2i ; yi2 [ 0 ¼ Minimum 62 ; 2=3 the element y12 ð¼ 2Þ. Hence, y12 ¼ 2 is the key element and y3 is the outgoing vector. In the key row, we then replace 0; y3 ; x3 under cB ; yB ; xB by 4; y2 ; x2 respectively, the corresponding elements of the key column. Now converting the key element to unity and all other elements of this column to zero by suitable operations, we construct the new simplex table as follows: cj cB
yB
4 y2 5 y1 0 y5 zB ¼ 32
xB x2 ¼ 3 x1 ¼ 4 x5 ¼ 2
5 y1
4 y2
0 y3
0 y4
0 y5
0 1 0 0
1 0 0 0
1/2 1=3 1=2 1/3
1=2 2/3 1/2 4/3
0 0 1 0
zj cj
Here all the zj cj 0. Hence, the current BFS is optimal, and the optimal solution is x1 ¼ 4; x2 ¼ 3 and Max z ¼ 32: Example 2 Solve the following LPP: Maximize z ¼ 10x1 þ x2 þ 2x3 subject to x1 þ x2 2x3 10 4x1 þ x2 þ x3 20 and x1 ; x2 ; x3 0:
50
3 Simplex Method
Solution: This is a maximization problem. All the bi ’s are positive. Now introducing the slack variables x4 ; x5 , one to each constraint, we get the problem in the standard form as follows: Maximize z ¼ 10x1 þ x2 þ 2x3 þ 0x4 þ 0x5 subject to x1 þ x2 2x3 þ x4 ¼ 10 4x1 þ x2 þ x3 þ x5 ¼ 20 x1 ; x2 ; x3 ; x4 ; x5 0: Clearly, we may consider the variables x4 and x5 as basic variables. Considering the other variables as non-basic, we get x4 ¼ 10 and x5 ¼ 20, which is the initial basic feasible solution. Now we construct the initial simplex table. cj cB 0 0 zB ¼ 0
yB
xB
y4 y5
x4 = 10 x5 = 20
10 y1
1 y2
2 y3
0 y4
0 y5
Mini ratio
1 4 −10
1 1 −1
−2 1 −2
1 0 0
0 1 0
10/1 = 10 20/4 = 5 zj cj
Since all zj cj l0, the current BFS is not optimal. Here y1 is the incoming vector, y5 is the outgoing vector and y21 ¼ 4 is the key element. Now we construct the next simplex table. cj cB
yB
0 y4 10 y1 zB ¼ 50
xB x4 = 5 x1 = 5
10 y1
1 y2
2 y3
0 y4
0 y5
0 1 0
3/4 1/4 6/4
−9/4 1/4 2/4
1 0 0
−1/4 1/4 10/4
zj c j
Here all zj cj 0. Hence, an optimum solution has been attained, and this solution is given by x1 ¼ 5; x2 ¼ 0; x3 ¼ 0 and Max Z ¼ 50. Example 3 Maximize z ¼ 3x1 þ x2 þ 3x3 subject to 2x1 þ x2 þ x3 2 x1 þ 2x2 þ 3x3 5 2x1 þ 2x2 þ x3 6 and x1 ; x2 ; x3 0:
3.11
Simplex Table
51
Solution: This is a maximization problem. All the bi ’s are positive. Now introducing the slack variables x4 ; x5 ; x6 , one to each constraint, we get the problem in the standard form as follows: Maximize z ¼ 3x1 þ x2 þ 3x3 þ 0x4 þ 0x5 þ 0x6 subject to 2x1 þ x2 þ x3 þ x4 ¼ 2 x1 þ 2x2 þ 3x3 þ x5 ¼ 5 2x1 þ 2x2 þ x3 þ x6 ¼ 6 and xi 0i ¼ 1; 2; . . .; 6: Let us take the variables x4 ; x5 and x6 as basic variables. Considering the other variables as non-basic, we get x4 ¼ 2; x5 ¼ 5 and x6 ¼ 6, which is the initial basic feasible solution. Now we construct the initial simplex table. cj cB
yB
0 y4 0 y5 0 y6 zB ¼ 0
xB x4 = 2 x5 = 5 x6 = 6
3 y1
1 y2
3 y3
0 y4
0 y5
0 y6
Mini ratio
2 1 2 −3
1 2 2 −1
1 3 1 −3
1 0 0 0
0 1 0 0
0 0 1 0
2/2 = 1 5/1 = 5 6/2 = 3 zj cj
Since all zj cj l0, the current BFS is not optimal. Here y1 is the incoming vector, y4 is the outgoing vector and y11 ¼ 2 is the key element. Now we construct the next simplex table. cj cB
yB
3 y1 0 y5 0 y6 zB ¼ 3
xB x1 = 1 x5 = 4 x6 = 4
3 y1
1 y2
3 y3
0 y4
0 y5
0 y6
Mini ratio
1 0 0 0
1/2 3/2 1 1/2
1/2 5=2 0 −3/2
1/2 −1/2 −1 3/2
0 1 0 0
0 0 1 0
1 8/5 – zj cj
Since all zj cj l0, the current BFS is not optimal. Here y3 is the incoming vector, y5 is the outgoing vector and y23 ¼ 5=2 is the key element. Now we construct the next simplex table.
52 cj cB
3 Simplex Method
yB
3 y1 3 y3 0 y6 zB ¼ 27=5
xB x1 = 1/5 x3 = 8/5 x6 = 4
3 y1
1 y2
3 y3
0 y4
0 y5
0 y6
1 0 0 0
1/5 3/5 1 7/5
0 1 0 0
3/5 −1/5 −1 6/5
1/5 2/5 0 9/5
0 0 1 0
zj c j
Here all zj cj 0. Hence, the optimality condition has been satisfied, and the optimal solution is x1 ¼ 1=5; x2 ¼ 0; x3 ¼ 8=5 and Max z ¼
27 : 5
Example 4 Use simplex method to solve the following LPP: Minimize subject to
z ¼ x1 3x2 þ 2x3 3x1 þ x2 þ 2x3 7 2x1 þ 4x2 12 4x1 þ 3x2 þ 8x3 10 and x1 ; x2 ; x3 0:
Solution: This is a minimization problem. All the bi ’s are positive. Let z0 ¼ z ¼ x1 þ 3x2 2x3 . Hence, Minimize z ¼ Maximize z0 : Now introducing the slack variables x4 ; x5 ; x6 , one to each constraint, we get the reduced problem in the standard form as follows: Maximize z0 ¼ x1 þ 3x2 2x3 þ 0x4 þ 0x5 þ 0x6 subject to 3x1 þ x2 þ 2x3 þ x4 ¼ 7 2x1 þ 4x2 þ x5 ¼ 12 4x1 þ 3x2 þ 8x3 þ x6 ¼ 10 and x1 ; x2 ; x3 ; . . .; x6 0: Clearly, we can select the variables x4 ; x5 and x6 as the basic variables. Considering the other variables as non-basic, we get x4 ¼ 7; x5 ¼ 12 and x6 ¼ 10, which is the initial basic feasible solution. Now we construct the initial simplex table.
3.11 cj cB
Simplex Table
yB
0 y4 0 y5 0 y6 z0B ¼ 0
xB x4 = 7 x5 = 12 x6 = 10
53 −1 y1
3 y2
−2 y3
0 y4
0 y5
0 y6
Mini ratio
3 −2 −4 1
−1 4 3 −3
2 0 8 2
1 0 0 0
0 1 0 0
0 0 1 0
– 12/4 = 3 10/3 = 10/3 zj cj
Since all zj cj l0, the current BFS is not optimal. Here y2 is the incoming vector, y5 is the outgoing vector and y22 ¼ 4 is the key element. Now we construct the next simplex table. cj cB
yB
0 y4 3 y2 0 y6 z0B ¼ 9
xB x4 = 10 x2 ¼ 3 x6 ¼ 1
−1 y1
3 y2
−2 y3
0 y4
0 y5
0 y6
Mini ratio
5=2 −1/2 −5/2 −1/2
0 1 0 0
2 0 8 0
1 0 0 0
1/4 1/4 −1/4 3/4
0 0 1 0
10/5/2 = 4 – – zj cj
Since all zj cj l0, the current BFS is not optimal. Here y1 is the incoming vector, y4 is the outgoing vector and y11 ¼ 5=2 is the key element. Now we construct the next simplex table. cj cB
yB
−1 y1 3 y2 0 y6 z0B ¼ 11
xB x1 ¼ 4 x2 ¼ 5 x6 ¼ 11
−1 y1
3 y2
−2 y3
0 y4
0 y5
0 y6
1 0 0 0
0 1 0 0
4/5 2/5 10 11/5
2/5 1/5 1 1/5
1/10 3/10 −1/2 4/5
0 0 1 0
zj cj
Here all zj cj 0. Hence, optimality has been reached, and the optimal solution is x1 ¼ 4; x2 ¼ 5; x3 ¼ 0 and Min z ¼ Max z0 ¼ 11:
3.12
Use of Artificial Variables
In the standard form of an LPP, if the coefficient matrix does not contain a unit basis matrix, then a non-negative variable is added to the left side of each equation which lacks the starting basic variables. This variable is known as an artificial variable. This variable plays the same role as a slack variable in providing the initial basic
54
3 Simplex Method
feasible solution. However, since such artificial variables have no physical meaning from the standpoint of the original problem, the method will be valid only if these variables are driven out or at zero level when the optimum solution is attained. In other words, to return to the original problem, artificial variables must be driven out or at zero level in the final solution. Otherwise, the resulting solution may be infeasible. For solving an LPP involving artificial variables, the following two methods are generally employed: (i) Big M method (also known as Charne’s Big M method or the method of penalties) (ii) Two-phase method.
3.12.1 Big M Method The Big M method is an alternative method for solving an LPP having artificial variables. This method is applied only when the coefficient matrix of standard form of an LPP does not give the unit basis matrix. In this method, a minimum number of artificial variables are introduced, and a very large negative price, say −M (M is positive and very large), is assigned in the objective function corresponding to the artificial variable. Due to the very large negative price, the objective function cannot be improved if one or more artificial variables appear in the basis. The corresponding problem can be solved in the usual simplex algorithm, and finally the following cases may arise: (i) At any iteration, if the optimality conditions are not satisfied, we have to proceed further to get an optimal solution by omitting the vectors corresponding to the artificial variables. Here the artificial variables are used as agents to get a unit basis. (ii) At any iteration, if the optimality conditions are satisfied and all the vectors corresponding to artificial variables are driven out from the basis, the given LPP has no feasible solution. (iii) At any iteration, if the optimality conditions are satisfied and all the artificial variables are at zero level but at least one artificial vector is present in the basis, the solution obtained is an optimal solution of the given LPP. (iv) At any iteration, if the optimality conditions are satisfied and some of the artificial vectors are present in the basis, i.e. some of the vectors corresponding to artificial variables are at a positive level, then the given LPP has no feasible solution.
3.12
Use of Artificial Variables
55
3.12.2 Solution of Simultaneous Linear Equations by Simplex Method Let us consider the system of m simultaneous linear equations in n unknowns in the form Ax ¼ b;
ð3:27Þ
where A is an m n real matrix and b is an m 1 real matrix. The variables to be determined are given by the vector x. Now we shall solve this by the simplex method. For this purpose, we have to form an LPP with a dummy objective function which is to be optimized. To solve the given system of equations, we introduce artificial variables in the given system of equations, and the corresponding price in the dummy objective function z will be −M for each artificial variable. Thus, the given problem of solving the system of linear equations can be reduced to an LPP in which the objective function will be z ¼ M
m X
xa i ;
i¼1
where xai ði ¼ 1; 2; . . .; mÞ is the artificial variable. Since there is no non-negativity restriction for the required variables x, these variables are considered to be unrestricted in sign. Now to introduce the non-negativity restrictions, let us assume that x ¼ x0 x00 where x0 ; x00 0. Therefore, the transformed LPP becomes Maximize z ¼ M
m P
xa i
i¼1
subject to Aðx0 x00 Þ þ xa ¼ b and x0 ; x00 ; xa 0; where xa is the vector of artificial variables. Solving this LPP by the Big M method: (i) If we get Max z = 0 for which xai ði ¼ 1; 2; . . .; mÞ becomes zero, then a basic feasible solution of the system of equations will be obtained. (ii) At any iteration, if the optimality conditions are satisfied and some of the artificial vectors are present in the basis, then the given system of equations has no solution.
56
3 Simplex Method
3.12.3 Inversion of a Matrix by Simplex Method To find the inverse of a given non-singular square matrix, an identity matrix of the same order is considered. Now, performing the same sequence of elementary row transformations on the given matrix and the identity matrix to reduce the given matrix to an identity matrix, we get the inverse matrix. Let us consider a real non-singular square matrix of order n. Let x be an ncomponent column vector and b be an n-component dummy column vector. Then let us consider the system of linear equations as follows: Ax ¼ b; x 0:
ð3:28Þ
Now we introduce the artificial vectors xa ð 0Þ and construct the dummy objective function as z ¼ M
m X
xa i ;
i¼1
where xai ð 0Þ is the ith component of the artificial vector xa . Hence, our problem is Maximize z ¼ M
m P
xa i
i¼1
subject to the constraints Ax þ Ixa ¼ b; x; xa 0: Solving this LPP, we get the optimal solution. If the basis of the optimum simplex table contains all the variables of vector x, then the inverse of A is directly obtained from the optimum simplex table. It consists of those column vectors in the last iteration of the simplex method which are present in the initial basis. If the optimum (final) simplex table does not contain all the variables of vector x in the basis, the simplex method is continued until all the variables of the vector are in the basis. In this situation, the solution may be either feasible or infeasible but optimum. Example 5 Solve the following LPP by the Big M method: Maximize subject to
z ¼ 2x1 3x2 2x3 2x1 þ x2 2x3 3 x1 þ x2 x3 1 x1 ; x2 ; x3 0:
3.12
Use of Artificial Variables
57
Solution: The given problem is a maximization problem. Here all bi ’s are positive. Now, introducing a slack variable x4 and a surplus variable x5 to the first and second constraints respectively, we get the following converted equations: 2x1 þ x2 2x3 þ x4 ¼ 3 x1 þ x2 x3 x5 ¼ 1; where xi 0; i ¼ 1; 2; . . .; 5. The coefficient matrix of this system of equations does not contain a unit matrix. To get a unit matrix, an artificial variable x6 is added to the second equation; then the transformed equations are as follows: 2x1 þ x2 2x3 þ x4 ¼ 3 x1 þ x2 x3 x5 þ x6 ¼ 1 where xi 0; i ¼ 1; 2; . . .; 6. Then the new objective function is given by z ¼ 2x1 3x2 2x3 þ 0 x4 þ 0 x5 Mx6 [assign a very large negative price to the artificial variable]. cj cB
yB
0 y4 −M y6 zB ¼ M
xB x4 = 3 x6 = 1
2 y1
−3 y2
−2 y3
0 y4
0 y5
−M y6
Mini ratio
2 1 −M – 2
1 1 −M + 3
−2 −1 M+2
1 0 0
0 −1 M
0 1 0
3/2 = 3/2 1/1 = 1 zj cj
Since all zj cj l0, the current BFS is not optimal. Here y1 is the incoming vector, y6 is the outgoing vector and y21 ¼ 1 is the key element. Now we construct the next simplex table. cj cB
B
0 y4 2 y1 zB ¼ 2
xB x4 = 1 x1 = 1
2 y1
−3 y2
−2 y3
0 y4
0 y5
−M y6
Mini ratio
0 1 0
−1 1 5
0 −1 0
1 0 0
2 −1 −2
−2 1 2+M
1/2 – zj cj
Since all zj cj l0, the current BFS is not optimal. Here y5 is the incoming vector, y4 is the outgoing vector and y15 ¼ 1 is the key element.
58
3 Simplex Method
Now we construct the next simplex table. cj cB
B
0 y5 2 y1 zB ¼ 3
xB x5 = 1/2 x1 = 3/2
2 y1
−3 y2
−2 y3
0 y4
0 y5
−M y6
0 1 0
−1/2 1/2 4
0 −1 0
1/2 1/2 1
1 0 0
−1 0 M
zj c j
Here all zj cj 0. Hence, optimality has been reached, and the optimal solution is x1 ¼ 3=2; x2 ¼ 0; x3 ¼ 0 with Max z = 3. Example 6 Use Charne’s Big M method to solve the following LPP: Maximize subject to
z ¼ x1 þ 2x2 x1 5x2 10 2x1 x2 2 x1 þ x2 ¼ 0 and x1 ; x2 0:
Solution: The given problem is a maximization problem. Here all bi ’s are positive. Now, introducing a slack variable x3 and a surplus variable x4 to the first and second constraints respectively, we get the following converted equations: x1 5x2 þ x3 ¼ 10 2x1 x2 x4 ¼ 2 x1 þ x2 ¼0 where xi 0; i ¼ 1; 2; . . .; 4. The coefficient matrix of this system of equations does not contain a unit matrix. To get a unit matrix, two artificial variables x5 and x6 are added to second and third equations respectively, and the transformed equations are as follows: x1 5x2 þ x3
¼ 10
2x1 x2 x4 þ x5 ¼ 2 x1 þ x2 þ x 6 ¼0 where xi 0; i ¼ 1; 2; . . .; 6. Then the new objective function is given by z ¼ x1 þ 2x2 þ 0x3 þ 0x4 Mx5 Mx6 [assign a very large negative price to the artificial variable].
3.12
Use of Artificial Variables
59
Clearly, the variables x3 ; x5 and x6 are basic. Considering the other variables as non-basic, we get x3 ¼ 10; x5 ¼ 2 and x6 ¼ 0 which is the initial basic feasible solution. Now we construct the initial simplex table. cj cB
B
0 y3 −M y5 −M y6 zB ¼ 2M
xB x3 = 10 x5 = 2 x6 = 0
1 y1
2 y2
0 y3
0 y4
−M y5
−M y6
Mini ratio
1 2 1 −3M − 2
−5 −1 1 −2
1 0 0 0
0 −1 0 M
0 1 0 0
0 0 1 0
10/1 = 10 2/2 = 1 0/1 = 0 zj cj
Since all zj cj l0, the current BFS is not optimal. Here y1 is the incoming vector, y6 is the outgoing vector and y31 ¼ 1 is the key element. Now we construct the next simplex table. cj cB
B
0 y3 −M y5 1 y1 zB ¼ 2M
xB x3 = 10 x5 = 2 x1 = 0
1 y1
2 y2
0 y3
0 y4
−M y5
0 0 1 0
−6 −3 1 3M − 1
1 0 0 0
0 −1 0 M
0 1 0 0
zj cj
Here all zj cj 0 for all j; i.e. the optimality condition has been satisfied. But the artificial vector y5 appears in the basis at a positive level, and the value of the corresponding artificial variable x5 is 2. Hence, there is no feasible solution to the given LPP. Example 7 Solve the LPP Maximize z ¼ 3x1 x2 subject to x1 þ x2 2 5x1 2x2 2 and x1 ; x2 0: Solution: The given problem is a maximization problem. Here all bi ’s are positive. Now, introducing surplus and artificial variables, the given problem can be written in the standard form as follows:
60
3 Simplex Method
Maximize z ¼ 3x1 x2 þ 0 x3 þ 0 x4 Mx5 Mx6 subject to x1 þ x2 x3 þ x5 ¼ 2 5x1 2x2 x4 þ x6 ¼ 2 and xi 0; i ¼ 1; 2; . . .; 6: Clearly, the variables x5 and x6 are basic. Considering the other variables as non-basic, we get x5 ¼ 2 and x6 ¼ 2, which is the initial basic feasible solution. Now we construct the initial simplex table. cj cB
B
xB
−M y5 −M y6 zB ¼ 4M
x5 = 2 x6 = 2
3 y1
−1 y2
0 y3
0 y4
−M y5
−M y6
Mini ratio
−1 5 −4M − 3
1 −2 M+1
−1 0 M
0 −1 M
1 0 0
0 1 0
– 2/5 = 2/5 zj cj
Since all zj cj l0, the current BFS is not optimal. Here y1 is the incoming vector, y6 is the outgoing vector and y21 ¼ 5 is the key element. Now we construct the next simplex table. cj cB
B
xB
b
−M
y5
x5
12/ 5 2/5
x1 3 y1 zB ¼ 12M=5 þ 6=5
3 y1
−1 y2
0 y3
0 y4
−M y5
Mini ratio
0
3=5
−1
−1/5
1
1 0
−2/5 −3/5M − 1/5
0 M
−1/5 M/5 − 3/5
0 0
12/5/3/ 5=4 – zj c j
Since all zj cj l0, the current BFS is not optimal. Here y2 is the incoming vector, y5 is the outgoing vector and y12 ¼ 3=5 is the key element. Now we construct the next simplex table. cj cB −1 3 zB ¼ 2
B
xB
b
y2 y1
x2 x1
4 2
3 y1
−1 y2
0 y3
0 y4
0 1 0
1 0 0
−5/3 −2/3 −1/3
−1/3 −1/3 −2/3
zj c j
Since all zj cj l0, the current BFS is not optimal. The value of z4 c4 is mostly negative, with yi4 \0; i ¼ 1; 2. Hence, the outgoing vector cannot be selected because there is an unbounded solution to the given problem.
3.12
Use of Artificial Variables
61
Example 8 Solve the following LPP using the Big M method: Maximize z ¼ 3x1 x2 subject to the constraints 2x1 þ x2 2; x1 þ 3x2 3; x2 4 and x1 ; x2 0: Solution: The given problem is a maximization problem. Here all bi ’s are positive. Now introducing surplus variable x3 and slack variables x4 and x5 in the given constraints, we get the following converted equations: 2x1 þ x2 x3 ¼ 2 x1 þ 3x2 þ x4 ¼ 3 x2 þ x5 ¼ 4 where xi 0; i ¼ 1; 2; . . .; 5. The coefficient matrix of the preceding system of equations does not contain a unit matrix. To get a unit matrix, an artificial variable x6 is added to the first equation, and the transformed equations are as follows: 2x1 þ x2 x3 þ x6 ¼ 2 x1 þ 3x2 þ x4
¼3
x2 þ x5
¼4
where xi 0; i ¼ 1; 2; . . .; 6 . Then the new objective function is given by z ¼ 3x1 x2 þ 0 x3 þ 0 x4 þ 0 x5 Mx6 [assign a very large negative price to the artificial variable]. Clearly, the variables x4 ; x5 and x6 are basic. Considering the other variables as non-basic, we get x6 ¼ 2; x4 ¼ 3 and x5 ¼ 4, which is the initial basic feasible solution. Now we construct the initial simplex table. cj cB
yB
−M y6 0 y4 0 y5 zB ¼ 2M
xB x6 ¼ 2 x4 ¼ 3 x5 ¼ 4
3 y1
−1 y2
0 y3
0 y4
0 y5
−M y6
Mini ratio
2 1 0 2M 3
1 3 1 M þ 1
−1 0 0 M
0 1 0 0
0 0 1 0
1 0 0 0
2/2 = 1 3/1 = 3 – zj cj
62
3 Simplex Method
Here all zj cj l0. Hence, the current BFS is not optimal. Here y1 is the incoming vector, y11 ¼ 4 is the key element and y6 is the outgoing vector. Now we construct the next simplex table. cj cB 3 0 0 zB ¼ 3
yB
xB
y1 y4 y5
x1 ¼ 1 x4 ¼ 2 x5 ¼ 4
3 y1
−1 y2
0 y3
0 y4
0 y5
Mini ratio
1 0 0 0
1/2 5/2 1 5/2
−1/2 1=2 0 −3/2
0 1 0 0
0 0 1 0
– 2/½ = 4 – zj c j
Here all zj cj l0. Hence, the current BFS is not optimal. Here y3 is the incoming vector, y23 ¼ 1=2 is the key element and y4 is the outgoing vector. Now we construct the next simplex table. cj cB 3 0 0 zB ¼ 9
yB
xB
y1 y3 y5
x1 ¼ 3 x3 ¼ 4 x5 ¼ 4
3 y1
−1 y2
0 y3
0 y4
0 y5
1 0 0 0
3 5 1 10
0 1 0 0
1 2 0 3
0 0 1 0
zj c j
Here all zj cj 0 for all j. Hence, the current BFS is optimal, and the optimal solution is given by x1 ¼ 3; x2 ¼ 0 with Max z ¼ 9: Example 9 Using the simplex method, solve the following system of linear equations: x1 þ x2 ¼ 1; 2x1 þ x2 ¼ 3: Solution: Since the non-negativity restrictions of x1 and x2 are not given, these variables are unrestricted in sign. Now to introduce the non-negativity restrictions, let us assume that x1 ¼ x01 x001 and x2 ¼ x02 x002 ; where x01 ; x001 ; x02 ; x002 0.
3.12
Use of Artificial Variables
63
To solve the given system of equations, we introduce two artificial variables x3 0 and x4 0 in the first and second equations respectively and a dummy objective function z ¼ Mx3 Mx4 (assigning a large negative price −M corresponding to each artificial variable) which is to be maximized subject to the constraints as follows: x01 x001 þ x02 x002 þ x3 ¼ 1 2x01 2x001 þ x02 x002 þ x4 ¼ 3 and x01 ; x001 ; x02 ; x002 ; x3 ; x4 0. Let us take x3 and x4 as the basic variables. Considering other variables as non-basic variables, we get x3 ¼ 1; x4 ¼ 3, which is the initial BFS. Now we construct the initial simplex table. cj cB
yB
−M y3 −M y4 zB ¼ 4M
xB x3 ¼ 1 x4 ¼ 3
0 y01
0 y001
0 y02
0 y002
−M y3
−M y4
Mini ratio
1 2 3M
−1 −2 −3M
1 1 2M
−1 −1 2M
1 0 0
0 1 0
1 3/2 zj cj
Since all zj cj l0, the current BFS is not optimal. Here y01 is the incoming vector, y3 is the outgoing vector and y011 ¼ 1 is the key element. Now we construct the next simplex table. cj cB
yB
0 y01 −M y4 z ¼ M
xB x01 ¼ 1 x4 ¼ 1
0 y01
0 y001
0 y02
0 y002
−M y4
Mini ratio
1 0 0
−1 0 0
1 −1 M
−1 1 −M
0 1 0
– 1 zj c j
Since all zj cj l0, the current BFS is not optimal. Here y002 is the incoming vector, y4 is the outgoing vector and y0022 ¼ 1 is the key element. Now we construct the next simplex table.
64 cj cB 0 0 zB ¼ 0
3 Simplex Method
yB
xB
y01 y002
x01 ¼ 2 x002 ¼ 1
0 y01
0 y001
0 y02
0 y002
1 0 0
−1 0 0
0 −1 0
0 1 0
zj cj
Since all zj cj are non-negative, an optimum solution has been attained. Thus, the solution of the LPP is
i:e: and
x01 ¼ 2; x001 ¼ 0; x02 ¼ 0; x002 ¼ 1; x1 ¼ x01 x001 ¼ 2 0 ¼2 x2 ¼ x02 x002 ¼ 0 1 ¼ 1;
which is the solution of the given system of equations. Remark Note that the method is applied in solving an LPP with one or more unrestricted variables with the given objective function. Example 10 Using the simplex method, find the inverse of the following matrix: A¼
4 1
2 : 5
Solution: Considering the given matrix as a coefficient matrix, let us take the following system of linear equations:
4 1
2 5
x1 x2
6 ¼ ; 4
x1 ; x2 0;
where ð6; 4ÞT is a dummy column vector, i:e: and
4x1 þ 2x2 ¼ 6 x1 þ 5x2 ¼ 4 x1 ; x2 0:
To solve the system of equations, we introduce two artificial variables x3 0 and x4 0 in the first and second equations respectively and a dummy objective function z ¼ Mx3 Mx4 (assigning a large negative price −M corresponding to each artificial variable)
3.12
Use of Artificial Variables
65
which is to be maximized subject to the constraints as follows: 4x1 þ 2x2 þ x3 ¼ 6 x1 þ 5x2 þ x4 ¼ 4 and x1 ; x2 ; x3 ; x4 0: Let us take x3 and x4 as the basic variables. Considering other variables as non-basic variables, we get x3 ¼ 6; x4 ¼ 4, which is the initial BFS. Now we construct the initial simplex table. cj cB
yB
−M y3 −M y4 zB ¼ 10M
xB x3 ¼ 6 x4 ¼ 4
0 y1
0 y2
−M y3
−M y4
Mini ratio
4 1 5M
2 5 7M
1 0 0
0 1 0
2 6/5 zj cj
Since all zj cj l0, the current BFS is not optimal. Here y2 is the incoming vector, y4 is the outgoing vector and y22 ¼ 5 is the key element. Now we construct the next simplex table. cj cB
yB
−M y3 0 y2 zB ¼ 22=5M
xB x3 ¼ 22=5 x2 ¼ 4=5
0 y1
0 y2
−M y3
−M y4
Mini ratio
18=5 1/5 18=5M
0 1 0
1 0 0
−2/5 1/5 7/5M
11/9 4 zj cj
Since all zj cj l0, the current BFS is not optimal. Here y1 is the incoming vector, y3 is the outgoing vector and y11 ¼ 18=5 is the key element. Now we construct the next simplex table. cj cB 0 0 zB ¼ 0
yB
xB
y1 y2
x1 ¼ 11=9 x2 ¼ 5=9
0 y1
0 y2
−M y3
−M y4
1 0 0
0 1 0
5/18 −1/18 M
1=9 2/9 M
zj cj
Since all zj cj are non-negative, an optimum solution has been attained. From the preceding simplex table, it is seen that
66
3 Simplex Method
y3 ¼
5 18
1 18
T
and y4 ¼ 19
2 T : 9
Since the initial basis consists of the column vectors y3 and y4 , and the column vectors y1 ; y2 corresponding to columns of matrix A reduce to unit vectors, then the inverse of matrix A is given by A1 ¼
5=18 1=18
1=9 2=9
¼
1 18
5 2 : 1 4
3.12.4 Two-Phase Method In an earlier section, we discussed the Big M method for solving a particular type of LPP. To solve the same problem, Dantzig, Orden and Wolfe developed an alternative method known as the two-phase method. In this method, the problems are solved in two different phases, Phase I and Phase II. Phase I The objective of Phase I is either to remove all the artificial variables from the basis or to set some of the said variables (which are present in the basis) at a zero level. For this purpose, an auxiliary LPP is constructed with a new objective function X z0 ¼ xai ½xai is an artificial variable i
which is to be maximized. From the mathematical point of view, all artificial variables must be at a zero level at the optimal stage. Hence, the maximum value of z0 should be equal to zero. However, this will not be true for all problems. Thus, to find the maximum value of the new objective function z0 , the following auxiliary LPP is to be solved first: Maximize z0 ¼
X
xa i
i
subject to all the constraints: In solving this LPP, when optimality conditions are satisfied in an iteration, the following three cases may arise: Case I: Max z0 ¼ 0 and all artificial vectors are not present in the basis. Case II: Max z0 ¼ 0, but some of the artificial vectors are present in the basis at a zero level. In this case, some constraints may be redundant.
3.12
Use of Artificial Variables
67
Case III: Max z0 \0. This means that one or more artificial variables are present in the basis at a positive level. In this case, the original LPP has no feasible solution. Phase II If Phase I is terminated with either Case I or Case II, then we have to start Phase II to find the optimal solution of the original problem. In this phase, the first simplex table is almost identical with the last table of Phase I except that the rows for cj and zj cj as the values of cj of the original LPP will be used in the table. In Case II, Max z0 ¼ 0 and one or more artificial vectors appear in the basis at zero level. This means that one or more artificial variables are basic, but their values are zero. In this case, we must take care that these artificial variables never become positive at any iteration of Phase II. Example 11 Use the two-phase simplex method to solve the following problem: Maximize z ¼ 3x1 x2 subject to the constraints 2x1 þ x2 2 x1 þ 3x2 2 x1 4 x1 ; x2 0: Solution: This is a maximization problem. All bi ’s are positive. Now introducing the surplus variable x3 and slack variables x4 ; x5 , the given problem can be rewritten as follows: Maximize z ¼ 3x1 x2 þ 0 x3 þ 0 x4 þ 0 x5 subject to the constraints 2x1 þ x2 x3 ¼ 2 x1 þ 3x2 þ x4 ¼ 2 x1 þ x5 ¼ 4 and xj 0; j ¼ 1; 2; . . .; 5:
The coefficient matrix of the above constraints does not contain a unit matrix. To get a unit matrix, an artificial variable x6 is added to the first equation, and the transformed equations are as follows:
68
3 Simplex Method
2x1 þ x2 x3 þ x6 ¼ 2 x1 þ 3x2 þ x4 ¼ 2 x1 þ x5 ¼ 4 where xj 0; j ¼ 1; 2; . . .; 6. Phase I: Assigning a profit coefficient −1 to the artificial variable and 0 to all other variables, the objective function of the auxiliary LPP to be maximized is given by z 0 ¼ 0 x1 þ 0 x2 þ 0 x3 þ 0 x4 þ 0 x5 x6 subject to the preceding system of equations. Let us take x4 ; x5 and x6 as the basic variables. Considering other variables as non-basic variables, we get x6 ¼ 2; x4 ¼ 2; x5 ¼ 4, which is the initial BFS. Now we construct the initial simplex table. cj cB
yB
−1 y6 0 y4 0 y5 0 z ¼ 2
xB x6 ¼ 2 x4 ¼ 2 x5 ¼ 4
0 y1
0 y2
0 y3
0 y4
0 y5
−1 y6
Mini ratio
2 1 1 −2
1 3 0 −1
−1 0 0 1
0 1 0 0
0 0 1 0
1 0 0 0
2/2 = 2 2/1 = 2 4/1 = 4 zj cj
Since all zj cj l0, the current BFS is not optimal. Here y1 is the incoming vector, y6 is the outgoing vector and y11 ¼ 2 is the key element. Now we construct the next simplex table. cj cB 0 0 0 z0B ¼ 0
yB y1 y4 y5
xB x1 ¼ 1 x4 ¼ 1 x5 ¼ 3
0
0
0
0
0
y1 1 0 0 0
y2 1/2 5/2 −1/2 0
y3 −1/2 1/2 1/2 0
y4 0 1 0 0
y5 0 0 1 0
zj cj
Here all zj cj are non-negative and Max z0 ¼ 0. Again no artificial variable appears in the final table of Phase I. We then pass on to Phase II where the objective function of the original problem to be maximized is z ¼ 3x1 x2 þ 0 x3 þ 0 x4 þ 0 x5 :
3.12
Use of Artificial Variables
69
Now we construct the initial simplex table of Phase II. cj cB
yB
3 y1 0 y4 0 y5 zB ¼ 3
xB x1 ¼ 1 x4 ¼ 1 x5 ¼ 3
3 y1
−1 y2
0 y3
0 y4
0 y5
Mini ratio
1 0 0 0
1/2 5/2 −1/2 5/2
−1/2 1=2 1/2 −3/2
0 1 0 0
0 0 1 0
– 2 6 zj cj
Since all zj cj l0, the current BFS is not optimal. Here y3 is the incoming vector, y4 is the outgoing vector and y32 ¼ 1=2 is the key element. Now we construct the next simplex table. cj cB 3 0 0 zB ¼ 6
yB
xB
y1 y3 y5
x1 ¼ 2 x3 ¼ 2 x5 ¼ 2
3 y1
−1 y2
0 y3
0 y4
0 y5
1 0 0 0
3 5 −3 10
0 1 0 0
1 2 −1 3
0 0 1 0
zj c j
Here all zj cj are non-negative. Hence, an optimum solution has been attained, and this solution is given by x1 ¼ 2; x2 ¼ 0 and Max z = 6. Example 12 Use the two-phase simplex method to solve the following problem: Maximize z ¼ 3x1 þ 2x2 subject to the constraints 2x1 þ x2 2 3x1 þ 4x2 12 and x1 ; x2 0:
Solution: This is a maximization problem. All bi ’s are positive. Now introducing the slack variable x3 and surplus variable x4 , the given problem can be rewritten as follows: Maximize z ¼ 3x1 þ 2x2 þ 0 x3 þ 0 x4 subject to 2x1 þ x2 þ x3 ¼ 2 3x1 þ 4x2 x4 ¼ 12 and x1 ; x2 ; x3 ; x4 0:
70
3 Simplex Method
The coefficient matrix of these constraints does not contain a unit matrix. To get a unit matrix, an artificial variable x5 is added to the first equation; the transformed equations are as follows: 2x1 þ x2 þ x3 ¼ 2 3x1 þ 4x2 x4 þ x5 ¼ 12 where xj 0; j ¼ 1; 2; 3; 4; 5. Phase I: Assigning a profit coefficient −1 to the artificial variable and 0 to all other variables, the objective function of the auxiliary LPP to be maximized is given by z 0 ¼ 0 x1 þ 0 x 2 þ 0 x3 þ 0 x4 x5 subject to the preceding system of equations. cj cB
yB
0 y3 −1 y5 z0B ¼ 12
xB x3 ¼ 2 x5 ¼ 12
0 y1
0 y2
0 y3
0 y4
−1 y5
Mini ratio
2 3 −3
1 4 −4
1 0 0
0 −1 1
0 1 0
2/1 = 2 12/4 = 3 zj cj
Since all zj cj l0, the current BFS is not optimal. Here y2 is the incoming vector, y3 is the outgoing vector and y21 ¼ 1 is the key element. Now we construct the next simplex table. cj cB
yB
0 y2 −1 y5 z0B ¼ 4
xB x2 ¼ 2 x5 ¼ 4
0 y1
0 y2
0 y3
0 y4
−1 y5
2 −5 5
1 0 0
1 −4 4
0 −1 1
0 1 0
zj c j
Here all zj cj 0. Hence, an optimum basic feasible solution to the auxiliary LPP has been attained. Again, Max z0 \0 and the artificial variable x5 is basic with a positive value. Hence, the original LPP does not possess any feasible solution.
3.12
Use of Artificial Variables
71
Example 13 Use the two-phase simplex method to solve the following problem: Maximize z ¼ 5x1 4x2 þ 3x3 subject to the constraints 2x1 þ x2 6x3 ¼ 20 6x1 þ 5x2 þ 10x3 76 8x1 3x2 þ 6x3 50 and x1 ; x2 ; x3 0: Solution: This is a maximization problem. All bi ’s are positive. Now introducing the slack variables x3 and x4 , the given problem can be rewritten as follows: Maximize z ¼ 5x1 4x2 þ 3x3 þ 0 x4 þ 0 x5 subject to the constraints 2x1 þ x2 6x3 ¼ 20 6x1 þ 5x2 þ 10x3 þ x4 ¼ 76 8x1 3x2 þ 6x3 þ x5 ¼ 50 and xj 0; j ¼ 1; 2; 3; 4; 5: The coefficient matrix of these constraints does not contain a unit matrix. To get a unit matrix, an artificial variable x6 is added to the first equation, and the transformed equations are as follows: 2x1 þ x2 6x3 þ x6 ¼ 20 6x1 þ 5x2 þ 10x3 þ x4 ¼ 76 8x1 3x2 þ 6x3 þ x5 ¼ 50; where xj 0; j ¼ 1; 2; . . .; 6. Phase I: Assigning a profit coefficient −1 to the artificial variable and 0 to all other variables, the objective function of the auxiliary LPP to be maximized is given by z 0 ¼ 0 x1 þ 0 x2 þ 0 x3 þ 0 x4 þ 0 x5 x6 subject to the preceding system of equations. cj cB
xB
0 y1
0 y2
0 y3
0 y4
0 y5
−1 y6
Mini ratio
yB
−1 0
y6 y4
x6 ¼ 20 x4 ¼ 76
2 6
1 5
−6 10
0 1
0 0
1 0
10 38/3 (continued)
72
3 Simplex Method
(continued) cj cB
yB
0 y5 z0B ¼ 20
xB x5 ¼ 50
0 y1
0 y2
0 y3
0 y4
0 y5
−1 y6
Mini ratio
8 −2
−3 −1
6 6
0 0
1 0
0 0
25/4 zj c j
Since all zj cj l0, the current BFS is not optimal. Here y1 is the incoming vector, y5 is the outgoing vector and y31 ¼ 8 is the key element. Now we construct the next simplex table. cj cB
yB
−1
y6
0
y4
0 y1 z0B ¼ 15=2
0 y2
0 y3
0 y4
0 y5
−1 y6
Mini ratio
xB
0 y1
x6 ¼ 15=2
0
7=4
−15/2
0
−1/4
1
x4 ¼ 77=2
0
29/4
11/2
1
−3/4
0
30 7 154 29
x1 ¼ 25=4
1 0
−3/8 −7/4
3/4 15/2
0 0
1/8 1/4
0 0
– zj c j
Since all zj cj l0, the current BFS is not optimal. Here y2 is the incoming vector, y6 is the outgoing vector and y12 ¼ 7=4 is the key element. Now we construct the next simplex table. cj cB
yB
0 y2 0 y4 0 y1 z0B ¼ 0
xB x2 ¼ 30=7 x4 ¼ 52=7 x1 ¼ 55=7
0 y1
0 y2
0 y3
0 y4
0 y5
0 0 1 0
1 0 0 0
−30/7 256/7 −6/7 0
0 1 0 0
−1/7 2/7 1/14 0
zj cj
Here all zj cj 0. Hence, an optimum solution to the auxiliary LPP has been attained. Furthermore, from this table, it is observed that no artificial vector appears in the basis. We then pass on to Phase II, where the objective function of the original problem to be maximized is z ¼ 5x1 4x2 þ 3x3 þ 0 x4 þ 0 x5 : Phase II: Now we construct the initial simplex table of Phase II.
3.12 cj cB
Use of Artificial Variables
yB
−4 y2 0 y4 0 y1 zB ¼ 155=7
xB x2 ¼ 30=7 x4 ¼ 52=7 x1 ¼ 55=7
73 5 y1
−4 y2
3 y3
0 y4
0 y5
0 0 1 0
1 0 0 0
−30/7 256/7 −6/7 69/7
0 1 0 0
−1/7 2/7 1/14 13/14
zj cj
Here all zj cj 0. Hence, an optimum basic feasible solution to the given LPP has been attained, and the optimal solution is given by x1 ¼ 55=7; x2 ¼ 30=7; x3 ¼ 0 with maximumz ¼ 155=7: Example 14 Using the simplex method, solve the following system of linear simultaneous equations: x1 þ x2 ¼ 1 2x1 þ x2 ¼ 3:
Solution: Since the non-negativity restrictions of x1 and x2 are not given, these variables are unrestricted in sign. Now to introduce the non-negativity restrictions, let us assume that x1 ¼ x01 x001 and x2 ¼ x02 x002 , where x01 ; x001 ; x02 ; x002 0. Now we introduce the artificial variables x3 0; x4 0 by assigning profit −1 for each. Then the problem of solving the given system of equations reduces to the following LPP: Maximize z ¼ x3 x4 subject to the constraints x01 x001 þ x02 x002 þ x3 ¼ 1 2x01 2x001 þ x02 x00 þ x4 ¼ 3 and x01 ; x001 ; x02 ; x002 ; x3 ; x4 0: cj cB
yB
−1 y3 −1 y4 zB ¼ 4
xB x3 ¼ 1 x4 ¼ 3
0 y01
0 y001
0 y02
0 y002
−1 y3
−1 y4
Mini ratio
1 2 −3
−1 −2 3
1 1 −2
−1 −1 2
1 0 0
0 1 0
1 3/2 zj c j
Since all zj cj l0, the current BFS is not optimal. Here y01 is the incoming vector, y3 is the outgoing vector and y011 ¼ 1 is the key element. Now we construct the next simplex table.
74 cj cB
3 Simplex Method
yB
0 y01 −1 y4 zB ¼ 1
xB x01 ¼ 1 x4 ¼ 1
0 y01
0 y001
0 y02
0 y002
−1 y4
Mini ratio
1 0 0
−1 0 0
1 −1 1
−1 1 −1
0 1 0
– 1 zj c j
Since all zj cj l0, the current BFS is not optimal. Here y002 is the incoming vector, y4 is the outgoing vector and y0022 ¼ 1 is the key element. Now we construct the next simplex table. cj cB 0 0 zB ¼ 0
yB
xB
y01 y002
x001 ¼ 2 x002 ¼ 1
0 y01
0 y001
0 y02
0 y002
1 0 0
−1 0 0
0 −1 0
0 1 0
zj cj
Since all zj cj are non-negative, an optimum solution has been attained. Thus, the solution of the LPP is x01 ¼ 2; x002 ¼ 1; x001 ¼ 0; x02 ¼ 0;
i:e:x1 ¼ x01 x001 ¼ 2 0 x2 ¼ x02 x00 ¼ 0 1
¼2 ¼ 1;
which is the solution of the given system of equations. Remark Note that the method is applied in solving an LPP with one or more unrestricted variables with the given objective function. Example 15 Using the simplex method, find the inverse of the following matrix: A¼
3 4
2 : 1
Solution: Taking the given matrix as a coefficient matrix, let us consider the following system of linear equations:
3 4
2 1
x1 x2
4 ¼ ; 6
where ð4; 6ÞT is a dummy column vector,
x1 ; x2 0;
3.12
Use of Artificial Variables
75
i:e: and
3x1 þ 2x2 ¼ 4 4x1 x2 ¼ 6 x1 ; x2 0:
To solve the system of equations, we introduce two artificial variables x3 0 and x4 0 in the first and second equations respectively and a dummy objective function z ¼ x3 x4 which is to be maximized subject to the constraints 3x1 þ 2x2 þ x3 ¼ 4 4x1 x2 þ x4 ¼ 6 and x1 ; x2 ; x3 ; x4 0: cj cB
yB
xB
−1 y3 −1 y4 zB ¼ 10
x3 ¼ 4 x4 ¼ 6
0 y1
0 y2
−1 y3
−1 y4
Mini ratio
3 4 −7
2 −1 −1
1 0 0
0 1 0
4/3 3/2 zj cj
Since all zj cj l0, the current BFS is not optimal. Here y1 is the incoming vector, y3 is the outgoing vector and y11 ¼ 3 is the key element. Now we construct the next simplex table. cj cB
yB
0 y1 −1 y4 zB ¼ 2=3
xB x1 ¼ 4=3 x4 ¼ 2=3
0 y1
0 y2
−1 y3
−1 y4
1 0 0
2/3 −11/3 11/3
1/3 −4/3 7/3
0 1 0
zj c j
Since all zj cj are non-negative, an optimal solution has been reached. But to find the inverse of the matrix A, we have to convert the vector y2 as a unit vector. For this purpose, we consider y2 as the incoming vector and y4 as the outgoing vector and y22 ¼ 11=3 as the key element. Now we construct the next simplex table. cj cB 0 0 zB ¼ 0
yB
xB
y1 y2
x1 ¼ 16=11 x2 ¼ 2=11
0 y1
0 y2
−1 y3
−1 y4
1 0 0
0 1 0
1/11 4/11 1
2/11 −3/11 1
zj cj
76
3 Simplex Method
1 4 T
2 3T From this simplex table, it is seen that y3 ¼ 11 ; 11 and y4 ¼ 11 ; 11 , since the initial basis consists of the column vectors y3 and y4 and the column vectors y1 ; y2 corresponding to the columns of matrix A; therefore, the inverse of matrix A is given by A1 ¼
3.13
1 11 4 11
2 11 3 11
¼
1 11
1 4
2 : 3
Exercises
1. Solve the following LPPs: (i) Maximize subject to (ii) Minimize subject to (iii) Maximize subject to
(iv) Maximize subject to
(v) Maximize subject to (vi) Maximize subject to
(vii) Maximize subject to (viii) Minimize subject to
(ix) Minimize subject to
z ¼ 10x1 þ x2 þ 2x3 x1 þ x2 2x3 10 4x1 þ x2 þ x3 20 and x1 ; x2 0: z ¼ 3x1 2x2 4x1 þ x2 8 2x1 þ 4x2 20 and x1 ; x2 0: z ¼ 2x1 þ 5x2 4x1 5x2 20 x1 þ x2 10 x2 2 and x1 ; x2 0: z ¼ 5x1 þ 11x2 2x1 þ x2 4 3x1 þ 4x2 24 2x1 3x2 6 and x1 ; x2 0: z ¼ 2x1 þ x2 þ 3x3 x1 þ x2 þ 2x3 5 2x1 þ 3x2 þ 4x3 ¼ 12 and x1 ; x2 ; x3 0: z ¼ 4x1 þ x2 3x1 þ x2 ¼ 3 4x1 þ 3x2 6 x1 þ 2x2 3 and x1 ; x2 0: z ¼ 2x1 þ x2 x3 þ 4x4 3x1 5x2 þ x3 2x4 ¼ 7 6x1 10x2 x3 þ 5x4 ¼ 11 and xi 0; i ¼ 1; 2; . . .; 4: z ¼ 2x1 þ x2 3x1 þ x2 3 4x1 þ 3x2 6 x1 þ 2x2 2 and x1 ; x2 0: z ¼ 4x1 þ 2x2 3x1 þ x2 27 x1 þ x2 21 x1 þ 2x2 30 and x1 ; x2 0:
3.13
Exercises
77
2. Solve the following LPPs: (i) Maximize subject to (ii) Maximize subject to (iii) Maximize subject to
(iv) Maximize subject to
(v) Maximize subject to (vi) Maximize subject to
(vii) Minimize subject to (viii) Maximize subject to
z ¼ x1 þ 5x2 3x1 þ 4x2 6 x1 þ 3x2 3 and x1 ; x2 0: z ¼ 2x1 þ x2 þ 3x3 x1 þ x2 þ 2x3 5 2x1 þ 3x2 þ 4x3 ¼ 12 and x1 ; x2 ; x3 0: z ¼ 4x1 þ x2 3x1 þ x2 ¼ 3 4x1 þ 3x2 6 x1 þ 2x2 3 and x1 ; x2 0: z ¼ 2x1 þ x2 4x1 þ 3x2 12 4x1 þ x2 8 4x1 x2 8 and x1 ; x2 0: z ¼ 2x1 þ x2 þ 10x3 x1 2x3 ¼ 0 x2 þ x3 ¼ 1 and x1 ; x2 ; x3 0: z ¼ 5x1 2x2 þ 3x3 2x1 þ 2x2 x3 2 3x1 4x2 3 x2 þ 3x3 5 and x1 ; x2 ; x3 0: z ¼ 4x1 þ 8x2 þ 3x3 x1 þ x2 2 2x1 þ x3 5 and x1 ; x2 ; x3 0: z ¼ 5x1 þ 3x2 x1 þ x2 2 5x1 þ 2x2 10 5x1 þ 2x2 10 and x1 ; x2 0:
3. Solve the following LPPs using the two-phase simplex method: (i)
(ii)
(iii)
(iv)
z ¼ x1 þ x2 2x1 þ x2 4 x1 þ 7x2 7 and x1 ; x2 0: Maximize z ¼ 3x1 þ 2x2 subject to 2x1 þ x2 2 3x1 þ 4x2 12 and x1 ; x2 0: Maximize z ¼ 5x1 þ 8x2 subject to 3x1 þ 2x2 3 x1 þ 4x2 4 x1 þ x2 5 and x1 ; x2 0: Minimize z ¼ 4x1 þ x2 subject to x1 þ 2x2 3 4x1 þ 3x2 6 3x1 þ x2 ¼ 3 and x1 ; x2 0: Minimize subject to
78
3 Simplex Method
(v)
Minimize subject to
z ¼ 40x1 þ 24x2 2x1 þ 25x2 240 8x1 þ 5x2 720 and x1 ; x2 0:
4. Solve the following systems of linear simultaneous equations using the simplex method: (i)
x1 þ x2 ¼ 1
(ii)
2x1 þ x2 ¼ 4: 3x1 þ x2 ¼ 7
(iii)
(iv)
x1 þ x2 ¼ 3: 7x1 x2 ¼ 1 11x1 þ 2x2 ¼ 23: x1 þ 2x2 þ 3x3 ¼ 14 2x1 x2 þ 5x3 ¼ 15 3x1 þ 2x2 þ 4x3 ¼ 13:
5. Using the simplex method, find the inverse of the following matrices: 0 1 1 2 3 2 1 3 2 3 (i) (ii) (iii) @ 3 2 1 A (iv) 1 2 4 1 1 4 2 1 3 4 (v) 1 2
1 1
Chapter 4
Revised Simplex Method
4.1
Objective
The objective of this chapter is to discuss: • The revised simplex methods and their computational procedures • The relevant information required at each iteration of the revised simplex method • The use of the revised simplex in comparison to the usual simplex method.
4.2
Introduction
The revised simplex method is a modified form of the ordinary simplex method. It is also an efficient method for solving linear programming problems (LPPs). It is efficient in the sense that we need not recompute all the values of yj ; zj cj ; xB at each iteration. Here, the word ‘revised’ refers to the procedure of changing or updating the ordinary simplex method. This method is economical on the computer, as it computes and stores only the relevant information required for testing the optimality condition and for updating the current solution. The basic information required at each iteration in the ordinary simplex method includes: (i) (ii) (iii) (iv)
Net evaluations corresponding to the non-basic variables The column vector corresponding to the most negative net evaluation The values of the current basic variables Selection of an outgoing vector from the basis.
This information can be computed directly from its definitions provided the inverse of the basis matrix is known. To get the inverse of the basis matrix at any © Springer Nature Singapore Pte Ltd. 2019 A. K. Bhunia et al., Advanced Optimization and Operations Research, Springer Optimization and Its Applications 153, https://doi.org/10.1007/978-981-32-9967-2_4
79
80
4 Revised Simplex Method
iteration, we need to compute the same from the previous inverse matrix. For this purpose, the relevant information we must know at each iteration is: (i) The coefficients of non-basic variables in the objective function (ii) The components of the vectors corresponding to the non-basic variables. Another feature of the revised simplex method is to treat the objective function z ¼ hcT ; xi as a constraint considering z as an unrestricted variable.
4.3
Standard Forms of Revised Simplex Method
There are two standard forms of the revised simplex method. 1. Standard form I: In this form, an initial basis matrix is obtained by introducing slack variables and/or surplus variables. Basically, an identity is considered as an initial basis matrix. 2. Standard form II: In this form, artificial variables are also required for an identity matrix which is considered as the initial basis matrix. Thus, a two-phase method is used to handle the artificial variables. The method for solving standard form I may also be applied to solve these types of problems.
4.4
Standard Form I of Revised Simplex Method
Let us consider the following LPP: Maximize z ¼ hcT ; xi subject to Ax ¼ b; x 0, where c; xT 2 Rn ; bT 2 Rm and A is an m n real matrix of rank m. In the revised simplex method we consider the objective function z ¼ hcT ; xi as one of the constraints and z as a variable. Thus, the problem is now to solve the new system of ðm þ 1Þ simultaneous linear equations in ðn þ 1Þ variables x1 ; x2 ; . . .; xn ; z, for which z is unrestricted in sign and may be as large as possible. The set of constraints can thus be represented as Ax þ O z ¼ b cT ; x þ z ¼ 0 x 0; z is unrestricted in sign or
A O c 1
x z
! ¼
b 0
! ;
x 0:
ð4:1Þ
Let B be an initial basis submatrix of A and xB ¼ B1 b be an initial basic feasible solution (BFS) of the original LPP. Then the initial BFS of the reformulated problem (4.1) is given by
4.4 Standard Form I of Revised Simplex Method
81
BxB þ O z ¼ b cTB ; x þ z ¼ 0 or or or
B
O
cB
1
xB
b
¼ 0 B b xB ¼ b b¼ Bb b where B cB z
O xB b ; bx B ¼ ;b b¼ 1 z 0
b 1 b bx B ¼ B b;
which is the initial BFS of (4.1). We note that bx B is called the BFS whenever xB is the BFS of the original LPP, even when z is unrestricted in sign. A1 A2 1 b Let B ¼ A3 A4 I O 1 b B b¼ ) B O 1 A1 A2 B O I O or ¼ cB 1 A3 A4 O 1 A1 B A2 cB A2 I O or ¼ O 1 A3 B A4 cB A4 ) A1 B A2 cB ¼ I: A2 ¼ O A3 B A4 cB ¼ O A4 ¼ 1: From (4.2), we have A1 B A2 cB ¼ I or
A1 B ¼ I ½* A2 ¼ O or
A1 ¼ B1 :
From (4.3), A3 B A4 cB ¼ O or
A3 B cB ¼ O ½* A4 ¼ 1 or
A3 B ¼ cB
ð4:2Þ
ð4:3Þ
82
4 Revised Simplex Method
A3 ¼ cB B1 1 A2 B ¼ A4 cB B1
or b 1 ¼ )B
A1 A3
O : 1
Let us define a new ðm þ 1Þ n matrix b by ¼ B
1
b where A b¼ A, )^y ¼ ¼
A c
!
B1 cB B1 y
O 1
cB y c
A
! ¼
c
B1 A cB B1 A c
*B1 A ¼ y :
Now, cB y c 1 y1n C y2n C C C C ðc1 ; c2 ; . . .; cn Þ C A ym1 ym2 ymn ! m m m X X X cBi yi1 c1 ; cBi yi2 c2 ; . . .; cBi yin cn 0
y11 B B y21 B ¼ ðcB1 ; cB2 ; . . .; cBm ÞB B B @
¼
i¼1
y12 y22
i¼1
i¼1
¼ ðz1 c1 ; z2 c2 ; . . .; zn cn Þ yj ) by j ¼ ; Þ j ¼ 1; 2; . . .; n z j cj or ðby 1 ; by 2 ; . . .; by n Þ ¼
y1 z1 c 1
y2 z 2 c2
yn : z n cn
ð4:4Þ
From these expressions, it is clear that the first m components (i.e. the first m rows) of the jth column constitute the vector yj and the (m + 1)th row is zj cj (the net evaluation). Hence, the net evaluations are the components of
cB B1 A c ¼ cB B1 1
! A : c
4.4 Standard Form I of Revised Simplex Method
Also; bx B ¼
xB z
83
b 1 b b ¼B
h
i b 1 b * bx B ¼ B b ;
where the first m components of bx B are xB and the (m + 1)th component is the corresponding value of the objective function.
4.5
Computational Procedure of Standard Form I of Revised Simplex Method
Step 1 Introduce slack and surplus variables, if needed, and reformulate the given LPP in maximization standard form. Step 2 Obtain an initial BFS with an initial basis B ¼ Im and obtain the inverse of the new basis matrix 1 1 B O B O b b B¼ by the formula B ¼ : cB 1 cB B1 1 b and b Step 3 Find A b by the formula A
b¼ A
!
c
;
b b¼
b
!
0
:
Step 4 Compute the net evaluations zj cj ð j ¼ 1; 2; . . .; nÞ by using the formula
z j cj ¼ cB B
1
1
A c
! :
(a) If all zj cj are non-negative, the current BFS is an optimum one. (b) If at least one zj cj is negative, determine the most negative of them, say zk ck ; then the corresponding vector by k enters the basis. Go to Step 5. b 1 b Step 5 Compute by k ¼ B a k . In that case, (a) If all by ik 0, there exists an unbounded optimum solution to the given problem. (b) If at least one by ik [ 0, find the minimum ratios of the components of bx B and the corresponding positive by ik and determine the outgoing vector and key or leading element. Step 6 Write down the results obtained in Step 2 through Step 5 in a tabular form known as a revised simplex table.
84
4 Revised Simplex Method
Step 7 Update the B1 matrix by suitable row operations to convert the key element to unity and all other elements of the entering column to zero, and b 1 b then update the current BFS using the formula bx B ¼ B b. Step 8 Go to Step 4 and repeat the procedure until an optimum BFS is obtained or there is an indication of an unbounded solution. Example 1 Using the revised simplex method, solve the following LPP: Maximize subject to
z ¼ 6x1 2x2 þ 3x3 2x1 x2 þ 2x3 2 x1 þ 4x3 4 and x1 ; x2 ; x3 0:
Solution The given LPP is a maximization problem. Here all bi ’s (i =1, 2) are positive. Now introducing the slack variables x4 0 and x5 0 in the first and second constraints respectively, the given LPP is rewritten in standard form as follows: Maximize z ¼ 6x1 2x2 þ 3x3 þ 0 x4 þ 0 x5 subject to 2x1 x2 þ 2x3 þ x4 ¼ 2 x1 þ 4x3 þ x5 or Maximize z ¼ cT ; x subject to Ax ¼ b; x 0; where c ¼ ð 6 2
3
¼ 4 and xj 0;
0 0 Þ; T
x ¼ ðx1 x2 x3 x4 x5 Þ ;
A¼
2 1
1 0
2 4
1 0
The initial BFS is x4 ¼ 2; x5 ¼ 4 with B ¼ I2 ¼ Here cB ¼ ð0; 0Þ: b 1 ¼ )B
1
B
cB B1
0 1
j ¼ 1; 2; . . .; 5
0
1 0 B ¼ @0 1 0 0
0 ; 1 1 0 0 1
! 2 : 4
b¼
as the initial basis.
1 0
1 C 0 A * cB B1 ¼ ð0; 0Þ 0 1
0 1
¼ ð0; 0Þ :
4.5 Computational Procedure of Standard Form I of Revised Simplex Method
85
! A Since the net evaluations zj cj are the components of ðcB B1 1Þ , the c net evaluations zj cj corresponding to non-basic variables are given by ðz1 c1 z2 c2 z3 c3 Þ ¼ cB B1 1 ðb a1 b a2 b a3Þ 0 1 2 1 2 B C ¼ ð 0 0 1 ÞB 0 4 C @ 1 A 6 2 3 0 2 0 1 2 1 2 A 6 B 6* A A¼B 1 b¼@ 0 4 4 @ c 6 2 3 ¼ ð 6 2 3 Þ:
1
0
13
0
C7 7 1C A5
0
0
These calculations indicate that z1 c1 is the most negative, and hence by 1 enters the basis. 0
10
1
0
1
y1 CB C B C 1 0 A@ 1 A ¼ @ 1 A ¼ : z 1 c1 0 0 1 6 6 0 10 1 0 12 0 13 1 0 0 2 2 2 b B CB C B C6 B C7 b 1 b b ¼ @ 0 1 0 A@ 4 A ¼ @ 4 A4 * b b¼ ¼ @ 4 A5 : Also; bx B ¼ B 0 0 0 1 0 0 0 2 ) xB ¼ ; zB ¼ 0: 4 Now;
1
B b 1 b by 1 ¼ B a1 ¼ @ 0
0 0
2
2
Now we construct the initial revised simplex table. by B
bx B
b 1 B
by 4 by 5
2 4 0
1 0 0
0 1 0
0 0 1
by 1
Mini ratio
2 1 6
1 4
The minimum ratio corresponds to the first basic variable, i.e. x4 ; hence, by 4 leaves the basis and by 11 ¼ 2 is the key element. First iteration: We introduce by 1 and drop by 4 . In this iteration,
86
4 Revised Simplex Method
0
b B
1
1 0 0A 1
1=2 0 ¼ @ 1=2 1 3 0
by
R01 ¼ 12 R1 R02 ¼ R2 R01 : R03 ¼ R3 þ 6R01
Here cB ¼ ð 6 0 Þ. The net evaluations zj cj corresponding to non-basic variables are given by ðz2 c2 z3 c3 z4 c4 Þ ¼ cB B1 1 ðb a3 b a4Þ a2 b 0 1 1 2 1 B C ¼ ð 3 0 1 Þ@ 0 4 0 A ½ðcB B1 1Þ ¼ ð 3 0 2 3 0 ¼ ð 1 3 3 Þ:
Clearly; Now;
1 Þ
z2 c2 is negative: Hence; by 2 enters the basis: 0 10 1 0 1 1=2 0 0 1 1=2 B CB C B C b 1 b by 2 ¼ B a 2 ¼ @ 1=2 1 0 A@ 0 A ¼ @ 1=2 A 3
1=2 1 B b b and bx B ¼ B b ¼ @ 1=2
0 1
1 2 10 1 0 1 0 2 1 CB C B C 0 A@ 4 A ¼ @ 3 A
3 )xB ¼ ð 1 3 Þ ; zB ¼ 6:
0
1
0
0
0
1
6
T
Now we construct the next revised simplex table. by B
bx B
b 1 B next
by 1 by 5
1 3 6
1/2 1=2 3
0 1 0
0 0 1
by 2
Mini ratio
1=2 1/2 1
– 6
Second iteration: We introduce by 2 and drop by 5 . In this iteration, 0
b B
1
0 ¼ @ 1 2
1 2 2
Here
1 0 0A 1
by
R02 ¼ 2R2 R01 ¼ R1 þ 12 R02 : R03 ¼ R3 þ R02
cB ¼ ð 6 2 Þ
The net evaluations zj cj corresponding to the non-basic variables are given by
4.5 Computational Procedure of Standard Form I of Revised Simplex Method
ðz3 c3 z4 c4 z5 c5 Þ ¼ cB B1 1 ð b a3 0 ¼ ð2 ¼ ð9
b a4
2 B 2 1 Þ@ 4 2 2 Þ: 3
87
b a5 Þ 1 1 0 C 0 1A 0 0
Since all zj cj [ 0 correspond to non-basic variables, an optimum basic feasible solution is obtained. The optimal solution is given by 0
b bx B ¼ B
1
0 B b b ¼ @ 1 2 6 ÞT and
) xB ¼ ð 4
10 1 0 1 0 2 4 CB C B C 0 A@ 4 A ¼ @ 6 A 1 0 12
1 2 2
zB ¼ 12:
Hence, the optimal solution of the given LPP is x1 ¼ 4; x2 ¼ 6; x3 ¼ 0 and zmax ¼ 12. Example 2 Using the revised simplex method, solve the following LPP: Maximize subject to
z ¼ x1 þ x2 þ 3x3 3x1 þ 2x2 þ x3 3 2x1 þ x2 þ 2x3 2 and
x1 ; x2 ; x3 0:
Solution This is a maximization problem. Here all bi ’s are positive. Now introducing the slack variables x4 0 and x5 0 in the first and second constraints respectively, the given LPP is rewritten in the standard form as follows: Maximize subject to
z ¼ x1 þ x2 þ 3x3 þ 0 x4 þ 0 x5 3x1 þ 2x2 þ x3 þ x4 ¼ 3 2x1 þ x2 þ 2x3 þ x5 ¼ 2 and xj 0; j ¼ 1; 2; . . .; 5 or Maximize z ¼ cT ; x subject to Ax ¼ b; x 0; where
c ¼ ð1
T
x ¼ ð x1 x2 x3 x4 x5 Þ ; A ¼
1
3
0 0 Þ;
3 2 2 1
1 2
1 0
0 1
and
b¼
! 3 : 2
88
4 Revised Simplex Method
Here the initial BFS is x4 ¼ 3; x5 ¼ 2 with B ¼ I2 ¼ Here A
b¼ Now A
b 1 ¼ )B
!
c
B1 cB B1
0
3
2
1 0 0 1
as the initial basis.
cB ¼ ð 0 0 Þ:
1
1
0
1
!
0 1 3 B C B ¼ @2C A 0
b B C b¼ ¼@ 2 1 2 0 1 A and b 0 1 1 3 0 0 0 1 1 0 0
1 0 0 B C ) cB B1 ¼ ð 0 0 Þ ¼ ð0 0Þ : ¼ @0 1 0A 0 1 1 0 0 1
The net evaluations zj cj corresponding to non-basic variables are given by ðz1 c1 z2 c2 z3 c3 Þ ¼ cB B1 1 ð b a1 0 ¼ ð0
This result indicates that z3 c3 0 1 0 b 1 b Now by 3 ¼ B a3 ¼ @ 0 1 0 0
1
b a3 Þ 2 1
1 1 3 Þ:
1
1
C 2 A 3
is the most negative. Hence, by 3 enter the basis. 10 1 0 1 2 0 13 0 1 1 1 0 A@ 2 A ¼ @ 2 A 4 * b a 3 ¼ @ 2 A5 : 1 3 3 3 0
1
B b 1 b Also; bx B ¼ B b ¼ @0 3 ) xB ¼ 2
3
B 1 Þ@ 2
0
¼ ð 1
b a2
0 and
10 1 0 3 1 3 C CB C B C 1 0 A@ 2 A ¼ B @ 2 A: 0 1 0 0 0 0
zB ¼ 0:
Now we construct the initial revised simplex table. by B
bx B
b 1 B
by 4 by 5
3 2 0
1 0 0
0 1 0
0 0 1
by 3
Mini ratio
1 2 3
3 1
The minimum ratio corresponds to the second basic variable, i.e. x5 ; hence, by 5 leaves the basis.
4.5 Computational Procedure of Standard Form I of Revised Simplex Method
89
First iteration: In this iteration, 0
b B
1
1 ¼ @0 0
1=2 1=2 3=2
1 0 0A 1
by
R02 ¼ R22 R01 ¼ R1 R02 : R03 ¼ R3 þ 3R02
The net evaluations zj cj corresponding to non-basic variables are given by ðz1 c1 z2 c2 z5 c5 Þ ¼ cB B1 1 ð b a1 b a2 b a5 Þ 0 1 3 2 0 B C 1 1A ¼ 0 32 1 @ 2 1 1 0 ! " 1 1 2 cB B1 ¼ ð 0 3 Þ ¼ 0 1 0 2 1 3 ¼ 2 2 2 :
3 2
Since all zj cj 0, an optimum basic feasible solution is obtained. The optimal solution is given by 0
1 B b 1 b bx B ¼ B b ¼ @0 ) xB ¼ ð 2
1=2 1=2
10 1 0 2 1 0 3 CB C B C C 0 A@ 2 A ¼ B @1A
0 3=2 1 1 Þ and zB ¼ 3: T
0
3
Hence, the optimal solution is x1 ¼ 0; x2 ¼ 0; x3 ¼ 1, and the maximum value of z is 3.
4.6
Standard Form II of Revised Simplex Method
In this form, if artificial variables appear in an LPP, then the initial basis matrix is obtained by using the vectors’ corresponding to slack and artificial variables. In that case, to solve the problem, either Charne’s Big M method or the two-phase method may be used. We denote these methods as Charne’s Big M revised simplex method and the two-phase revised simplex method. We shall discuss both methods separately.
90
4.6.1
4 Revised Simplex Method
Charne’s Big M Revised Simplex Method
In this case, the auxiliary basis matrix b¼ B
B cB
0 1
is not a unit basis, and the associated profit vector cB is not a null vector. Thus, b 1 ¼ B
B1 cB B1
0 1
will not be an identity matrix. This is the only difference between the standard forms I and II of the revised simplex method. All other theoretical developments and computational procedures are the same as in standard form I. Example 3: Using the revised simplex method, solve the following LPP: Maximize z ¼ x1 þ 2x2 subject to 2x1 þ 5x2 6 x1 þ x2 2; x1 ; x2 0: Solution This is a maximization problem. All the bi’s (i = 1, 2) are positive. Now introducing the surplus variables x3 0 and x4 0 in the first and second constraints respectively, the given LPP is written as follows: Maximize z ¼ x1 þ 2x2 þ 0 x3 þ 0 x4 subject to 2x1 þ 5x2 x3 ¼ 6 x1 þ x2 x4 ¼2; xj 0; j ¼ 1; 2; 3; 4: The coefficient matrix of the constraints does not contain a unit basis matrix. To obtain a unit matrix, two artificial variables x5 ; x6 are added to the left-hand side of the first and second constraints. Then the LPP becomes: Maximize
z ¼ x1 þ 2x2 þ 0 x3 þ 0 x4 Mx5 Mx6
subject to 2x1 þ 5x2 x3 þ x5 ¼ 6 x1 þ x2 x4 þ x6 ¼ 2; xj 0; j ¼ 1; 2; . . .; 6 or Maximize z ¼ cT ; x subject to
Ax ¼ b; x 0;
where c ¼ ð 1 2 0 0 M M Þ; x ¼ ðx1 x2 . . . x6 ÞT 6 2 5 1 0 1 0 b¼ A¼ : 2 1 1 0 1 0 1
4.6 Standard Form II of Revised Simplex Method
91
The coefficients of the variables constitute the column vectors y1 ¼
! 2 ; 1
y2 ¼
! 5 1 ; y3 ¼ ; 0 1
and the requirement vector b ¼
6
y4 ¼
0 ; 1
y5 ¼
! 1 ; 0
y6 ¼
0 ; 1
!
. 2 Here the artificial vectors y5 and y6 constitute the initial basis matrix B ¼ I2 , and taking this as the initial basis, we have the initial BFS as xB ¼ B1 b ¼ I2 b ¼ b 0: Therefore xB ¼ ð xB1 xB2 ÞT ¼ ð x5 x6 ÞT ¼ ð 6 2 ÞT ) x5 ¼ 6; x6 ¼ 2: 1 0 Here B ¼ ; cB ¼ ð M M Þ: 0 1 0 1 2 5 1 0 1 0 A B C b¼ Now; A ¼@ 1 1 0 1 0 1 A c 1 2 0 0 M M 0 1 1 0 0 1 0 B B C b 1 ¼ )B ¼@ 0 1 0A 1 cB B 1 M M 1
1 0 ¼ ð M M Þ: * cB B1 ¼ ð M M Þ 0 1 Now, the net evaluations zj cj corresponding to non-basic variables are given by ðz1 c1 z2 c2 z3 c3 z4 c4 Þ ¼ cB B1 1 ðb a1 b a2 b a3 b a4Þ 0 2 5 B ¼ ð M M 1 Þ@ 1 1
1 0
1 2 0 ¼ ð 3M 1 6M 2 M M Þ:
1 0 C 1 A 0
The result indicates that z2 c2 is the most negative, and hence by 2 enters the basis.
92
4 Revised Simplex Method
1 1 0 10 5 5 1 0 0 C C B CB B b 1 b 1 a2 ¼ @ 0 Now; by 2 ¼ B 1 0 A@ 1 A ¼ @ A 6M 2 2 M M 1 1 10 1 0 0 6 6 1 0 0 C CB C B B b 1 b Also; bx B ¼ B b¼@ 0 1 0 A@ 2 A ¼ @ 2 A: 8M 0 M M 1 6 ) xB ¼ and zB ¼ 8M: 2 0
13 5 6 C7 B a 2 ¼ @ 1 A5 : 4* b 2 2
0
Now, we construct the initial revised simplex table. by B
bx B
b 1 B
by 5 by 6
6 2
1 0 M
0 1 M
0 0 1
by 2
Mini ratio
5 1 6M 2
6/5 2
The minimum ratio corresponds to the first basic variable, i.e.x5 ; hence, by 5 leaves the basis. First iteration: We introduce by 2 and drop by 5 .In this case, 0
b B
1
1=5 ¼ @ 1=5 ðM þ 2Þ=5
0 1 M
1 0 0A 1
by
R01 ¼ R51 R02 ¼ R2 R01 R03 ¼ R3 þ ð6M þ 2ÞR01 :
Now, the net evaluations zj cj corresponding to non-basic variables are given by ðz1 c1 z3 c3 z4 c4 z5 c5 Þ ¼ cB B1 1 ðb a1 b a3 b a4 b a5Þ 0 2 1 0 M þ2 B M 1 @ 1 0 1 ¼ 5 1 ¼ 3M5þ 1 M 5þ 2 M
0 6M þ 2 5
0 :
1 1 C 0A M
Since z1 c1 is the most negative, then by 1 enters the basis. 0
1=5 b 1 b a 1 ¼ @ 1=5 Now; by 1 ¼ B ðM þ 2Þ=5
0 1 M
1 10 1 0 2=5 0 2 0 A@ 1 A ¼ @ 3=5 A: 3M5þ 1 1 1
4.6 Standard Form II of Revised Simplex Method
93
10 1 2 6 1=5 0 0 1 B CB C 6 b b b Again; bx B ¼ B b ¼ @ 1=5 1 0 A@ 2 A 6 4* b ¼ 0 ðM þ 2Þ=5 M 1 T 6 4 4M þ 12 ¼ 5 5 5 T 6 4 4M þ 12 ) xB ¼ and zB ¼ : 5 5 5 0
b
!
0
0 13 6 B C7 C7 ¼B @ 2 A5 0
The next revised simplex table is as follows: by B
bx B
b 1 B
by 2 by 6
6/5 4/5
1/5 1=5
4M þ 12 5
M þ2 5
0 1 M
0 0 1
by 1
Mini ratio
2/5 3/5
3 4/3
3M5þ 1
Second iteration: We introduce by 1 and drop by 6 . In this case, 0
b B
1
1 1=3 2=3 0 ¼ @ 1=3 5=3 0 A by 1=3 1=3 1
R02 ¼ 53 R2 ; R01 ¼ R1 25 R02 ; R03 ¼ R3 þ
3M þ 1 0 5 R2
:
Now, the net evaluations zj cj corresponding to non-basic variables are given by ðz3 c3 z4 c4 z5 c5 z6 c6 Þ ¼ cB B1 1 ð b a4 b a5 b a6 Þ a3 b 0 1 1 0 1 0 1 1 B C ¼ 3 3 1 @ 0 1 0 1 A 0 0 M M ¼ 13 13 M þ 13 M þ 13 : Clearly, z3 c3 and z4 c4 are most negative. Any one of them may enter the basis. Arbitrarily, we take by 3 as the entering vector. 0
b Now; by 3 ¼ B
1
1=3 b a 3 ¼ @ 1=3 1=3
2=3 5=3 1=3
10 1 0 1 0 1 1=3 0 A@ 0 A ¼ @ 1=3 A: 0 1=3 1
94
4 Revised Simplex Method
0
1=3
2=3
10 1 6 CB C 0 A@ 2 A 1 0
2
0
B b 1 b Again; bx 1 5=3 B ¼ B b ¼ @ 1=3 1=3 1=3 T 248 ¼ 333 T 24 8 ) xB ¼ and zB ¼ : 33 3
6 b 4* b ¼
b
!
0
0 13 6 B C7 ¼ @ 2 A5 0
The next revised simplex table is as follows: by B
bx B
b 1 B next
by 2 by 1
2/3 4/3 8/3
1/3 1=3 1/3
2=3 5/3 1/3
0 0 1
by 3
Mini ratio
1=3 1/3 1=3
– 4
The minimum ratio corresponds to the second basic variable, i.e. x1 ; hence, by 1 leaves the basis. Third iteration: In this case, 0
b B
1
0 ¼ @ 1 0
1 1 0 5 0 A by 2 1
R02 ¼ 3R1 ; R01 ¼ R1 þ
1 0 0 1 R ; R ¼ R3 þ R02 : 3 2 3 3
The net evaluations zj cj corresponding to non-basic variables are given by ðz1 c1 z4 c4 z5 c5 z6 c6 Þ ¼ cB B1 1 ð b a4 b a5 b a6 Þ a1 b 0 1 2 0 1 0 B C ¼ ð 0 2 1 Þ@ 1 1 0 1 A ¼ ð 1 2 M M Þ: 1 0 M M Clearly, z4 c4 is negative. ) by 4 enters the basis. 0
b Now; by 4 ¼ B
1
0 b a 4 ¼ @ 1 0
1 5 2
10 1 0 1 0 0 1 0 A@ 1 A ¼ @ 5 A: 1 0 2
Since the first two components of the entering vector by 4 are negative, there is an indication of an unbounded solution. Hence, the given LPP has an unbounded solution.
4.6 Standard Form II of Revised Simplex Method
4.6.2
95
Two-Phase Revised Simplex Method
In this method, besides the original objective function, an auxiliary objective function z is formed by considering the coefficients of artificial variables as −1 and the coefficients of the other variables as zero. Then both the objective functions are converted to constraints, considering z and z as unrestricted variables. Example 4 Use the two-phase revised simplex method to solve the following LPP: Maximize subject to
z ¼ 3x1 x2 2x1 þ x2 2 x1 þ 3x2 2 and x1 ; x2 0:
Solution This is a maximization problem. Here all bi ’s are positive. Now introducing the slack variable x3 and surplus variable x4 in the second and first constraints respectively, and then considering the given objective function as a constraint, with z as a variable, we get the reduced set of constraints as follows: 2x1 þ x2 x4
¼2
x1 þ 3x2 þ x3 ¼ 2 3x1 þ x2 þ z ¼ 0 and xj 0; j ¼ 1; 2; 3; 4: To get the initial basis matrix B as an identity matrix, introducing one artificial variable x5 ð 0Þ in the first constraint, we get the reduced constraints as 2x1 þ x2 x4 þ x5 ¼ 2 x1 þ 3x2 þ x3 ¼2 3x1 þ x2 þ z
¼ 0:
We shall solve this problem using the two-phase revised simplex method. In Phase I, the auxiliary objective function is Maximize
z ¼ x5
i.e. Maximize z ¼ 2x1 þ x2 x4 2 [substituting x5 from the0first constraint]. 1 1 0 0 Here the initial BFS is x3 ¼ 2; x5 ¼ 2; z ¼ 0 with I3 ¼ @ 0 1 0 A as the 0 0 1 initial basis,
96
4 Revised Simplex Method
0
1 0 0 1 0 A. Here cB ¼ ð 0 0 0 1 0 1 1 0 0 ¼ @ 0 1 0 A. 0 0 1
1 i.e. B ¼ @ 0 0 Now, B1
0 Þ.
0 1 1 2 1 0 0 0 B C B0 1 0 0C B1 0 C. Here b ¼ B 2 C. ) B1 ¼ ¼B 1 @ @ A 0 A 0 0 1 0 cB B 1 2 0 0 0 1 The net evaluations zj cj corresponding to non-basic variables are given by
ð z 1 c1
z 2 c2
¼ ð0 0
0
¼ ð 2
1
0
z 4 c4 Þ
1 Þð a1
a2
1 Þ:
2
0
a1
6 B 6 B 2 6 B 6 a4 Þ 6 * A ¼ B B 1 6 B 4 @ 3 2
a2
a3
a4
1
0
1
3 1
1 0
0 0
1
0
1
a5
13
C7 1 C7 C7 7 0C C7 C7 0 A5 0
This indicates that z1 c1 is the most negative. Hence, y1 enters the basis. 0 Now;
1
B0 B y1 ¼ B1 a1 ¼ B @0 0
Also;
xB ¼ B1 b ¼ ð 2
0
0
1
0
0 0
1 0
2 0
0
10
2
1
0
1
2
C B C B 0C CB 1 C B 1 C C: C¼B CB 0 A@ 3 A @ 3 A 1
2
2
2 ÞT :
Now we construct the initial revised simplex table. yB y5 y3 z z
xB 2 2 0 −2
y1
B1 1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
2 1 −3 −2
Mini ratio 2/2 = 1 2/1 = 2
According to the minimum ratio, y5 leaves the basis and y11 ¼ 1 is the key element. First iteration: We introduce y1 and drop y5 . In this iteration, the reduced basis inverse is given by
4.6 Standard Form II of Revised Simplex Method
0
B1
1=2 B 1=2 ¼B @ 3=2 1
97
0 1 0 0
1 0 0C C: 0A 1
0 0 1 0
The net evaluations zj cj corresponding to non-basic variables are given by ð z 2 c2 ¼ ð1
z5 c5 Þ ¼ ð 1 0 0 1 Þð a2 a4 0 1 1 1 1 B 3 0 0C B C 1 ÞB C ¼ ð 0 0 1 Þ: @ 1 0 0A
z 4 c4
0
0
1
1
a5 Þ
0
This indicates that all zj cj 0 correspond to non-basic variables. Hence, the current solution is an optimum for Phase I. The optimum solution for this phase is 0
1=2 B 1=2 B xB ¼ B1 b ¼ B @ 3=2 1 ) xB ¼ ð 1
1
3 ÞT
0 1
0 0
0 0
1 0
and
1 0 1 10 1 2 0 B B C C 0 CB 2 C B 1 C C C¼B C CB @ @ A A 3A 0 0 0 2 1
zB ¼ 0;
i.e. x1 ¼ 1; x3 ¼ 1; z ¼ 3 and Maximize z ¼ 0. Since there is no artificial variable in the current basis, we shall apply Phase II. Phase II: From the 0 basis inverse 1 B1 of the optimal obtained in Phase I, we 1=2 0 0 b 1 ¼ @ 1=2 1 0 A by deleting the fourth row and the fourth column. have B 3=2 0 1 0
Here
a1 B 2 b¼B A @ 1 3
a2 1 3 1
a3 0 1 0
1 a4 1 C C: 0 A 0
Now the net evaluations zj cj corresponding to non-basic variables are given by 0
ð z2 c2
z4 c4 Þ ¼ ð 3=2 0
1 Þð b a2
b a 4 Þ ¼ ð 3=2
0
1 1 Þ@ 3 1
1 1 0 A ¼ ð 5=2 0
This indicates that z4 c4 is negative. Hence, by 4 enters the basis.
3=2 Þ:
98
4 Revised Simplex Method
0
b ) by 4 ¼ B
1
1=2 b a 4 ¼ @ 1=2 3=2
0 1 0
10 1 0 1 0 1 1=2 0 A@ 0 A ¼ @ 1=2 A 1 0 3=2
0 1 0 2 1=2 0 b b and bx B ¼ B b, where b b ¼ @ 2 A ¼ @ 1=2 1 0 3=2 0 Now we construct the revised simplex table.
10 1 0 1 0 2 1 0 A @ 2 A ¼ @ 1 A. 1 0 3
1
by B
bx B
b 1 B
by 1 by 3 z
1 1 3
1/2 −1/2 3/2
0 1 0
0 0 1
by 4
Mini ratio
−1/2 1/2 −3/2
– 1/(½) = 2
According to the minimum ratio, by 3 leaves the basis and by 24 ¼ 1=2 is the key element. First iteration: In this iteration, the reduced basis inverse matrix is given by 0
b 1 B
0 1 ¼ @ 1 2 0 3
1 0 0 A: 1
Now the net evaluations zj cj corresponding to non-basic variables are given by ð z 2 c2 ¼ ð0
z 3 c3 Þ ¼ ð 0 3 1 Þ ð b a2 0 1 1 0 B C 3 1 Þ@ 3 1 A ¼ ð 10 3 Þ: 1 0
b a3 Þ
Since all zj cj [ 0 correspond to non-basic variables, an optimum basic feasible solution has been obtained. The optimal solution is given by 0
0 1 B b b bx B ¼ B b ¼ @ 1
) bx B ¼
2
!
0 and
0
i:e: x1 ¼ 2; x2 ¼ 0
1 2
10 1 0 1 0 2 2 CB C B C 0 A@ 2 A ¼ @ 2 A
3
1
0
zB ¼ 6; and zMax ¼ 6:
6
4.6 Standard Form II of Revised Simplex Method
99
Example 5 Solve the following LPP by the two-phase revised simplex method: Maximize subject to
z ¼ x1 þ 2x2 þ 3x3 þ 4x4 3x1 þ 2x2 þ 3x3 x4 25 2x1 þ x2 2x3 þ x4 5 2x1 þ 2x2 þ x3 þ x4 ¼ 20 and x1 ; x2 x3 ; x4 0:
Solution This is a maximization problem. Here all bi ’s are positive. Now introducing the slack variable x5 ð 0Þ and the surplus variable x6 ð 0Þ in the first and second constraints respectively, the given LPP is rewritten in the standard form as follows: Maximize subject to
z ¼ x1 þ 2x2 þ 3x3 þ 4x4 þ 0 x5 þ 0 x6 3x1 þ 2x2 þ 3x3 x4 þ x5 ¼ 25 2x1 þ x2 2x3 þ x4 x6 ¼ 5 2x1 þ 2x2 þ x3 þ x4 ¼ 20 and xi 0; i ¼ 1; 2; . . .; 6:
Considering the objective function as a constraint and z as a variable, we get the reduced set of constraints as 3x1 þ 2x2 þ 3x3 x4 þ x5
¼ 25
2x1 þ x2 2x3 þ x4 x6 ¼ 5 2x1 þ 2x2 þ x3 þ x4 ¼ 20 x1 2x2 3x3 4x4 þ z ¼ 0: To get the initial unit basis matrix, introducing two artificial variables x7 ð 0Þ and x8 ð 0Þ in the second and third constraints respectively, we get the reduced constraints as 3x1 þ 2x2 þ 3x3 x4 þ x5 ¼ 25 2x1 þ x2 2x3 þ x4 x6 þ x7 ¼ 5 2x1 þ 2x2 þ x3 þ x4 þ x8
¼ 20
x1 2x2 3x3 4x4 þ z
¼ 0:
We shall solve this problem by the two-phase method. In Phase I, the auxiliary objective function is to Maximize z ¼ x7 x8 ; i.e. Maximize z ¼ 3x2 x3 þ 2x4 x6 25 [substituting x7 and x8 from the second and third constraints].
100
4 Revised Simplex Method
Here the initial BFS is x5 ¼ 25; x7 ¼ 5; x8 ¼ 20; z ¼ 0 with B ¼ I4 as the initial basis. cB ¼ ð 0 0
Here
) B1 ¼
0 Þ:
0
1
0 1
B cB B1
0
1
B B0 B ¼B B0 B @0 0
0
0
1
0
0 0
1 0
0
0
0 0
1
C 0 0C C 0 0C C: C 1 0A 0 1
The net evaluations zj cj corresponding to non-basic variables are ð z 1 c1
z 2 c2
z 3 c3
0 0 1 Þ½ a1 0 3 2 6 B 6 B 2 1 6 B 6* A ¼ B 2 2 6 B 6 B 4 @ 1 2 0 3
¼ ð0 2
¼ ð0
0
3 1
2
z 4 c4 z 6 c6 Þ a2 a3 a4 a6
3
1
1
2 1
1 1
0 0
3
4
0
0
0
0
1
2
0
1
0
0
0
0
0
1 1 0 0
0 1
00
13
C7 0 0 C7 C7 7 0 0C C7 C7 1 0 A5 01
1 Þ:
This indicates that z2 c2 is the most negative. Hence, y2 enters the basis. Now; Also;
y2 ¼ B1 a2 ¼ ð 2 1 2 2 3 ÞT : xB ¼ B1 b ¼ ð 25 5 20 0 25 ÞT :
) xB ¼ ð 25
5
20 0 ÞT and zB ¼ 25:
Now we construct the initial revised simplex table. yB
xB
B1
y5 y7 y8
25 5 20 0 25
1 0 0 0 0
z z
0 1 0 0 0
0 0 1 0 0
0 0 0 1 0
0 0 0 0 1
y2
Mini ratio
2 1 2 2 3
25/2 5 10 – –
4.6 Standard Form II of Revised Simplex Method
101
According to the minimum ratio, y7 leaves the basis and y22 ¼ 1 is the key element. First iteration: We introduce y2 and drop y7 . In this iteration, the reduced basis inverse is given by 0
B1
1 B0 B ¼B B0 @0 0
2 1 2 2 3
0 0 1 0 0
1 0 0C C 0C C 0A 1
0 0 0 1 0
by
R01 R03 R04 R05
¼ R1 2R02 ¼ R3 2R02 : ¼ R4 þ 2R02 0 ¼ R5 þ 3R2
Again, the net evaluations zj cj corresponding to non-basic variables are given by ð z1 c1
¼ ð0
z4 c4 z6 c6 Þ ¼ ð 0 3 0 0 1 Þ½ a1 a3 a4 1 0 3 3 1 0 C B B 2 2 1 1 C C B 1 ÞB 1 1 0 C C ¼ ð 6 5 1 2 Þ: B 2 C B @ 1 3 4 0 A
z 3 c3
3 0
0
0
1
2
a6
1
This indicates that z1 c1 is the most negative. Hence, y1 enters the basis. 0
1 B0 B ) y1 ¼ B1 a1 ¼ B B0 @0 0 0
1 B0 B and xB ¼ B1 b ¼ B B0 @0 0
2 1 2 2 3
0 0 1 0 0
0 0 0 1 0
10 1 0 1 0 3 7 B C B C 0C CB 2 C B 2 C B 2 C¼B 6 C 0C CB C B C 0 A@ 1 A @ 5 A 1 0 6
2 1 2 2 3
0 0 1 0 0
0 0 0 1 0
1 1 0 10 15 25 0 C C B B 0C CB 5 C B 5 C B C B C 0 CB 20 C ¼ B 10 C C: 0 A@ 0 A @ 10 A 10 25 1
Hence, the next revised simplex table is as follows: yB
xB
B1
y5 y2
15 5
1 0
2 1
0 0
0 0
0 0
y1
Mini ratio
7 2
15/7 – (continued)
102
4 Revised Simplex Method
(continued) yB
xB
B1
y8
10 10 10
0 0 0
z z
2 2 3
1 0 0
0 1 0
0 0 1
y1
Mini ratio
6 5 6
5/3 – –
According to the minimum ratio, y8 leaves the basis and y13 ¼ 6 is the key element. Second iteration: We introduce y1 and drop y8 . In this iteration, the reduced basis inverse is given by 0
B1
1 B0 B ¼B B0 @0 0
1=3 1=3 1=3 1=3 1
7=6 1=3 1=6 5=6 1
0 0 0 1 0
1 0 0C C 0C C: 0A 1
Now, the net evaluations zj cj corresponding to non-basic variables are ð z 3 c3
¼ ð0
z 4 c4
1
1
z6 c6 Þ ¼ ð 0 0 3 1 B B 2 1 B 0 1 ÞB 1 B 1 B @ 3 4 1
2
1 1 0 1 Þð a3 a4 1 0 C 1 C C 0 C C ¼ ð 0 0 0 Þ: C 0 A 1
a6 Þ
Since all zj cj corresponding to non-basic variables are zero, the current solution is an optimum one for Phase I. Therefore, the optimum solution for this phase is 0
1
B B0 B 1 xB ¼ B b ¼ B B0 B @0 0
1=3
7=6
0
1=3 1=3
1=3 1=6
0 0
1=3
5=6
1
1
1
0
0
10
25
1
0
10=3
1
C C B CB 0 CB 5 C B 25=3 C C C B CB C C B B 0C CB 20 C ¼ B 5=3 C: C C B CB 0 A@ 0 A @ 55=3 A 0 25 1
5 25 ) x1 ¼ ; x2 ¼ ; x3 ¼ 0; x4 ¼ 0 and the Max z ¼ 0: 3 3 Since there is no artificial variable in the current basis, we shall apply Phase II.
4.6 Standard Form II of Revised Simplex Method
103
Phase II: From the basis inverse B1 of the optimal solution obtained in Phase I, we have 0
b 1 B
Here
1 B0 B ¼B @0
1=3 0 1=3 0 3 B 2 b¼B A B @ 2 1
1 0 0C C C by deleting the fifth row and the fifth column: 0A
7=6 1=3
1=3 1=3
1=6 5=6
1
2
3
1
1
1
2
1
0
2 2
1 3
1 4
0 0
0
1
1 C C C: 0 A 0
Now the net evaluations zj cj corresponding to non-basic variables are ð z3 c 3
z 4 c4
z 6 c6 Þ
a4 ¼ 0 13 56 1 ð b a3 b 17 1 ¼ 6 17 6 3 :
0
b a6 Þ ¼ 0
1 3
5 6
3 B B 2 1 B @ 1
1 1
3
4
1
1 0 1 C C C 0 A 0
This result indicates that either by 3 or by 4 can enter the basis. Arbitrarily, we choose by 3 as the entering vector. 0
1 B0 1 b b Now; by 3 ¼ B a3 ¼ B @0 0
1=3 7=6 1=3 1=3 1=3 1=6 1=3 5=6
b 1 b and bx B ¼ B b ¼ ð 10=3
10 1 0 1 0 3 7=6 B C B C 0C CB 2 C ¼ B 1=3 C A @ A @ 0 1 5=6 A 1 3 17=6
25=3 5=3
55=3 ÞT ) zB ¼
55 : 3
Hence, the next revised simplex table is as follows: by B
bx B
b 1 B
by 5 by 2 by 1 z
10/3 25/3 5/3 55/3
1 0 0 0
1/3 1/3 −1/3 1/3
−7/6 1/3 1/6 5/6
0 0 0 1
by 3
Mini ratio
7/6 −1/3 5/6* −17/6
20/7 – 2 –
According to the minimum ratio, by 1 leaves the basis and by 33 ¼ 5=6 is the key element.
104
4 Revised Simplex Method
First iteration: In this iteration, the reduced basis matrix is given by 0
b 1 B
1 B0 ¼B @0 0
4=5 7=5 1=5 2=5 2=5 1=5 4=5 7=5
1 0 0C C: 0A 1
Now, the net evaluations zj cj corresponding to non-basic variables are ð z1 c1 ¼ 0
z6 c6 Þ ¼ 0 45 75 1 ð b a1 0 1 3 1 0 C B B 2 1 1 C 17 1 B C ¼ 5 17 5 @ 2 1 0 A 1 4 0
z 4 c4 45
7 5
b a4
4 5
b a6 Þ :
This indicates that by 4 enters the basis. 0
1 B0 1 b b ) by 4 ¼ B a4 ¼ B @0 0 0
4=5 1=5 2=5 4=5
1 B0 1 b b b¼B and bx B ¼ B @0 0
7=5 2=5 1=5 7=5
10 1 0 1 0 1 8=5 B C B C 0C CB 1 C ¼ B 3=5 C 0 A@ 1 A @ 1=5 A 1 4 17=5 10 1 0 1 0 25 1 B 5 C B 9 C 0C CB C ¼ B C: 0 A@ 20 A @ 2 A 1 0 24
4=5 7=5 1=5 2=5 2=5 1=5 4=5 7=5
) zB ¼ ð 1
9
2 ÞT ; zB ¼ 24:
Hence, the next revised simplex table is as follows: by B
bx B
b 1 B
by 5 by 2 by 3 z
1 9 2 24
1 0 0 0
4/5 1/5 −2/5 −4/5
−7/5 2/5 1/5 7/5
0 0 0 1
by 4
Mini ratio
– 8/5 3/5* −1/5 −17/5
– 15 –
According to the minimum ratio, by 2 leaves the basis and by 24 ¼ 3=5 is the key element.
4.6 Standard Form II of Revised Simplex Method
105
Second iteration: In this iteration, the reduced basis matrix is given by 0
b 1 B
1 B0 ¼B @0 0
4=3 1=3 1=3 2=3 1=3 1=3 1=3 11=3
1 0 0C C: 0A 1
Now the net evaluations zj cj corresponding to non-basic variables are given by ð z1 c1 z2 c2 z6 c6 Þ 1 ðb ¼ 0 13 11 a2 b a6 Þ a1 b 3 0 1 3 2 0 C B B 2 1 1 C 17 1 ¼ 0 13 11 B C¼ 3 3 @ 2 2 0 A 1
2
17 3
13 :
0
This indicates that by 6 enters the basis. 0
1 B0 1 b b a6 ¼ B ) by 6 ¼ B @0 0 0
1 B0 1 b B b¼B and bx B ¼ B @0 0
4=3 1=3 1=3 1=3
1=3 2=3 1=3 11=3
4=3 1=3 1=3 2=3 1=3 1=3 1=3 11=3
10 1 0 1 0 0 4=3 B C B C 0C CB 1 C ¼ B 1=3 C 0 A@ 0 A @ 1=3 A 1 0 1=3 10 1 0 1 0 25 25 B 5 C B 15 C 0C CB C ¼ B C: 0 A@ 20 A @ 5 A 1 0 75
The next revised simplex table is as follows: by B
bx B
by 5 by 4 by 3 z
25
b 1 B 1
4/3
−1/3
15 5 75
0 0 0
1/3 −1/3 1/3
2/3 1/3 11/3
by 6
Mini ratio
0
−4/3
–
0 0 1
−1/3 1=3 −1/3
– 15
According to the minimum ratio, by 3 leaves the basis and by 36 ¼ 13 is the key element.
106
4 Revised Simplex Method
Third iteration: In this iteration, the reduced basis matrix is given by 0
b 1 B
1 B0 ¼B @0 0
0 0 1 0
1 1 1 4
1 0 0C C: 0A 1
Now the net evaluations zj cj corresponding to non-basic variables are given by ð z1 c1 ¼ ð0
z 3 c3 Þ ¼ ð 0 0 4 1 Þ ð b a1 b a2 0 1 3 2 3 B 2 1 2 C B C 1 ÞB C ¼ ð 7 6 1 Þ: @ 2 2 1 A
z 2 c2
0 4
1
2
b a3 Þ
3
Since the net evaluations zj cj corresponding to non-basic variables are greater than zero, an optimum basic feasible solution has been obtained. The optimal solution is given by 0 0
1 1
1
1
10 1 0 1 45 25 0 B B C C 0 CB 5 C B 20 C C CB C ¼ B C: 0 A@ 20 A @ 15 A
0
4
1
0
1 B B0 b 1 b bx B ¼ B b¼B @0 0
45
1
B C )bx B ¼ @ 20 A
0 and
0
80
zB ¼ 80;
15 i:e: x1 ¼ 0; x2 ¼ 0; x3 ¼ 0; x4 ¼ 20 and zMax ¼ 80:
4.7
Advantages of Revised Simplex Method
The revised simplex method has the following advantages: (i) This method automatically generates the inverse of the current basis matrix and the new basic feasible solution as well. (ii) It provides more information with less computational effort. (iii) It requires fewer computations than that of the ordinary simplex method. (iv) Fewer entries need to be made in each table of the revised simplex method.
4.8 Difference Between Revised Simplex Method and Simplex Method
4.8
107
Difference Between Revised Simplex Method and Simplex Method
Although the simplex method is a straightforward algebraic procedure, it is not an efficient computational procedure for solving LPPs on a digital computer. The ordinary simplex method requires the storing of all the elements of the entire table in the memory of the computer, which may not be feasible for a very large problem. Also, at each iteration the ordinary simplex method requires many computations that are not needed in the subsequent iteration. In fact, it is not necessary to compute the entire table during each iteration. The only pieces of information needed at each iteration are: (i) net evaluations zj cj , (ii) entering vector, (iii) current basic variables and their values. In the revised simplex method the preceding information is directly obtained very efficiently, avoiding unnecessary calculations as compared to the ordinary simplex method. Thus, the revised simplex method computes and stores only those pieces of information which are needed in the current iteration.
4.9
Exercises
Solve the following LPPs using the revised simplex method: 1. Maximize such that 2. Maximize such that 3. Maximize such that 4. Maximize such that 5.
Maximize such that
6. Maximize such that 7. Maximize such that
z ¼ x1 þ x2 x1 þ 2x2 2 4x1 þ x2 4 and x1 ; x2 0: z ¼ 5x1 þ 3x2 3x1 þ 5x2 15 3x1 þ 2x2 10 and x1 ; x2 0: z ¼ x1 þ x2 þ 3x3 3x1 þ 2x2 þ x3 3 2x1 þ x2 þ 2x3 2 and x1 ; x2 ; x3 0: z ¼ x1 þ x2 3x1 þ 3x2 6 x1 þ 4x2 4 and x1 ; x2 0: z ¼ 3x1 þ 2x2 þ 5x3 x1 þ 2x2 þ x3 430 3x1 þ 2x3 460 x1 þ 4x2 420 and x1 ; x2 ; x3 0: z ¼ x1 þ 2x2 x1 þ 2x2 3 x1 þ 3x2 1 and x1 ; x2 0: z ¼ x1 þ 2x2 x1 þ x2 3 x1 þ 2x2 5 3x1 þ x2 6 and x1 ; x2 0:
108
4 Revised Simplex Method
z ¼ 5x1 þ 3x2 4x1 þ 5x2 10 5x1 þ 2x2 10 3x1 þ 8x2 12 and x1 ; x2 0: 9. Minimize z ¼ 3x1 þ x2 such that x1 þ x2 1 2x1 þ x2 0 and x1 ; x2 0: 10. Minimize z ¼ 4x1 þ 2x2 þ 3x3 such that 2x1 þ 4x3 5 2x1 þ 4x2 þ x3 4 and x1 ; x2 ; x3 0: Maximize z ¼ 4x1 þ 3x2 11. such that 3x1 þ 4x2 12 3x1 þ 3x2 10 2x1 þ x2 4 x1 ; x2 0: 12. Maximize z ¼ 2x1 þ 3x2 such that x1 þ x2 0 x1 4 and x1 ; x2 0: Minimize z ¼ 2x1 þ x2 13. such that 3x1 þ x2 3 4x1 þ 3x2 6 x1 þ 2x2 3 and x1 ; x2 0: 14. Maximize z ¼ x1 þ x2 such that 2x1 þ 5x2 6 x1 þ x2 2 and x1 ; x2 0: 8. Maximize such that
Chapter 5
Dual Simplex Method
5.1
Objective
The objective of this chapter is to discuss an advanced technique, called the dual simplex method, for solving linear programming problems with type constraints.
5.2
Introduction
In Chap. 3, the simplex method was discussed for solving linear programming problems (LPPs). In this method, a negative net evaluation (for a maximization primal problem) indicates that the current solution is not optimum. Consider the jth dual constraint of a primal maximization problem, namely m X
aij wi cj ; j ¼ 1; 2; . . .; n
i¼1
where aij is the constraint coefficient of the jth primal variable xj in the ith primal constraint, and cj is the profit/cost associated with xj in the primal objective funcm P tion. Then clearly the net evaluation zj cj corresponding to xj is aij wi cj . i¼1
Thus, a negative value of zj cj indicates that the corresponding dual constraint is not satisfied. This leads to the conclusion that the dual is infeasible when the primal is non-optimum and vice versa. This gives the following result. While the primal problem starts as feasible but non-optimum and continues to be feasible until the optimum is reached, the dual problem starts as infeasible but better than optimum and continues to be infeasible until the true optimal solution is reached. In other words, while the primal problem is seeking optimality, the dual problem is automatically seeking feasibility. © Springer Nature Singapore Pte Ltd. 2019 A. K. Bhunia et al., Advanced Optimization and Operations Research, Springer Optimization and Its Applications 153, https://doi.org/10.1007/978-981-32-9967-2_5
109
110
5 Dual Simplex Method
Hence, the solution will start optimum (or actually better than optimum) and infeasible and will remain infeasible until the true optimum is reached, at which the solution becomes feasible. Such a procedure is called the dual simplex method. The dual simplex method is very similar to the ordinary simplex method. In these methods, the only difference is in the criterion used for selecting the entering and outgoing vectors. In this context, we note that in the dual simplex method the outgoing vector is determined first and then the entering vector. This is just the reverse of what is done in the ordinary simplex method. The dual simplex method gives an algorithm in which we start with a basic optimal solution of the primal problem with all zj cj 0, but it is not a feasible solution, as the values of some basic variables are negative. In each iteration, the numbers of negative basic variables are decreased while maintaining the optimality. An optimal solution is reached in a finite number of steps. In this method, inclusions of artificial variables are not required. Hence, this method greatly reduces the amount of computation. The dual simplex method has a wide range of applications. Some are as follows: (i) Parametric programming problem (ii) Integer linear programming problem (iii) Post-optimality analysis when the requirement vector is changed or when new constraints are added. Theorem For a basic solution (not necessarily feasible) with all zj cj 0 8j; the value of the objective function corresponding to this basic solution is optimal or better than optimal. Proof Let the primal LPP be as follows: Maximize z ¼ cT ; x subject toAx ¼ b; x 0; where c; xT 2 Rn ; bT 2 Rm and A is an m n real matrix: Let B be the current basis of the primal LPP and xB be the corresponding basic solution. ) BxB ¼ b
or
xB ¼ B1 b:
Let zj cj 0; 8j ¼ 1; 2; . . .; n and one or more basic variables has negative value. The dual of the preceding LPP is given by the following: Minimize subject to
z0 ¼ hbT ; wi AT w cT ; w is unrestricted:
The value of z corresponding to the basic solution xB is given by
5.2 Introduction
ðzB Þ ¼ cB xB ¼ ½cB xB T ¼
111
½* cB xB is (1,1) matrix as cB is 1 m and xB is m 1 matrix
xTB cTB
T ¼ B1 b cTB * xB ¼ B1 b T T ¼ bT B1 cTB ¼ bT cB B1 T ¼ bT w ; where w ¼ cB B1 :
Therefore zB ¼ hbT ; w i: It should be noted that w 2 Rm : Since all zj cj 0, we have A 1 0 1 cB B c or cB B1 A c 0 T T or cB B1 A cT ; or AT cB B1 cT or AT w0 cT This implies that w0 is a basic feasible solution of the dual LPP. Now hbT ; w0 i is the value of the dual objective function for the solution w0 : z0Min bT ; w0 : But we have hbT ; w0 i ¼ zB : Therefore, z0Min zB : According to the duality theorem, we have ) z0Min zMax : Hence, zMax zB or zB zMax ; i.e. the value of the objective function corresponding to the basic solution xB ¼ B1 b is equal to or greater than the optimal value of the objective function. This proves the theorem. Theorem Let xBr be a negative basic variable in a dual simplex table, let all net evaluations zj cj be non-negative and let the primal LPP be of maximization type. If yrj 0 for all non-basic variables xj , then there does not exist any feasible solution to the primal LPP. Proof Let the primal LPP be as follows: Maximize z ¼ cT ; x subject to Ax ¼ b; x 0; where c; xT 2 Rn ; bT 2 Rm and A is an m n real matrix:
112
5 Dual Simplex Method
Let xB ¼ B1 b be a basic solution of this primal LPP. The dual of this LPP is given by
Minimize z0 ¼ bT ; w subject to AT w cT ; where w is unrestricted and wT 2 Rm :
Since all zj cj 0; we have cB B1
1
A c
! 0
T
or AT w0 cT , where w0 ¼ ðcB B1 Þ . Hence, w0 is a feasible solution of the dual LPP. Let g be any positive real number and w1 ¼ w0 þ gbr , where br is the rth column of T ðB1 Þ : Now AT w1 ¼ AT ðw0 þ gbr Þ ¼ AT w0 þ gAT br : ð5:1Þ We know that yj ¼ B1 aj for all j. T
Hence, Y ¼ B1 A or Y T ¼ ðB1 AÞ ¼ AT ðB1 Þ
T
) Y T ¼ AT ½b1 b2 . . . bm ¼ AT b1 AT b2 . . . AT bm 2
y11 6 y21 6 . 6 or6 .. 6 . 4 .. ym1 2
y11 6 y12 6 . 6 or6 .. 6 . 4 .. y1n
3 y1n y2n 7 7 7 7 ¼ AT b1 AT b2 . . . AT bm 7 5
y12 y22
... ...
ym2
. . . ymn
y21 y22
y2n
3 . . . ym1 . . . ym2 7 7 7 7 ¼ AT b1 AT b2 . . . AT bm 7 5 . . . ymn
3 yr1 6 yr2 7 6 . 7 6 7 ) 6 .. 7 ¼ AT br for r ¼ 1; 2; . . .; m 6 . 7 4 .. 5 yrn 2
5.2 Introduction
113
Since all yrj 0; we have AT br 0: As g 0 we have gAT br 0. Hence from (5.1), we have AT w0 AT w1 or AT w0 cT as AT w1 cT : Hence, w0 is a feasible solution to the dual LPP for all a: The value of the dual objective function z0 for the feasible solution w0 is given by z0 jw0 ¼ bT ; w0 ¼ bT ; w1 þ gbr ¼ bT ; w1 þ g bT ; br : T
ð5:2Þ
T
As, xB ¼ B1 b; xTB ¼ ðB1 bÞ or xTB ¼ bT ðB1 Þ or ½ xB1 xB2 . . . xBr . . . xBm ¼ bT ½ b1 b2
. . . br
. . . bm :
Hence, xBr ¼ hbT ; br ifor all r: Thus, from (5.2), z0 jw0 ¼ hbT ; w1 i þ gxBr : Letting g ! 1, we have z0 jw0 ! 1 ½* xBr \0 and g [ 0. This means that the dual LPP has an unbounded solution. Therefore, the primal LPP has no feasible solution. Criteria for incoming and outgoing vectors in dual simplex method In the simplex method, if the basic variable xBr is replaced by the non-basic variable xk , then we have ^z ¼ z
xBr ð z k ck Þ yrk
and ^zj ^cj ¼ zj cj
yrj ðzk ck Þ: yrk
ð5:3Þ ð5:4Þ
Now to remove negative basic variables, first the most negative basic variable xBr is selected as the outgoing basic variable from all basic variables. So the corresponding vector yBr is the outgoing vector. We know that if for some basic solution (not necessarily feasible) all components of net evaluations are non-negative, the value of the objective function to this basic solution is either optimal or better than optimal. So, we have to decrease the value of the objective function to obtain the value of zMax : Since xBr \0 and zk ck 0, in order to have ^z z, we must choose an entering variable xk such that yrk \0: In this case, the new values of the net evaluations are given by
114
5 Dual Simplex Method
^zj ^cj ¼ zj cj Again;
yrj ðzk ck Þ: yrk
^zj ^cj 0 8 j:
Hence, we can write yrj ðzk ck Þ; yrk z j cj z k ck i:e: for yrj \0 yrj yrk
z k ck z j cj or ¼ Max ; yrj \0 : j yrk yrj z j cj
From this criterion, the incoming vector and key element can be found easily. Here the incoming vector is yk and yrk is the key element.
5.3
Dual Simplex Algorithm
The iterative procedures for the dual simplex method are as follows: Step 1 If the given LPP is in the minimization form, then convert it into a maximization problem using the mini-max principle. Step 2 Convert type constraints of the given LPP, if any, into type, multiplying both sides of the corresponding constraints by ð1Þ. Step 3 Introduce slack variables in the constraints of the given LPP and obtain an initial basic solution. Put this solution in the starting dual simplex table. Step 4 Test the nature of the net evaluations zj cj in the starting simplex table. (i) If all zj cj and xBj are non-negative for all i and j, then an optimum basic feasible solution has been obtained. (ii) If all zj cj are non-negative and at least one basic variable, say xBi , is negative, then go to Step 5. (iii) If at least one zj cj is negative, then the dual simplex method is not applicable for the given problem. Step 5 Select the most negative of xBi , say xBr . The corresponding basis vector yBr leaves the basis. Step 6 Test the nature of all yrj ; j ¼ 1; 2; . . .; n, i.e. the elements of the rth row. (i) If all yrj are non-negative, there does not exist any feasible solution to the given LPP.
5.3 Dual Simplex Algorithm
115
(ii) If at least one yrj is negative, then compute
Max j
z j cj : yrj \0 ; yrj
j ¼ 1; 2; . . .; n;
k If this is zkyc , then yk enters the basis in place of yBr and yrk is the rk key element.
Step 7 With yrk as the key element, form the next table. Using elementary row operations, convert the key element to unity and all other elements of the key column to zero to get the improved solution. Step 8 Repeat Steps 4 through 7 until either an optimum basic feasible solution is obtained or there is an indication of no feasible solution. Example 1 Using the dual simplex method, solve the following LPP: Minimize z ¼ x1 þ x2 subject to 2x1 þ x2 2 x1 x2 1 and x1 ; x2 0: Solution The given LPP can be written as follows: Minimize subject to
z ¼ x1 þ x2 2x1 x2 2 x1 þ x2 1 and x1 ; x2 0:
Here the problem is a minimization problem. Let z0 ¼ z. Then Min z ¼ MaxðzÞ ¼ Max z0 using the mini-max principle. Hence, the problem is to maximize z0 ¼ x1 x2 . Now introducing two slack variables x3 and x4 in the first and second constraints respectively in the given LPP, we get the following converted equations: 2x1 x2 þ x3 ¼ 2 x1 þ x2 þ x4 ¼ 1 and x1 ; x2 ; x3 ; x4 0 Then the readjusted objective function z0 is given by z0 ¼ x1 x2 þ 0 x3 þ 0 x4 :
116
5 Dual Simplex Method
Now taking x1 ¼ x2 ¼ 0, we get x3 ¼ 2; x4 ¼ 1, which is an initial basic but infeasible solution to the given problem. Now we construct the starting dual simplex table as follows: cj cB 0 0
yB
xB
y3 y4 z0B ¼ 0
x3 ¼ 2 x4 ¼ 1
1 y1
1 y2
0 y3
0 y4
2 1 1
1 1 1
1 0 0
0 1 0
zj c j
Since all zj cj 0 and xB1 ¼ 2; xB2 ¼ 1, the current basic solution is optimum or more than optimum but infeasible. Now xBr ¼ MinfxBi ; xBi \0g ¼ Minf2; 1g ¼ 2 ¼ xB1 )r¼1 Hence, the vector yB1 , i.e. y3 , leaves the basis.
z j cj z j cj Again; Max ; yrj \0 ¼ Max ; y1j \0 yrj y1j
z 1 c1 z 2 c2 1 1 1 ; ¼ Max ; ¼ ; ¼ Max 2 1 2 y11 y12 which occurs for j =1. Then the vector y1 enters the basis, and yrj i.e. y11 ¼ 2 is the key element. Now we construct the next table.
cB
cj yB
xB
1 y1
1 y2
0 y3
0 y4
1
y1
x1 = 1
1
12
0
y4
x4 ¼ 2
1 2 1 2 1 2
1 2 1 2
1
0
z0B ¼ 1
0 0
0
zj c j
Since all zj cj 0 and xB2 ¼ 2 is negative, the current basic solution is an optimum or more than optimum but infeasible solution. Since only xB2 ¼ 2 is negative, then r =2 and the corresponding basis vector yB2 i.e. y4 leaves the basis. But since all y2j ðj ¼ 1; 2; 3; 4Þ are non-negative, no other vector enters the basis. Thus, there does not exist any feasible solution to the given LPP. Example 2 Using the dual simplex method, solve the following LPP:
5.3 Dual Simplex Algorithm
Minimize subject to
117
z ¼ 2x1 þ x2 þ x3 4x1 þ 6x2 þ 3x3 8 x1 9x2 þ x3 3 2x1 3x2 þ 5x3 4 and x1 ; x2 ; x3 0:
Solution Here the problem is a minimization problem. Let z0 ¼ z. Then Min z ¼ MaxðzÞ ¼ Max z0 using the mini-max principle. Hence, the problem is to maximize z0 ¼ 2x1 x2 x3 . Now introducing three slack variables x4 0; x5 0; x6 0, one to each constraint respectively, we get the following converted equations: 4x1 þ 6x2 þ 3x3 þ x4 ¼ 8 x1 9x2 þ x3 þ x5 ¼ 3 2x1 3x2 þ 5x3 þ x6 ¼ 4 and xj 0; j ¼ 1; 2; . . .; 6 Then the readjusted objective function z0 is given by z0 ¼ 2x1 x2 x3 þ 0 x4 þ 0 x5 þ 0 x6 : Now taking x1 ¼ x2 ¼ x3 ¼ 0, we get x8 ¼ 8; x5 ¼ 3; x6 ¼ 4, which is an initial basic but infeasible solution to the given problem. Now we construct the starting dual simplex table.
cB
cj yB
0 y4 0 y5 0 y6 z0B ¼ 0
xB x4 ¼ 8 x5 ¼ 3 x6 ¼ 4
2 y1
1 y2
1 y3
0 y4
0 y5
0 y6
4 1 2 2
6 9 3 1
3 1 5 1
1 0 0 0
0 1 0 0
0 0 1 0
zj c j
Since all zj cj 0 and xB2 ¼ 3, xB3 ¼ 4, the current solution is optimum or more than optimum but infeasible. Since xBr ¼ Min½xBi ; xBi \0 ¼ Min½xB2 ; xB3 ¼ Min½3; 4 ¼ 4 ¼ xB3 ; ) r ¼ 3:
118
5 Dual Simplex Method
Hence, the vector yB3 , i.e. y6 , leaves the basis. Again;
zj cj zj cj z1 c1 z2 c2 Max ; yrj \0 ¼ Max ; y3j \0 ¼ Max ; yrj y3j y31 y32
2 1 1 ; ¼ ; which occurs for j ¼ 2: ¼ Max 2 3 3
Then the vector y2 enters the basis, and yrj , i.e. y32 ¼ 3, is the key element. Now we construct the next table.
cB
cj yB
0 0 −1
y4 y5 y2
z0B
¼
xB
2 y1
1 y2
1 y3
0 y4
0 y5
0 y6
x4 ¼ 0 x5 ¼ 9
0 7
13 14
2 3 4 3
1 0 0
0 1 0
2 3
x2 ¼ 43
0 0 1
0
0
1 3
43
0
53 8 3
13 zj cj
Since all zj cj 0 8 j ¼ 1; 2; . . .; 6 and all xBi 0, an optimal basic feasible solution has been obtained. Hence, the required optimal basic feasible 4solution 4 to the 4 0 given LPP is x1 ¼ 0; x2 ¼ 3 ; x3 ¼ 0 with Min z ¼ Max z ¼ 3 ¼ 3. Example 3 Use the dual simplex method to solve the following LPP: Maximize z ¼ 3x1 x2 subject to x1 þ x2 1 2x1 þ 3x2 2 and x1 ; x2 0: Solution The given LPP can be written as follows: Maximize z ¼ 3x1 x2 subject to x1 x2 1 2x1 3x2 2 and x1 ; x2 0: Now introducing the slack variables x3 0 and x4 0 in the constraint of the given LPP, we get the following reduced LPP: Maximize subject to
z ¼ 3x1 x2 þ 0 x3 þ 0 x4 x1 x2 þ x3 ¼ 1 2x1 3x2 þ x4 ¼ 2
and
x1 ; x2 ; x3 ; x4 0:
5.3 Dual Simplex Algorithm
119
Now setting x1 ¼ x2 ¼ 0, we get x3 ¼ 1; x4 ¼ 2, which is an initial basic but non-feasible solution to the given problem. The starting dual simplex table is as follows:
cB 0 0 zB ¼ 0
cj yB
xB
y3 y4
x3 ¼ 1 x4 ¼ 2
3 y1
1 y2
0 y3
0 y4
1 2 3
1 3 1
1 0 0
0 1 0
zj c j
Since all zj cj 0 and xB1 ¼ 1; xB2 ¼ 2, the current basic solution is optimum or more than optimum but infeasible. Here the most negative basic variable is xB2 ¼ 2. Hence r = 2, and the corresponding basis vector y4 leaves the basis. For the vector which enters the basis we select
z j cj z 1 c 1 z 2 c2 Max ; yij \0 ¼ Max ; y2j y21 y22
3 1 3 1 1 ; ¼ Max ¼ Max ; ¼ ; which occurs for j ¼ 2: 2 3 2 3 3 Then y2 enters the basis, and y2j , i.e. y22 ¼ 3, is the key element. Now we construct the next table. 3 y1
1 y2
0 y3
0 y4
x3 ¼ 13
13
0
1
13
x2 ¼
2 3 7 3
1
0
13
0
0
1 3
cB
cj yB
xB
0
y3 y2
1 zB ¼
23
2 3
zj cj
Since all zj cj 0 and xB1 ¼ 13, the current basic solution is optimum or more than optimum but infeasible. Since xB1 ¼ 13 is negative, hence r ¼ 1 and the corresponding basis vector y3 leaves the basis.
zj cj ; y1j \0 y1j
z 1 c1 z 4 c4 7=3 1=3 ¼ Max ; ; ¼ Max ¼ Maxf7; 1g ¼ 1; which occurs for j ¼ 4: y11 y14 1=3 1=3
Now; Max
120
5 Dual Simplex Method
Then y4 enters the basis and y14 ¼ 13 is the key element. Now we construct the third table.
cB 0 1 zB ¼ 1
cj yB
xB
y4 y2
x4 ¼ 1 x2 ¼ 1
3 y1
1 y2
0 y3
0 y4
1 1 2
0 1 0
3 1 1
1 0 0
zj cj
Here all zj cj 0 8 j ¼ 1; 2; 3; 4 and all xBi ’s are non-negative. This shows that an optimal basic feasible solution has been obtained. Thus, the required optimal basic feasible solution is x1 ¼ 0; x2 ¼ 1 and zMax ¼ 1.
5.4
Difference Between Simplex Method and Dual Simplex Method
The dual simplex method is similar to the regular simplex method, except that in the latter the starting initial basic solution is feasible but not optimum, while in the former it is infeasible but optimum or better than optimum. The simplex method works towards optimality, while the dual simplex method works towards feasibility. In the consecutive iterations of the simplex method for solving a maximization problem, the value of the objective function gradually increases and finally reaches to its optimal value. In each iteration, the solution is basic feasible but non-optimal. On the other hand, in the consecutive iterations of the dual simplex method for solving the same problem, the value of the objective function gradually decreases and finally reaches to its optimal value. In each iteration, the solution is basic non-feasible but optimal (or better than optimal).
5.5
Modified Dual Simplex Method
If the initial table of the dual simplex method contains some negative basic variables and some of the net evaluations are negative, then the dual simplex method is not applicable. In such situations, the dual simplex method is to be modified to form an equivalent LPP in which some basic variables are negative but all net evaluations are non-negative. Hence, the standard dual simplex method can be applied to that equivalent LPP. One of the modified methods is the artificial constraint method. In this method, the variables corresponding to the negative net evaluations and the variable
5.5 Modified Dual Simplex Method
121
corresponding to the most negative component of net evaluations are considered. Let zp cp be the most negative net evaluation corresponding to the variable xp : In this method we have to consider the artificial constraint X xj M; P where is extended over all j’s for which zj cj \0, and M is a sufficiently large positive number. Adding the slack variable xM to this constraint, we have X xj þ xM ¼ M: From this equation, we obtain xp ¼ M
xM þ
P j6¼p
! xj .
Then substituting this xp in the original objective function and all the constraints, we get a reduced problem. This reduced problem together with the artificial constraint is equivalent to the given problem. For this equivalent LPP, all zj cj 0: Thus, to solve this problem, the dual simplex method can now be applied. Example 4 Solve the following problem by the modified dual simplex method: Minimize z ¼ 2x1 x2 x3 subject to 4x1 þ 6x2 þ 3x3 8 x1 þ 9x2 x3 3 2x1 þ 3x2 5x3 4 and x1 ; x2 ; x3 0: Solution The given minimization problem is written as a maximization problem as follows: Maximize z0 ¼ 2x1 þ x2 þ x3 þ 0 x4 þ 0 x5 þ 0 x6 subject to 4x1 þ 6x2 þ 3x3 þ x4 ¼ 8 x1 9x2 þ x3 þ x5 ¼ 3 2x1 3x2 þ 5x3 þ x6 ¼ 4 and x1 ; x2 ; x3 ; x4 ; x5 ; x6 0: Now we construct the initial dual simplex table.
cB 0 0 0 z0B ¼ 0
yB y4 y5 y6
cj
2
1
1
0
0
0
xB x4 ¼ 8 x5 ¼ 3 x6 ¼ 4
y1 4 1 −2 −2
y2 6 −9 −3 −1
y3 3 1 5 −1
y4 1 0 0 0
y5 0 1 0 0
y6 0 0 1 0
zj c j
122
5 Dual Simplex Method
Here there are negative net evaluations, so the standard dual simplex method is not applicable. From this simplex table, it is seen that the most negative evaluation is z1 c1 ¼ 2 corresponding to the variablex x1 : Hence, the artificial constraint is x1 þ x2 þ x3 M, where M is a very large positive number. Adding the slack variable xM ; we have x1 þ x2 þ x3 þ xM ¼ M; from which we have x1 ¼ M x2 x3 xM : Substituting the expression for x1 in the given LPP, we have: Maximize subject to
z00 ¼ 2ðM x2 x3 xM Þ þ x2 þ x3 þ 0 x4 þ 0 x5 þ 0 x6 4ðM x2 x3 xM Þ þ 6x2 þ 3x3 þ x4 ¼ 8 M x2 x3 xM 9x2 þ x3 þ ¼ 3 2ðM x2 x3 xM Þ 3x2 þ 5x3 þ x6 ¼ 4 x1 þ x2 þ x3 þ xM ¼M and x1 ; x2 ; x3 ; x4 ; x5 ; x6 ; xM 0
or Maximize subject to
z00 ¼ 2xM x2 x3 þ 0 x4 þ 0 x5 þ 0 x6 þ 2M 4xM þ 2x2 x3 þ x4 ¼ 8 4M xM 10x2 þ x5 ¼ 3 M 2xM x2 þ 7x3 þ x6 ¼ 4 þ 2M xM þ x1 þ x2 þ x3 ¼M and x1 ; x2 ; x3 ; x4 ; x5 ; x6 ; xM 0
Now we construct the dual simplex table.
cB yB 0 y4 0 y5 0 y6 0 y1 z00B ¼ 2M
Now,
cj
−2
0
−1
−1
0
0
0
xB x4 ¼ 8 4M x5 ¼ 3 M x6 ¼ 4 þ 2M x1 ¼ M
yM −4 −1 2 1 2
y1 0 0 0 1 1
y2
y3 −1 0 7 1 1
y4 1 0 0 0 0
y5 0 1 0 0 0
y6 0 0 1 0 0
2 −10 −1 1 1
zj cj
xBr ¼ MinfxBi ; xBi \0g ¼ Minf8 4M; 3 M g ¼ 8 4M ¼ xB1 :
Hence, r ¼ 1 and the corresponding vector yB1 ; i:e: y4 leaves the basis.
5.5 Modified Dual Simplex Method
n Again,
Max
zj cj yrj
123
o n o 2 1 z c 2 ; yrj \0 ¼ Max jy1j j ; y1j \0 ¼ Max 4 ; 1 ¼ 4 ,
which
occurs for the vector yM : Then the vector yM enters the basis and yM1 ¼ 4 is the key element. Now we construct the next dual simplex table.
yB cB −2 yM 0 y5 0 y6 0 y1 z00B ¼ 4
cj
−2
0
−1
−1
0
0
0
xB xM ¼ 2 þ M x5 ¼ 5 x6 ¼ 0 x1 ¼ 2
yM 1 0 0 0 0
y1 0 0 0 1 0
y2 −1/2 −21/2 0 3/2 2
y3 1/4 1/4 13/2 3/4 1/2
y4 −1/4 −1/4 1/2 ¼ 1/2
y5 0 1 0 0 0
y6 0 0 1 0 0
zj cj
Since only xB2 is negative hence r ¼ 2 and the corresponding basis vector y5 leaves the basis. n o n o z c 2 4 ; 1=2 Again, Max jy2j j ; y2j \0 ¼ Max 21=2 1=4 ¼ 21, which occurs for the vector y2 : Then the vector y2 enters the basis and y22 ¼ 21=2 is the key element. Now we construct the next dual simplex table.
yB cB −2 yM −1 y2 0 y6 0 y1 z00B ¼ 64=21
cj
−2
0
−1
−1
0
0
0
xB xM ¼ M 37=21 x2 ¼ 10=21 x6 ¼ 0 x1 ¼ 9=7
yM 1 0 0 0 0
y1 0 0 0 1 0
y2 0 1 0 0 0
y3 5/21 −1/42 13/2 11/14 13/42
y4 −5/21 1/42 1/2 3/14 19/42
y5 −1/21 −2/21 0 1/7 4/21
y6 0 0 1 0 0
zj cj
Here all the basic variables are non-negative. So an optimal basic feasible solution has been obtained. Therefore, the required optimal basic feasible solution is given by x1 ¼ 9=7; x2 ¼ 10=21; x3 ¼ 0 and z0Max ¼ ð2M þ 64=21Þ þ 2M ¼ 64=21: Therefore, zMin ¼ z0Max ¼ 64=21:
124
5.6
5 Dual Simplex Method
Exercises
Use the dual simplex method to solve the following LPPs: 1. Minimize subject to
z ¼ 3x1 x2 x1 þ x2 1 2x1 þ 3x2 2 and x1 ; x2 0:
2. Minimize subject to
z ¼ 2x1 þ x2 3x1 þ x2 3 4x1 þ 3x2 6 x1 þ 2x2 3 and x1 ; x2 0:
3. Maximize subject to
z ¼ 2x1 3x2 x3 2x1 þ x2 þ 2x3 3 3x1 þ 2x2 þ x3 4 and x1 ; x2 ; x3 0:
4. Minimize subject to
z ¼ x1 þ 2x2 þ 3x3 2x1 x2 þ x3 4 x1 þ x2 þ 2x3 8 x2 x3 2 and x1 ; x2 ; x3 0:
5. Minimize subject to
z ¼ 2x1 x2 x3 4x1 þ 6x2 þ 3x3 8 x1 9x2 þ x3 3 2x1 3x2 þ 5x3 4 and x1 ; x2 ; x3 0:
6. Minimize subject to
z ¼ 10x1 þ 6x2 þ 2x3 x1 þ x2 þ x3 1 3x1 þ x2 x3 2 and x1 ; x2 ; x3 0:
7. Minimize subject to
z ¼ x1 þ x2 2x1 þ x2 2 x1 x2 1 and x1 ; x2 0:
5.6 Exercises
125
8. Maximize subject to
z ¼ 3x1 2x2 x1 þ x2 1 x1 þ x2 7 x1 þ 2x2 10 x2 3 and x1 ; x2 0:
9. Maximize subject to
z ¼ 3x1 þ 2x2 2x1 þ x2 5 x1 þ x2 3 and x1 ; x2 0:
10. Maximize subject to
z ¼ 2x1 3x2 2x3 x1 2x2 3x3 ¼ 8 2x2 þ x3 10 x2 2x3 4 and x1 ; x2 ; x3 0:
11. Minimize subject to
z ¼ 3x1 þ x2 x1 þ x2 1 2x1 þ 3x2 2 and x1 ; x2 0:
12. Minimize subject to
z ¼ 6x1 þ 7x2 þ 3x3 þ 5x4 5x1 þ 6x2 3x3 þ 4x4 12 x2 þ 5x3 6x4 10 2x1 þ 5x2 þ x3 þ x4 8 and x1 ; x2 ; x3 ; x4 0:
13. Minimize subject to
z ¼ 5x1 þ 6x2 x1 þ x2 2 4x1 þ x2 4 and x1 ; x2 0:
14. Minimize subject to
z ¼ 4x1 þ 2x2 3x1 þ x2 27 x1 þ x2 21 x1 þ 2x2 30 and x1 ; x2 0:
Chapter 6
Bounded Variable Technique
6.1
Objective
The objective of this chapter is to discuss the modified simplex method to solve LPPs whose variables are restricted with lower and upper bounds.
6.2
Introduction
In the theory of linear programming, it is well known that the decision variables of a linear programming problem (LPP) satisfy non-negativity conditions/restrictions only. However, in real-life situations, some (or all) of the variables are bounded by lower and upper limits. These bounds of the variables are expressed as bound constraints. This means that, in an LPP, the bound constraints of some (or all) variables are given in addition to the general constraints. In such cases, the standard form of the LPP will be as follows: Maximize z ¼
n P
cj xj
j¼1
subject to the constraints Ax = b, l x u, where
x ¼ ½x1 ; x2 ; . . .; xn T ; b ¼ ½b1 ; b2 ; . . .; bn T ; l ¼ ½l1 ; l2 ; . . .; ln T ; u ¼ ½u1 ; u2 ; . . .; un T ; A ¼ aij mxn ; i ¼ 1; 2; . . .; m; j ¼ 1; 2; . . .; n:
Here A is an m n real matrix, and l and u denote the lower and upper bounds of x respectively. In cases of unbounded variables, these limits are considered as 0 and 1 respectively.
© Springer Nature Singapore Pte Ltd. 2019 A. K. Bhunia et al., Advanced Optimization and Operations Research, Springer Optimization and Its Applications 153, https://doi.org/10.1007/978-981-32-9967-2_6
127
128
6 Bounded Variable Technique
The bound constraints l x u can be written separately as x l and x u. Now introducing surplus variables s0 and slack variables s00 in the preceding bound constraints, we get the equality constraints as follows: x s0 ¼ l and x þ s00 ¼ u;
where
x; s0 ; s00 0:
The lower bound constraints can be written as x ¼ l þ s0 . Thus, x can be eliminated from all the remaining constraints by substituting x ¼ l þ s0 . On the other hand, the upper bound constraints can be written x ¼ u s00 . This does not serve the purpose, since there is no guarantee that x will be non-negative. This difficulty is overcome by using a special technique called the bounded variable simplex method. In this method, the optimality condition for a solution will be the same, as the constraints x þ s00 ¼ u require modification in the feasibility condition of decision variables due to the following reasons: (i) A basic variable should become a non-basic variable at its upper bound (in the usual simplex method, all non-basic variables are at zero level). (ii) When a non-basic variable becomes a basic variable, its value should not exceed its upper bound without affecting the non-negativity and upper bound conditions of all existing basic variables.
6.3
The Computational Procedure
The iterative procedure for finding an optimum basic feasible solution (BFS) to an LPP with bounded variables is as follows: Step 1: Check whether the objective function of the given LPP is to be maximized or minimized. If it is to be minimized, then convert it into a problem of maximizing by using Minimum z ¼ Maximum( zÞ: Step 2: Check whether all bi (i = 1, 2,…, m) are non-negative. If any one of bi is negative, then multiply the corresponding inequality of the constraints by (−1) so as to get all bi (i = 1, 2,…, m) as non-negative. Step 3: Convert all the inequality constraints (except bound constraints) into equations by introducing slack and/or surplus variables in the constraints. Set the coefficients of these variables in the objective function equal to zero. We note that the lower and upper bounds of slack and surplus variables are assumed as 0 and 1 respectively. Step 4: If the lower bound lj (j = 1, 2, …, n) of any variable xj is at a positive level (i.e. if lj xj ; lj> 0), set xj= lj+ xj, xj 0 in the reduced problem obtained from Step 3.
6.3 The Computational Procedure
129
Step 5: Obtain an initial BFS. Step 6: Compute the net evaluations zj − cj (j = 1, 2,…, n) and examine their signs. Two cases may arise: (i) If all zj − cj 0, then an optimum BFS has been attained. (ii) If at least one zj − cj< 0, choose the most negative of zj − cj. Let it be zr − cr for some j = r. Step 7: Compute h ¼ Minfh1 ; h2 ; ur g, where h1 ¼ Min i
h2 ¼ Min i
n
xBi yir
n
o ; yir [ 0 ;
uðxBi ÞxBi yir
o ; yir \0 ;
where the xBi’s correspond to the current BFS, and ur is the upper bound for the variable xr. Note that h2 ¼ 1 if all yir > 0 and h1 ¼ 1 if all yir < 0. In finding the value of h, three cases may arise: (i) If h ¼ h1 which corresponds to xBk, then yr enters the basis and yk leaves it. (ii) If h ¼ h2 which corresponds to xBk, then yr enters the basis and yk leaves it. Due to this, the value of one or more basic variable will be negative in the next iteration. In that case, the values of basic variables are to be updated. This can be done by putting the non-basic variables xr (corresponding to the most negative zj − cj) at zero level at the upper bound by substituting xr ¼ ur x0r ð0 x0r ur Þ, as the upper bound of xr is ur. In this case, ðxBi Þnew ¼ ðxBi Þold yir ur (iii) If h ¼ ur ; xr is substituted at its upper bound and it remains non-basic; i.e. it can be put at zero level by using xr ¼ ur x0r ; 0 x0r ur : In this situation, the basic variables are to be updated by the formula ðxBi Þnew ¼ ðxBi Þold yir ur : Then Go to Step 6.
130
6 Bounded Variable Technique
Example 1 Using the bounded variable technique, solve the following LPP: Maximize z ¼ 3x1 þ 5x2 þ 2x3 subject to x1 þ 2x2 þ 2x3 14; 2x1 þ 4x2 þ 3x3 23; 0 0 x3 3:
x1 4; 2 x2 5;
Solution By introducing the stack variables x4 ð 0Þ and x5 ð 0Þ, in the first and second constraints respectively, we get the reduced problem as Maximize z ¼ 3x1 þ 5x2 þ 2x3 þ 0:x4 þ 0:x5 x1 þ 2x2 þ 2x3 þ x4 ¼ 14 2x1 þ 4x2 þ 3x3 þ x5 ¼ 23 and 0 x1 4; 2 x2 5; 0 x3 3; x4 ; x5 0: Since there are no upper bounds specified for the variables x4 and x5 , we arbitrarily assume that all have an upper bound at /, i.e. u4 ¼ u5 ¼/. Since x2 has a positive lower bound, let us take x2 ¼ x02 þ 2. Then 2 x2 5 implies 0 x02 3. Then the given LPP reduces to Maximize z ¼ 3x1 þ 5ðx02 þ 2Þ þ 2x3 þ 0:x4 þ 0:x5 x1 þ 2ðx02 þ 2Þ þ 2x3 þ x4 ¼ 14 or, x1 þ 2x02 þ 2x3 þ x4 2x1 þ 4ðx02
þ 2Þ þ 3x3 þ x5 ¼ 23
and 0 x1 4;
0 x02
or, 2x1 þ 4x02
¼ 10
þ 3x3 þ x5 ¼ 15
3; 0 x3 3; x4 ; x5 0
Here the initial BFS is x4 ¼ 10; x5 ¼ 15 with B ¼ I2 as initial basis. Now we construct the starting simplex table:
cB yB 0 y4 0 y5 zB ¼ 10 uj
cj
3
5
2
0
0
xB x4 ¼ 10 x5 ¼ 15
y1 1 2 –3 4
y02 2 4 –5 3
y3 2 3 –2 3
y4 1 0 0 1
y5 0 1 0 1
Since D2 is most negative, y02 enters the basis. n o 15 15 x Now, h1 ¼ Min yB0 i ; y0i2 [ 0 ¼ Min 10 2 ; 4 ¼ 4 (corresponds to x5 ) i
i2
Dj
6.3 The Computational Procedure
131
uðxBi Þ xBi 0 h2 ¼ Min ; yi2 \0 i y0i2
¼ 1 ½As y0i2 [ 0
i ¼ 1; 2
for
and u02 ¼ 3 0 Hence, h ¼ Min h1 ; h2 ; u02 ¼ Min 15 4 ; 1; 3 ¼ 3, which implies that h ¼ u2 . Therefore, x02 is substituted at its upper bound and it remains non-basic. Thus, x02 ¼ u02 x002 ¼ 3 x002 , where 0 x002 3. Due to this substitution, the values of the basic variables will be updated, and these updated values are ðxB1 Þnew ¼ ðxB1 Þold y012 u02 ¼ 10 2 3 ¼ 4 ðxB2 Þnew ¼ ðxB2 Þold y022 u02 ¼ 15 4 3 ¼ 3: Therefore, by using the updated values of the basic and non-basic variables, the initial simplex table becomes cj cB yB xB 0 y4 x4 ¼ 4 x5 ¼ 3 0 y5 zB ¼ 15 þ 10 ¼ 25 uj
3
–5
2
0
0
y1 1 2 −3 4
y002 −2 −4 5 3
y3 2 3 −2 3
y4 1 0 0 1
y5 0 1 0 1
Dj
Since D1 , i.e. z1 c1 , is most negative; hence y1 enters the basis. Now, h1 ¼ Min 41 ; 32 ¼ 32 (corresponds to x5 ) h2 ¼ 1; since all the elements of y1 are positive and u1 ¼ 4 3 3 ) h ¼ Minfh1 ; h2 ; u1 g ¼ Min ; 1; 4 ¼ ; which implies h ¼ h1 : 2 2 Hence y12 ¼ 2 is the key element and y5 leaves the basis. Now, we construct the next table. cj cB yB xB 0 y4 x4 ¼ 5=2 3 y1 x1 ¼ 3=2 zB ¼ ð25 þ 3 3=2Þ ¼ 59=2 uj
3
−5
2
0
0
y1 0 1 0 4
y002 0 −2 −1 3
y3 1/2 3=2 5=2 3
y4 1 0 0 1
y5 1=2 1=2 3=2 1
Dj
132
6 Bounded Variable Technique
Since Dj corresponding to y002 is negative, then y002 enters the basis. Now, h1 ¼ 1, since all the elements of y002 are either negative or zero. h2 ¼ Min
4 32 ð2Þ
5 ¼ ; which corresponds to x1 : 4
u002 ¼ 3: Hence, h ¼ Min h1 ; h2 ; u002 ¼ 54, which implies h ¼ h2 . Hence, y0022 ¼ 2 is the key element and y1 leaves the basis. Now we construct the next table. cj yB xB cB 0 y4 x4 ¼ 5=2 x002 ¼ 3=4 –5 y002 zB ¼ ð25 þ 15=4Þ ¼ 115=4 uj
3
−5
2
0
0
y1 0 1=2 1=2 4
y002
y3 1/2 −3/4 7=4 3
y4 1 0 0 1
y5 1=2 1=4 5=4 1
0 1 0 3
Dj
Since D1 is only negative, then y1 enters the basis. But from the preceding table, it is seen that the value of one basic variable is negative. So, we have to update the basic variables. This can be done by putting the non-basic variable x1 at zero level at its upper bound by substituting x1 ¼ 4 x01 ð0 x01 4Þ, as the upper bound of x1 is 4. Due to this substitution, the values of the basic variables will be updated, and these updated values are 5 5 04¼ 2 2 3 1 5 ¼ ðxB2 Þold y21 u1 ¼ ð Þ 4 ¼ : 4 2 4
ðxB1 Þnew ¼ ðxB1 Þold y11 u1 ¼ ðxB2 Þnew
Thus using the updated values of the basic variables, we construct the simplex table. cj
−3
−5
2
0
0
cB 0
yB y4
xB
y01 0
y002 0
y3
y4 1
y5
–5
y002
x002
1
34 7 4
zB ¼ 25 þ 3 4
x4 ¼ 52 ¼ 54 25 123 4 ¼ 4
1 2 1 2
0
1 2
12
0
14
0
5 4
Dj
Since all Dj 0, an optimal solution has been attained. The solution is given by x01 ¼ 0; x002 ¼ 54 ; x3 ¼ 0 with Maximum z ¼ 123 4 .
6.3 The Computational Procedure
133
Now, since x1 ¼ 4 x01 ; x01 ¼ 0 implies x1 ¼ 4. Again, since x02 ¼ 3 x002 ; x002 ¼ 54 implies x02 ¼ 3 54 ¼ 74. Also, x2 ¼ x02 þ 2 ¼ 74 þ 2 ¼ 15 4. 123 Hence, the optimal BFS is x1 ¼ 4; x2 ¼ 15 4 ; x3 ¼ 0 and Maximum z ¼ 4 :
Example 2 Solve the LPP by using the bounded variable method. Minimize z ¼ 6x1 2x2 3x3 subject to 2x1 þ 4x2 þ 2x3 8 x1 2x2 þ 3x3 7 0 x1 2; 0 x2 2 and 0 x3 1: Solution The given LPP is a minimization problem. Let z0 ¼ z; then Min z = −Max (−z) = −Max z*. Hence, the problem is to Maximize z* = 6x1 þ 2x2 þ 3x3 subject to the given constraints. Introducing slack variables x4 and x5 , the given LPP reduces to Maximize z ¼ 6x1 þ 2x2 þ 3x3 subject to 2x1 þ 4x2 þ 2x3 þ x4 ¼ 8 x1 2x2 þ 3x3 þ x5 ¼ 7 0 x1 ; x2 2; 0 x3 1 and x4 ; x5 0: Since there are no specified upper bounds for the variables x4, x5, we may arbitrarily assume that u4 ¼ 1; u5 ¼ 1. Here the initial BFS is x4 = 8, x5 = 7, with B = I2 as the initial basis. Now, we construct the initial simplex table as follows:
cB 0 0 zB ¼ 0
yB y4 y5
cj
−6
2
3
0
0
xB 8 7
y1 2 1 6 2
y2 4 −2 −2 2
y3 2 3 −3 1
y4 1 0 0 1
y5 0 1 0 1
uj
Dj
134
6 Bounded Variable Technique
Since D3 is most negative, y3 enters the basis. Now h1 ¼ Min 82 ; 73 ¼ 73, which corresponds to y5, h2 ¼ 1 since all yi3 [ 0 for i = 1,2: u3 ¼ 1: h ¼ Minfh1 ; h2 ; u3 g ¼ Min 73 ; 1; 1 ¼ 1, which corresponds to h ¼ u3 . Therefore, x3 is substituted at its upper bound and it remains non-basic, thus x3 ¼ u3 x03 ¼ 1 x03 ; where 0 x03 1:
Due to this substitution, the basic variables will be updated, and these updated values are ðxB1 Þnew ¼ ðxB1 Þold y13 u3 ¼ 8 2 1 ¼ 6 ðxB2 Þnew ¼ ðxB2 Þold y23 u3 ¼ 7 3 1 ¼ 4: The initial table is as follows:
cB 0 0 zB ¼ 3
yB y4 y5
cj
−6
2
-3
0
0
xB 6 4
y1 2 1 6 2
y2 4 −2 −2 2
y03 −2 −3 3 1
y4 1 0 0 1
y5 0 1 0 1
uj
Dj
Since, D2 is negative only then y2 enters the basis. Now, h1 ¼ Min 64 ¼ 32, which corresponds to y4, h2 ¼ Minf1g ¼ 1 and u2 ¼ 2: ) h ¼ Min
3
2 ; 1; 2
¼ 32, which corresponds to h ¼ h1 .
Hence, y12 = 4 is the key element, and y4 leaves the basis. Now we construct the next table: cj
−6
cB 2
yB y2
xB
y1
3 2
1 2
0
y5
7
2
2
−3
0
0
y2 1
y03
y4
12
1 4
y5 0
0
−4
1 (continued)
6.3 The Computational Procedure
135
(continued) −6
cj zB ¼ 6
2
7
0
−3 2
0 1 2 1 2
0
0
Dj
Since all Dj 0, the optimal solution has been reached and the optimal solution is given by 3 and x03 ¼ 0; 2 3 i:e: x1 ¼ 0; x2 ¼ and x3 ¼ 1 as x3 ¼ 1 x03 2 3 and Minimum z ¼ 6:0 2: 3:1 ¼ 6: 2
x1 ¼ 0; x2 ¼
6.4
Exercises
Solve the following LPP: 1. Maximize z ¼ 3x1 þ 2x2 subject to the constraints x1 3x2 3; x1 2x2 4; 2x1 þ x2 20 x1 þ 3x2 30; x1 þ x2 6 and 0 x1 8; 0 x2 6: 2. Maximize z ¼ 3x1 þ 5x2 þ 2x3 subject to the constraints x1 þ 2x2 þ 2x3 14; 2x1 þ 4x2 þ 3x3 23 0 x1 4; 2 x2 5; 0 x3 3:
and 3. Maximize z ¼ x2 þ 3x3 subject to the constraints
x1 þ x2 þ x3 10; x1 2x3 0; 2x2 x3 10 and
0 x1 8; 0 x2 4; x3 0:
136
6 Bounded Variable Technique
4. Maximize z ¼ 4x1 þ 4x2 þ 3x3 subject to the constraints x1 þ 2x2 þ 3x3 15; x2 þ x3 4; 2x1 þ x2 x3 6; x1 x2 þ 2x3 10 0 x1 8; x2 0; 0 x3 4:
and
5. Maximize z ¼ 3x1 þ x2 þ x3 þ 7x4 subject to the constraints 2x1 þ 3x2 x3 þ 4x4 40; 2x1 þ 2x2 þ 5x3 x4 35 x1 þ x2 2x3 þ 3x4 100 and
x1 2; x2 1; x3 3; x4 4:
6. Maximize z ¼ 4x1 þ 10x2 þ 9x3 þ 11x4 subject to the constraints 2x1 þ 2x2 þ 2x3 þ 2x4 5; 48x1 þ 80x2 þ 160x3 þ 240x4 257 and
0 xj 1; j ¼ 1; 2; 3; 4:
7. Maximize z ¼ 6x1 2x2 3x3 subject to the constraints 2x1 þ 4x2 þ 2x3 8; x1 2x2 þ 3x3 7 and 0 x1 2; 0 x2 2; 0 x3 1:
Chapter 7
Post-optimality Analysis in LPPs
7.1
Objectives
The objectives of this chapter are: • To discuss the effect of discrete changes of the cost/profit vector and requirement vector on the optimal solution of a linear programming problem (LPP) • To find the feasibility and optimality conditions for the discrete changes of the elements of the coefficient matrix of the LPP • To study the effect of structural changes (i.e. inclusion and deletion of a decision variable and constraint) in an LPP.
7.2
Introduction
When solving a real-life linear programming problem (LPP), after formulation and obtaining an optimal solution of the said LPP, sometimes we are interested in studying the effect of changes (discrete as well as continuous) of different parameters of the problem on the current optimal solution. This situation arises in the following two cases: Case I: An error (wrong value used for cost/profit coefficient, omission of variables or constraints, etc.) has been observed in the problem formulation after obtaining the optimal solution. Case II: The system is analysed to study the effect of changes in different parameters. For Case I, the error can be removed in two different ways: (i) Solve the new LPP after replacing the correct values of the components which are used wrongly. © Springer Nature Singapore Pte Ltd. 2019 A. K. Bhunia et al., Advanced Optimization and Operations Research, Springer Optimization and Its Applications 153, https://doi.org/10.1007/978-981-32-9967-2_7
137
138
7 Post-optimality Analysis in LPPs
(ii) Develop some procedure so that the current optimum solution can be used to find the correct solution of the problem. For Case I, to reduce the computational effort, one should choose option (ii). On the other hand, for Case II, option (ii) of Case I is used, as there is no other option. An analysis of such a post-optimal problem can thus be termed a post-optimality analysis. The changing parameters may be either discrete or continuous. The study of the effect of discrete parameter changes on the optimal solution is called sensitivity analysis, and that of continuous changes is termed parametric programming. The changes or variations in an LPP which are studied in the post-optimality analysis are: (i) Changes in the cost coefficients cj of the objective function, i.e. a change in the cost or profit vector (ii) Changes in the requirement vector b (iii) Changes in the coefficient of the constraints (iv) Structural changes due to addition or deletion of some variables (v) Structural changes due to addition or deletion of some linear constraints. After these changes have been effected in an LPP, we may have any of the following cases: (i) The optimum solution remains unaltered; i.e. the basic variables and their values are essentially unchanged. (ii) The basic variables remain the same, but their values may change. (iii) The basic solution is changed entirely.
7.3
Discrete Changes in the Profit Vector
Let xB be the optimal basic feasible solution of the following LPP: Maximize z ¼ cT ; x subject to Ax = b and x 0, where c, xT 2 Rn ; bT 2 Rm and A is an m n real matrix of rank m. Let Dck be the amount which is added to the kth component ck of c = (c1, c2, …, cn), so that the new value of the kth component becomes ck ¼ ck þ Dck . Since x B ¼ B 1 b is independent of c, then for any change of ck ; k ¼ 1; 2; . . .; n, the value of x B will not be changed. Hence, the solution x B remains basic feasible for the current problem. Here, two cases may arise: Case I: ck is not in cB . Case II: ck is in cB .
7.3 Discrete Changes in the Profit Vector
139
Case I: When ck is not in cB , ck is not the coefficient of the basic variable in the objective function. In that case, the net evaluation corresponding to the kth profit coefficient is given by zk ck ¼ zk ðck þ Dck Þ;
8ck 62 cB
since zk is not affected for any change in ck . Also, we note that the net evaluations corresponding to the profit coefficients other than ck remain unchanged. Thus, the current solution xB remains optimum for the new problem if zk ck Dck 0 i:e: if Dck zk ck
8ck 2 6 cB ; 8ck 2 6 cB :
Case II: When ck is in cB, ck is the coefficient of the basic variable in the objective function. In this case, the net evaluations are given by z j cj ¼
m X
¼
cBi yij cj
i¼1 m X
cBi yij þ cBk ykj cj ; where ck ¼ cBk :
i¼1
i 6¼ k If ck is replaced by ck þ Dck ¼ ck , then zj cj ¼
m X
cBi yij þ ðcBk þ DcBk Þykj cj ðj 6¼ kÞ
i¼1
i 6¼ k m X ¼ cBi yij þ DcBk ykj cj i¼1
¼ zj cj þ DcBk ykj : Thus, the new basic feasible solution xB will remain optimum so long as Dck satisfies or When ykj [ 0, DcBk
zj cj þ DcBkykj 0 DcBk ykj zj cj :
zj cj ykj ,
140
7 Post-optimality Analysis in LPPs
z j cj i:e: DcBk Max ; ykj [ 0 : ykj ðzj cj Þ ðzj cj Þ ; y \0 . Again, when ykj \0, DcBk ykj , i.e. DcBk Min kj ykj ðzj cj Þ ðzj cj Þ Hence, Max ; ykj [ 0 DcBk ¼Dck Min ; ykj \0 ð j 6¼ kÞ. ykj ykj Example 1 We are given the following LPP: z ¼ x1 þ 2x2 x3
Max
subject to 3x1 þ x2 x3 10 x1 þ 4x2 þ x3 6 and
x2 þ x3 4 xj 0; j ¼ 1; 2; 3:
(a) Determine the effect of discrete changes of those components of the profit vector which correspond to the basic variables. (b) Discuss the effect of discrete changes in those components of the profit vector which do not correspond to basic variables. Solution Introducing the slack variables x4 0, x6 0, surplus variable x5 0 and an artificial variable x7 0 in the constraints of the given LPP and then solving the resulting problem by the simplex method, the following optimum simplex table is obtained:
cj cB
yB
0 y4 2 y2 0 y5 zB ¼ 8
xB x4 = 6 x2 = 4 x5 = 10
−1 y1
2 y2
−1 y3
0 y4
0 y5
0 y6
−M y7
3 0 1 1
0 1 0 0
−2 1 3 3
1 0 0 0
0 0 1 0
−1 1 4 2
0 0 −1 M
Dj
(a) Here the basic variables are x4 ; x2 ; x5 . The corresponding profit coefficients are c4 ; c2 ; c5 . We know that an optimum basic feasible solution will maintain its optimality if a change Dck of ck 2 cB satisfies the following inequalities:
7.3 Discrete Changes in the Profit Vector
141
z j cj zj cj Max ; ykj [ 0 DcBk ¼ Dck Min ; ykj \0 ðj 6¼ k Þ: ykj ykj For k ¼ 2, i.e. k ¼ 2, we have z j cj z j cj Max ; y2j [ 0 DcB2 ¼ Dc2 Min ; y2j \0 ; y2j y2j 2 i.e. Max 3 1 ; 1 Dc2 , i.e. 2 Dc2 or Dc2 2. (b) Here the non-basic variables are x1 ; x3 ; x6 . Among these variables, only x1 and x3 are the decision variables of the given LPP. The corresponding profit coefficients are c1 and c3 : We know that the change Dck in ck 2 cB must satisfy the upper limit Dck ðzk ck Þ in order to maintain the optimality of the optimum basic feasible solution. Thus, for k ¼ 1; Dc1 z1 c1 , i.e. Dc1 1. For k ¼ 3; Dc3 z3 c3 , i.e. Dc3 3. Alternative method (a) Here the basic variables are x4 , x2 , x5 . Among these variables, only x2 is the given decision variable of the given LPP. The corresponding profit coefficient is c2 ¼ 2. Let Dc2 be the amount by which c2 is changed. Hence, the new value of c2 is c2 ¼ c2 þ Dc2 ¼ 2 þ Dc2 . Now for the change of c2 ¼ 2 to c2 , we have the following corresponding changed simplex table: cj cB
yB
xB
0 2 þ Dc2 0
y4 y2 y5
x4 ¼ 6 x2 ¼ 4 x5 ¼ 10
−1 y1
2 þ Dc2 y2
−1 y3
0 y4
0 y5
0 y6
−M y7
3 0 1 1
0 1 0 0
−2 1 3 3 þ Dc2
1 0 0 0
0 1 0 0
−1 1 4 2 þ Dc2
0 0 −1 M
The current solution xB remains optimum for the new problem if 3 þ Dc2 0 and 2 þ Dc2 0, i.e. if Dc2 3 and Dc2 2, i.e. if Dc2 Maxf3; 2g, i.e. if Dc2 2.
Dj
142
7 Post-optimality Analysis in LPPs
(b) Here the non-basic variables are x1 , x3 , x6 . Among these, only x1 and x3 are the given decision variables. The corresponding profit coefficients are c1 and c3 . Let Dc1 be the amount by which c1 is changed. Hence, the new value of c1 is c1 ¼ c1 þ Dc1 ¼ 1 þ Dc1 . Now for the change of c1 ¼ 1 to c1 , we have the following corresponding changed simplex table: cj cB
yB
xB
0 2 0
y4 y2 y5
x4 ¼ 6 x2 ¼ 4 x5 ¼ 10
1 þ Dc1 y1
2 y2
−1 y3
0 y4
0 y5
0 y6
−M y7
3 0 1 1 Dc1
0 1 0 0
−2 1 3 3
1 0 0 0
0 0 1 0
−1 1 4 2
0 0 −1 M
Dj
The current solution remains optimum for the new problem if 1 Dc1 0, i.e. if Dc1 1: Similarly, the range of Dc3 can be obtained. Example 2 Given the LPP z ¼ 15x1 þ 45x2
Max
subject to x1 þ 16x2 240 5x1 þ 2x2 162 x2 50 and x1 ; x2 0; if Maximize z ¼
P
cj xj ; j ¼ 1; 2 and c 2 is kept fixed at 45, determine how much c 1
j
can be changed without affecting the optimal solution. Solution By introducing the slack variables x3 0; x4 0 and x5 0 in the constraints of the given LPP and then solving the resulting problem by the simplex method, the following optimum simplex table is obtained: cj cB
yB
45 y2 15 y1 0 y5 zB ¼ 1005
xB x2 ¼ 173=13 x1 ¼ 352=13 x5 ¼ 477=13
15 y1
45 y2
0 y3
0 y4
0 y5
0 1 0 0
1 0 0 0
5/78 −1/39 -5/78 5/2
+1/78 8/39 1/78 5/2
0 0 1 0
Dj
From this optimum table, it is seen that y 1 ; y 2 ; y 5 are the basic vectors. Let c 1 be changed to c 1 þ Dc 1 .
7.3 Discrete Changes in the Profit Vector
143
Since c 2 is kept fixed and c 1 can vary, an optimum basic feasible solution will maintain its optimality if Dc 1 in c1 2 cB satisfies z j cj z j cj Max ; y2j [ 0 DcB2 ¼ Dc1 Min ; y2j \0 ; y2j y2j ðz4 c4 Þ ðz3 c3 Þ Dc1 Min or Max y24 y23 5=2 5=2 Dc1 Min or Max 8=39 1=39 195 195 Dc1 : or 16 2
here k ¼ 2
Example 3 We are given the following LPP: Max z ¼ 3x1 þ 5x2 subject to 3 x 1 þ 2 x 2 18 x1 2 x2 6 and x1 ; x2 0: Discuss the effect on the optimality of the solution, where the objective function is changed to 3x1 þ x2 . Solution After introducing the slack variables x3 0; x4 0 and x5 0 in the constraints of the preceding LPP and then solving the reformulated LPP by the simplex method, the optimum simplex table is as follows: cj cB
yB
3 y1 0 y4 5 y2 zB ¼ 36
xB x1 ¼ 2 x4 ¼ 0 x2 ¼ 6
3 y1
5 y2
0 y3
0 y4
0 y5
1 0 0 0
0 0 1 0
1/3 −2/3 0 1
0 1 0 0
−2/3 4/3 1 3
Dj
Since the objective function is now changed to z ¼ 3x1 þ x2 , the new cB becomes (3 0 1). Hence, the new net evaluations corresponding to the variables x3 ; x4 ; x5 become 1 2 þ0 þ1 0 0 ¼1 3 3 D4 ¼ 3 0 þ 0 1 þ 1 0 0 ¼0
2 4 þ 0 þ 1 1 0 ¼ 1: D5 ¼ 3 3 3 D3 ¼ 3
144
7 Post-optimality Analysis in LPPs
This indicates that y5 enters the basis in the next iteration. The new simplex table is the following: cj cB
yB
3 y1 0 y4 1 y2 zB ¼ 12
xB x1 ¼ 2 x4 ¼ 0 x2 ¼ 6
3 y1
1 y2
0 y3
0 y4
0 y5
Mini ratio
1 0 0 0
0 0 1 0
1/3 −2/3 0 1
0 1 0 0
−2/3 4/3 1 −1
– 0/(4/3) = 0 6/1 = 6 Dj
Here y 4 is the outgoing vector, and y 25 ¼ Now we construct the next table: cj cB
yB
3 y1 0 y5 1 y2 zB ¼ 12
xB x1 ¼ 2 x5 ¼ 0 x2 ¼ 6
4 3
is the key element.
3 y1
1 y2
0 y3
0 y4
0 y5
1 0 0 0
0 0 1 0
0 −1/2 1/2 1/2
1/2 3/4 −3/4 3/4
0 1 0 0
Dj
Here all D j 0. Hence, the solution is optimal, and it is x1 ¼ 2; x2 ¼ 6 and Max z ¼ 12. Example 4 We are given the LPP Maximize z ¼ 6x1 2x2 þ 3x3 subject to 2x1 x2 þ 2x3 2 x1 þ 4x3 4 and x1 ; x2 ; x3 0: Find the ranges of the profit components when (i) c1 and c2 are the two profit components changed at the same time, (ii) all three profit components c1 , c2 and c3 are changed at a time to keep the optimal solution the same. Solution Introducing the slack variables x4 0 and x5 0 in the constraints of the given LPP and then solving the resulting problem by the simplex method, the following optimal simplex table is obtained:
cB
cj yB
xB
6 -2
y1 y2
x1 ¼ 4 x2 ¼ 6
6 y1
−2 y2
3 y3
0 y4
0 y5
1 0 0
0 1 0
4 6 9
0 −1 2
1 2 2
Dj
7.3 Discrete Changes in the Profit Vector
145
(i) When two profit components c1 and c2 are changed at a time, for the change of c1 ¼ 6 and c2 ¼ 2 to c1 and c2 , the modified simplex table for the new problem is as follows:
cj cB
yB
xB
c1 c2
y1 y2
x1 ¼ 4 x2 ¼ 6
c2 y1
c2 y2
3 y3
0 y4
0 y5
1 0 0
0 1 0
4 6 4c1 þ 6c2 3
0 −1 c2
1 2 c1 þ 2c2
Dj
The current solution remains optimum for the new problem if zj cj 0 for all j, i.e. if 4c1 þ 6c2 3 0, c2 0 and c1 þ 2c2 0, c
1 , c 0 and c2 21 , i.e. if c2 34c n6 2 o 34c1 c1 i.e. if Max 6 ; 2 c2 0 and c1 is any real number.
(ii) When all the three components c1 , c2 and c3 are changed at a time, for the change of c1 ¼ 6, c2 ¼ 2 and c3 ¼ 3 to c1 , c2 and c3 , the modified simplex table for the new problem is as follows:
cj cB
yB
xB
c1 c2
y1 y2
x1 ¼ 4 x2 ¼ 6
c2 y1
c2 y2
c3 y3
0 y4
0 y5
1 0 0
0 1 0
4 6 4c1 þ 6c2 c3
0 −1 c2
1 2 c1 þ 2c2
Dj
The current solution remains optimum for the new problem if zj cj 0 for all j, i.e. if 4c1 þ 6c2 c3 0, c2 0 and c1 þ 2c2 0, c
1 , c 0 and c2 21 , i.e. if c2 c3 4c n 6 2 o c3 4c1 c1 i.e. if Max c2 0 and c1 , c3 are any real numbers. 6 ; 2
7.4
Discrete Changes in the Requirement Vector
Let xB be the optimum basic feasible solution of the following LPP: Maximize z ¼ hcT ; xi subject to Ax ¼ b and x 0
146
7 Post-optimality Analysis in LPPs
where c, xT 2 Rn ; bT 2 Rm and A is an m n real matrix of rank m. Let Db k be the amount which is added to b k , and the new value of b k is b0k ¼ bk þ Dbk ; k ¼ 1; 2; . . .; m. Let the new basic solution be x0B . 3 b1 7 6 6 b2 7 7 6 6 ...7 7 6 0 1 0 0 Then xB ¼ B b , where b ¼ 6 7 6 bk þ Dbk 7 7 6 6 ...7 5 4 bm 2
2
2
3
0
3
b1 7 6 6 07 6 7 7 6 6 b2 7 7 6 6 7 6 ...7 6 7 7 1 6 1 6 7 ¼ B 6...7þB 6 7 6 Dbk 7 6 7 7 6 6...7 7 6 4 5 6 ...7 5 4 bm 0 2
b11 6b 6 21 6 6 ... ¼ xB þ 6 6 ... 6 6 4 ...
b12 b22
... ...
b1k b2k
... ...
...
...
...
...
... ...
... ...
... ...
... ...
bm1
bm2
...
bmk
...
3
2
3
2
or
x0B1
xB 1
2
3
2
0
3
b1m 6 7 6 07 7 b2m 7 76 7 76 7 . . . . . . 76 7 6 76 7 7 . . . 76 Dbk 7 7 6 7 7 . . . 56 6 ...7 5 4 bmm 0
b1k Dbk
3
7 6 7 6 0 7 6 6 xB2 7 6 xB2 7 6 b2k Dbk 7 7 6 7 6 7 6 6 ...7 6 ...7 6 ...7 7 6 7 6 7 6 7þ6 7: 6 0 7¼6 6 xB 7 6 xBi 7 6 bik Dbk 7 7 6 7 6 i7 6 7 6 7 6 6 ...7 5 4 ...5 4 ...5 4 0 xB m bmm Dbk xB m
) x0Bi ¼ xBi þ bik Dbk , where bik is the (i, k)th element.
7.4 Discrete Changes in the Requirement Vector
147
For maintaining the feasibility of the basic solution, we must have x0B 0 i:e: x0Bi 0 8i ¼ 1; 2; . . .; m i:e: xBi þ bik Dbk 0 8i ¼ 1; 2; . . .; m i:e: bik Dbk xBi : x
When bik [ 0; Dbk bBi ki ; 8i ¼ 1; 2; . . .; m, n o x i.e. Max bi kBi ; bik [ 0 Dbk . n o x Similarly, for bi k \0; Dbk Min bi kBi ; bik \0 8i ¼ 1; 2; . . .; m. n o n o x x Hence, Max bi kBi ; bik [ 0 Dbk Min bi kBi ; bik [ 0 ; 8i ¼ 1; 2; . . .; m. We observe that x0B remains optimal since the optimality condition Dj 0 remains unaffected by the change of bk. Example 5 We are given the following LPP: Maximize z ¼ x1 þ 2x2 x3 subject to 3x1 þ x2 x3 10 x1 þ 4x2 þ x3 6 x2 þ x3 4 and xj 0;
j ¼ 1; 2; 3:
(a) Determine the ranges for discrete changes in the components b2 and b3 of the requirement vector so as to maintain the optimality of the current optimum solution. Solution Introducing slack variables x4 and x6 , surplus variable x5 and artificial variable x7 , the given LPP can be written as Max z ¼ x1 þ 2x2 x3 þ 0:x4 þ 0:x5 þ 0:x6 Mx7 subject to 3x1 þ x2 x3 þ x4 ¼ 10 x1 þ 4x2 þ x3 x5 þ x7 ¼ 6 x2 þ x3 þ x6 ¼ 4: The initial simplex table is the following: cj cB
yB
xB
−1 y1
2 y2
−1 y3
0 y4
0 y5
0 y6
−M y7
0 −M
y4 y7
x4 = 10 x7 = 6
3 −1
1 4
-1 1
1 0
0 −1
0 0
0 1 (continued)
148
7 Post-optimality Analysis in LPPs
(continued) cj cB
yB
0 y6 zB ¼ 6M
xB x6 = 4
−1 y1
2 y2
−1 y3
0 y4
0 y5
0 y6
−M y7
0 M−1
1 −4 M−2
1 −M + 1
0 0
0 M
1 0
0 0
Dj
The optimum table is as follows: cj cB
yB
0 y4 2 y2 0 y5 zB ¼ 8
xB x4 = 6 x2 = 4 x5 = 10
−1 y1
2 y2
−1 y3
0 y4
0 y5
0 y6
−M y7
3 0 1 1
0 1 0 0
−2 1 3 3
1 0 0 0
0 0 1 0
−1 1 4 2
0 0 −1 M
Dj
From the optimum simplex table, we have xB ¼ ½6; 4; 10; B1
b ¼ ½10; 6; 4 0 1 1 0 1 B C ¼ ½ y4 ; y 7 ; y6 ¼ @ 0 0 1 A: 0 1 4
The individual effects of changes in b2 and b3 where b = [b1, b2, b3] without violating the optimality of the basic feasible solution are given by Max
xBi xBi ; bik [ 0 Dbk Min ; bik \0 ; k ¼ 2; 3: bik bik
When k = 2, then
xBi xBi ; bi2 [ 0 Db2 Min ; bi2 \0 bi2 bi 2 xB 3 Db2 Min b32 10 Db2 Min or Db2 10: 1
Max or or
7.4 Discrete Changes in the Requirement Vector
149
Again, when k = 3, then
xBi xBi ; bi3 [ 0 Db3 Min ; bi3 \0 bi 3 bi 3 xB2 xB3 xBi Max ; Db3 Min b23 b33 b13 4 10 6 ; Max Db3 Min 1 4 1
Max or or or
Maxf4; 5=2g Db3 6
or
5=2 Db3 6:
Example 6 Consider the LPP Maximize z ¼ 2x1 þ x2 þ 4x3 x4 subject to x1 þ 2x2 þ x3 3x4 8 x2 þ x3 þ 2x4 0 2x1 þ 7x2 5x3 10x4 21 x1 ; x2 ; x3 ; x4 0 The optimal simplex table of this LPP is
cB 2 1 0 zB ¼ 16
yB y1 y2 y7
xB 8 0 5
2
1
4
−1
0
0
0
y1 1 0 0 0
y2 0 1 0 0
y3 3 −1 −4 1
y4 1 −2 2 1
y5 1 0 −2 2
y6 2 −1 3 3
y7 0 0 1 0
When b is changed to ½ 3 2 4 T , make the necessary corrections in the optimal simplex table and solve the resulting problem. Solution From the given optimal simplex table, we have xB ¼ ½ 8 B1 ¼ ½ y5
5 T ;
0
y6
When b is changed to ½ 3 2 ables are given by
b ¼ ½ 8 0 21 T 2 3 1 2 0 6 7 y7 ¼ 4 0 1 0 5: 2 3 1
4 T , the new values of the current basic vari-
150
7 Post-optimality Analysis in LPPs
2
3 2 x1 1 2 4 x2 5 ¼ 4 0 1 2 3 x7
32 3 2 3 0 3 1 0 54 2 5 ¼ 4 2 5: 1 4 8
Since the values of x1 and x7 are negative, the current basic solution becomes infeasible. Hence, we have to apply the dual simplex method to obtain the basic feasible solution. Now we construct the next dual simplex table.
cj cB 2 1 0 zB ¼ 0
yB
xB
y1 y2 y7
−1 2 −8
2 y1
1 y2
4 y3
−1 y4
0 y5
0 y6
0 y7
1 0 0 0
0 1 0 0
3 −1 −4 1
1 −2 2 1
1 0 −2 2
2 −1 3 3
0 0 1 0
Now xBr ¼ minfxBi ; xBi \0g ¼ minf1; 8g ¼ 8 ¼ xB3 . ) r ¼ 3. Hence, the vector yB3 , i.e. y7 , has the basis. z j cj Again; Max ; yrj \0 yrj z j cj z3 c3 z5 c5 ¼ Max ; y3j \0 ¼ Max ; y3j y33 y35 1 2 1 ; ¼ Max ¼ ; which occurs for j ¼ 3: 4 2 4 Hence, the vector y3 enters the basis, and yrj , i.e. y33 ¼ 4, is the key element. Now we construct the next dual simplex table.
cj cB
yB
xB
2 y1
1 y2
4 y3
−1 y4
0 y5
0 y6
0 y7
2
y1
−7
1
0
0
5 2
12
17 4
3 4
1 2 1 2
74 34
14
3/2
15/4
1/4
1
y2
4
0
1
0
4
y3
2
0
0
1
52 12
0
0
0
3/2
zB ¼ 2
14
7.4 Discrete Changes in the Requirement Vector
151
cj cB
yB
xB
2 y1
1 y2
4 y3
−1 y4
0 y5
0 y6
0 y7
0
y5
14
−2
0
0
−5
1
17 2
32
1
y2
−3
1
1
0
0
0
4
y3
−5
1
0
1
2
0
3
0
0
9
0
5 2 7 2 33 2
1 2 1 2 5 2
zB ¼ 23
This table shows that y3 leaves the basis, but since all the elements of the departing row are non-negative, by the dual simplex method, the given problem does not possess any feasible solution.
7.5
Discrete Changes in the Coefficient Matrix
Here, we shall discuss the post-optimality problems which involve discrete changes in the element of the coefficient matrix A. Let us consider the following LPP: Maximize z ¼ cT ; x subject to Ax ¼ b and x 0; where c, xT 2 Rn ; bT 2 Rm and A is an m n real matrix of rank m. Let xB be the optimum basic feasible solution of that LPP. Again, let us assume that the element ark in the rth row and kth column of matrix A is changed to a0rk ð¼ ark þ Dark Þ: Two possibilities may arise: Case I: ak 62 B Case II: ak 2 B, where B is the basis matrix. Case I: In this case, the post-optimal change does not have any effect on the feasibility of the current optimal solution xB = B−1b. Thus, the only effect, if any, may be on the optimality of the current solution. Let B1 ¼ ðb1 ; b2 ; . . .; bm Þ and ak ¼ ða1k ; a2k ; . . .; amk ÞT . Therefore, a0k ¼ ða1k ; a2k ; . . .; ark þ Dark ; . . .; amk ÞT ¼ ða1k ; a2k ; . . .; ark ; . . .; amk ÞT þ ð0; 0; . . .; Dark ; . . .; 0ÞT ¼ ak þ ð0; 0; . . .; Dark ; . . .; 0ÞT Now, to preserve the optimality of the optimal solution of the original LPP, we must have zk ck 0.
152
7 Post-optimality Analysis in LPPs
D E D E where zk ¼ cTB ; yk ¼ cTB ; B1 ak D h iE ¼ cTB ; B1 ak þ ð0; 0; . . .; Dark ; . . .; 0ÞT D E D E ¼ cTB ; B1 ak þ cTB ; B1 ð0; 0; . . .; Dark ; . . .; 0ÞT D E ¼ zk þ cTB ; ðb1 ; b2 ; . . .br ; . . .; bm Þð0; 0; . . .; Dark ; . . .; 0ÞT D E ¼ zk þ cTB ; br Dark D E D E ) zk ck ¼ zk ck þ cTB ; br Dark ¼ zk ck þ cTB ; br Dark : Now, from the optimality condition, we have zk ck 0
D E ) zk ck þ cTB ; br Dark 0 D E ) cTB ; br Dark ðzk ck Þ D E 2 ck Þ Dark ðzcTk;b for cTB ; br [ 0 h B ri 6 6 D E 6 ðzk ck Þ ) 6 Dark cT ;b for cTB ; br \0 h B ri 6 4 D E Dark is unrestrieted for cTB ; br ¼ 0
Case II: In this case, ak 2 B, i.e. ak is a basis vector. So, the feasibility of the current optimum solution may also be destroyed. Therefore, in this case, we have to find a range for discrete changes Dark in ark so as to maintain the feasibility as well as the optimality of the solution. Now, we shall find the conditions for maintaining feasibility of the solution. Let B = (B1, B2, …, Bm), Bj 2 A; j ¼ 1; 2; . . .; m and B1 ¼ ðb1 ; b2 ; . . .; bm Þ, where bj is the jth column vector of B−1. Since ak 2 B; any change in ark will affect only the pth column of B. Then the basis matrix B* becomes B ¼ B1 ; B2 ; . . .; Bp ; . . .; Bm ,
where Bp ¼ ðB1p; B2p; . . .; Brp; þ Darp ; . . .; Bmp ÞT . Since Bp 2 B , it can be expressed as the linear combination for vectors B1 ; B2 ; . . .; Bp ; . . .; Bm in B; therefore, Bp ¼ k1 B1 þ k2 B2 þ þ km Bm ¼ Bk where k ¼ ðk1 ; k2 ; . . .; km ÞT .
7.5 Discrete Changes in the Coefficient Matrix
153
) k ¼ B1 Bp ¼ B1 ½Bp þ ð0; 0; . . .; Darp ; . . .; 0ÞT ¼ B1 Bpþ br Dark ¼ ep þ br Dark
"
# since B1 Bp ¼ ep ; as Bep ¼ B1 ; B2 ; . . .; Bp ; . . .; Bm ð0; 0; . . .; 1; . . .; 0Þ ¼ Bp
i.e. ðk1 ; k2 ; . . .; kp ; . . .; km ÞT ¼ ð0; 0; . . .; 1; . . .; 0ÞT ðb1r ; b2r ; . . .; bpr ; . . .; bmr ÞT Dark . Now, equating the pth element and ith (i 6¼ p) element of both sides, we have kp ¼ 1 þ bpr Dark and ki ¼ bir Dark for i 6¼ p. Now, B*−1 exists only if B* is the non-singular ði:e:; jB j 6¼ 0Þ. Therefore, we must have kp 6¼ 0; i.e. 1 þ bpr Dark 6¼ 0. Now, we shall introduce a new matrix E which differs from an identity matrix Im in the pth column only. The pth column of matrix E can be obtained as follows: Bp ¼ k1 B1 þ k2 B2 þ þ kp Bp þ þ km Bm or
k1 k2 1 km B1 B2 . . . þ Bp . . . Bm kp kp kp kp
k1 k2 1 km T ¼ B1 ; B2 ; . . .; BP ; . . .; Bm ; ; . . . þ ; . . .; ¼ B g; kp kp kp kp
Bp ¼
T where g ¼ k1 =kp ; k2 =kp ; . . .; 1=kp ; km =kp = 0 1 0 ::: ::: ::: k1 =kp ::: B 0 1 ::: ::: ::: k2 =kp ::: B B ::: ::: ::: ::: ::: ::: ::: B B ::: ::: ::: ::: ::: ::: ::: Hence, E ¼ B B 0 0 ::: ::: ::: 1=k ::: p B B ::: ::: ::: ::: ::: ::: ::: B @ ::: ::: ::: ::: ::: ::: ::: 0 0 ::: ::: ::: km =kp :::
the pth column of E. 1 ::: ::: 0 ::: ::: 0 C C ::: ::: ::: C C ::: ::: ::: C C. ::: ::: 0 C C ::: ::: ::: C C ::: ::: ::: A ::: ::: 1
Therefore; B E ¼ B ðe1 ; e2 ; . . .; g; . . .; em Þ ¼ ðB e1 ; B e2 ; . . .; B g; . . .; B em Þ ¼ B1 ; B2 ; . . .; Bp ; . . .; Bm h
as B e1 ¼ B1 ; B2 ; Bp ; . . .; Bm ð1; 0; . . .; 0ÞT ¼ B1 and B g ¼ Bp :
154
7 Post-optimality Analysis in LPPs
) B E ¼ B ) ðB EÞB1 ¼ BB1 ) B EB1 ¼ I ) ðB Þ1 ¼ EB1: Hence, the new solution becomes xB ¼ ðB Þ1 b ¼ ðEB1 Þb ¼ EðB1 bÞ ¼ ExB *xB ¼ B1 b ) xB ¼ ExB
T ) x0B1 ; x0B2 ; . . .; x0Bp ; . . .; x0Bm ¼ E xB1 ; xB2 ; . . .; xBp ; . . .; xBm ( xBi kkpi xBp for i 6¼ p 0 ) xBi ¼ xBp for i ¼ p kp 8 < xB i bi r Dar k xB p for i 6¼ p; i ¼ 1; 2; . . .; m 1 þ bp r Dar k or x0Bi ¼ : xB p for i ¼ p 1 þ b Dar k pr
Thus, to maintain the feasibility of x0B , we must have x0Bi 0 for i 6¼p and x0Bp 0. Now; x0Bp 0 )
xBp 1 þ bpr Dark
) 1 þ Dbp r Dar k [ 0 if xBp 6¼ 0:
Again; x0B i 0ði 6¼ pÞ ) xB i
bi r Dar k xB p 0 1 þ bp r Dar k
) xB i þ Dar k ðbp r xB i bi r xB p Þ 0 ) Dar k ðbp r xB i bi r xB p Þ xB i 8 xB i > ; for bp r xB i bi r xB p [ 0 Da > < rk bp r xB i bi r xB p ) xB i > > ; for bp r xB i bi r xB p \0 : Dark b p r x B i bi r x B p
7.5 Discrete Changes in the Coefficient Matrix
155
Hence, in order to maintain the feasibility, combining these two inequalities, we have Max
xBi xBi ; Qi [ 0 Dark Min ; Qi \ 0 ; Qi Qi
where Qi ¼ bpr xBi bir xBp : The preceding inequalities give a range for discrete changes Dark in the coefficient matrix A to maintain the feasibility of the solution. Condition for maintaining the optimality of the solution To maintain the optimality of the new solution, we must have zj cj 0
for all j ¼ 1; 2; . . .; n: D E Now; zj cj ¼ cTB ; B1 aj cj D E ¼ cTB ; ðEB1 Þaj cj *B1 ¼ EB1 D E ¼ cTB ; ðEB1 aj Þ cj D E h i 1 ¼ cTB ; ðEyj Þ cj )yj ¼ B aj ¼
m X
cBi yij cj ;
i¼1
where
yi j
¼
) zj cj ¼ ¼
8 < yi j :
Dar k bi r 1 þ bp r Dar k yp j ;
yp j 1 þ bp r Dar k
m X i¼0 m X
cB i
1 þ bp r Dar k
condition 0
or zj cj þ Dark
of
i¼p ! Dar k bi r yp j yi j cj 1 þ bp r Dar k m X
cB i
i¼1
¼ z j cj
Thus, the yp j Dar k hcTB ;br i
;
cB i yi j cj
i¼1
i 6¼ p
D yp j Dar k cTB ; br
Dar k bi r yp j 1 þ bp r Dar k E
1 þ bp r Dar k
optimality
requires
n D Eo zj cj bp r yp j cTB ; br 0.
:
zj cj 0;
i.e.
D E ðzj cj Þ T ; if ðz c Þb y c ; b j j pr pj r [0 B ðzj cj Þbpr ypj hcTB ;br i D E ðzj cj Þ and D ar k z c b y cT ;b ; if ðzj cj Þbpr ypj cTB ; br \0. ð j j Þ pr pj h B r i
Hence, D ar k
z j cj
156
7 Post-optimality Analysis in LPPs
Hence, in order to maintain the optimality of the new solution, Dark must satisfy the following inequalities (which are the combination of the preceding two inequalities): ðzj cj Þ ðzj cj Þ Max ; Pj [ 0 Dark Min ; Pj \0 ; Pj Pj D E where Pj ¼ zj cj bpr ypj cTB ; br .
7.6
Addition of a Variable
Addition of a new variable in an LPP plays an important role in post-optimality analysis. Due to this change, structural variation in the simplex table takes place. Let a new variable xn+1 be introduced in the following LPP whose optimal solution is known: Maximize z ¼ cT ; x subject to Ax ¼ b and x 0: Also, let an+1 be the coefficient vector associated with xn+1 and let the profit coefficient for xn+1 be cn+1. Since the requirement vector b is not changed, the old optimal solution will be a feasible solution of the new LPP, but it may not be optimal. Now, we have to compute yn þ 1 ¼ B1 an þ 1 and zn þ 1 cn þ 1 ¼ cTB ; yn þ 1 cn þ 1 : If zn þ 1 cn þ 1 0; then xn+1 = 0, and the current optimal solution remains optimal for the new problem. If zn þ 1 cn þ 1 \0; yn þ 1 enters the basis, and the simplex method is applied to obtain the optimal solution of the resulting new problem. Example 7 Consider the LPP Maximize z ¼ x1 þ 2x2 þ x3 subject to 2x1 þ x2 x3 2 2x1 x2 þ 5x3 6 4x1 þ x2 þ x3 6 and x1 ; x2 ; x3 0:
7.6 Addition of a Variable
157
The optimal simplex table is as follows:
cB 2 1 2 zB ¼ 10
yB y2 y3 y6
cj
1
2
1
0
0
0
xB x2 ¼ 4 x3 ¼ 2 x6 ¼ 0
y1 3 1 0 6
y2 1 0 0 0
y3 0 1 0 0
y4 5/4 1/4 −3/2 1/4
y5 1/4 1/4 −1/2 3/4
y6 0 0 1 0
Dj
If a new variable x7 ð 0Þ is introduced with profit (i) 4, (ii) 6 and a7 ¼ ½ 2 1 4 T , discuss the effect of the introduction of the new variable x7 . Solution From the given optimal simplex table, we get the inverse of the basis matrix as 2
B1 ¼ ½ y4
y5
5=4 y6 ¼ 4 1=4 3=2
1=4 1=4 1=2
3 0 0 5: 1
Since a7 ¼ ½ 2 1 4 T , the corresponding column vector to be introduced in the optimal simplex table is given by 2
5=4 y7 ¼ B1 a7 ¼ 4 1=4 3=2
1=4 1=4 1=2
32 3 2 3 0 2 9=4 0 54 1 5 ¼ 4 1=4 5: 1 4 3=2
(i) When c7 ¼ 4,
z 7 c7 ¼ 2
9 1 3 3 þ 1 þ 0 4 ¼ [ 0: 4 4 2 4
Hence, the optimality condition is satisfied for the modified problem also. Therefore, the optimal solution for the modified problem is given by x1 ¼ 0; x2 ¼ 4; x3 ¼ 2; x7 ¼ 0 with zMax ¼ 10: (ii) When c7 ¼ 6,
z7 c7 ¼ 2
9 1 3 5 þ 1 þ 0 6 ¼ \0: 4 4 2 4
158
7 Post-optimality Analysis in LPPs
Hence, the optimality condition is not satisfied. We have to modify the optimal simplex table of the old problem with the added column vector y7 ¼ ½ 9=4 1=4 3=2T ; z7 c7 ¼ 54 and c7 ¼ 6. Now, to get the optimal solution, we shall apply the simplex method. The consecutive tables obtained are as follows:
cj cB
yB
2 y2 1 y3 0 y6 zB ¼ 10
cj cB
yB
2 y2 1 y3 6 y7 zB ¼ 10
xB x2 ¼ 4 x3 ¼ 2 x6 ¼ 0
xB x2 ¼ 4 x3 ¼ 2 x7 ¼ 0
1 y1
2 y2
1 y3
0 y4
0 y5
0 y6
6 y7
Min ratio
3 1 0 6
1 0 0 0
0 1 0 0
5/4 1/4 −3/2 11/4
1/4 1/4 −1/2 3/4
0 0 1 0
9/4 1/4 3/2 −5/4
16/9 8 0 Dj
1 y1
2 y2
1 y3
0 y4
0 y5
0 y6
6 y7
3 1 0 6
1 0 0 0
0 1 0 0
7/2 1/2 −1 3/2
1 1/3 −1/3 1/3
−3/2 −1/6 2/3 5/6
0 0 1 0
Dj
From this table, it is seen that all xBi 0 and all Dj 0, and so the optimality conditions are satisfied. Hence, the optimal solution is given by x1 ¼ 0; x2 ¼ 4; x3 ¼ 2; x7 ¼ 0 and zMax ¼ 10.
7.7
Deletion of a Variable
Like addition of a new variable, deletion of a variable (basic or non-basic) also changes the structure of the problem. According to the type of deleted variables, two cases may arise. Case 1: If the variable to be deleted is non-basic, then the feasibility and optimality conditions are not affected. This means that the current optimal solution is the optimal solution of the new problem. Case 2: If the deleted variable is basic, then the conditions of optimality may be affected, and a new solution must be obtained. To find this new optimal solution, we have to assign a profit −M corresponding to the basic variable to be deleted and apply the simplex method, taking the modified optimal simplex table as the starting simplex table.
7.8 Addition of a New Constraint
7.8
159
Addition of a New Constraint
Let us suppose that a new constraint, i.e. the (m + 1)th constraint am þ 1; 1 x1 þ am þ 1; 2 x2 þ þ am þ 1; n xn bm þ 1 ;
i:e: ux bm þ 1 ;
is added to the system of original constraints Ax = b, x 0, where u ¼ am þ 1;1 ; am þ 1;2 ; . . .; am þ 1;n and bm+1 is positive or 0 or negative. Then two cases may arise. (i) The current optimal solution xB of the original problem satisfies the new constraint. If so, the solution remains feasible as well as optimal. (ii) If the current optimal solution xB does not satisfy ! the new constraint, then the xB new basic feasible solution will be xB ¼ , where xs is the slack variable xS introduced in the additional constraint. Inthat case the new basis matrix B* of
B O , where a is a submatrix of u [the order m + 1 is given by B ¼ a 1 components are the coefficients of basic variables].
)B1 ¼
B1 aB1
O : 1
b xB B1 O ¼ . bm þ 1 bm þ 1 axB aB1 1 If xB 0, the current solution remains basic feasible for the new problem also; otherwise, we apply the dual simplex method to obtain the basic feasible solution. In this case, note that the following three situations may arise depending on the nature of the solution to the original LPP. Now, xB ¼ B1 b ¼
(i) If the original LPP has an optimal solution, then the modified LPP may have an optimal solution or it will have no feasible solution. (ii) If the original LPP has an unbounded solution, then the modified LPP may have an optimal solution or it will have no feasible solution or it will have an unbounded solution. (iii) If the original LPP has no feasible solution, then the modified LPP will also have no feasible solution.
160
7 Post-optimality Analysis in LPPs
Example 8 We are given the following LPP: Maximize z ¼ 3x1 þ 4x2 þ x3 þ 7x4 subject to the constraints 8x1 þ 3x2 þ 4x3 þ x4 7 2x1 þ 6x2 þ x3 þ 5x4 3 x1 þ 4x2 þ 5x3 þ 2x4 8 and x1 ; x2 ; x3 ; x4 0: (i) Discuss the effect of discrete changes in b1. (ii) Let the variable x4 be deleted from the LPP part of the problem. Obtain an optimum solution to the resulting LPP. (iii) Let a linear constraint 2x1 þ 3x2 þ x3 þ 5x4 4 be the addition to the constraints of the problem. Check whether there is any change in the optimum solution of the original problem. (iv) Also discuss the case where the upper limit of the preceding constraint is reduced to 2. Solution For the given LPP, the optimum table is the following:
cj cB
yB
xB
3 y1 x1 ¼ 16/19 x4 ¼ 5/19 7 y4 x7 ¼ 126/19 0 y7 zB ¼ 83=19
3 y1
4 y2
1 y3
7 y4
0 y5
0 y6
0 y7
1 0 0 0
9/38 21/19 59/38 169/38
1/2 0 9/2 1/2
0 1 0 0
5/38 −1/19 −1/38 1/38
−1/38 8/38 −15/38 53/38
0 0 1 0
Dj
(i) From this optimum table, we have xB = [16/19, 5/19, 126/19]T, b = [7, 3, 8]T 0 and B1 ¼ ðy5 y6 y7 Þ ¼
5 38 B 1 B @ 19 1 38
1 38 8 38 15 38
0
1
C 0C A: 1
Let the requirement vector component b1 be changed to b1+Δb1. Then the range of Δb1 for such a change, maintaining the optimality to the basic feasible solution, is given by
7.8 Addition of a New Constraint
161
xBi xBi Max ; bi1 [ 0 Db1 Min ; bi1 \0 bi1 bi1 or or
Maxfð16=19Þ=ð5=38Þg Db1 Minfð5=19Þ=ð1=19Þ; ð126=19Þ=ð1=38Þg 32=5 Db1 Minf5; 252g or; 32=5 Db1 5:
(ii) Since the variable x4 to be deleted is a basic variable, we assign a cost—M to x4 and take the current optimum simplex table as the initial simplex table for the new LPP. 3
4
1
−M
0
0
0
cB
yB
xB
y1
y2
y3
y4
y5
y6
y7
Mini ratio
3 −M 0
y1 y4 y7
16/19 5/19 126/19
1 0 0 0
9/38 21/19 59/38
1/2 0 9/2
5/38 −1/19 −1/38
−1/38 4/19 −15/38
125 21M 19 38
1 2
0 1 0 0
0 0 1 0
32/9 5/21 252/59 Dj
cj
zB ¼ 5M=19 þ 48=19
M 19
þ
15 38
3 4M 19 38
Here y2 is the incoming vector, y4 is the outgoing vector and y22 = 21/19 is the key element. Now introducing y2 and dropping y4, we develop the next table.
cj cB
xB
3 y1
4 y2
1 y3
0 y5
0 y6
0 y7
3 y1 x1 ¼ 209=266 x2 ¼ 5=21 4 y2 x7 ¼ 13997=798 0 y7 zB ¼ 139=42
1 0 0 0
0 1 0 0
1/2 0 9/2 1/2
1/7 −1/21 1/21 5/21
−1/14 4/21 −187/266 23/42
0 0 1 0
yB
Dj
This shows that the new optimum solution to the LPP when the variable x4 has been deleted is given by x1 = 209/266, x2 = 9/2, x3 = 0 and x4 = 0, Max z = 139/ 42. (iii) From the optimum simplex table of the given LPP, we have x1 = 16/19, x2 = 0, x3 = 0, x4 = 5/19. For this solution, 2x1 þ 3x2 þ x3 þ 5x4 ¼ 2:16=19 þ 5:5=19 ¼ ð32 þ 25Þ=19 ¼ 3\4. Hence, the current optimal solution also satisfies the additional constraint. Thus, the additional constraint is redundant, and the optimum solution of the given LPP is also the optimum solution of the modified LPP (the original LPP including the additional constraint).
162
7 Post-optimality Analysis in LPPs
(iv) When the upper limit of the additional constraint is reduced to 2, the additional constraint will be 2x1 þ 3x2 þ x3 þ 5x4 2. The current optimum solution does not satisfy the additional constraint, so by introducing a slack variable x8 ð 0Þ in the additional constraint, we have 2x1 þ 3x2 þ x3 þ 5x4 þ x8 ¼ 2: Now appending this constraint in the optimum table, we have the following: cj cB
yB
xB
3 y1
4 y2
1 y3
7 y4
0 y5
0 y6
0 y7
0 y8
3 7 0 0
y1 y4 y7 y8
16/19 5/19 126/19 2
1 0 0 2
9/38 21/19 59/38 3
1/2 0 9/2 1
0 1 0 5
5/38 −1/19 −1/38 0
−1/38 4/19 −15/38 0
0 0 1 0
0 0 0 1
The basis B = (y1, y2, y7, y8) is obtained by performing the operation R04 ¼ R4 2R1 and then R04 ¼ R4 5R2 , to obtain the following simplex table:
cj cB
yB
3 y1 7 y4 0 y7 0 y8 zB ¼ 83=19
xB 16/19 5/19 126/19 −1
3 y1
4 y2
1 y3
7 y4
0 y5
0 y6
0 y7
0 y8
1 0 0 0 0
9/38 21/19 59/38 −3 169/38
1/2 0 9/2 0 1/2
0 1 0 0 0
5/38 −1/19 −1/38 0 1/38
−1/38 4/19 −15/38 −1 53/38
0 0 1 0 0
0 0 0 1 0
Dj
Since all zj cj 0 and xB4 ¼ 1, an optimum or more than optimum but infeasible solution has been obtained. Hence, we have to apply the dual simplex method to solve the problem. As xB4 ¼ 1, which is negative, the corresponding basis vector y8 leaves the basis.
z j cj Now Max ; y4j \0 ; j ¼ 1; 2; . . .; 8 y4j 169=38 53=38 169 53 ; ; ¼ Max ¼ Max 3 1 114 38 53 ¼ ; which occurs for j ¼ 6: 38
7.8 Addition of a New Constraint
163
Therefore, y6 enters the basis, and y46 = −1 is the key element. Now we construct the next simplex table.
cj cB
yB
xB
3 y1 33/38 1/19 7 y4 267/38 0 y7 1 0 y6 zB ¼ 113=38
3 y1
4 y2
1 y3
7 y4
0 y5
0 y6
0 y7
0 y8
1 0 0 0 0
6/19 9/19 7/19 3 5/19
1/2 0 9/2 0 1/2
0 1 0 0 0
5/38 −1/19 −1/38 0 1/38
0 0 0 1 0
0 0 1 0 0
−1/38 4/19 −15/38 −1 53/38
Dj
Since all zj cj 0, an optimum solution has been attained. From the preceding simplex table, we see that the additional constraint has decreased the optimum value of the objective function from 83/19 to 113/38. Example 9 We are given the following LPP: Maximize Z ¼ 3x1 þ 4x2 þ x3 þ 7x4 subject to the constraints 8x1 þ 3x2 þ 4x3 þ x4 7 2x1 þ 6x2 þ x3 þ 5x4 3 x1 þ 4x2 þ 5x3 þ 2x4 8 and
x1 ; x2 ; x3 ; x4 0:
The following table gives the optimum solution:
cj cB
yB
xB
3 y1 16/19 5/19 7 y4 126/19 0 y7 zB ¼ 83=19
3 y1
4 y2
1 y3
7 y4
0 y5
0 y6
0 y7
1 0 0 0
9/38 21/19 59/38 169/38
1/2 0 9/2 1/2
0 1 0 0
5/38 −1/19 −1/38 1/38
−1/38 8/38 −15/38 53/38
0 0 1 0
Dj
Discuss the effect of discrete changes in the activity coefficients a12 and a24 of the coefficient matrix A on the current optimum basic feasible solution to the given LPP.
164
7 Post-optimality Analysis in LPPs
Solution From the given optimum simplex table, we have b1 5 B 38 B B B 1 ¼B B 19 B @ 1 38
b2 b3 1 1 0C 38 C C 8 C 0 C; C 38 C A 15 1 38
0
B1
where B is the basis for the given problem. Let the element a12 of A be changed to a12 ¼ a12 þ Da12 . In the optimum simplex table, the element a12, i.e. the corresponding column vector y2, is not in the basis. We know that the range for discrete change Δark in the coefficient matrix A for maintaining the optimality of the solution is given by D E cTB ; br [ 0 D E ck Þ for cTB ; br \0 Dark ðzcTk;b h B ri Dapk is unrestricted for cTs ; br ¼ 0: Dark
ðzk ck Þ hcTB ;br i
for
Here r = 1, k = 2. Now cTB ; b1 ¼ 3ð5=38Þ þ 7ð1=19Þ þ 0 ¼ 1=38 [ 0. Therefore, the range for discrete change Δa12 in the coefficient matrix A for maintaining the optimality of the solution is given by ðz2 c2 Þ E Da12 D cTB ; b1
½)cB b1 [ 0
or
Da12 ð169=38Þ=ð1=38Þ
or
Da12 169:
Let the element a24 of A be changed to a24 ¼ a24 þ Da24 . In the optimum simplex table, the element a24 corresponding to the column vector y4 is in the basis. Hence, the discrete change in a24 belonging to vector y4 may affect the feasibility as well as the optimality of the original optimum basic feasible solution. We know that the range for discrete change in the element a24 of the coefficient matrix in order to maintain the optimality of the new solution is given by
7.8 Addition of a New Constraint
165
ðzj cj Þ ðzj cj Þ Max ; Pj [ 0 Dark Min ; Pj \0 ; Pj Pj D E where Pj ¼ zj cj bp r yp j cTB ; br : Here rD= 2, k E= 4 and p = 2. Now, cTB ; b2 ¼ 3ð1=38Þ þ 7ð8=38Þ þ 0ð15=38Þ ¼ 53=38. D E Hence; P2 ¼ ðz2 c2 Þb22 y22 cTB ; b2 ¼ ð169=38Þð8=38Þ ð21=19Þð53=38Þ ¼ 23=38 D E P3 ¼ ðz3 c3 Þb22 y23 cTB ; b2 ¼ ð1=2Þð8=38Þ 0ð53=38Þ ¼ 2=19 D E P5 ¼ ðz5 c5 Þb22 y25 cTB ; b2 ¼ ð1=38Þð8=38Þ ð1=19Þð53=38Þ ¼ 3=38 D E P6 ¼ ðz6 c6 Þb22 y26 cTB ; b2 ¼ ð53=8Þð8=38Þ ð8=38Þð53=38Þ ¼ 0:
Therefore, the range for discrete change in the element a24 of the coefficient matrix in order to maintain the optimality of the new solution is given by or or
n o n o Max ðz3Pc3 3 Þ ; ðz5Pc5 5 Þ D a24 Min ðz2Pc2 2 Þ 1=3 a24 169=23:
Again, we know that the range for discrete change in the element a24 in order to maintain the optimality of the new solution is given by xB xB Max i ; Qi [ 0 D ark Min i ; Qi \0 ; Qi Qi where Qi ¼ bpr xBi bir xBp . Here r = 2, k = 4, p = 2. 1 5 8 7 Now Q1 ¼ b22 xB1 b12 xB2 ¼ 38 16 19 38 19 ¼ 38 Q3 ¼ b22 xB3 b32 xB2
8 126 15 5 3 ¼ : ¼ 38 19 38 19 2
Therefore, the range for the discrete change in the element a24 in order to n xB1 maintain the optimality of the new solution is given by Max Q1 ; x
QB33 g Da24 \1 n o 126=19 ; or Max 16=19 Da24 or 84 19 Da24 \1. 7=38 3=2
166
7.9
7 Post-optimality Analysis in LPPs
Exercises
1. Solve the following LPP:
Maximize z ¼ x1 þ 3x2 2x3 subject to the constraints 3x1 x2 þ 2x3 7 2x1 þ 4x2 12 4x1 þ 3x2 þ 8x3 10: and x1 ; x2 ; x3 0: Determine the separate range for discrete changes in a13 ; a23 ; a21 consistent with the optimal solution of the given LPP. 2. The optimal solution to the problem
Maximize Z ¼ x1 þ 2x2 x3 subject to the constraints 3x1 þ x2 x3 10; x1 þ 4x2 þ x3 6; and x1 ; x2 ; x3 0
x2 þ x3 4
is given in the following table: cj ! yB
cB
y4 0 2 y2 0 y6 zB ¼ 8
xB 6 4 10
−1 y1
2 y2
−1 y3
0 y4
0 y5
0 y6
−M y7
3 0 1 1
0 1 0 0
−2 1 3 3
1 0 0 0
−1 1 4 2
0 0 1 0
0 0 −1 M
Dj
Determine the ranges for discrete changes in the components b2 and b3 of the required vector so as to maintain the optimality of the current optimal solution. 3. Will the optimal solution to the following LPP remain optimal if the requirement vector (3,5,6)T changes to (3, 10, 6)T?
7.9 Exercises
167
Minimize z ¼ 8x1 þ 3x2 þ 6x3 þ 3x4 subject to x2 þ 4x3 þ 5x4 3 3x1 þ 2x3 5 x1 þ 2x2 þ x4 6 x 1 ; x2 ; x3 ; x4 0 4. Find the optimal solution of the LPP
Maximize z ¼ 4x1 þ 3x2 subject to x1 þ x2 5 3x1 þ x2 7 x1 þ 2x2 10; x1 ; x2 0: Show how to find the optimal solution of the problem if: (i) The first component of the original requirement vector is increased by one unit and the third component is decreased by one unit. (ii) The second component of the original requirement vector is decreased by two units. 5. For the LPP
Maximize z ¼ x1 þ 2x2 þ x3 subject to 2x1 þ x2 x3 2 2x1 x2 þ 5x3 6 4x1 þ x2 þ x3 6 x1 ; x2 ; x3 0; the optimal table is as follows: cB
yB
2 y2 1 y3 0 y6 zB ¼ 10
xB
y1
y2
y3
y4
y5
y6
4 2 0
3 1 0 6
1 0 0 0
0 1 0 0
5/4 1/4 −3/2 1/4
1/4 1/4 −1/2 3/4
0 0 1 0
Discuss the effect of deletion of the variable (i) x1, (ii) x2, (iii) x3.
Dj
168
7 Post-optimality Analysis in LPPs
6. Let us consider the final table of an LPP. cB 2 4 1
yB y1 y2 y3
cj
2
4
1
8
2
0
0
0
xB 3 1 7
y1 1 0 0 0
y2 0 1 0 0
y3 0 0 1 0
y4 −1 2 1 −1
y5 0 1 −2 0
y6 1/2 −1 5 2
y7 1/5 0 −3/10 1/10
y8 −1 ½ 2 2
Dj
Here y6, y7 and y8 are slack variables. If the constraints ðiÞ 2x1 þ 3x2 x3 þ 2x4 þ 4x5 5 ðiiÞ 2x1 þ 3x2 x3 þ 2x4 þ 4x5 1 are added, then find the solution of the changed LPP. 7. For the LPP
Maximize z ¼ 15x1 þ 45x2 subject to 5x1 þ 2x2 162 x1 þ 16x2 240 x2 50 and x1 ; x2 0; find the optimal solution. Find the range of each cost coefficient (changed one at a time) for which the current solution will remain optimal. 8. Find the optimal solution of the LPP and the separate ranges of variations of b2 and b3 consistent with the optimality of the solution
Minimize z ¼ x1 þ 2x2 x3 subject to 3x1 þ x2 x3 10 x1 þ 4x2 þ x3 6 x2 þ x3 4 and x1 ; x2 ; x3 0: Determine also the efficient discrete changes in the components of the cost vector which correspond to the basic variables.
7.9 Exercises
169
9. Following is the optimal table for an LPP: cB 2 1
yB y1 y2
cj
2
1
1
2
0
xB 3 4
y1 1 0 0
y2 0 1 0
y3 −1 4 1
y4 3 −1 3
y5 2 −2 2
(i) Find the limitations of the values of c3 ; c4 ; c5 (taking one at a time) for which the current solution will remain optimal. (ii) Find the optimal solution to the problem if c3 is changed to 3. (iii) Find the limitations of the values of c1 for which the current solution remains optimal. (iv) Find the optimal solution to this problem if c1 is changed to 5. 10. Find the optimal solution of the LPP
Maximize z ¼ 4x1 þ 3x2 subject to x1 þ x2 5 3x1 þ x2 7 x1 þ 2x2 10 and x1 ; x2 0: Then find the optimal solution of the problem, if: (i) The first component of the original requirement vector is increased by one unit and the third component is decreased by one unit. (ii) The second component of the original requirement vector is decreased by two units.
Chapter 8
Integer Programming
8.1
Objectives
After studying this chapter, the readers are able to learn: • The limitations of the simplex method in deriving integer solutions to LPPs • Gomory’s all-integer and mixed-integer programming techniques • The branch and bound method to solve integer programming problems.
8.2
Introduction
Integer programming problems (IPPs) are a special class of linear programming problem (LPP) where all or some of the decision variables are restricted to the integers. If all the variables are restricted to take integer values, the corresponding IPP is called a pure IPP. On the other hand, if some of the variables are restricted to take integer values, then the problem is called a mixed-integer linear programming problem (mixed ILPP). In the year 1956, R. E. Gomory first developed a method to solve pure IPPs. Later, he extended the method to solve mixed ILPPs. Another important approach, called the branch and bound method, was developed for solving both all-integer and mixed-integer programming problems. The integer solution of a given ILPP can be obtained by finding the non-integer optimal solution through the simplex method or the graphical method and then rounding off the fractional values of the variables. But, in some cases, the derivation from the exact optimal integer solution (obtained as a result of rounding) may become large enough to give an infeasible solution. Hence, it is necessary to develop a systematic procedure to determine the optimal integer solution to such problems. The following example will make the concept more understandable. © Springer Nature Singapore Pte Ltd. 2019 A. K. Bhunia et al., Advanced Optimization and Operations Research, Springer Optimization and Its Applications 153, https://doi.org/10.1007/978-981-32-9967-2_8
171
172
8 Integer Programming
Let us consider a simple problem: Maximize z ¼ 10x1 þ 4x2 subject to 3x1 þ 4x2 8; x1 0; x2 0, where x1, x2 are integers. First ignoring the integer-valued restrictions, we get the optimal solution x1 ¼ 2 23 ; x2 ¼ 0 and Maximize z ¼ 26 23 using the graphical method. Then by rounding off the fractional value of x1 ¼ 2 23, the optimum solution becomes x1 ¼ 3; x2 ¼ 0 with Maximize z ¼ 30. But this solution does not satisfy the constraint 3x1 þ 4x2 8. Therefore, this solution is not feasible. Now, let us take x1 ¼ 2; x3 ¼ 0, which is feasible and integer-valued. But this solution gives z = 20, which is far away from the value z ¼ 26 23. So this is another disadvantage of getting an integer-valued solution by rounding down the solution. The result may be far away from the optimum solution. There are two methods used to solve IPPs: (i) The Gomory method or the Gomory cutting-plane method (ii) The branch and bound method.
8.3
Gomory’s All-Integer Cutting-Plane Method
A systematic procedure for solving all IPPs was first developed by R. E. Gomory in 1956, Later, he extended the method to deal with the more complicated case of mixed-integer programming problems. In this method, first of all, we have to find the optimal solution of the given IPP by the simplex method, ignoring the restriction of integer values of the variables. (i) If all the variables in the optimal solution obtained have integer values, the current solution will be the required optimal integer solution. (ii) If not, the considered LPP requires modifications by introducing a secondary constraint (also called the Gomory constraint) that reduces some of the non-integer values of variables to integer ones but does not eliminate any feasible integers. (iii) Then, an optimal solution to this modified LPP is obtained by using the dual simplex method. If all the variables in this solution are integers, then the optimal integer solution is obtained. Otherwise, another constraint is added to the LPP, and the entire procedure is repeated. All Integer Programming Algorithm Gomory’s fractional cut method may be summarized in the following steps: Step 1 Express the given LPP in its standard forms by converting the minimization problem to a maximization problem (if necessary) and introducing slack and/or surplus variables with the readjustment of the objective function. Step 2 Determine the optimal solution by using the simplex method, ignoring the integer value restriction.
8.3 Gomory’s All-Integer Cutting-Plane Method
173
Step 3 Test the integrality of the optimum solution (a) If the optimum solution contains all integer values, an optimal integer basic feasible solution (BFS) has been achieved. Stop the process. (b) If not, go Step 4. Step 4 Examine the values of basic variables which give fractional values. Express them as the sum of an integer and a proper fraction. Then, select the largest fractional value of the basic variables, i.e. find Maxf fBi g. Let i
it be fBk for i ¼ k. Step 5 Express each of the negative fractions, if any, in the kth row of the optimal simplex table as the sum of a negative integer and a non-negative proper fraction. Step 6 Construct the Gomorian constraint n X
fkj xj fBk
j¼1
P or nj¼1 fkj xj þ G1 ¼ fBk , where 0 fkj \1 and 0\ fBk \1 and G1 is the Gomorian slack variable. Step 7 Add the Gomorian constraint constructed in Step 6 at the bottom of the optimum simplex table. Use the dual simplex method to find an improved optimal solution. Then go to Step 3.
8.4
Construction of Gomory’s Constraint
Let the optimal non-integer solution to the maximization LPP be obtained. In our usual notation this solution is shown by the following simplex table: cB
B
XB
y1 ðb1 Þ
y 2 ð b2 Þ
y i ð bi Þ
y m ðbm Þ
ymþ1
yn
cB1
b1
x B1
1
0
0
0
y1; m þ 1
y1; n
cB2
b2
x B2
0
1
0
0
y2; m þ 1
y2; n
.. .
.. .
.. .
.. .
.. .
.. .
.. .
cBi
Bi
x Bi
0
0
0
yi; m þ 1
.. .
.. .
.. .
.. .
.. .
.. .
.. .
cBm
bm
x Bm
0
0
0
1
ym; m þ 1
0
0
0
0
Dm þ 1
z ¼ \cTB ; xB [
.. .
1
.. .
.. .
yi; n .. .
ym; n Dn
Dj
174
8 Integer Programming
In this table, the variables xBi ði ¼ 1; 2; . . .; mÞ are taken as basic variables and the remaining xm þ 1 ; xm þ 2 ; . . .; xn as the non-basic variables. (For simplicity, basic variables are taken consecutively.) Let the ith basic variable xBi possess a non-integer value which is given by
or
0:x1 þ 0:x2 þ þ xi þ þ 0:xm þ yi;m þ 1 xm þ 1 þ þ yi;n xn ¼ xBi n X xi þ y ij x j ¼ xBi : j¼m þ 1
ð8:1Þ Now, let us assume that xBi ¼ IBi þ f Bi and yij ¼ Iij þ f_ij , where I Bi and Iij are the largest integral parts of xBi and yij respectively, such that IBi \xBi and Iij yij . Hence, 0\ fBi \1 and 0 fij \1; i.e. fBi is a strictly positive proper fraction, while fij is a non-negative proper fraction or zero. Now substituting xBi ¼ IBi þ fBi and yij ¼ Iij þ fij in (8.1), we have n X
xi þ or
j¼m þ 1 n X
xi þ
Iij þ fij xj ¼ IBi þ fBi
I ij xj þ
j¼m þ 1
n X
ð8:2Þ f ij xj ¼ IBi þ fBi
j¼m þ 1
Now for all the variables xi (for i = 1, 2, …, m) and xj (for j = m + 1, m + 2, …, n) to be integer-valued, the sum of terms with fractional coefficients on the left-hand side must be greater than or equal to the fraction on the right-hand side, n X
i:e: or or
fij xj fBi
j¼m þ 1 n X
fij xj j¼m þ 1 n X
fBi 0
fij xj þ fBi 0
j¼m þ 1
This constraint is known as the Gomorian constraint. Example 1 Find the optimal integer solution to the following LPP: Maximize z ¼ x1 þ x2 subject to 3x1 þ 2x2 5 x2 2; x1 ; x2 0 and are integers:
ð8:3Þ
8.4 Construction of Gomory’s Constraint
175
Solution: The given problem is a maximization problem. Now introducing two slack variables x3 ð 0Þ and x4 ð 0Þ, the standard form of given LPP becomes Maximize z ¼ x1 þ x2 þ 0:x3 þ 0:x4 3x1 þ 2x2 þ x3 ¼ 5 x2 þ x4 ¼ 2; x1 ; x2 ; x3 ; x4 0 and are integers:
subject to
Now, ignoring the integer condition, we shall solve the problem by the simplex method. 1 0 Here the initial BFS is x3 = 5, x4 = 2 with the initial basis as B ¼ . 0 1 Now we construct the initial simplex table. cj xB
1 y1
1 y2
0 y3
0 y4
Minimum ratio
x3 = 5 x4 = 2 zB ¼ 0
3 0 −1 "
2 1 −1
1 0 0 #
0 1 0
5/1 = 5 – Dj
1 y2
0 y3
0 y4
Minimum ratio
2/3 1 −1/3
1/3 0 1/3
0 1 0
(5/3)/(2/3) = 5/2 2/1 = 2 Dj
cB
yB
0 0
y3 y4
cB
yB
cj xB
1 y1
1 0
y1 y4
x1 = 5/3 x4 = 2
1 0 0
zB ¼ 53
"
cB 1 1
yB y1 y2
#
cj
1
1
0
0
xB x1 = 1/3 x2 = 2
y1 1 0 0
y2 0 1 0
y3 1/3 0 1/3
y4 −2/3 1 1/3
zB ¼ 73
Since all Dj 0, an optimal solution is obtained, given by x1 ¼ 13 ; x2 ¼ 2. The optimal solution is not an integer solution because x1 ¼ 13, so we have to add a fractional cut constraint, i.e. a Gomorian constraint, in the optimal simplex table.
176
8 Integer Programming
Since xB1 ¼ 13 ¼ 0 þ 13, we have fB1 ¼ 13. Hence, Maxð fB1 Þ ¼ 13 ¼ f B1 , which corresponds to the first row. So, the first row is the source row. Now, from the first row of the optimal simplex table, we get 1 2 1 x3 x4 ¼ 3 3 3 1 1 1 x1 þ 0:x3 þ x3 x4 þ x4 ¼ : 3 3 3 x1 þ
Hence, the Gomorian constraint will be
or or
1 1 1 x3 þ x4 3 3 3 1 1 1 x3 x4 3 3 3 1 1 1 x3 x4 þ G1 ¼ ; 3 3 3
where G1 is the Gomorian slack variable. Adding this new constraint at the bottom of the preceding optimal simplex table, we have used the dual simplex method to solve the problem.
cB
yB
cj xB
1 y1
1 y2
0 y3
0 y4
0 g1
1 1 0
y1 y2 g1
x1 = 1/3 x2 = 2 G1 = −1/3
1 0 0 0
0 1 0 0
1/3 0 −1/3 1/3
−2/3 1 −1/3 1/3
0 0 1 0
zB ¼ 73
Dj
We have xBr ¼ Min½xBi ; xBi \0 ¼ Min 13 ¼ 13 ¼ xB3 . Hence, r ¼ 3. So yB3 , i.e. g1 , leaves the basis. Now,
Dj Max ; yrj \0 yrj
(
¼ Max
1 3
1 3
)
; 13 13
¼ Maxf1; 1g ¼ 1; which corresponds to y3 and y4. Arbitrarily, we choose y3 as the entering vector, and y33 ¼ 13 is the key element. Now we construct the next simplex table.
8.4 Construction of Gomory’s Constraint
cB
yB
1 1 0
y1 y2 y3
cj xB x1 x2 x3 zB
=0 =2 ¼1 ¼2
177
1 y1
1 y2
0 y3
0 y4
0 g1
1 0 0 0
0 1 0 0
0 0 1 0
−1 1 1 0
1 0 −3 1
Dj
Since all Dj 0 and all xBi 0, we obtain an optimal feasible integer solution. The optimal integer solution is x1 ¼ 0; x2 ¼ 2 and Max z ¼ 0 þ 2 ¼ 2. Example 2 Find the optimum integer solution to the following LPP: Maximize z ¼ 2x1 þ 1:7x2 subject to 4x1 þ 3x2 7 x1 þ x2 4 x1 ; x2 0 and are integers: Solution: The given problem is a maximization problem. Now introducing two slack variables x3 ð 0Þ and x4 ð 0Þ, the standard form of the given LPP becomes Maximize z ¼ 2x1 þ 1:7x2 þ 0:x3 þ 0:x4 subject to 4x1 þ 3x2 þ x3 ¼ 7 x1 þ x2 þ x4 ¼ 4; x1 ; x2 ; x3 ; x4 0 and are integers:
Now, ignoring the integer condition, we shall solve the problem by the simplex method. 1 0 Here the initial BFS is x3 = 7, x4 = 4 with the initial basis as B ¼ . 0 1 Now we construct the initial simplex table.
cB
yB
0 0
y3 y4
cj xB
2 y1
1.7 y2
0 y3
0 y4
Minimum ratio
x3 = 7 x4 = 4 zB ¼ 0
4 1 −2 "
3 1 −1.7
1 0 0 #
0 1 0
7/4 = 1.75 4 Dj
178
8 Integer Programming
yB
cj xB
2 y1
1.7 y2
0 y3
0 y4
Minimum ratio
cB 2 0
y1 y4
x1 = 7/4 x4 = 9/4
1 0 0
3/4 1/4 −1/5
1/4 −1/4 1/2
0 1 0
7/3 9
#
"
zB ¼ 72
Dj
cB
yB
cj xB
2 y1
1.7 y2
0 y3
0 y4
1.7 0
y2 y4
x2 = 7/3 x4 = 5/3
4/3 −1/3 0.267
1 0 0
1/3 −1/3 1.7/3
0 1 0
zB ¼ 11:9 3
Since all D j 0, an optimum solution is obtained, given by x2 ¼ 73 ; x4 ¼ 53. The optimal solution is not an integer solution because x2 ¼ 73 ; x4 ¼ 53, so we have to add a fractional cut constraint or Gomorian constraint in the optimal simplex table. Since xB1 ¼ 73 ¼ 2 þ 13 and xB2 ¼ 53 ¼ 1 þ 23, we have fB1 ¼ 13 and fB2 ¼ 23. Hence, Max fB1 ; fB2 ¼ 23 ¼ fB2 , which corresponds to the second row. So, the second row is the source row. Now, from the second row of the optimal simplex table, we get x1 þ
2 2 2 x1 x3 þ x3 þ x4 ¼ 1 þ : 3 3 3
Hence, the Gomorian constraint will be
or or
2 2 2 x1 þ x3 3 3 3 2 2 2 x1 x3 3 3 3 2 2 2 x1 x3 þ G1 ¼ ; 3 3 3
where G1 is the Gomorian slack variable. Adding this new constraint at the bottom of the preceding optimal simplex table, we have used the dual simplex method to solve the problem.
8.4 Construction of Gomory’s Constraint
179
cB
yB
cj xB
2 y1
1.7 y2
0 y3
0 y4
0 g1
1.7 0 0
y2 y4 g1
x2 = 7/3 x4 = 5/3 G1 = −2/3
4/3 −1/3 −2/3 0.267
1 0 0 0
1/3 −1/3 −2/3 1.7/3
0 1 0 0
0 0 1 0
zB ¼ 11:9 3
Dj
We have xBr ¼ Min ½xBi ; xBi \0 ¼ Min 23 ¼ 23 ¼ xB3 . Hence r ¼ 3. So yB3 i.e. g1 leaves the basis. Now,
Dj Max ; yrj \0 yrj
(
0:267 1:7 ¼ Max ; 3 23 23
)
¼ Maxf0:801=2; 1:7=2g ¼ 0:801=2; which corresponds to y1. Hence y1 is the entering vector, and y31 ¼ 23 is the key element. Now we construct the next simplex table.
cB
yB
1.7 0 2
y2 y4 y1
cj xB x2 x4 x1 zB
=1 =2 =1 ¼ 3:7
2 y1
1.7 y2
0 y3
0 y4
0 g1
0 0 1 0
1 0 0 0
−1 0 1 0.3
0 1 0 0
2 −1/2 −3/2 0.4
Dj
Since all Dj 0 and all xBi 0, we obtain an optimal feasible integer solution. The optimal integer solution is x1 ¼ 1; x2 ¼ 1 and Max z ¼ 2 þ 1:7 ¼ 3:7. Example 3 Find the optimum integer solution to the following LPP: Maximize z ¼ 7x1 þ 6x2 subject to
2x1 þ 3x2 12 6x1 þ 5x2 30 x1 ; x2 0 and are integers:
Solution: The given problem is a maximization problem. Now introducing two slack variables x3 ð 0Þ and x4 ð 0Þ, the standard form of the given LPP becomes
180
8 Integer Programming
Maximize z ¼ 7x1 þ 6x2 þ 0:x3 þ 0:x4 2x1 þ 3x2 þ x3 ¼ 12 6x1 þ 5x2 þ x4 ¼ 30; x1 ; x2 ; x3 ; x4 0 and are integers:
subject to
Now, ignoring the integer condition, we shall solve the problem by the simplex method. 1 0 Here the initial BFS is x3 = 12, x4 = 30 with the initial basis as B ¼ . 0 1 Now we construct the initial simplex table.
cB
yB
0 0
y3 y4
cB
yB
0 7
y3 y1
cB
yB
6 7
y2 y1
cj xB
7 y1
6 y2
0 y3
0 y4
Minimum ratio
x3 = 12 x4 = 30 zB ¼ 0
2 6 −7 "
3 5 −6
1 0 0
0 1 0 #
12/2 = 6 30/6 = 5 Dj
cj xB
7 y1
6 y2
0 y3
0 y4
Minimum ratio
x3 = 2 x1 = 5 zB ¼ 35
0 1 0
4/3 5/6 −1/6 "
1 0 0 #
−1/3 1/6 7/6
3/2 6 Dj
cj xB
7 y1
6 y2
0 y3
0 y4
x2 = 3/2 x1 = 15/4 zB ¼ 141=4
0 1 0
1 0 0
3/4 −5/8 1/8
−1/4 3/8 9/8
Since all Dj 0, an optimum solution is obtained, given by x2 ¼ 32 ; x1 ¼ 15 4 . The optimal solution is not an integer solution because x2 ¼ 32 ; x1 ¼ 15 , so we have to 4 add a fractional cut constraint or Gomorian constraint in the optimal simplex table. 3 1 3 Since xB1 ¼ 32 ¼ 1 þ 12 and xB2 ¼ 15 4 ¼ 3 þ 4, we have fB1 ¼ 2 and fB2 ¼ 4. Hence 3 Max fB1 ; fB2 ¼ 4 ¼ fB2 , which corresponds to the second row. So, the second row is the source row.
8.4 Construction of Gomory’s Constraint
181
Now, from the second row of the optimal simplex table, we get x1 x3 þ
3 3 3 x3 þ x4 ¼ 3 þ : 8 8 4
Hence, the Gomorian constraint will be 3 3 3 x3 þ x4 8 8 4 3 3 3 x3 x4 8 8 4 3 3 3 x3 x4 þ G1 ¼ ; 8 8 4
or or
where G1 is the Gomorian slack variable. Adding this new constraint at the bottom of the preceding optimal simplex table, we have used the dual simplex method to solve the problem.
cB
yB
6 7 0
y2 y1 g1
cj xB
7 y1
6 y2
0 y3
0 y4
0 g1
x2 = 3/2 x1 = 15/4 G1 = −3/4 zB ¼ 141=4
0 1 0 0
1 0 0 0
3/4 −5/8 −3/8 1/8
−1/4 3/8 −3/8 9/8
0 0 1 0
Dj
We have xBr ¼ Min½xBi ; xBi \0 ¼ Min 34 ¼ 34 ¼ xB3 . Hence, r ¼ 3. So yB3 i.e. g1 leaves the basis. Now,
Dj Max ; yrj \0 yrj
(
)
1=8 9=8 ¼ Max ; 38 38 ¼ Maxf1=3; 3g ¼ 1=3;
which corresponds to y3. Hence y3 is the entering vector, and y33 ¼ 38 is the key element. Now we construct the next simplex table. cj
2
1.7
0
0
0
cB
yB
xB
y1
y2
y3
y4
g1
6
y2
x2 = 0
0
1
0
−1
2
7
y1
x1 = 5
1
0
0
1
−5/3
0
y3
x3 =4
0
0
1
1
−8/3
zB ¼ 35
0
0
0
1
1/3
Dj
182
8 Integer Programming
Since all D j 0 and all x Bi 0, we obtain an optimal feasible integer solution. The optimal integer solution is x1 ¼ 5; x2 ¼ 0 and Max z ¼ 5 7 þ 6 0 ¼ 35.
8.5
Gomory’s Mixed-Integer Cutting-Plane Method
In all-integer programming problems, the fractional cut or Gomory’s constraint is formed under the assumption that all the variables including slack and surplus variables are integers. This means that the application of the fractional cut will not yield a feasible integer solution unless all variables assume integer values. The importance of this assumption is discussed in the following constraint: x1 þ
1 13 x2 ; x1 ; x2 0 and are integers: 3 2
From the stand point of solving the associated integer linear programming program, the constraint is converted to an equation by introducing a non-negative slack variable G1, i.e. x1 þ
1 13 x2 þ G 1 ¼ : 3 2
From this equation, it is clear that it can have a feasible integer solution in x1 and x2 only if G1 is non-integer. Thus, the application of the fractional cut will yield no feasible integer solution because all the variables x1, x2 and G1 cannot be integers simultaneously. To avoid this situation, a special cut, called Gomory’s mixed-integer cut, is developed. For this cut, only a subset of variables is assumed to take integer values, and the remaining variables including slack and surplus variables are assumed as non-integer variables.
8.6
Construction of Additional Constraint for Mixed-Integer Programming Problems
Let us consider the following mixed-integer programming problem: Maximize z ¼ subject to
x X j¼1 n X
cj xj aij xj ¼ bi ;
i ¼ 1; 2; . . .; m
j¼1
and xj 0;
j ¼ 1; 2; . . .; n
where xj are integers for j ¼ 1; 2; . . .; kð\nÞ:
8.6 Construction of Additional Constraint for Mixed …
183
Let us suppose that the basic variable xr has the largest fractional value among the values of all other basic variables which are restricted to take integer values in the optimal simplex table (which is developed ignoring the integer restrictions of variables). Then, from the optimal simplex table, the rth constraint can be written as follows: X xBr ¼ xr þ arj xj ð8:4Þ j2R
where R is the set of indices corresponding to non-basic variables. Let xBr ¼ ½xBr þ fr ; 0\fr \1
and j þ ¼ j : arj 0 ; j ¼ j : arj \0 . Then Eq. (8.4) can be rewritten as X X ½xBr þ fr ¼ xr þ arj xj þ arj xj or
X
X
arj xj þ
j2j
j2j þ
arj xj ¼ ½xBr xr þ fr ¼ I þ fr
ð8:5Þ
j2j
j2j þ
where 0\fr \1 and I is the integer value. Clearly, the left-hand side of (8.5) is either positive or negative according to whether I þ fr is positive or negative respectively. Case I: Let I þ fr be positive. In this case, I þ fr must be greater than or equal to fr, i:e: I þ fr fr : Hence, from (8.5), we have
X
arj xj þ
X
arj xj fr
ð8:6Þ
j2j
j2j þ
Since for j 2 j ; arj are non-positive and xj 0, then X j2j þ
i:e:
arj xj
X
X j2j þ
arj xj fr
arj xj þ
X j2j
½by (8:6Þ
arj xj ; ð8:7Þ
j2j þ
Case II: Let I + fr be negative. In this case, the value of I + fr will be either −1 + fr or −2 + fr or −3 + fr or …
184
8 Integer Programming
Therefore, X
arj xj
j2j
or
X
X
arj xj þ
X
arj xj 1 þ fr
j2j
j2j þ
arj xj 1 þ fr
j2j
or or
1 X arj xj 1 ½Dividing both sides by 1 þ fr 1 þ fr j2j fr X arj xj fr ½Multiplying both sides by fr 1 þ fr j2j
ð8:8Þ
In both cases, the left-hand sides of inequalities (8.7) and (8.8) are positive and greater than or equal to fr. Thus, any feasible solution of mixed-integer programming must satisfy the following inequality: X
arj xj þ
j2j þ
fr X arj xj fr 1 þ fr j2j
ð8:9Þ
This inequality is not satisfied by the optimal solution of the LPP without the integer requirement, because by setting xj ¼ 0 8 j, the left-hand side becomes zero, whereas the right-hand side becomes positive. Now, introducing a non-negative slack variable in (8.9), we get the following equation:
X j2j þ
arj xj
fr X arj xj þ G1 ¼ fr 1 þ fr j2j
ð8:10Þ
This equation represents Gomory’s constraint or Gomory’s cut. Example 4 Solve the following mixed-integer programming problem: Maximize z ¼ 7x1 þ 9x2 subject to x1 þ 3x2 6 7x1 þ x2 35; x1 ; x2 0 and x1 is an integer:
Solution: Ignoring the integer restriction, we get the optimal simplex table as follows:
8.6 Construction of Additional Constraint for Mixed …
185
cB
yB
cj xB
7 y1
9 y2
0 y3
0 y4
9 7 zB = 63
y2 y1
7/2 9/2
0 1 0
1 0 0
7/22 −1/22 28/11
1/22 3/22 15/11
From the preceding table, it is observed that the optimal value of x1 is not an integer. Hence from the second row, we have
or
9 1 3 ¼ x1 x3 þ x4 22 2 22 1 1 3 x4 : 4þ ¼ x1 x3 þ 2 22 22
Here r = 2, fBr ¼ 12. Hence, the corresponding fractional cut is X fBr X arj xj arj xj þ G1 ¼ f Br ; 1 þ fBr j2j j2j þ
where j þ ¼ j : arj 0 and j ¼ j : arj \0 and xj are non-basic variables in the optimal simplex table 1 3 1 1 or x4 2 1 x3 þ G1 ¼ 22 22 2 1 þ 2 3 1 1 or x4 x3 þ G1 ¼ : 22 22 2 Adding this new constraint at bottom of the preceding simplex table, we have used the dual simplex method to solve the problem.
cB
yB
9 7 0
y2 y1 g1 zB = 63
cj xB
7 y1
9 y2
0 y3
0 y4
0 g1
7/2 9/2 −1/2
0 1 0 0
1 0 0 0
7/22 −1/22 −1/22 28/11
1/22 3/22 −3/22 15/11
0 0 1 0
Since xBr ¼ Min½xBi ; xBi \0 ¼ Min 12 ¼ xB3 , then B3 or g1 leaves the basis. n o n o D 28=11 15=11 Now Max y3jj ; y3j \0 ¼ Max 1=22 ; 3=22 ¼ Maxf56; 10g ¼ 10, which corresponds to y4.
186
8 Integer Programming
Hence, y4 enters the basis and g1 leaves the basis.
cB
yB
9 7 0
y2 y1 y4 zB = 58
cj xB
7 y1
9 y2
0 y3
0 y4
0 g1
10/3 4 11/3
0 1 0 0
1 0 0 0
10/33 −1/11 1/3 23/11
0 0 1 0
1/3 1 −22/3 10
The required optimum solution is x1 ¼ 4; x2 ¼ 10 3 and Max z ¼ 58 with x1 an integer. Example 5 Solve the following mixed-integer programming problem: Maximize z ¼ x1 þ x2 subject to 2x1 þ 5x2 16 6x1 þ 5x2 30; x2 0 and x1 is a non-negative integer: Solution: Ignoring the integer restriction, we get the optimal simplex table as follows: cB
yB
1 y2 1 y1 zB = 53/10
cj xB
1 y1
1 y2
0 y3
0 y4
9/5 7/2
0 1 0
1 0 0
3/10 −1/4 1/20
−1/10 1/4 3/20
From the preceding table, it is observed that the optimal value of x1 is not an integer. Hence from the second row, we have
or
7 1 1 ¼ x1 þ 0:x2 x3 þ x4 2 4 4 1 1 1 3þ ¼ x1 x3 þ x4 : 2 4 4
Here r = 2, fBr ¼ 12. Hence, the corresponding fractional cut is
X j2j þ
arj xj
fBr X arj xj þ G1 ¼ fBr ; 1 þ fBr j2j
8.6 Construction of Additional Constraint for Mixed …
187
where j þ ¼ j : arj 0 and j ¼ j : arj \0 and xj are non-basic variables in the optimal simplex table. 1 1 1 1 2 x4 x3 þ G 1 ¼ 4 4 2 1 þ 12 1 1 1 or x4 x3 þ G1 ¼ : 4 4 2 Adding this new constraint at bottom of the preceding simplex table,
or
we have used the dual simplex method to solve the problem. cj xB
cB
yB
1 1 0
y2 9/5 7/2 y1 −1/2 g1 zB = 53/10
1 y1
1 y2
0 y3
0 y4
0 g1
0 1 0 0
1 0 0 0
3/10 −1/4 −1/4 1/20
−1/10 1/4 −1/4 3/20
0 0 1 0
Since xBr ¼ Min½xBi ; xBi \0 ¼ Min 12 ¼ 12 ¼ xB3 then yB3 i.e. g1 leaves the basis. n o n o D 1=20 3=20 ; 1=4 ¼ Maxf1=5; 3=5g ¼ 1=5 which Now, Max y3jj ; y3j \0 ¼ Max 1=4 corresponds to y3. Hence, y3 enters the basis and g1 leaves the basis. cj xB
cB
yB
1 1 0
y2 6/5 4 y1 2 y3 zB = 26/5
1 y1
1 y2
0 y3
0 y4
0 g1
0 1 0 0
1 0 0 0
0 0 1 0
−4/10 1/2 1 1/10
6/5 −1 −4 1/5
The required optimum solution is x1 ¼ 4; x2 ¼ 65 and Max z ¼ 26=5 with x1 as integer.
8.7
Branch and Bound Method
This method was first developed by A. H. Land and A. G. Doig, and it was further studied by J. D. C. Little and other researchers. This method can be used to solve all-integer, mixed-integer and zero-one LPPs. The concept behind this method is to divide the entire feasible region of the LPP into smaller parts by forming (or branching) subproblems and then examine each
188
8 Integer Programming
successively until a feasible solution which gives the optimal solution is obtained. Here, we shall discuss its application to all-integer programming problems. Branch and Bound (B & B) Algorithm The iterative procedure of this method is summarized here for solving an integer programming problem (maximization case): Let us consider the following all-integer programming problem (AIPP): Maximize z ¼
n X
cj xj
j¼1
subject to
n X
aij xj ¼ bi ;
i ¼ 1; 2; . . .; m
j¼1
xj 0 ðj ¼ 1; 2; . . .; nÞ are integers: Step 1 Ignoring the integer restrictions of the variables, obtain the optimal solution of the given LPP (we denote this LPP as LP-1). Step 2 Test the integrality of the optimum solution obtained in Step 1. There may arise different possibilities. (i) If the solution of LP-1 is infeasible or unbounded, then the solution to the given AIPP is also infeasible or unbounded respectively. Then stop the process. (ii) If the optimum solution of LP-1 satisfies the integer restrictions, then the current solution is optimum to the given AIPP. Stop the process. (iii) If one or more basic variables of LP-1 do not satisfy the integer restrictions, then go to Step 3. Let the optimal value of the objective function of LP-1 be z1. This value provides an initial upper bound. Step 3 Branching step (i) The value of z1 will be considered as the initial upper bound of the objective function value for the given AIPP. Let it be denoted by zU. The lower bound of the objective function of the given integer LPP can be obtained by rounding off to integer all values of the variables. (ii) Let the optimum value xk of the basic variable xk not be an integer. This selection can be done by considering the largest fractional value of the basic variables whose values are not integer, or it can be done arbitrarily. (iii) Branch or partition the problem LP-1 into two new subproblems (called nodes) based on the integer value of xk. This can be done by adding two mutually exclusive constraints
8.7 Branch and Bound Method
189
xk xk and xk xk þ 1 to the LPP LP-1, where xk is the largest integer value not greater than xk . In this case, the two new LPPs (LP-2 and LP-3) are as follows: n X cj xj LP-2: Maximize z ¼ j¼1
subject to
n X
aij xj ¼ bi ;
i ¼ 1; 2; . . .; m
j¼1
xk [ xk þ 1 and xj 0; j ¼ 1; 2; . . .; n: n X LP-3: Maximize z ¼ cj xj j¼1
subject to xk xk þ 1 and xj 0;
n X
aij xj ¼ bi ;
i ¼ 1; 2; . . .; m
j¼1
j ¼ 1; 2; . . .; n:
Step 4 Bound step Solve the two subproblems LP-2 and LP-3 obtained in Step 3. Let the optimal value of the objective function of LP-2 be z2 and that of LP-3 be z3. For the best integer solution (if it exists), the objective function value becomes the lower bound of the IPP objective function value (initially this is the rounding off value). Let the lower bound be denoted by zL. Step 5 Fathoming step Examine the solutions of both LP-2 and LP-3, which contain the optimal solution. There may arise different cases: (i) If the optimal solutions of the two subproblems are integral, then the required solution is the one that gives the larger value of z. (ii) If the optimal solution of one subproblem is integral and the other subproblem has no feasible solution, the required solution is the same as that of the subproblem having an integer-valued solution. (iii) If the optimal solution of one-subproblem is integral while that of the other is not integer-valued, then repeat Steps 3 and 4 for the non-integer-valued subproblem. When the solution of any subproblem is either integer-valued or infeasible, then the subproblem or node is said to be fathomed. If the solution of any subproblem is infeasible, then it is said to be fathomed by infeasibility. If zL zU , it is said to be fathomed by bounding.
190
8 Integer Programming
If zL is obtained as an integer feasible solution in any subproblem and zL \zU , it is said to be fathomed by integrality. Step 6 Repeat Steps 3 through 5 until all integer-valued solutions are recorded. Step 7 Choose the solution among the recorded integer-valued solutions that yields an optimum value of z. Step 8 Stop. Now, we shall illustrate the B & B algorithm by the following numerical example. Example 6 Solve the following IPP by the branch and bound method: Maximize z ¼ 5x1 þ 4x2 subject to x1 þ x2 5 10x1 þ 6x2 45 x1 ; x2 0 and they are integers:
Solution: Let us consider the following IPP: Maximize z ¼ 5x1 þ 4x2 subject to x1 þ x2 5 10x1 þ 6x2 45 x1 ; x2 0 and are integers: Ignoring the integer restrictions of the variables x1 and x2, the associated problem of the given IPP is denoted by LP-1 as follows: Maximize z ¼ 5x1 þ 4x2 ðLP-1Þ subject to x1 þ x2 5; 10x1 þ 6x2 45; x1 ; x2 0 Its optimum solution is x1 ¼ 3:75; x2 ¼ 1:25 and z = 23.75 (see Fig. 8.1). The optimum solution of LP-1 does not satisfy the integer requirements. Hence, this solution is not optimal for the given IPP. Hence, the initial upper bound of the objective function to the given problem is given by zu ¼ 23:75. Again, the initial lower bound can also be obtained by rounding of the values of x1 and x2 to the nearest integer, i.e. x1 ¼ 4, x2 ¼ 1, which does not satisfy the second constraint. Now, we first select one of the integer variables whose value at LP-1 is not integer.
8.7 Branch and Bound Method
191
We select x1 ð¼ 3:75Þ as it contains the largest fractional part. The region 3\x1 \4 of the LP-1 solution space does not contain any integer value. Hence, we can eliminate this region. So, we can write two new restrictions x1 3 and x1 4 which are mutually exclusive. As a result, we get two new LPPs, LP-2 and LP-3 as follows: LP-2: Maximize z ¼ 5x1 þ 4x2 subject to x1 þ x2 5 10x1 þ 6x2 45 x1 3 and x1 ; x2 0 ½Here LP-2 solution space = LP-1 solution space + ðx 3Þ The optimal solution of LP-2 is given by x1 ¼ 3, x2 ¼ 2 and z ¼ 23 (Fig. 8.2). LP-3: Maximize z ¼ 5x1 þ 4x2 subject to x1 þ x2 5 10x1 þ 6x2 45 x1 4 and x1 ; x2 0 x2
Fig. 8.1 Graphical representation of LP-1 8
C (0, 5)
6
4
2
B(3.75, 1.25)
O
x1 2
4
6
A (4.5, 0)
192
8 Integer Programming
½Here LP-3 solution space ¼ LP-1 solution space þ ðx1 4Þ The optimal solution of LP-3 is given by x1 ¼ 4, x2 ¼ 0:83 and z ¼ 23:33. Since this solution does not satisfy the integer restriction, we can write two new constraints, x2 ½0:83 and x2 ½0:83 þ 1, i.e. x2 0 and x2 1. These constraints are mutually exclusive. As a result, we get two new LPPs, LP-4 and LP-5 as follows: LP4: Maximize z ¼ 5x1 þ 4x2 subject to x1 þ x2 5 10x1 þ 6x2 45 x1 4; x2 0: ½Here; LP-4 solution space ¼ LP-3 solution space þ ðx2 0Þ ¼ LP-1 solution space þ ðx1 [ 4Þ þ ðx2 0Þ The optimal solution of LP-4 is x1 ¼ 4:5, x2 ¼ 0 and z ¼ 22:5. LP5: Maximize z ¼ 5x1 þ 4x2 subject to x1 þ x2 5 10x1 þ 6x2 45 x1 4; x2 1:
x2
Fig. 8.2 Graphical representation of LP-2 & LP-3
8
C(0, 5)
6
4
D 2
B
O
2
E
4
G
A
x1 6
8.7 Branch and Bound Method
193
½Here LP-5 solution space ¼ LP-3 solution space þ ðx2 1Þ ¼ LP-1 solution space þ ðx1 4Þ þ ðx2 1Þ LP-5 has no feasible solution. Since the solution of LP-4 does not satisfy the integer restriction, we can write two new mutually exclusive constraints x1=5. As a result, we get two new LPPs, LP-6 and LP-7 as follows: LP-6: Maximize z ¼ 5x1 þ 4x2 subject to x1 þ x2 5 10x1 þ 6x2 45 x1 4; x2 0 and x1 ; x2 0:
½Here LP-6 solution space ¼ LP-4 solution space þ ðx1 4Þ ¼ LP-1 solution space þ ðx1 4Þ þ ðx2 0Þ þ ðx1 4Þ The optimal solution of LP-6 is x1 ¼ 4, x2 ¼ 0 and z ¼ 20. LP-7: Maximize z ¼ 5x1 þ 4x2 subject to
x1 þ x2 5
10x1 þ 6x2 45 x1 5; x2 0 ½Here LP-7 solution space ¼ LP-4 solution space þ ðx 5Þ LP-7 has no feasible solution. Hence, the optimal solution of the given AIPP is x1 ¼ 3; x2 ¼ 2 and Max z ¼ 23.
8.8
Exercises
1. Find the optimum integer solution to the following all-integer programming problems: (i)
Maximize z ¼ x1 þ 2x2 subject to 2x2 7; x1 þ x2 7; 2x1 11 x1 0; x2 0 and they are integers:
194
8 Integer Programming
(ii)
Maximize z ¼ x1 þ 5x2 subject to x1 þ 5x2 20;
(iii)
x1 2 x1 0; x2 0 and they are integers: Maximize z ¼ x1 þ x2 subject to 3x1 2x2 5; x1 2 x1 0; x2 0 and they are integers: Maximize z ¼ 3x1 þ 4x2 subject to
(iv)
(v)
(vi)
3x1 þ 2x2 8;
x1 þ 4x2 10 x1 0; x2 0 and they are integers:
Answer: x1 ¼ 0; 2 ¼ 4; Max z ¼ 16 Maximize z ¼ 2x1 þ 2x2 subject to 5x1 þ 3x2 8 x1 þ 2x2 4 x1 0; x2 0 and they are integers: Maximize z ¼ x1 x2 subject to x1 þ 2x2 4; 6x1 þ 2x2 9 x1 0; x2 0 and they are integers:
2. Find the optimum integer solution to the following mixed-integer programming problems. (i)
Maximize z ¼ 1:5x1 þ 3x2 þ 4x3 subject to 2:5x1 þ 2x2 þ 4x3 12 2x1 þ 4x2 x3 7
(ii)
(iii)
x1 ; x2 ; x3 0 and x3 is an integer: Maximize z ¼ 2x1 þ 6x2 subject to x1 þ x2 5 x1 þ 2x2 6 x1 ; x2 0 and x2 is an integer: Maximize z ¼ x1 þ 6x2 subject to 3x1 þ 2x2 5 x2 2 x1 ; x2 0 and x1 is an integer:
8.8 Exercises
195
3. Use the branch and bound method to solve the following all-integer programming problems. (i)
(ii)
Maximize z ¼ x1 þ x2 subject to 4x1 x2 10 2x1 þ 5x2 10 x1 0; x2 0 and they are integers: Maximize z ¼ 7x1 þ 9x2 subject to
(iii)
x1 þ 3x2 6
7x1 þ x2 35 0 x1 ; x2 7 and they are integers: Maximize z ¼ 21x1 þ 11x2 subject to 7x1 þ 4x2 þ x3 ¼ 13 x2 5
(iv)
x1 ; x2 ; x3 0 and they are integers: Maximize z ¼ 4x1 þ 3x2 subject to x1 þ 2x2 4 2x1 þ x2 6 x1 0; x2 0 and they are integers:
Chapter 9
Basics of Unconstrained Optimization
9.1
Objectives
The objectives of this chapter are to: • Discuss different techniques for solving unconstrained optimization problems of a single variable • Study the necessary and sufficient conditions for optimizing a function of several variables.
9.2
Optimization of Functions of a Single Variable
Let I R be any interval, finite or infinite, and let f: I ! R be a real-valued function. Definition (i) x 2 I is said to be a local minimum or minimizer of f if 9 a d [ 0 such that f ðx Þ f ð xÞ, 8x 2 ðx d; x þ dÞ. (ii) x 2 I is said to be a global minimum or minimizer of f if f ðx Þ f ð xÞ; 8x 2 I.
Necessary condition for optimality The condition for optimality of f (x) at the point x = x* is f 0 ðx Þ ¼ 0. This is a necessary condition but not sufficient. (i) For example, if f ð xÞ ¼ x3 , then at x = 0, f 0 ð0Þ ¼ 0. But when x > 0, f (x) > f (0) and f (x) < f (0) for x < 0. © Springer Nature Singapore Pte Ltd. 2019 A. K. Bhunia et al., Advanced Optimization and Operations Research, Springer Optimization and Its Applications 153, https://doi.org/10.1007/978-981-32-9967-2_9
197
198
9 Basics of Unconstrained Optimization
Here f (x) has no extreme value at x = 0. (ii) Even when f 0 ðx Þ does not exist, f ðx Þ may be optimum. For example, if f ð xÞ ¼ j xj, then obviously f (0) is a minimum value of f (x), but f 0 ð0Þ does not exist. Sufficient condition for optimality The sufficient condition for optimality of f(x) is as follows: If at x = x* for the function f (x): (i) (ii) (a) (b)
f 0 ðx Þ ¼ f 00 ðx Þ ¼ ¼ f n1 ðx Þ ¼ 0 f n ðx Þ exists and 6¼ 0 then at x = x*, f (x) has no extreme value if n is odd an extreme value if n is even, i.e. a maximum if f n ðx Þ\0, minimum if f n ð x Þ [ 0
Alternative approach Theorem If (i) f (x) is defined at x = x*. (ii) f 0 ð xÞ exists in a suitably small neighbourhood of 0\jx x j\d [ f 0 ð xÞ may or may not exist]. (iii) f 0 ð xÞ has a fixed sign when x < x*, also a unique sign when x < x*. Then with the increase of x through x*, (a) f (x) has no extreme value at x = x* if the sign of f 0 ð xÞ does not change. (b) f (x) has a maximum value at x = x* if the sign of f 0 ð xÞ changes from positive to negative. (c) f (x) has a minimum value at x = x* if the sign of f 0 ð xÞ changes from negative to positive. Note: This method does not require higher order derivative information or the derivative information at the point where the optimality is tested. Example 1 Test for optimality at x = 0 for f ð xÞ ¼ sin x x þ
x3 3!
Solution Here 3
5
f ð xÞ ¼ sin x x þ x3! x5! 2 4 f 0 ð xÞ ¼ cos x 1 þ x2! x4! ; 3 f 00 ð xÞ ¼ sin x þ x x3! ; 2 f 000 ð xÞ ¼ cos x þ 1 x2! ; f iv ð xÞ ¼ sin x x; f v ð xÞ ¼ cos x 1; f vi ð xÞ ¼ sin x; f vii ð xÞ ¼ cos x;
5
x5! .
) f 0 ð 0Þ ¼ 0 ) f 00 ð0Þ ¼ 0 ) f 000 ð0Þ ¼ 0 ) f iv ð0Þ ¼ 0 ) f v ð 0Þ ¼ 0 ) f vi ð0Þ ¼ 0 ) f vii ð0Þ ¼ 1 6¼ 0:
9.2 Optimization of Functions of a Single Variable
199
Since the order of the first non-zero derivative is odd, then f (x) is neither maximum nor minimum at x = 0. Example 2 If f 0 ð xÞ ¼ ðx aÞ2m ðx bÞ2n þ 1 , where m and n are positive integers, show that x ¼ a gives neither a maximum nor a minimum value of f (x). But x = b gives a minimum. Solution Let h be any positive number, however small. Now; f 0 ða hÞ ¼ ða h aÞ2m ða h bÞ2n þ 1 ¼ ðhÞ2m ða b hÞ2n þ 1 ¼ h2m ða b hÞ2n þ 1 and f 0 ða þ hÞ ¼ ða þ h aÞ2m ða þ h bÞ2n þ 1 ¼ h2m ða b þ hÞ2n þ 1 : Since h is a very small quantity, ða b hÞ2n þ 1 and ða b þ hÞ2n þ 1 have the same sign. ) f 0 ða hÞ and f 0 ða þ hÞ have the same sign. ) f (x) is neither maximum nor minimum at x = a. Again, f 0 ðb hÞ ¼ ðb h aÞ2m ðb h bÞ2n þ 1 ¼ ðb a hÞ2m ðhÞ2n þ 1 ¼ h2n þ 1 ðb a hÞ2m f 0 ðb þ hÞ ¼ ðb þ h aÞ2m ðb þ h bÞ2n þ 1 ¼ h2n þ 1 ðb a þ hÞ2m : Since h is a very small quantity, ðb a þ hÞ2m and ðb a hÞ2m have the same sign. ) f 0 ðb hÞ\0 and f 0 ðb þ hÞ [ 0; i.e. f 0 ð xÞ changes its sign from negative to positive. Therefore, f (x) is minimum at x = b. Example 3 If f ð xÞ ¼ j xj, show that f (x) is minimum at x = 0. Solution Since f ð xÞ ¼ j xj
) f ð0Þ ¼ 0.
Let h be any positive number, however small.
200
9 Basics of Unconstrained Optimization
Now; f ð0 hÞ ¼ f ðhÞ ¼ jhj ¼ h [ 0 ¼ f ð0Þ ) f ð0 hÞ [ f ð0Þ: Again; f ð0 þ hÞ ¼ f ðhÞ ¼ jhj ¼ h [ 0 ¼ f ð0Þ ) f ð0 þ hÞ [ f ð0Þ: Hence, f (x) has a minimum at x = 0. 1
2
Example 4 Show that ðx aÞ3 ð2x aÞ3 is maximum for x ¼ a2 and neither maximum nor minimum for x = a (a > 0). 1 2 Solution Let f ð xÞ ¼ ðx aÞ3 ð2x aÞ3 ) f a2 ¼ 0 and f ðaÞ ¼ 0. Again, let h be any positive number, however small. 1
2
) f ða hÞ ¼ ða h aÞ3 ð2a 2h aÞ3 1
2
1
2
¼ ðhÞ3 ða 2hÞ3 ¼ h3 ða 2hÞ3 \0 ¼ f ðaÞ 1
2
f ða þ hÞ ¼ ða þ h aÞ3 ð2a þ 2h aÞ3 2
1
¼ h3 ða þ 2hÞ3 [ 0 ¼ f ðaÞ ) f ða hÞ\f ðaÞ\f ða þ hÞ: Hence, f(x) is neither maximum nor minimum for x = a (a > 0). Again; f
Now; f
a
a 13 h a i23 þh ¼ þh a 2 þh a 2 2 2 n a o13 2 ¼ h ð2hÞ3 2 a 13 a 2 ¼ h ð2hÞ3 \0 ¼ f : 2 2
a a 13 h a i23 h ¼ ha 2 h a 2 2 2 2 1 2 a13 a13 ¼ h ða 2h aÞ3 ¼ ð1Þ3 h þ ð2hÞ3 2 2 a 13 2 ¼ þ h ð2hÞ3 2 13 2 a þ h [ 0 and ð2hÞ3 [ 0 as h [ 0 ¼ negative * 2 a a ) f h \0 ¼ f : 2 2
) f (x) is maximum at x ¼ a2.
9.3 Optimization of Functions of Several Variables
9.3
201
Optimization of Functions of Several Variables
Here we consider the optimization problem f ðxÞ x 2 X;
Minimize subject to
ð9:1Þ
where the function f : Rn ! R is a real-valued function, called the objective function. The vector x is an n-tuple vector of independent variables x1 ; x2 ; . . .; xn : The set X is a subset of Rn , called the feasible set. There are also optimization problems that require maximization of the objective function, in which case we seek maximizers. Maximization problems, however, can be represented equivalently in the minimization form above because maximizing f is equivalent to minimizing f . Therefore, we can confine our attention to minimization problems without loss of generality. Definition Let f : X Rn ! R be a real-valued function. A point x 2 X is said to be a: (i) Local minimizer of f if 9 a d [ 0 such that f ðx Þ f ð xÞ; 8x 2 X \ Bðx ; dÞ, where Bðx ; dÞ is an open ball whose centre is at x and radius d (ii) Global minimizer of f if f ðx Þ f ð xÞ; 8x 2 X. If strict inequality holds in the preceding definition, x is said to be a strict local minimizer/global minimizer. Necessary condition for optimality Theorem The necessary condition for the continuous function f (x) to have an extreme point at x ¼ x is that the gradient rf ðx Þ ¼ 0, i:e:
@f ðx Þ @f ðx Þ @f ðx Þ ¼ ¼ ¼ ¼0 @x1 @x2 @xn
ð9:2Þ
Proof From the Taylor series expansion, we have f ðx þ hÞ ¼ f ð xÞ þ hh; rf ð xÞi þ
1 hH ðx þ hhÞh; hi 2!
0\h\1
Setting x ¼ x in this expression, we have 1 hH ðx þ hhÞh; hi 2! n X @f ðx Þ 1 ) f ð x þ hÞ f ð x Þ ¼ hi þ hH ðx þ hhÞh; hi: @x 2! i i¼1
f ðx þ hÞ ¼ f ðx Þ þ hh; rf ðx Þi þ
202
9 Basics of Unconstrained Optimization
Since the term hH ðx þ hhÞh; hi is of order h2i , so the term of order h only will dominate the higher order terms for small h. Thus, the sign of f ðx þ hÞ f ðx Þ is decided by the sign of hi @f@xðxi Þ.
Let us suppose that @f@xðxi Þ [ 0; then the sign of f ðx þ hÞ f ðx Þ will be positive for hi > 0 (i = 1, 2, …, n), i.e. f ðx þ hÞ [ f ðx Þ and negative for hi < 0 (i = 1, 2, …, n), i.e. f ðx þ hÞ\f ðx Þ.
This means that x cannot be an extreme point. The same conclusion can be obtained if we assume that @f@xðxi Þ \0. These two conclusions contradict our original statement that x is an extreme point. Hence, we conclude that @f ðx Þ ¼ 0; i ¼ 1; 2; . . .; n; @xi i:e: rf ðx Þ ¼ 0:
Sufficient condition for optimality Theorem A sufficient condition for a stationary point x to be an extreme point is that the Hessian matrix of f ð xÞ evaluated at x is: (i) Positive definite when x is a local minimizer. (ii) Negative definite when x is a local maximizer. Proof From Taylor’s theorem we can write f ð x þ hÞ ¼ f ð x Þ þ
n X i¼1
hi
n X n @f ðx Þ 1X @ 2 f ðx þ hhÞ þ hi hj ; @xi 2! i¼1 j¼1 @xi @xj
0\h\1: ð9:3Þ
Since x is an extreme point, from the necessary conditions for f ð xÞ to have an extreme point, we have @f@xðxi Þ ¼ 0; i ¼ 1; 2; . . .; n. Then Eq. (9.3) reduces to n X n 1X @ 2 f ð xÞ f ð x þ hÞ f ð x Þ ¼ hi hj : 2! i¼1 j¼1 @xi @xj x¼x þ hh
9.3 Optimization of Functions of Several Variables
203
) The sign of f ðx þ hÞ f ðx Þ will be the same as that of XX i
j
@ 2 f ð xÞ hi hj @xi @xj
: x¼x þ hh
ð xÞ Since the second-order partial derivative @@xfi @x is continuous in the neighbourhood j 2 @ 2 f ð xÞ ð xÞ of x , then @xi @xj will have the same sign as @@xfi @x for sufficiently small j 2
x¼x þ hh
x¼x
components of h. This implies that f ðx þ hÞ f ðx Þ will be positive and x will be a local minimum n P n P 2 ð xÞ if Q ¼ hi hj @@xfi @x is positive. j i¼1 j¼1 x¼x
This Q is a quadratic form and can be written as Q ¼ hHh; hijx¼x ; where Hjx¼x is called a Hessian matrix of f ð xÞ. From linear algebra, we know that the quadratic form will be positive for all h if and only if H is positive definite at x ¼ x . This means that a sufficient condition for the extreme point x to be a local minimum is that the Hessian matrix of f ð xÞ evaluated at x ¼ x is positive definite. This completes the proof for the minimization case. By proceeding in a similar manner, it can be proved that the Hessian matrix will be negative definite if x is a local maximum point. Remark For the semi-definite case of the Hessian matrix of f ð xÞ, no general conclusion can be drawn regarding the optimality of f ð xÞ. Example 5 Examine whether f ðx1 ; x2 Þ ¼ 12 k2 x21 þ 12 k3 ðx2 x1 Þ2 þ 12 k1 x22 Px2 (P being constant) is optimum or not. Solution It is given that f ðx1 ; x2 Þ ¼ 12 k2 x21 þ 12 k3 ðx2 x1 Þ2 þ 12 k1 x22 Px2 . Now for the extremum of f, we have @f ¼ 0 i:e:; k2 x1 k3 ðx2 x1 Þ ¼ 0 @x1
ð9:4Þ
204
9 Basics of Unconstrained Optimization
and
@f ¼0 @x2
i:e:; k3 ðx2 x1 Þ þ k1 x2 P ¼ 0:
ð9:5Þ
From (9.4), we have k2 x1 k3 x2 þ k3 x1 ¼ 0
ð9:6Þ
or ðk2 þ k3 Þx1 k3 x2 ¼ 0: From (9.5), we have k3 ð x2 x1 Þ þ k1 x2 P ¼ 0 or k3 x2 k3 x1 þ k1 x2 ¼ P
ð9:7Þ
or k3 x1 þ ðk1 þ k3 Þx2 ¼ P: Now multiplying (9.6) and (9.7) by k3 and (k2 + k3) respectively and then adding, we have 2 k3 þ ðk1 þ k3 Þðk2 þ k3 Þðk2 þ k3 Þ x2 ¼ Pðk2 þ k3 Þ Pðk2 þ k3 Þ : or x2 ¼ k1 k2 þ k1 k3 þ k2 k3 Now multiplying (9.6) and (9.7) by (k1 + k3) and k3 respectively and then adding, we have k1 k2 þ k1 k3 þ k2 k3 þ k32 x1 ¼ Pk3 Pk3 or x1 ¼ : k1 k2 þ k1 k3 þ k2 k3 Now the Hessian matrix of f evaluated at ðx1 ; x2 Þ is 2 Hjðx1 ;x2 Þ ¼
@2f 2 4 @x2 1 @ f @x1 @x2
@2f @x1 @x2 @2f @x22
3 5 ðx1 ;x2 Þ
k þ k3 ¼ 2 k3
k3 k1 þ k3
k þ k3 ¼ 2 k3 ðx1 ;x2 Þ
k3 : k1 þ k3
The leading principal minors of the Hessian matrix are
and
A1 ¼ jk2 þ k3 j ¼ k2 þ k3 [ 0 ½* k2 ; k3 [ 0 k þ k3 k3 A2 ¼ 2 ¼ ðk2 þ k3 Þðk1 þ k3 Þ k32 k3 k1 þ k3 ¼ k1 k2 þ k2 k3 þ k3 k1 þ k32 k32 ¼ k1 k2 þ k2 k3 þ k3 k1 [ 0 ½* k1 ; k2 ; k [ 0:
9.3 Optimization of Functions of Several Variables
205
Therefore, the Hessian matrix of f evaluated at x1 ; x2 , i.e. Hjðx1 ;x2 Þ , is positive definite. Hence, f is minimum at x1 ; x2 , which are given by Pk3 k1 k2 þ k2 k3 þ k2 k1 P ð k2 þ k3 Þ x2 ¼ : k1 k2 þ k2 k3 þ k3 k1
x1 ¼
Example 6 Find the extreme point of f ðx1 ; x2 Þ ¼ x31 þ x32 þ 2x21 þ 4x22 þ 6. Solution Here and
f ðx1 ; x2 Þ ¼ x31 þ x32 þ 2x21 þ 4x22 þ 6 @f ) @x ¼ 3x21 þ 4x1 ¼ x1 ð3x1 þ 4Þ 1 @f 2 @x2 ¼ 3x2 þ 8x2 ¼ x2 ð3x2 þ 8Þ:
The necessary conditions for the optimum of f ðx1 ; x2 Þ are given by @f ¼0 @x1
and
@f ¼ 0; @x2
i:e: x1 ð3x1 þ 4Þ ¼ 0
ð9:8Þ
and x2 ð3x2 þ 8Þ ¼ 0:
ð9:9Þ
From (9.8), we have x1 = 0 or x1 ¼ 43. From (9.9), we have x2 = 0 or x2 ¼ 83. Hence, Eqs. (9.8) and (9.9) are satisfied at the points [or the solution of (9.8) and (9.9)] 8 4 4 8 ð0; 0Þ; 0; ; ; 0 and ; : 3 3 3 3 Since f ðx1 ; x2 Þ ¼ x31 þ x32 þ 2x21 þ 4x22 þ 6, @2f ¼ 6x1 þ 4; @x21
@2f ¼ 6x2 þ 8; @x22
@2f ¼ 0: @x1 @x2
206
9 Basics of Unconstrained Optimization
The Hessian matrix of f ðx1 ; x2 Þ is given by H¼
6x1 þ 4 0 : 0 6x2 þ 8
Now, the leading principal minors of the Hessian matrix are given by 6x1 þ 4 0 H1 ¼ j6x1 þ 4j and H2 ¼ : 0 6x2 þ 8 Now, we construct the following table: Point ðx1 ; x2 Þ
Value of H1
Value of H2
Nature of H
Nature of x
Value of f(x)
(0, 0)
4
32
Local minimum
6
4
−32
Positive definite Indefinite
Saddle point
−4
−32
Indefinite
Saddle point
−4
32
Negative
Local maximum
418 27 194 27 50 3
0; 83 4 ;0 34 3 ; 83
Example 7 Examine the extreme values of f ðx; y; zÞ ¼ 2xyz 4zx 2yz þ x2 þ y2 þ z2 2x 4y þ 4z: Solution Here; fx ¼ 2yz 4z þ 2x 2 fy ¼ 2xz 2z þ 2y 4 fz ¼ 2xy 4x 2y þ 2z þ 4: For extreme values of f ðx; y; zÞ, the necessary conditions are fx ¼ 0;
fy ¼ 0;
fz ¼ 0;
i:e: yz 2z þ x 1 ¼ 0
ð9:10Þ
xz z þ y 2 ¼ 0
ð9:11Þ
xy 2x y þ z þ 2 ¼ 0:
ð9:12Þ
9.3 Optimization of Functions of Several Variables
207
From (9.10), we have x ¼ 1 þ 2z yz:
ð9:13Þ
Substituting x ¼ 1 þ 2z yz in (9.11) and then simplifying, we have ð2 yÞz2 þ y 2 ¼ 0:
ð9:14Þ
Again, substituting x ¼ 1 þ 2z yz in (9.11) and then simplifying, we have zðy2 4y þ 3Þ ¼ 0; i:e: z ¼ 0 or y ¼ 3; 1: When z = 0, then from (9.13) and (9.14), we have x = 1, y = 2. Again, for y = 2, z = ± 1, x = 0, 2 and for y = 1, z = ± 1, x = 2, 0. Therefore, the solutions of (9.10) through (9.12) are ð1; 2; 0Þ; ð0; 3; 1Þ; ð2; 3; 1Þ; ð2; 1; 1Þ; ð0; 1; 1Þ: Now, fxx ¼ 2;
fyy ¼ 2;
fzz ¼ 2
fyx ¼ fxy ¼ 2z;
fyz ¼ fzy ¼ 2x 2;
fzx ¼ fxz ¼ 2y 4:
) The Hessian matrix of f ðx; y; zÞ is 2
fxx H ¼ 4 fyx fzx
fxy fyy fzy
3 2 fxz 2 fyz 5 ¼ 4 2z fzz 2y 4
2z 2 2x 2
3 2y 4 2x 2 5: 2
Now at (1, 2, 0), 2
2 H ¼ 40 0
3 0 0 2 0 5: 0 2
The leading principal minors of H are 2 0 ¼ 4 and H3 ¼ jH j ¼ 8: H1 ¼ j2j ¼ 2; H2 ¼ 0 2 As all the leading principal minors are positive, the Hessian matrix at (1, 2, 0) is positive definite.
208
9 Basics of Unconstrained Optimization
Hence, the given function f ðx; y; zÞ is minimum at (1, 2, 0), and the minimum value of f ðx; y; zÞ is −5. In a similar way, test the optimality at other points. Example 8 Show that f ðx; y; zÞ ¼ ðx þ y þ zÞ3 3ðx þ y þ zÞ 24xyz þ a3 has a minimum at (1, 1, 1) and a maximum at (−1, −1, −1).
Solution Here f ðx; y; zÞ ¼ ðx þ y þ zÞ3 3ðx þ y þ zÞ 24xyz þ a3 Hence; fx ¼ 3ðx þ y þ zÞ2 3 24yz fy ¼ 3ðx þ y þ zÞ2 3 24zx fz ¼ 3ðx þ y þ zÞ2 3 24xy Again; fxx ¼ 6ðx þ y þ zÞ fyy ¼ 6ðx þ y þ zÞ fzz ¼ 6ðx þ y þ zÞ fxy ¼ fyx ¼ 6ðx þ y þ zÞ 24z fyz ¼ fzy ¼ 6ðx þ y þ zÞ 24x fxz ¼ fzx ¼ 6ðx þ y þ zÞ 24y: At (1, 1, 1), fx ¼ 3:32 3 24:1:1 ¼ 0 fy ¼ 3:32 3 24:1:1 ¼ 0 fz ¼ 3:32 3 24:1:1 ¼ 0: Again at (−1, −1, −1), fx ¼ 3ð3Þ2 3 24 ¼ 0 fy ¼ 3ð3Þ2 3 24 ¼ 0 fz ¼ 3ð3Þ2 3 24 ¼ 0: Hence, (1, 1, 1) and (−1, −1, −1) are the extreme points of f (x, y, z). The Hessian matrix for the function f (x, y, z) is 2
3 fxy fxz fyy fyz 5 fzy fzz 6ðx þ y þ zÞ ¼ 4 6ðx þ y þ zÞ 24z 6ðx þ y þ zÞ 24y
fxx 4 f H ¼ yx 2 fzx
6ðx þ y þ zÞ 24z 6ð x þ y þ z Þ 6ðx þ y þ zÞ 24x
3 6ðx þ y þ zÞ 24y 6ðx þ y þ zÞ 24x 5: 6ð x þ y þ z Þ
9.3 Optimization of Functions of Several Variables
209
At (1, 1, 1), 2
18 H ¼ 4 6 6
6 18 6
3 6 6 5: 18
The leading principal minors are 18 6 ¼ 288 and H3 ¼ jH j ¼ 18 288 ¼ 5184: H1 ¼ j18j ¼ 18; H2 ¼ 6 18 As all the leading principal minors are positive, the Hessian matrix H is positive definite at (1, 1, 1), Therefore, f (x, y, z) has a minimum value at ð1; 1; 1Þ, and the minimum value of f (x, y, z) is a3−6. At (−1, −1, −1), 18 H ¼ 6 6
6 6 18 6 : 6 18
The leading principal minors are 18 H1 ¼ j18j ¼ 18; H2 ¼ 6
6 ¼ ð18Þ2 62 ¼ 288 18
and H3 ¼ jH j ¼ 18ð324 36Þ 6ð108 36Þ þ 6ð36 þ 108Þ ¼ 5184 þ 2 864 ¼ 3456: Hence, f (x, y, z) has a maximum value at (−1, −1, −1) [since the Hessian matrix is negative definite].
9.4
Exercises 1
2
1. Show that ðx aÞ3 ð2x aÞ3 is maximum for x ¼ a2 and neither maximum nor minimum for x = a (a > 0).
210
9 Basics of Unconstrained Optimization
2. Find the extreme points of the function f ðx1 ; x2 Þ ¼ 2x31 þ 3x32 þ 5x21 þ 4x22 þ 6. 3. Find the extreme points of the function f ðx1 ; x2 Þ ¼ 8x1 þ 4x2 þ x1 x2 x21 x22 . 4. Examine the extreme values of f ðx; y; zÞ ¼ 2xyz 4zx 2yz þ x2 þ y2 þ z2 2x 4y þ 4z:
Chapter 10
Constrained Optimization with Equality Constraints
10.1
Objective
The objective of this chapter is to introduce the most important methods, namely direct substitution, constrained variation and the Lagrange multiplier method, for solving non-linear constrained optimization problems with equality constraints.
10.2
Introduction
The general form of non-linear constrained optimization problems with equality constraints can be written as follows: Optimize z ¼ f ð xÞ subject to gi ð xÞ ¼ 0;
i ¼ 1; 2; . . .; m;
where x ¼ ðx1 ; x2 ; . . .; xn Þ: i ¼ 1; 2; . . .; m subject to gi ðxÞ ¼ 0; x 2 Rn ; f : Rn ! R; gi : Rn ! R and m n: If m > n, the problem becomes overdefined, and in general there will be no solution. For solving this problem, there are several methods, e.g. (i) Direct substitution method (ii) Constrained variation method (iii) Lagrange multiplier method. We shall discuss these methods in detail. © Springer Nature Singapore Pte Ltd. 2019 A. K. Bhunia et al., Advanced Optimization and Operations Research, Springer Optimization and Its Applications 153, https://doi.org/10.1007/978-981-32-9967-2_10
211
212
10.3
10 Constrained Optimization with Equality Constraints
Direct Substitution Method
For an optimization problem with n variables and m equality constraints, it is possible (theoretically) to express any set of m variables in terms of the remaining (n − m) variables. When these expressions are substituted into the original objective function, then the reduced objective function involves only n − m variables. This reduced objective function is not subjected to any constraint, so its optimum value can be found by using the unconstrained optimization technique discussed in Chap. 9. Theoretically, the method of direct substitution is very simple. However, from a practical point of view, it is not convenient. In most of practical problems, the constraint equations are highly non-linear in nature. In those cases, it becomes impossible to solve them and express any m variables in terms of the remaining (n − m) variables from the given constraints. Example 1 Minimize
z ¼ 9 8x1 6x2 4x3 þ 2x21 þ 2x22 þ x23 þ 2x1 x2 þ 2x1 x3
subject to
x1 þ x2 þ 2x3 ¼ 3:
Solution From the given constraint, we have x2 ¼ 3 x1 2x3 , and substituting it in the objective function, we have z ¼ 9 8x1 6ð3 x1 2x3 Þ 4x3 þ 2x21 þ 2ð3 x1 2x3 Þ2 þ x23 þ 2x1 ð3 x1 2x3 Þ þ 2x1 x3 ¼ 2x21 þ 9x23 þ 6x1 x3 8x1 16x3 þ 9:
Now we have to optimize z with respect to the decision variables x1 and x3 . The necessary conditions for optimality of z are given by @z @z ¼ 0 and ¼ 0; @x1 @x3 i:e: 2x1 þ 3x3 ¼ 4
ð10:1Þ
3x1 þ 9x3 ¼ 8:
ð10:2Þ
Solving (10.1) and (10.2), we have x1 ¼ 43 ; x3 ¼ 49. Now x2 ¼ 3 x1 2x3 ¼ 3 43 2 49 ¼ 79. We have
@2 z @x21
¼ 4;
@2 z @x23
¼ 18;
@2 z @x1 @x3
¼ 6.
10.3
Direct Substitution Method
213
Hence, the Hessian matrix is given by 0 H¼
@2 z 2 @ @x2 1 @ z @x3 @x1
@2z @x1 @x3 @2z @x23
1 A¼
4 6
6 : 18
4 The leading principal minors are H1 ¼ j4j ¼ 4 and H2 ¼ 6
6 ¼ 36. 18
Since H1 [ 0 and H2 [ 0, Z is minimum for x1 ¼ 43 ; x2 ¼ 79 ; x3 ¼ 49 and the minimum value of z is 19. Example 2 Find the dimension of a box of largest volume that can be inscribed in a sphere of unit radius. Solution Let the origin O of the Cartesian coordinate system OX1 X2 X3 be the centre of the sphere and the sides of the box be 2x1, 2x2 and 2x3. Obviously, x1 ; x2 ; x3 [ 0. The volume of the box is given by f ðx1 ; x2 ; x3 Þ ¼ 8x1 x2 x3 : Since the box is inscribed in a sphere of unit radius, the corners of the box lie on the surface of the sphere of unit radius; that is, x1, x2 and x3 have to satisfy the constraint equation x21 þ x22 þ x23 ¼ 1, as the equation of a sphere with unit radius and centre at the origin is x21 þ x22 þ x23 ¼ 1. Hence, our problem is Maximize f ðx1 ; x2 ; x3 Þ ¼ 8x1 x2 x3 subject to x21 þ x22 þ x23 ¼ 1 and x1 ; x2 ; x3 [ 0. This problem has three variables and one equality constraint. Hence, any one variable can be eliminated using the constraint. Now, from the constraint x21 þ x22 þ x23 ¼ 1, we have 1 x3 ¼ 1 x21 x22 2 as x3 [ 0 Substituting the expression of x3 in f ðx1 ; x2 ; x3 Þ, we get the reduced objective function as
214
10 Constrained Optimization with Equality Constraints
1 f ¼ 8x1 x2 1 x21 x22 2 ; which can be maximized as an unconstrained function of two variables. The necessary conditions for the optimality of f are @f @f ¼ 0 and ¼ 0: @x1 @x2 2 3 2 1 @f x1 ¼ 0 implies 8x2 4 1 x21 x22 2 Now; 1 5 ¼ 0: @x1 1 x21 x22 2 2 3 2 1 @f x 2 Again; ¼ 0 implies 8x1 4 1 x21 x22 2 12 5 ¼ 0: @x2 2 2 1x x 1
ð10:3Þ
ð10:4Þ
2
After simplifying (10.3) and (10.4), we have 1 2x21 x22 ¼ 0 1 x21 2x22 ¼ 0
as x1 ; x2 [ 0:
Solving these equations, we have 1 x1 ¼ x2 ¼ pffiffiffi : 3 1 1 2 Hence; x3 ¼ 1 x1 x22 3 ¼ pffiffiffi : 3
2 3 3 12 @2f 8x1 x2 8x2 x 2 2 1 4 Now; ¼ 1 1 þ 2x1 1 x1 x2 5 2 2 @x21 1 x21 x22 2 1 x1 x2 1 x21 x22 2 2 3 3 12 @2f 8x1 x2 8x1 x 2 2 2 4 ¼ 1 1 þ 2x2 1 x1 x2 5 2 2 @x22 1 x21 x22 2 1 x1 x2 1 x21 x22 2 2 3 1 12 @2f 8x22 8x21 x22 2 2 2 2 2 4 ¼ 8 1 x1 x2 1 1 x2 x2 1 x1 x2 þ 1 5: @x1 @x2 1 2 1 x21 x22 2 1 x21 x22 2
At x1 ¼ p1ffiffi3 ;
x2 ¼ p1ffiffi3 ;
@2f @x21
¼ p32ffiffi3 ;
@2 f @x22
¼ p32ffiffi3 ;
@2 f @x1 @x2
¼ p16ffiffi3 :
10.3
Direct Substitution Method
215
Hence, at x1 ¼ x2 ¼ p1ffiffi3, the Hessian matrix H of f is given by " H¼
# p32ffiffi3 p16ffiffi3 : p16ffiffi3 p32ffiffi3
The leading principal minors are 32 32 H1 ¼ pffiffiffi ¼ pffiffiffi \0 3 3 p32ffiffi p16ffiffi 1024 256 768 3 ¼ [ 0: and H2 ¼ 163 ¼ pffiffi3 p32ffiffi3 9 9 9 Hence, f is maximum at x1 ¼ x2 ¼ p1ffiffi3, and the maximum value of f is 8 Hence, the maximum volume of the box is
10.4
8 ffiffi p 3 3
with dimensions p2ffiffi3 ;
3 p1ffiffi ¼ 3
8 ffiffi p . 3 3
p2ffiffi ; p2ffiffi. 3 3
Method of Constrained Variation
The basic idea used in the method of constrained variation is to find a closed-form expression for the first-order differential of f(x) (i.e. df) at all points at which the constraints gj ð xÞ ¼ 0; j ¼ 1; 2; . . .; m, are satisfied. Then, by setting df = 0, the desired optimum points are obtained. Before discussing the general method, we shall indicate the important features through the following simple problem with n = 2 and m = 1. f ðx1 ; x2 Þ subject to gðx1 ; x2 Þ ¼ 0: Optimize
Theorem 1 The necessary condition for the preceding problem to have an extremum at x1 ; x2 is
@f @g @f @g ¼ 0: @x1 @x2 @x2 @x1 ðx ;x Þ 1 2
Proof The necessary condition for f to have an extremum at some point x1 ; x2 is that the total differential of f ðx1 ; x2 Þ must be zero at x1 ; x2 ,
216
10 Constrained Optimization with Equality Constraints
i:e: df ¼ 0 @f @f or dx1 þ dx2 ¼ 0 @x1 @x2
ð10:5Þ
Since g x1 ; x2 ¼ 0 at the extreme point, any infinitesimal variations dx1 and dx2 taken about the point x1 ; x2 are called the admissible variations, provided that the new point lies on the constraint, i:e: g x1 þ dx1 ; x2 þ dx2 ¼ 0:
ð10:6Þ
The Taylor series expansion of the function g x1 þ dx1 ; x2 þ dx2 about the point x1 ; x2 gives g
x1 ; x2
@g @g dx1 þ dx2 ¼ 0: þ @x1 ðx ;x Þ @x2 ðx ;x Þ 1 2 1 2
ð10:7Þ
Since g x1 ; x2 ¼ 0, Eq. (10.7) reduces to @g @g dx1 þ dx2 ¼ 0 @x1 @x2
at x1 ; x2 :
ð10:8Þ
Thus, Eq. (10.8) has to be satisfied by all the admissible variations. Now, assuming that
@g @x2
6¼ 0, Eq. (10.8) can be rewritten as dx2 ¼
@g=@x1 dx1 @g=@x2
at x1 ; x2 :
ð10:9Þ
This relation indicates that once the variation in x1 (i.e. dx1) is chosen arbitrarily, the variation in x2 (i.e. dx2) is obtained automatically in order to have dx1 and dx2 as a set of admissible variations. Now, by substituting the preceding expression for dx2 in (10.5), we have
@f @f @g=@x1 dx1 ¼ 0: @x1 @x2 @g=@x2 ðx ;x Þ 1 2
ð10:10Þ
The expression on the left-hand side of Eq. (10.10) is called the constrained variation of f. Since dx1 is chosen arbitrarily, the coefficient of dx1 in (10.10) must be zero,
10.4
Method of Constrained Variation
217
@f @f @g=@x1 i:e: ¼0 @x1 @x2 @g=@x2 ðx ;x Þ 1 2 @f @g @f @g i:e: ¼ 0; @x1 @x2 @x2 @x1 ðx ;x Þ 1 2
ð10:11Þ
which is the necessary condition for the given problem to have an extremum at x1 ; x2 .
10.4.1 Geometrical Interpretation of Admissible Variation We now illustrate the admissible variations geometrically in Fig. 10.1, where PQ indicates the curve at each point of which gðx1 ; x2 Þ ¼ 0 is satisfied. If A is taken as the point x1 ; x2 , then the infinitesimal variations in x1 and x2 leading to the points B and C are called admissible variations. Thus, by the coordinates of B and C, gðx1 ; x2 Þ ¼ 0 will be satisfied, as these points lie on the same curve. On the other hand, the infinitesimal variations in x1 and x2 representing D are not admissible, since the point D does not lie on the constraint curve gðx1 ; x2 Þ ¼ 0. Thus, any set of variations ðdx1 ; dx2 Þ that does not satisfy Eq. (10.8) leads to points which do not satisfy the constraint gðx1 ; x2 Þ ¼ 0.
10.4.2 Necessary Conditions for Extremum of a General Problem for Constrained Variation Method Let the function to be optimized be f ðx1 ; x2 ; . . .; xn Þ subject to the equality constraints
Fig. 10.1 Variation about A
x1 D
Q C
A B P
x2
218
10 Constrained Optimization with Equality Constraints
gi ðx1 ; x2 ; . . .; xn Þ ¼ 0;
i ¼ 1; 2; . . .; m:
ð10:12Þ
Let x ¼ x1 ; x2 ; . . .; xn be an extreme point and (dx1, dx2, …, dxn) be the infinitesimal admissible variations about the point x*. Then, gi x1 ; x2 ; . . .; xn ¼ 0; i ¼ 1; 2; . . .; m. n P @gi Now, gi x1 þ dx1 ; . . .; xn þ dxn ’ gi x1 ; x2 ; . . .; xn þ @xj dxj . j¼1 x
Since (dx1, dx2, …, dxn) are the admissible infinitesimal variations, then gi x1 þ dx1 ; . . .; xn þ dxn ¼ 0.
) gi x1 ; x2 ; . . .; xn n X @gi or dxj ¼ 0; @xj j¼1
n X @gi þ dxj ¼ 0 @xj j¼1 x
* gi x1 ; x2 ; . . .; xn ¼ 0 i ¼ 1; 2; . . .; m:
ð10:13Þ
x
There are m equations in (10.13). These equations can be solved to express any m variations, say, the first m variations, in terms of the remaining variations as follows: @gi i Writing @g @xj simply as @xj , we get x
m X @gi j¼1
@xj
dxj ¼
n X @gi dxk ¼ hi ðsayÞ; @xk k¼m þ 1
i ¼ 1; 2; . . .; m:
By Cramer’s rule, we have dx1 ¼ D1 =D, h1 h2 where D1 ¼ . .. hm
@g1 @x2 @g2 @x2
@g1 @x3 @g2 @x3
@gm @x2
@gm @x3
@g 1 @x1 @g2 and D ¼ @x1 .. . @g m m @g @xm @x1
@g1 @xm @g2 @xm
@g1 @x2 @g2 @x2
@gm @x2
: @gm @g1 @xm @g2 @xm
@xm
10.4
Method of Constrained Variation
219
n P @g1 @g1 @g1 dx . . . @xm k¼m þ 1 @xk k @x2 n P @g2 @g2 @g2 dx k @x2 @xm k¼m þ 1 @xk : : : n P @gm @gm @gm . . . @xm
@xk dxk @x2 g1 ; g2 ; . . .; gm k¼m þ 1 Hence; dx1 ¼ assuming J 6¼ 0 ...; gm x1 ; x2 ; . . .; xm J gx11;; gx22;; ...; xm P ...; gm nk¼m þ 1 dxk J gx1k ;; gx22;; ...; xm ¼ : g1 ; g2 ; ...; gm J x1 ; x2 ; ...; xm
) dx1 ¼
In general; dxa ¼
Pn
k¼m þ 1
g1 ; g2 ; ...; gm dx J k k¼m þ 1 xk ; x2 ; ...; xm : g1 ; g2 ; ...; gm J x1 ; x2 ; ...; xm
Pn
ga1 ; ga ; ga þ 1 ; ...; gm dxk J gx11;; gx22;; ...; ...; xa1 ; xk ; xa þ 1 ; ...; xm ; g1 ; g2 ; ...; gm J x1 ; x2 ; ...; xm
a ¼ 1; 2; . . .; m:
At extreme point x*, we have df = 0 @f @f @f dx1 þ dx2 þ þ dxn ¼ 0 @x1 @x2 @xn m n X X @f @f or dxa þ dxk ¼ 0 @x @x a k a¼1 k¼m þ 1
or
Denoting
@f @f by @xj x @xj
Putting the expression dxa ; a ¼ 1; 2; . . .; m in above, we have " J
1 g1 ; g2 ; ...; gn x1 ; x2 ; ...; xn
m X @f @xa a¼1
n X k¼m þ 1
dxk J
g1 ; g2 ; . . .; ga1 ; ga ; ga þ 1 ; . . .; gm x1 ; x2 ; . . .; xa1 ; xk ; xa þ 1 ; . . .; xm #
)
@f g1 ; g2 ; . . .; gm dxk ¼ 0 J @xk x1 ; x2 ; . . .; xn k¼m þ 1 " n m X X @f g1 ; g2 ; . . .; ga1 ; ga ; ga þ 1 ; . . .; gm @f g1 ; g2 ; . . .; gm þ dxk ¼ 0: J J or @xa @xk x1 ; x2 ; . . .; xa1 ; xk ; xa1 ; . . .; xm x1 ; x2 ; . . .; xm a¼1 k¼m þ 1 þ
n X
(
Since dxm þ 1 ; dxm þ 2 ; . . .; dxn are all independent variations, the coefficient of each variation must be zero. Thus, we have
220
10 Constrained Optimization with Equality Constraints
m X @f g1 ; g2 ; . . .; ga1 ; ga ; ga þ 1 ; . . .; gm @f g1 ; g2 ; . . .; gm þ ¼0 J J @xa x1 ; x2 ; . . .; xa1 ; xk ; xa þ 1 ; . . .; xm @xk x1 ; x2 ; . . .; xm a¼1 for k ¼ m þ 1; m þ 2; . . .; n @f g1 ; g2 ; . . .; gm @f g1 ; g2 ; . . .; gm or J J @x1 @x2 xk ; x2 ; . . .; xm x1 ; xk ; . . .; xm @f g1 ; g2 ; . . .; gm1 ; gm @f g1 ; g2 ; . . .; gm þ ¼ 0: J J @xm @xk x1 ; x2 ; . . .; xm1 ; xk x1 ; x2 ; . . .; xm
ð10:14Þ h By transforming the column
@g1 @xk
@g2 @xk
@g3 @xk
...
@gm @xk
iT in each determinant of
Eq. (10.14) to the first column, we get @f g1 ; g2 ; . . .; gm @f g1 ; g2 ; ; gm @f g1 ; g2 ; . . .; gm þ J J J @xk @x1 @x2 xk ; x1 ; x3 ; . . .; xm x1 ; x2 ; . . .; xm xk ; x2 ; . . .; xm @f g1 ; g2 ; . . .; gm þ ð1Þm J ¼ 0; k ¼ m þ 1; m þ 2; . . .; n @xm xk ; x1 ; . . .; xm1
@f @f @f @xk @x1 @xm @g1 @g1 @g1 @xk @x1 @xm ¼0 or ... @gm @gm @g @x @x @xm k 1 m f ; g1 ; g2 ; . . .; gm or J ¼0 xk ; x1 ; x2 ; . . .; xm
ð10:15Þ
where k ¼ m þ 1; m þ 2; . . .; n:
Thus, the (n − m) equations represented by (10.15) give the necessary conditions for an extremum of f ðx1 ; x2 ; . . .; xn Þ under m equality constraints gi ðx1 ; x2 ; . . .; xn Þ ¼ 0; i ¼ 1; 2; . . .; m.
10.4.3 Sufficient Conditions for a General Problem in Constrained Variation Method For the optimization problem with n variables and m equality constraints, m variables are dependent and the other n − m variables are independent. Let us assume that x1 ; x2 ; . . .; xm are dependent variables and the others xm þ 1 ; xm þ 2 ; . . .; xn are independent.
10.4
Method of Constrained Variation
221
Then the Taylor series expansion of f, in terms of independent variables, about the extreme point x* gives ð10:16Þ where
@f @xi g
denotes the partial derivative of f with respect to xi (holding all the
other variables xm þ 1 ; xm þ 2 ; . . .; xj1 ; xj þ 1 ; . . .; xn as constant) when x1, x2, …, xm are allowed to change so that the constraints gi ðx þ dxÞ ¼ 0; i ¼ 1; 2; . . .; m @2f @xj @xk g
are satisfied. The second-order derivative
is used to denote a similar
meaning.
Since all the variations dxj appearing in (10.16) are independent,
@f @xj g
has to be
zero for j = m + 1, m + 2, …, n. Thus, the necessary conditions for the existence of a constrained optimum at x* can also be expressed as
@f @xj
¼ 0;
j ¼ m þ 1; m þ 2; . . .; n:
ð10:17Þ
g
Hence, from (10.16), we have
Therefore, the sufficient condition for x* to be a local optimum is that the quadratic form Q defined by 2 n X @ f Q¼ dxj dxk @xj @xk g j¼m þ 1 k¼m þ 1 n X
is positive (or negative) for all non-vanishing variations dxj. That is, the matrix 2 6 6 6 4
@2f @x2m þ 1 g
.. . 2
@ f @xn @xm þ 1 g
@2f @xm þ 1 @xm þ 2 g
@2f @xn @xm þ 2 g
3
@2 f
@xm þ 1 @xn g
@2 f @x2n g
7 7 7 5
is positive (or negative) definite. Remark After a little manipulation, it can be shown that Eq. (10.17) and f ; g1 ; g2 ; ...; gm J xk ; x1 ; x2 ; ...; xm ¼ 0 are the same.
222
10 Constrained Optimization with Equality Constraints
Disadvantages of the method In this method, the computation of the constrained second-order partial derivative @2f is a formidable task for a problem with more than three constraints. Also, @xj @xk g
the necessary conditions involve the evaluation of (n − m) numbers of determinants of order m + 1. This is also a complicated task. Example 3 Using the method of constrained variation, Minimize f ðx1 ; x2 ; x3 Þ ¼ x21 þ 2x22 þ x23 subject to 2x1 þ 4x2 þ 3x3 ¼ 9 4x1 þ 8x2 þ 5x3 ¼ 17: Solution The given problem is f ¼ x21 þ 2x22 þ x23
ð10:18Þ
subject to g1 ¼ 2x1 þ 4x2 þ 3x3 9 ¼ 0
ð10:19Þ
g2 ¼ 4x1 þ 8x2 þ 5x3 17 ¼ 0:
ð10:20Þ
In this problem, m ¼ number of constraints ¼ 2 n ¼ number of variables ¼ 3: Hence (n − m) = 1 variable is independent and the other two are dependent. Let us choose x1 ; x2 as dependent variables. @g1 @g1 @x1 @x2 2 4 g1 ; g2 ¼ 0. Then, J x1 ; x2 ¼ @g2 @g2 ¼ 4 8 @x @x 1
2
So x1 and x2 cannot be chosen as dependent variables. Now, we consider x1 and x3 as dependent variables. @g1 @g1 g1 ; g 2 @x1 @x3 2 3 ¼ 2 6¼ 0: J ¼ @g2 @g2 ¼ 4 5 @x @x x1 ; x 3 1
3
Thus, x1 ; x3 can be chosen as dependent variables. The necessary condition for optimality of f is given by
f ; g1 ; g2 J x2 ; x1 ; x3
¼ 0;
10.4
Method of Constrained Variation
223
@f @x2 1 i:e: @g @x2 @g 2 @x2
4x2 or; 4 8
2x1 2 4
2x3 3 ¼ 0 5
@f @x1 @g1 @x1 @g2 @x1
@f @x3 @g1 @x3 @g2 @x3
¼0
or, 8x2 þ 8x1 ¼ 0
or, x1 ¼ x2 :
ð10:21Þ
Using (10.21) in (10.19) and (10.20), we have 6x2 þ 3x3 ¼ 9 12x2 þ 5x3 ¼ 17: Solving, we get x2 ¼ 1; x3 ¼ 1
) x1 ¼ x2 ¼ x3 ¼ 1.
So the required optimum solution is x1 ¼ x2 ¼ x3 ¼ 1 with fmin ¼ 6: Example 4 Find at which point f ð xÞ ¼ 2x21 x22 þ 4x23 þ x24 is optimum subject to g1 ð xÞ ¼ x1 þ x2 x3 þ x4 10 ¼ 0
ð10:22Þ
g2 ð xÞ ¼ x1 þ x2 þ 2x3 þ 3x4 12 ¼ 0:
ð10:23Þ
Solution Here m ¼ number of constraints ¼ 2 n ¼ number of variables ¼ 4: So (n − m) = 2 variables are independent and the other two are dependent. We first take x1 and x2 as dependent variables depending on x3 ; x4 . @g1 g1 ; g 2 @x1 Now; J ¼ @g @x2 x1 ; x 2 1
@g1 @x2 @g2 @x2
1 ¼ 1
1 ¼ 0: 1
So x1 , x2 cannot be chosen as dependent variables.
g1 ; g2 Again; J x1 ; x3
@g1 @x1 ¼ @g @x2 1
@g1 @x3 @g2 @x3
1 ¼ 1
1 ¼ 3 6¼ 0: 2
224
10 Constrained Optimization with Equality Constraints
So x1 , x3 can be chosen as dependent variables. Now the necessary conditions for optimality are given by
f ; g1 ; g2 J x2 ; x1 ; x3 @f @x2 1 i:e: @g @x2 @g 2 @x2
2x2 i:e:; 1 1
@f @x1 @g1 @x1 @g2 @x1
4x1 1 1
f ; g1 ; g2 ¼ 0 and J x4 ; x1 ; x3
@f @x3 @g1 @x3 @g2 @x3
@f @x4 1 ¼ 0 and @g @x4 @g 2 @x4
2x4 8x3 1 ¼ 0 and 1 3 2
@f @x1 @g1 @x1 @g2 @x1
4x1 1 1
¼0 @f @x3 @g1 @x3 @g2 @x3
¼0
8x3 1 ¼ 0: 2
Hence; 6x2 12x1 ¼ 0 implies x2 ¼ 2x1
ð10:24Þ
and 6x4 20x1 16x3 ¼ 0 implies 3x4 10x1 8x3 ¼ 0:
ð10:25Þ
Now from (10.22) and (10.23) using (10.24), we have x1 x3 þ x4 ¼ 10
ð10:26Þ
and x1 þ 2x3 þ 3x4 ¼ 12:
ð10:27Þ
From (10.26) and (10.27), we have 3x3 þ 2x4 ¼ 2;
ð10:28Þ
and from (10.25) and (10.26), we have 2x3 7x4 ¼ 100:
ð10:29Þ
304 Solving (10.28) and (10.29), we have x3 ¼ 186 25 ; x4 ¼ 25 .
) x1 ¼ x4 x3 10 ¼
48 5
) x2 ¼ 2x1 ¼
96 : 5
96 186 304 So at the point x1 ¼ 48 5 ; x2 ¼ 5 ; x3 ¼ 25 ; x4 ¼ 25 , the given function f ð xÞ is optimum.
10.5
10.5
Lagrange Multiplier Method
225
Lagrange Multiplier Method
In the area of optimization, the method of Lagrange multipliers (named after Joseph-Louis Lagrange) provides a strategy for finding the optimum of a function subject to the given equality constraints. The Lagrange multiplier method is a very useful technique in multivariate calculus. One of the most common problems in calculus is finding the optimum of a function. However, it is difficult to find the closed-form solution for a problem to be optimized. Such difficulties often arise when one wishes to maximize or minimize a function subject to some equality constraints. The method of Lagrange multipliers is a very well-known method for solving non-linear constrained optimization problems. The method of Lagrange multipliers is a systematic process of generating the necessary conditions for a stationary point. Let us consider the following optimization problem: Optimize z ¼ f ð xÞ subject to gi ð xÞ ¼ 0;
i ¼ 1; 2; . . .; m
where x 2 Rn ; f : Rn ! R; gi : Rm ! R and m n. Now we introduce a new variable for each of the given constraints. These new variables k1 ; k2 ; . . .; km are known as Lagrange multipliers. Using these multipliers, Lagrange defined a function now called the Lagrange function. It is denoted by Lðx1 ; x2 ; . . .; xn ; k1 ; k2 ; . . .; km Þ and is defined by Lðx1 ; x2 ; . . .; xn ; k1 ; k2 ; m P . . .; km Þ ¼ f ð xÞ þ ki gi ð xÞ (ki may be positive or negative). i¼1
Since the problem has n variables and m equality constraints, m additional variables are to be added so that the Lagrange function will have n + m variables. Stationary point The points where the partial derivations of the Lagrange function L are zero are called stationary points. If ðx1 ; x2 ; . . .; xn Þ is an optimum point for the original constrained optimization problem, then there exists a set of real numbers ðk1 ; k2 ; . . .; km Þ such that ðx1 ; x2 ; . . .; xn ; k1 ; k2 ; . . .; km Þ is a stationary point for the Lagrange function. However, not all stationary points yield a solution of the original problem. Thus, the method of Lagrange multipliers yields necessary conditions for optimality in constrained optimization problems. Now we shall prove the theorem regarding the necessary conditions for optimality.
226
10 Constrained Optimization with Equality Constraints
10.5.1 Necessary Conditions for Optimality Theorem 2 The necessary condition for a function f(x) subject to the constraints gi ð xÞ ¼ 0; i ¼ 1; 2; . . .; m to have a local optimum at a point x* is that the partial derivative of the Lagrange function defined by L ¼ Lðx1 ; x2 ; Pm . . .; xn ; k1 ; k2 ; . . .; km Þ ¼ f ð xÞ þ i¼1 ki gi ð xÞ with respect to each of its arguments is zero at x*. Proof Since x* is a local extreme point, then at x*, df = 0 n X @f or @xj j¼1
dxj ¼ 0;
ð10:30Þ
x¼x
where dxj (j = 1, 2, …, n) are the admissible variations about the point x*. Since x* is an extreme point and dxj (j = 1, 2, …, n) are the admissible variations about the point x*, then gi ðx Þ ¼ 0 and gi ðx þ dxÞ ¼ 0:
ð10:31Þ
From (10.31), we have gi ðx þ dxÞ ¼ 0
n X @gi dxj ¼ 0 or gi ðx Þ þ @xj j¼1 x¼x n X @gi dxj ¼ 0 i ¼ 1; 2; . . .; m: or @xj j¼1
ð10:32Þ
x¼x
Multiplying each equation in (10.32) by a constant ki ði ¼ 1; 2; . . .; mÞ, we have n X @gi ki @xj j¼1
dxj ¼ 0 x¼x
ði ¼ 1; 2; . . .; mÞ:
ð10:33Þ
10.5
Lagrange Multiplier Method
227
From (10.30) and (10.33), we have n m n X X X @f @gi dxj þ ki dxj ¼ 0 @xj @xj j¼1 i¼1 j¼1 x¼x x¼x ! n n m X X X @f @gi dxj þ ki or dxj ¼ 0 @xj @xj j¼1 j¼1 i¼1 x¼x x¼x " # n m X X @f @gi ki or dxj ¼ 0: x¼x þ @xj @xj j¼1 i¼1
ð10:34Þ
x¼x
Since there are m constraints, any n − m variables can be chosen independently. Let x1 ; x2 ; . . .; xm be a set of dependent variables and xm þ 1 ; xm þ 2 ; . . .; xn be the set of independent variables. We choose the values of ki ði ¼ 1; 2; . . .; mÞ in such a way that the coefficients of the first m variations dxi ði ¼ 1; 2; . . .; mÞ become zero. Thus, ki are defined by m X @f @gi þ ki @xj x¼x @xj i¼1
¼0
for j ¼ 1; 2; . . .; m:
ð10:35Þ
x¼x
Using (10.35), from (10.34) we have n X j¼m þ 1
"
m X @f @gi þ ki @xj x¼x @xj i¼1
# dxj ¼ 0:
x¼x
Since dxm þ 1 ; dxm þ 2 ; . . .; dxn are all independent, we have m X @f @gi þ ki ¼ 0 for j ¼ m þ 1; m þ 2; . . .; n: @xj x¼x i¼1 @xj
ð10:36Þ
x¼x
From (10.35) and (10.36), we have m X @f @gi þ ki @xj x¼x i¼1 @xj
¼ 0 for j ¼ 1; 2; . . .; n:
ð10:37Þ
for i ¼ 1; 2; . . .; m:
ð10:38Þ
x¼x
Also; gi ðx Þ ¼ 0
228
Now L ¼ f ð xÞ þ
10 Constrained Optimization with Equality Constraints m X
ki gi ðxÞ:
i¼1
m X @L @f @gi ¼ þ ki ¼ 0 for j ¼ 1; 2; . . .; n ½using ð10:37Þ ) @xj x¼x @xj x¼x i¼1 @xj x¼x @L and ¼ gk ðx Þ ¼ 0 for k ¼ 1; 2; . . .; m ½using ð10:38Þ @kk x¼x @L ¼ 0 for k ¼ 1; 2; . . .; m: or @kk x¼x
These are the necessary conditions.
10.5.2 Sufficient Conditions for Optimality Theorem 3 A sufficient condition for f(x) to have a local minimum (maximum) subject to the constraints gi ð xÞ ¼ 0; i ¼ 1; 2; . . .; m at x* is that the partial derivative of the Lagrange function Lðx1 ; x2 ; . . .; xn ; k1 ; k2 ; . . .; kn Þ at x* with respect to each of its arguments is zero, and the quadratic Q defined by n X n X @ 2 L Q¼ dxj dxk @xj @xk j¼1 k¼1 x¼x
must be positive (negative) for all choices of the admissible variations dxj ðj ¼ 1; 2; . . .; nÞ. Proof The Lagrange function is defined as Lðx; kÞ ¼ f ð xÞ þ
m X
ki gi ð xÞ:
i¼1
Let dx be the vector of admissible variations about the point x*. Then the point x* + dx must satisfy the given constraints. ) gi ðx þ dxÞ ¼ 0: @L Also; ¼ gi ðx Þ ¼ 0: @ki x
ð10:39Þ ð10:40Þ
10.5
Lagrange Multiplier Method
229
) Lðx þ dx; kÞ Lðx ; kÞ ¼ f ðx þ dxÞ þ
m X
ki gi ðx þ dxÞ f ðx Þ
i¼1
m X
ki gi ðx Þ
i¼1
or Lðx þ dx; kÞ Lðx ; kÞ ¼ f ðx þ dxÞ f ðx Þ ½using ð10:39Þ and ð10:40Þ:
ð10:41Þ The Taylor series expansion of Lðx þ dx; kÞ about the point x* (keeping all the components of k as fixed) gives n X @L Lðx þ dx; kÞ ¼ Lðx ; kÞ þ @xj j¼1
n X n X @ 2 L dxj þ dxj dxk ; 0\h\1 @xj @xk j¼1 k¼1 x¼x x¼x þ hdx 3 n X n X @ 2 L @L dxj dxk * ¼ 05: or Lðx þ dx; kÞ Lðx ; kÞ ¼ 0 þ @xj @xk @xj x¼x j¼1 k¼1
x¼x þ hdx
ð10:42Þ L are continuous in If it is assumed that the second-order partial derivatives @x@i @x j * the neighbourhood of x , then the sign of Lðx þ dx; kÞ Lðx ; kÞ will be the same n P n P @2 L dxj dxk . as that of @xj @xk j¼1 k¼1 2
x¼x
Therefore, from (10.41) and (10.42), it is clear that the signs of f ðx þ dxÞ P P 2 f ðx Þ and Q ¼ nj¼1 nk¼1 @x@j @xL k dxj dxk be the same for all admissible x¼x
variations. If Q > 0 for all admissible variations, then f ðx þ dxÞ f ðx Þ [ 0 for all admissible variations. That is, f ðx þ dxÞ [ f ðx Þ. ) x is a local minimum. If Q < 0 for all admissible variations, then f ðx þ dxÞ f ðx Þ\0 or f ðx þ dxÞ \ f ðx Þ for all admissible variations: ) x is a local maximum.
230
10 Constrained Optimization with Equality Constraints
10.5.3 Alternative Approach for Testing Optimality To test the optimality of a constrained optimization problem by the above-described sufficient conditions is a formable task. Therefore, the following two methods are also used to test the optimality. Method-1: In this method, a square matrix HB of order (m + n) (m + n), called a bordered Hessian matrix, is evaluated at the stationary points. This matrix is arranged as follows: HB ¼
where O is an m m null matrix, U ¼
O UT h i
U ; V
@gi @xj mn ;
V¼
h
i
@2L @xi @xj nn :
Then the sufficient conditions for maximum and minimum stationary points are given as follows. Let ðx ; k Þ be the stationary points for the Lagrangian function and HB be the value of the corresponding bordered Hessian matrix computed at this stationary point. Then: (i) x* is a local maximum if, starting with the leading principal minor of order (2 m + 1), the last (n − m) leading principal minors of HB have alternating sign starting with ð1Þm þ n . (ii) x* is a local minimum if, starting with the leading principal minor of order ð2m þ 1Þ, the last (n − m) leading principal minors of HB have the sign of ð1Þm . Method-2: In this method, optimality is tested by evaluating the roots of the polynomial in z, defined by the following Hancock determinantal equation: L11 z L12 L13 L1n g11 g21 gm1 L21 L22 z L23 L2n g12 g22 gm2 .. . Ln1 Ln2 Ln3 Lnn z g1n g2n gmn ¼ 0 g11 g12 g13 g1n 0 0 0 .. . gm1 gm2 gn3 gmn 0 0 0 at x ¼ x and k ¼ k ;
where Ljk ¼ @x@j @xL k ðx ; k Þ; 2
i ðx Þ gij ¼ @g@x . j
10.5
Lagrange Multiplier Method
231
This equation leads to an (n − m)th-order polynomial in z. Then the sufficient conditions for minimum and maximum are, respectively: (i) All positive roots of the polynomial (ii) All negative roots of the polynomial. If some of the roots of this polynomial are positive while the others are negative, the point x* is not an extreme point. Remark It may be observed that the conditions in both the methods are only sufficient for identifying an extreme point but not necessary. That is, a stationary point may be an extreme point without satisfying the conditions described in Methods 1 and 2. Example 5 Solve the following problem by the Lagrange multiplier method: Minimize subject to
z ¼ x21 10x1 þ x22 6x2 þ x23 4x3 x1 þ x2 þ x3 ¼ 7:
Solution The given problem is Minimize subject to
z ¼ x21 10x1 þ x22 6x2 þ x23 4x3 gð xÞ ¼ x1 þ x2 þ x3 7 ¼ 0:
Now the Lagrange function is Lðx1 ; x2 ; x3 ; kÞ ¼ x21 10x1 þ x22 6x2 þ x23 4x3 þ kðx1 þ x2 þ x3 7Þ: ) The necessary conditions for the minimum of z are given by @L ¼ 0; @x1 Now;
@L ¼ 0; @x2
@L ¼ 0; @x3
@L ¼ 0: @k
@L 10 k ¼ 0 gives 2x1 10 þ k ¼ 0 or, x1 ¼ @x1 2
ð10:43Þ
@L ¼ 0 gives 2x2 6 þ k ¼ 0 @x2
or, x2 ¼
6k 2
ð10:44Þ
@L ¼ 0 gives 2x3 4 þ k ¼ 0 @x3
or, x3 ¼
4k 2
ð10:45Þ
@L ¼ 0 gives x1 þ x2 þ x3 7 ¼ 0 or, x1 þ x2 þ x3 ¼ 7 @k
ð10:46Þ
232
10 Constrained Optimization with Equality Constraints
From (10.43), (10.44), (10.45) and (10.46), we have 20 3k ¼ 7 or, k ¼ 2: 2 ) x1 ¼ 4; x2 ¼ 2; x3 ¼ 1: Now
@2L ¼ 2 for i ¼ 1; 2; 3 @x2i @2L ¼ 0 for j ¼ 2; 3 @x1 @xj @2L ¼0 @x2 @xj
for
j ¼ 1; 3
@2L ¼ 0 for j ¼ 1; 2 @x3 @xj @g and ¼ 1 for i ¼ 1; 2; 3: @xi Hence, the bordered Hessian matrix at x1 ¼ 4; x2 ¼ 2 and x3 ¼ 1 is given by HB ¼
O UT
U , where O is a 1 1 null matrix, V 2 @gi @ L U¼ and V ¼ ; @xi @xj 33 @xj 13 2
0 61 i:e: HB ¼ 6 41 1
1 2 0 0
1 0 2 0
3 1 07 7: 05 2
Since m = 1 and n = 3, then n − m = 3 − 1 = 2 and 2 m + 1 = 3. Hence, only two leading principal minors of HB of order 3 and 4 (i.e. D3 and D4 ) are to be evaluated, and both the minors must have negative sign, as ð1Þ1 = negative. 0 1 1 Now; D3 ¼ 1 2 0 ¼ 2 2 ¼ 4\0 1 0 2 0 1 1 1 1 0 0 1 1 2 0 0 ¼ 1 2 0 þ 1 and D4 ¼ 1 0 2 1 1 0 2 0 1 0 0 2 ¼ 4 þ ð4Þ 4 ¼ 12\0
2 0 1 2 0 0 1 0 0 2 1 0
0 2 0
10.5
Lagrange Multiplier Method
233
Hence, z is minimum subject to gð xÞ ¼ 0 at x1 ¼ 4; x2 ¼ 2; x3 ¼ 1 and zmin ¼ 16 40 þ 4 12 þ 1 4 ¼ 35. Alternative test for optimality Again, the Hancock determinantal equation is L13 g11 L23 g12 ¼ 0; L23 z g13 g13 0
L11 z L12 L21 L z 22 L31 L 32 g11 g12 2 @gi L where Lij ¼ @x@i @x and gij ¼ @x j x ¼ x j
x¼x
k ¼ k
or,
or,
2 z 0 0 0 2z 0 0 0 2z 1 1 1 2 z 0 ð2 zÞ 0 2z 1 1
1 1 ¼0 1 0 1 0 1 0 0 1
2z 0 1
0 2z ¼ 0 1
or ð2 zÞfð2 zÞð1Þ ð2 zÞg ð2 zÞ2 ¼ 0 or ð2 zÞð2 zÞð2Þ ð2 zÞ2 ¼ 0 or 3ð2 zÞ2 ¼ 0 or z ¼ 2; 2: Hence, the extreme point (4, 2, 1) minimizes the objective function, and zmin ¼ 35. Example 6 Solve the following problem by the Lagrange multiplier method: Minimize z ¼ 4x21 þ 2x22 þ x23 4x1 x2 subject to the constraints x1 þ x2 þ x3 ¼ 15; 2x1 x2 þ 2x3 ¼ 20: Solution The given problem is Minimize z ¼ f ð xÞ ¼ 4x21 þ 2x22 þ x23 4x1 x2 subject to g1 ð xÞ ¼ x1 þ x2 þ x3 15 ¼ 0 and
g2 ð xÞ ¼ 2x1 x2 þ 2x3 20 ¼ 0:
234
10 Constrained Optimization with Equality Constraints
Now the Lagrange function is given by L ¼ f ð x Þ þ k1 g1 ð x Þ þ k2 g2 ð x Þ ¼ 4x21 þ 2x22 þ x23 4x1 x2 þ k1 ðx1 þ x2 þ x3 15Þ þ k2 ð2x1 x2 þ 2x3 20Þ: The necessary conditions for the minimum of f ð xÞ are given by @L ¼ 0; @x1
@L ¼ 0; @x2
Now;
@L ¼ 0; @x3
@L ¼ 0; @k1
@L ¼ 0: @k2
@L ¼ 0 gives 8x1 4x2 þ k1 þ 2k2 ¼ 0 @x1
@L ¼ 0 gives 4x2 4x1 þ k1 k2 ¼ 0 @x2
ð10:48Þ
@L ¼ 0 gives 2x3 þ k1 þ 2k2 ¼ 0 @x3
ð10:49Þ
@L ¼ 0 gives x1 þ x2 þ x3 15 ¼ 0 @k1
ð10:50Þ
@L ¼ 0 gives 2x1 x2 þ 2x3 20 ¼ 0: @k2
ð10:51Þ
From (10.47) and (10.48), we have 4x1 þ 2k1 þ k2 ¼ 0 or x1 ¼
2k1 þ k2 : 4
From (10.47), we have 4x2 ¼ 8
ð10:47Þ
2k1 þ k2 þ k1 þ 2k2 ¼ 4k1 2k2 þ k1 þ 2k2 ¼ 3k1 4 3 or x2 ¼ k1 4
From (10.49), we have x3 ¼
k1 þ 2k2 : 2
Now putting the values of x1 ; x2 ; x3 in (10.50), we get
10.5
Lagrange Multiplier Method
235
2k1 þ k2 3 k1 þ 2k2 k1 ¼ 15 4 4 2 or 7k1 þ 5k2 þ 60 ¼ 0:
ð10:52Þ
Putting the values of x1 ; x2 ; x3 in (10.51), we get 2k1 þ k2 3 k1 þ 2k2 þ k1 2 ¼ 20 4 4 2 or k1 þ 2k2 þ 16 ¼ 0: 2
ð10:53Þ
Now solving (10.52) and (10.53) by the cross-multiplication method, we have k1 k2 1 ¼ ¼ 80 120 60 112 14 5 40 52 and k2 ¼ : or k1 ¼ 9 9 2k1 þ k2 1 40 52 11 2 ) x1 ¼ ¼ ¼ 4 9 9 3 4 3 3 40 10 x 2 ¼ k1 ¼ ¼ 4 4 9 3 k1 þ 2k2 1 40 52 2 and x3 ¼ ¼ ¼ 8: 2 9 9 2 @2L @2L @2L ¼8 ¼ 4 ¼0 Now 2 @x1 @x2 @x1 @x3 @x1 @2L ¼4 @x22
@2L ¼ 4 @x2 @x1
@2L ¼0 @x2 @x3
@2L @2L @2L ¼2 ¼0 ¼ 0: 2 @x3 @x1 @x3 @x2 @x3 @g1 @g1 @g1 @g2 ¼1 ¼1 ¼1 ¼2 Again; @x1 @x2 @x3 @x1
@g2 ¼ 1 @x2
@g2 ¼ 2: @x3
10 Hence, the bordered Hessian matrix at x1 ¼ 11 3 ; x2 ¼ 3 ; x3 ¼ 8 is given by
HB ¼
O UT
U ; V
where O is an m m null matrix, U¼
@gi @xj
and
mn
V¼
@2L @xi @xj
; nn
236
10 Constrained Optimization with Equality Constraints
2
0 60 6 i:e: HB ¼ 6 61 41 1
0 0 2 1 2
1 2 8 4 0
1 1 4 4 0
3 1 27 7 07 7: 05 2
Since m = 2 and n = 3, then n − m = 1 and 2 m + 1 = 5. Thus, only one leading principal minor of HB of order 5 (i.e. D5 ) is to be evaluated, and the minor must have positive sign, as ð1Þm ¼ positive. Here D5 ¼ jHB j ¼ 90 [ 0. Hence, f ð xÞ is minimum subject to the constraints g1 ð xÞ ¼ 0 and g2 ð xÞ ¼ 0 at 112 2 10 10 820 x1 ¼ 11 þ 2 10 þ ð8Þ2 4 11 3 ; x2 ¼ 3 and x3 ¼ 8, and fmin ¼ 4 3 3 3 3 ¼ 9 . Example 7 Solve the following problem by using the Lagrange multiplier method: Minimize z ¼ 2x21 24x1 þ 2x22 8x2 þ 2x23 12x3 þ 200 subject to the constraints x1 þ x2 þ x3 ¼ 11 and x1 ; x2 ; x3 0: Solution The given problem can be written as Minimize z ¼ 2x21 24x1 þ 2x22 8x2 þ 2x23 12x3 þ 200 subject to gð xÞ ¼ x1 þ x2 þ x3 11 ¼ 0: Now the Lagrange function is Lðx1 ; x2 ; x3 ; kÞ ¼ 2x21 24x1 þ 2x22 8x2 þ 2x23 12x3 þ 200 þ kðx1 þ x2 þ x3 11Þ:
The necessary conditions for the minimum of Z are given by @L ¼ 0; @x1 Now;
@L ¼ 0; @x2
@L ¼ 0; @x3
@L ¼ 0: @k
@L 24 k ¼ 0 gives 4x1 24 þ k ¼ 0 ) x1 ¼ @x1 4
ð10:54Þ
@L 8k ¼ 0 gives 4x2 8 þ k ¼ 0 ) x2 ¼ @x2 4
ð10:55Þ
@L 12 k ¼ 0 gives 4x3 12 þ k ¼ 0 ) x3 ¼ @x3 4
ð10:56Þ
10.5
Lagrange Multiplier Method
237
@L ¼ 0 gives x1 þ x2 þ x3 11 ¼ 0: @k
ð10:57Þ
Substituting the values of x1 ; x2 and x3 [from (10.54), (10.55), (10.56)] in (10.57), we have 44 3k ¼ 11 ) k ¼ 0 4 ) x1 ¼ 6; x2 ¼ 2; x3 ¼ 3 Now;
@2L ¼ 4; @x21
@2L ¼ 0; @x1 @xj
j ¼ 2; 3
@2L ¼ 4; @x22
@2L ¼ 0; @x2 @xj
j ¼ 1; 3
@2L @2L ¼ 4; ¼ 0; j ¼ 1; 2: 2 @x3 @xj @x3 @g @g @g ¼ 1; ¼ 1; ¼ 1: Again; @x1 @x2 @x3 Now we shall test the optimality by the bordered Hessian matrix method. The bordered Hessian matrix at x1 ¼ 6; x2 ¼ 2; x3 ¼ 3 is given by
U ; V
O HB ¼ UT where 0 is an m m matrix, U ¼
h i
@gi @xj mn
2
0 61 i:e: HB ¼ 6 41 1
and V ¼ 1 4 0 0
1 0 4 0
h
i
@2L @xi @xj nn ,
3 1 07 7: 05 4
Since m = 1, n = 3, n − m = 3 − 1 = 2 and 2m + 1 = 3, only two leading principal minors of HB of order 3 and 4 (i.e. D3 and D4 ) are to be evaluated, and both the minors must have negative sign as ð1Þ1 ¼ 1 = negative. 0 1 1 Now; H3 ¼ 1 4 0 ¼ 4 4 ¼ 8\0 1 0 4
238
10 Constrained Optimization with Equality Constraints
0 1 and H4 ¼ jHB j ¼ 1 1
1 4 0 0
1 1 1 0 0 ¼ 1 1 4 0 1 0 4
0 0 1 4 0þ1 0 4 1
4 0 0
0 1 4 0 1 0 4 1 0
0 4 0
¼ 16 þ ð4Þð4 0Þ ¼ ð4Þð0 4Þ ¼ 16 16 16 ¼ 48\0: Hence, z is minimum subject to the given constraint at x1 ¼ 6; x2 ¼ 2; x3 ¼ 3 and Zmin ¼ 102. Alternative test for optimality For the given problem, the Hancock determinantal equation at x1 ¼ 6; x2 ¼ 2; x3 ¼ 3 is given by L11 z L12 L21 L z 22 L31 L 32 g11 g12
L13 g11 L23 g12 ¼ 0; L33 z g13 g13 0
@ 2 L @gi where Lij ¼ gij ¼ ; @xi @xj x ¼ x @xj x¼x k¼k 4 z 0 0 1 0 4z 0 1 ¼0 or 0 0 4 z 1 1 1 1 0 4z or ð4 zÞ 0 1
1 0 4 z 0 4 z 1 0 1 1 1 0 0
0 4 z ¼ 0 1
or ð4 zÞ½ðz 4Þ ð4 zÞ ð4 zÞð4 zÞ ¼ 0 or ð4 zÞðz 4 4 þ zÞ ð4 zÞ2 ¼ 0 or ð4 zÞ2 ¼ 0 or z ¼ 4; 4: Since all the values of z are positive, Z is minimum subject to the given constraint at x1 ¼ 6; x2 ¼ 2; x3 ¼ 3, and Zmin ¼ 102. Example 8 Find the dimension of a cylindrical tin (with top and bottom) made up of sheet metal to maximize its volume such that the total surface area is equal to 24p.
10.5
Lagrange Multiplier Method
239
Solution Let x1 be the radius of the top or bottom and x2 be the height of the tin. Then the volume of the tin ¼ px21 x2 and the total surface area ¼ 2px21 þ 2px1 x2 . Therefore, the problem is f ðx1 ; x2 Þ ¼ px21 x2 2px21 þ 2px1 x2 ¼ 24p px21 þ px1 x2 12p ¼ 0 g1 ðx1 ; x2 Þ ¼ px21 þ px1 x2 12p ¼ 0:
maximize subject to or or
Now, the Lagrange function is Lðx1 ; x2 ; kÞ ¼ px21 x2 þ k px21 þ px1 x2 12p : The necessary conditions for the maximum of f are given by @L ¼ 0; @x1 Now;
@L ¼ 0; @x2
@L ¼ 0: @k
@L ¼ 0 implies 2px1 x2 þ kpð2x1 þ x2 Þ ¼ 0: @x1 Again;
@L ¼ 0 gives px21 þ kpx1 ¼ 0 @x2 or; px21 þ kpx1 ¼ 0
ð10:58Þ
ð10:59Þ
or; x1 þ k ¼ 0 ½* x1 6¼ 0 or; x1 ¼ k: Again;
@L ¼ 0 gives x21 þ x1 x2 12 ¼ 0: @k
Now setting x1 ¼ k in (10.58), we have 2pðkÞx2 þ kpð2k þ x2 Þ ¼ 0 or pkx2 ¼ 2k2 p or x2 ¼ 2k ½* k 6¼ 0 or x2 ¼ 2x1 ½From ð10:59Þ; x1 ¼ k or, k ¼ x1 :
ð10:60Þ
240
10 Constrained Optimization with Equality Constraints
Setting x2 ¼ 2x1 in (10.60), we have x21 þ x1 2x1 12 ¼ 0 or, x21 ¼ 4 or x1 ¼ 2 ½As x1 is the radius of the top or bottom of the tin: ) x2 ¼ 2x1 ¼ 2:2 ¼ 4 ) x ¼ ð2; 4Þ: From (10.59), we have x1 = −k or, k = −2. Now the Hancock determinantal equation is g11 g12 ¼ 0: 0
L11 z L12 L21 L 22 z g11 g12
We have
@ 2 L ¼ L11 @x21 L12
x¼x k¼2
@ 2 L ¼ @x1 @x2
ð10:61Þ
¼ ½2px2 þ 2pkx0 ¼ 2p 4 þ 2p ð2Þ ¼ 8p 4p ¼ 4p
x¼x k¼2
¼ ½2px1 þ 2px0 ¼ 2p 2 þ ð2Þp ¼ 2p:
Similarly, L21 = 2p. 2 Again; L22 ¼ @@xL2 2
x¼x k¼2
¼0
Therefore, (10.61) becomes 4p z 2p 8p 0 z 2p ¼ 0 2p 8p 2p 0 or; ð4p zÞ 0 4p2 2p 0 16p2 þ 8p 4p2 þ 8pz ¼ 0 or; 68p2 z þ 48p3 ¼ 0 or; 68p2 z ¼ 48p3
or;
z¼
48 12 p ¼ p: 68 17
Since the value of z is negative, f is maximum for x ¼ ð2; 4Þ.
10.6
10.6
Interpretation of Lagrange Multipliers
241
Interpretation of Lagrange Multipliers
To find the physical meaning of the Lagrange multipliers, let us consider the following optimization problem involving only one constraint: Minimize f ð xÞ subject to hð xÞ ¼ b; i:e: Minimize f ð xÞ
ð10:62Þ
subject to gð xÞ ¼ bhð xÞ ¼ 0;
ð10:63Þ
where b is a constant. The necessary conditions for the minimum of f (x) are given by @L @f @g ¼ þk ¼ 0; @xj @xj @xj and
j ¼ 1; 2; . . .; n
ð10:64Þ
@L ¼ g ¼ 0; @k
where Lðx; kÞ ¼ f ð xÞ þ kgð xÞ:
ð10:65Þ
Let the solution of (10.64) and (10.65) be given by x*, k* and f ¼ f ðx Þ. Now, we want to find the effect of a small relaxation or tightening of the constraint on the optimum value of the objective function (i.e. we want to find the effect of a small change in b on f*). From (10.63), g(x) = 0, where g(x) = b – h(x). ) dg ¼ 0; which implies dbdh ¼ 0 or; db ¼ dh; i:e: db ¼
n X @h j¼1
@xj
dxj :
ð10:66Þ
Equation (10.64) can be rewritten as @f @h k ¼0 @xj @xj
or,
From (10.66) and (10.67), we have
@h 1 @f ¼ ; @xj k @xj
j ¼ 1; 2; . . .; n:
ð10:67Þ
242
10 Constrained Optimization with Equality Constraints
db ¼
"
n X 1 @f j¼1
or, k ¼
df dxj ¼ k @xj k
df db
or, k ¼
Since df ¼
n X @f
dxj
ð10:68Þ
or, df ¼ k db:
ð10:69Þ
j¼1
df db
#
@xj
Thus, k denotes the sensitivity (or rate of change) of f with respect to b or the marginal or incremental change in f with respect to b at x*. In other words, k indicates how the constraint can be relaxed or tightly binding at the optimum point. Depending on the value of k , three cases may arise: Case I (k [ 0), Case II (k \0), Case III (k ¼ 0). Case-I: k [ 0 In this case, for a unit decrease in b, the decrease in f will be exactly equal to k as df ¼ k ð1Þ ¼ k . Hence, k may be interpreted as the marginal gain (or reduction) in f due to the tightening of the constraint. On the other hand, if b is increased by one unit, f will be increased to a new optimum level by the amount df ¼ k ð þ 1Þ ¼ k . In this case, k may be interpreted as the marginal cost (increase) in f due to relaxation of the constraint. Case-II: k \0 In this case, the optimum value of f will be decreased with the increase of b. Here, the marginal gain (reduction) in f* due to a relaxation of the constraint by one unit is determined by the value of k as df ¼ k ð þ 1Þ\0. If b is decreased by one unit, the marginal cost (increase) in f* by the tightening of the constraint is df ¼ k ð1Þ [ 0, since, in this case, the minimum value of the objective function increases. Case-III: k ¼ 0 In this case, any incremental change in b has absolutely no effect on the optimum value of f, and hence the constraint will not be binding. This means that the optimization of f subject to g = 0 leads to the same optimum point x* as with the unconstrained optimization of f. In economics and operations research, Lagrange multipliers are known as shadow prices of the constraints, since they indicate the changes in optimal value of the objective function per unit change in the right-hand side of the equality constraints. Procedure for finding the effect of changes of b on f* This can be done in two ways: (i) For finding the effect of changes of b (right-hand side of the constraints) on the optimal value f*, one can solve the problem with the new value of b. (ii) Another procedure is to use the relation df ¼ k db, where k is the optimal value.
10.6
Interpretation of Lagrange Multipliers
243
Example 9 Find the maximum of the function f (x) = 2x1 + x2 + 10 subject to gð xÞ ¼ x1 þ 2x22 ¼ 3 using the Lagrange multiplier method. Also, find the effect of changing the right-hand side of the constraint on the optimum value of f. Solution Here the Lagrange function is given by L ¼ f ð x Þ þ k gð x Þ
¼ 2x1 þ x2 þ 10 þ k 3 x1 2x22 :
ð10:70Þ
The necessary conditions for the maximum of f (x) are given by @L ¼ 0; @x1
@L ¼ 0; @x2
@L ¼ 0: @k
ð10:71Þ
@L ¼ 0 gives 2 k ¼ 0 or k ¼ 2 @x1 @L 1 1 ¼ ¼ 0:13 ¼ 0 gives 1 4k x2 ¼ 0 or x2 ¼ @x2 4k 8 @L 1 95 2 ¼ 0 gives x1 þ 2x2 3 ¼ 0 or x1 ¼ 3 2x22 ¼ 3 ¼ ¼ 2:97: @k 32 32
Now;
Hence, the solution of (10.71) is given by x1 ¼ 2:97;
x2 ¼ 0:13;
k ¼ 2:
To test the sufficient condition, solve the Hancock determinantal equation for finding the value of z. The value of z will be negative. Hence, f will be maximum for x1 ¼ 2:97; x2 ¼ 0:13, and the maximum value of f will be f ¼ f ðx Þ ¼ 16:07: Second part: When the original constraint is tightened by one unit, i.e. db = −1, df ¼ k db ¼ 2ð1Þ ¼ 2: Due to this change, the new value of f* is fold þ df ¼ 16:07 2 ¼ 14:07.
On the other hand, if we relax the original constraint by one unit, i.e. db = + 1, df ¼ k db ¼ 2ð1Þ ¼ 2: Due to this change, the new value of f* is fold þ df ¼ 16:07 þ 2 ¼ 18:07
244
10 Constrained Optimization with Equality Constraints
Example 10 Find the maximum of the function f ð xÞ ¼ 4x1 þ x2 þ 15 subject to gð xÞ ¼ 5 x1 4x22 ¼ 0, using the Lagrange multiplier method. Also, find the effect of changing the right-hand side of the constraint on the optimum value of f. Solution The Lagrange function is given by L ¼ f ð x Þ þ k gð x Þ
¼ 4x1 þ x2 þ 15 þ k 5 x1 4x22 :
ð10:72Þ
The necessary conditions for the maximum of f ð xÞ are given by @L ¼ 0; @x1 Now;
@L ¼ 0; @x2
@L ¼ 0: @k
ð10:73Þ
@L ¼ 0 gives 4 k ¼ 0 ) k ¼ 4 @x1 @L 1 1 ¼ ¼ 0:03 ¼ 0 gives 1 8kx2 ¼ 0 ) x2 ¼ @x2 8k 32 @L 4 ¼ 0 gives 5 x1 4x22 ¼ 0 ) x1 ¼ 4x22 þ 5 ¼ 5 ¼ 4:99: @k ð32Þ2
Hence, the solution of (10.73) is given by x1 ¼ 4:99;
x2 ¼ 0:03; k ¼ 4:
To test the sufficient condition, solve the Hancock determinantal equation for finding the value of z. Now;
@2L ¼ 0; @x21
@2L ¼0 @x1 @x2
@2L @2L ¼ 8k ¼ 32; ¼0 2 @x2 @x1 @x2 @g @g and ¼ 1; ¼ 8x2 ¼ 0:24: @x1 @x2 Now the Hancock determinantal equation at x1 ¼ 4:99; x2 ¼ 0:03 and k ¼ 4 is given by 0 z 0 1
0 32 z 0:24
1 0:24 ¼ 0 0
10.6
Interpretation of Lagrange Multipliers
or, z ¼
245
32 ¼ 30:26; 1:0576
since the value of z is negative. Hence, f will be maximum for x1 ¼ 4:99; x2 ¼ 0:03, and the maximum value of f will be f ¼ f ðx Þ ¼ 34:99 Second part: When the original constraint is tightened by one unit, i.e. db ¼ 1, df ¼ k db ¼ 4ð1Þ ¼ 4: Due to this change, the new value of f is þ df ¼ 34:99 4 ¼ 30:99: fold
On the other hand, if we relax the original constraint by one unit, i.e. db ¼ þ 1, df ¼ k db ¼ 4ð1Þ ¼ 4: Due to this change, the new value of f is þ df ¼ 34:99 þ 4 ¼ 38:99: fold
10.7
Applications of Lagrange Multiplier Method
Financial mathematics: Constrained optimization plays an important role in financial mathematics; e.g. the choice problem for a consumer is represented as one of maximizing a utility function subject to a budget constraint. The Lagrange multiplier has an economic interpretation as the shadow price associated with the constraint, since the Lagrange multipliers indicate the changes in the optimal value of the objective function per unit change on the right-hand side of the equality constraint. Other examples include profit maximization for a firm, along with various applications in the branch of financial mathematics dealing with performance, structure, behaviour and decision-making of the whole economy. Control theory: The Lagrange multiplier method is used in optimal control theory to find the best possible control for taking a dynamical system from one state to another, especially in the presence of constraints for the state or input controls.
246
10 Constrained Optimization with Equality Constraints
10.8
Exercises
1. Solve the following problems by the Lagrange multiplier method: ðiÞ
Optimize z ¼ x21 10x1 þ x22 6x2 þ x23 4x3 subject to x1 þ x2 þ x3 ¼ 7:
ðiiÞ
Optimize z ¼ 3x21 þ x22 þ x23 subject to x1 þ x2 þ x3 ¼ 2:
2. Solve the problem by the Lagrange multiplier method: Minimize z ¼ 4x21 þ 2x22 þ x23 4x1 x2 subject to the constraints x1 þ x2 þ x3 ¼ 15 2x1 x2 þ 2x3 ¼ 20: 3. Solve the following problem by the Lagrange multiplier method: Minimize z ¼ x21 þ x22 þ x23 subject to the constraints x1 þ x2 þ 3x3 ¼ 2 5x1 þ 2x2 þ x3 ¼ 5 4. Solve the problem by the Lagrange multiplier method: Maximize z ¼ 6x1 þ 8x2 þ x21 x22 subject to the constraints 4x1 þ 3x2 ¼ 16 3x1 þ 5x2 ¼ 15: 5. Solve the following optimization problems: (i) Minimize z ¼ x1 x2 x3 subject to the constraints subject to x1 þ x2 þ x3 ¼ 1 and x1 ; x2 ; x3 [ 0:
10.8
Exercises
(ii) Minimize z ¼ x21 þ x22 subject to the constraint x1 þ x2 ¼ 2 : (iii) Maximize z ¼ x1 þ x2 subject to the constraint x21 þ x22 ¼ 1: 1 2 x1 þ x22 þ x23 2 subject to the constraint x1 þ x2 þ x3 ¼ 1:
(iv) Minimize z ¼
(v) Minimize z ¼ 2x21 þ 2x22 þ x23 þ 2x1 x2 þ 2x1 x3 8x1 6x2 4x3 þ 9 subject to the constraint x1 þ x2 þ 2x3 ¼ 3 :
247
Chapter 11
Constrained Optimization with Inequality Constraints
11.1
Objective
The objective of this chapter is to derive the Kuhn-Tucker necessary and sufficient conditions to solve multivariate constrained optimization problems with inequality constraints.
11.2
Introduction
Let us consider a multivariate minimization problem as follows: Minimize f ð xÞ subject to gi ð xÞ 0;
i ¼ 1; 2; . . .; m: :
where x 2 Rn ; f ; gi : Rn ! R Now adding the non-negative slack variables, y2i in the given inequality constraints, we get the equality constraints as gi ð xÞ þ y2i ¼ 0;
i ¼ 1; 2; . . .; m:
Let Gi ðx; yÞ ¼ gi ð xÞ þ y2i ¼ 0; i ¼ 1; 2; . . .; m, where y ¼ ðy1 ; y2 ; . . .; ym ÞT is the vector of slack variables. Therefore, the reduced problem is Minimize f ð xÞ subject to Gi ðx; yÞ ¼ gi ð xÞ þ y2i ¼ 0;
i ¼ 1; 2; . . .; m:
This problem can be solved by the method of Lagrange multiplier. © Springer Nature Singapore Pte Ltd. 2019 A. K. Bhunia et al., Advanced Optimization and Operations Research, Springer Optimization and Its Applications 153, https://doi.org/10.1007/978-981-32-9967-2_11
249
250
11 Constrained Optimization with Inequality Constraints
In this case, the Lagrange function is Lðx; y; kÞ ¼ f ð xÞ þ
m X
ki Gi ðx; yÞ;
i¼1
where k ¼ ðk1 ; k2 ; ; km ÞT is the vector of Lagrange multiplier. Now, the necessary conditions for a stationary point to be a local minimum are given by @L ¼ 0; @xj @L ¼ 0; @ki @L and ¼ 0; @yi i:e:
j ¼ 1; 2; . . .; n i ¼ 1; 2; . . .; m i ¼ 1; 2; . . .; m;
m X @f @gi þ ki ¼ 0; @xj @xj i¼1
gi ð xÞ þ y2i ¼ 0;
j ¼ 1; 2; . . .; n
i ¼ 1; 2; . . .; m
ð11:1Þ ð11:2Þ
and 2ki yi ¼ 0;
i ¼ 1; 2; . . .; m:
ð11:3Þ
From (11.3), we have ki yi ¼ 0. Hence, either ki ¼ 0 or yi ¼ 0; i ¼ 1; 2; . . .; m. For any i, if ki ¼ 0, it means that the corresponding constraint is inactive and hence can be ignored. On the other hand, if yi ¼ 0, it means that the constraint is active at the optimum point. Now, we divide the indices of the constraints into two subsets J1 and J2. Let the set J1 indicate the indices of those constraints which are active at the optimum point, and let J2 include the indices of all the inactive constraints, i:e: J1 ¼ fi: yi ¼ 0; i ¼ 1; 2; . . .; mg J2 ¼ fi: ki ¼ 0; i ¼ 1; 2; . . .; mg; i.e. i 2 J1 when yi ¼ 0 and i 2 J2 when ki ¼ 0; i ¼ 1; 2; . . .; m. Hence, (11.1) becomes X @gi @f þ ki ¼ 0; j ¼ 1; 2; . . .; n: @xj @xj i2J1
ð11:4Þ
11.2
Introduction
251
Similarly, (11.2) becomes gi ð xÞ ¼ 0 for i 2 J1
ð11:5Þ
gi ð xÞ þ y2i ¼ 0 for i 2 J2 :
ð11:6Þ
and
If the number of active constraints is p, then the number of inactive constraints is m − p. Therefore, in (11.4), (11.5), and (11.6), the number of equations are n + p + m − p = m + n and the unknowns are x1, x2, …, xn, ki ði 2 J1 Þ and yi ði 2 J2 Þ. Therefore, the number of unknowns are n + p + m − p = m + n. Hence, we can solve the system (11.4), (11.5), and (11.6) easily. Let rf ¼
@f @f @f @x1 @x2 @xn
T
and rgi ¼
@gi @gi @gi @x1 @x2 @xn
T
be the gradients of the objective function and the ith constraint respectively. Then Eq. (11.4) may be written as rf ¼
X
ki rgi :
ð11:7Þ
i2J1
Thus, the negative of the gradient of the objective function can be expressed as a linear combination of the gradients of the active constraints at the optimum point. Now, it can be proved that: (a) (b) (c) (d)
For For For For
11.3
a a a a
minimization problem with gi ð xÞ 0, we have ki 0. maximization problem with gi ð xÞ 0, we have ki 0. minimization problem with gi ð xÞ 0, we have ki 0. maximization problem with gi ð xÞ 0, we have ki 0.
Feasible Direction
The direction of a vector S is called a feasible direction from a point x ¼ ðx1 ; x2 ; . . .; xn Þ if at least a small step can be taken along S that does not immediately leave the feasible region. Thus, for problems with sufficiently smooth constraint surfaces, any vector S satisfying the relation \ST ; rgi [ \0
252
11 Constrained Optimization with Inequality Constraints
g1 = 0
Fig. 11.1 Feasible direction S when both constraint functions are convex
g2 = 0 g10
g1
g2
g1 = 0
Fig. 11.2 Feasible direction S when one constraint function is linear
g10
0
90
g1
g2
Angle greater than 900
g1 = 0
Fig. 11.3 Feasible direction S when one constraint function is concave and other one is convex
g10
900 g1
Angle greater than 900
g2
11.3
Feasible Direction
253
can be called a feasible direction (see Fig. 11.1). On the other hand, if the constraint is either linear or concave (see Figs. 11.2 and 11.3), any vector satisfying the relation \ST ; rgi [\0 is called a feasible direction. Let us suppose that only two constraints are active at the optimum point, i.e. p = 2. Then Eq. (11.7) reduces to rf ¼ k1 rg1 þ k2 rg2 :
ð11:8Þ
Let S be a feasible direction at the optimum point. Now by premultiplying both sides of (11.8) by ST, we have \ST ; rf [ ¼ k1 \ST ; rg1 [ þ k2 \ST ; rg2 [ : Since S is a feasible direction, then we have \ST ; rg1 [\0
and \ST ; rg2 [\0:
ð11:9Þ
Thus, if k1 [ 0 and k2 [ 0, the quantity \ST ; rf[ is always positive. As rf indicates the gradient direction along which the values of the function f increase at the maximum rate, \ST ; rf[ represents the component of the increment of f along the direction S. If \ST ; rf [ [ 0, the function value increases as we move along the direction S. Hence, if k1 ; k2 are positive, we will not be able to find any direction in the feasible region along which the function value can further be decreased. Since the point at which Eq. (11.9) is valid is assumed to be optimum, k1 and k2 have to be positive. This reason can be extended to cases where there are more than two active constraints.
11.4
Kuhn-Tucker Conditions
In this section, we shall derive the necessary and sufficient conditions for a local optimum of the general non-linear programming problem.
11.4.1 Kuhn-Tucker Necessary Conditions Let us consider a minimization problem as follows: Minimize f ð xÞ subject to gi ð xÞ 0;
i ¼ 1; 2; . . .; m:
wherex 2 Rn ; f ; gi : Rn ! R
254
11 Constrained Optimization with Inequality Constraints
Now adding the non-negative slack variables y2i in the given inequality constraints, we get the equality constraints as gi ð xÞ þ y2i ¼ 0;
i ¼ 1; 2; . . .; m:
Therefore, the reduced problem is Minimize f ð xÞ subject to gi ð xÞ þ y2i ¼ 0;
i ¼ 1; 2; . . .; m:
This problem can be solved by the Lagrange multiplier method. In this case, the Lagrange function is Lðx; y; kÞ ¼ f ð xÞ þ
m X ki gi ð xÞ þ y2i ; i¼1
where k ¼ ðk1 ; k2 ; . . .; km ÞT is the vector of Lagrange multipliers. Now the necessary conditions for a stationary point to be a local minimum are given by @L ¼ 0; @xj @L ¼ 0; @ki @L and ¼ 0; @yi i:e:
j ¼ 1; 2; . . .; n i ¼ 1; 2; . . .; m i ¼ 1; 2; . . .; m;
m X @L @gi þ ki ¼ 0; j ¼ 1; 2; . . .; n @xj @xj i¼1
ð11:10Þ
gi ð xÞ þ y2i ¼ 0;
ð11:11Þ
i ¼ 1; 2; . . .; m
and 2ki yi ¼ 0;
i ¼ 1; 2; . . .; m:
ð11:12Þ
If a constraint is satisfied with an equality sign gi ð xÞ ¼ 0 at the optimum point, then it is called the active constraint. Otherwise, it will be called inactive. Equation (11.12) implies that either ki ¼ 0 or yi ¼ 0. Now, if ki ¼ 0 and y2i ¼ 0, then the ith constraint is inactive and can be discarded. If yi ¼ 0; ki [ 0, then Eq. (11.11) implies gi ð xÞ ¼ 0 at the optimum point. This means that the constraint is active. Therefore, Eqs. (11.11) and (11.12) imply
11.4
Kuhn-Tucker Conditions
255
ki gi ð xÞ ¼ 0:
ð11:13Þ
Again, y2i 0. Then from (11.11), we have gi ð xÞ 0:
ð11:14Þ
Thus, (11.13) implies that if gi ð xÞ\0, then ki ¼ 0 and when gi ð xÞ ¼ 0; ki [ 0. Hence, the necessary conditions to be satisfied at a local minimum point are as follows: m X @f @gi þ ki ¼ 0; @xj @xj i¼1
j ¼ 1; 2; . . .; m
ki g i ð x Þ ¼ 0 gi ð x Þ 0 ki 0; i ¼ 1; 2; . . .; m: This set of necessary conditions is known as the Kuhn-Tucker necessary conditions. Note: The first two conditions will be the same for both minimization as well as maximization problems. However, the last two conditions will be different, and these are dependent on the type of problem. The constraint conditions are given in the table. Type of problem
Type of constraint
Form of last two necessary conditions
Minimization Minimization Maximization Maximization
gi ð xÞ 0 gi ð xÞ 0 gi ð xÞ 0 gi ð xÞ 0
gi ð xÞ 0; ki 0 gi ð xÞ 0; ki 0 gi ð xÞ 0; ki 0 gi ð xÞ 0; ki 0
11.4.2 Kuhn-Tucker Sufficient Conditions Theorem The Kuhn-Tucker necessary conditions for the optimization problem Minimize f ð xÞ subject to the constraints gi ð xÞ 0; i ¼ 1; 2; . . .; m are also sufficient conditions if f(x) is convex and all gi(x) are convex functions of x. Proof The Kuhn-Tucker necessary conditions for the given problem
256
11 Constrained Optimization with Inequality Constraints
Minimize f ð xÞ subject to gi ð xÞ 0; i ¼ 1; 2; . . .; m m X @f @gi þ ki ¼ 0; j ¼ 1; 2; . . .; n are @xj @xj i¼1 ki gi ð xÞ ¼ 0; i ¼ 1; 2; . . .; m gi ð xÞ 0; i ¼ 1; 2; . . .; m ki 0;
i ¼ 1; 2; . . .; m:
Now, we have to prove that these conditions are sufficient if f(x) is convex and all gi(x) are convex functions of x. The Lagrangian function of the given problem is Lðx; y; kÞ ¼ f ð xÞ þ
m X ki gi ð xÞ þ y2i : i¼1
If ki 0, then ki gi ð xÞ is convex. Further, since ki yi ¼ 0, we have gi ð xÞ þ y2i ¼ 0. Thus, it follows that Lðx; y; kÞ is a convex function. Again, the necessary condition for f(x) to be minimum at x is that Lðx; y; kÞ has a stationary point. However, if Lðx; y; kÞ is convex, its partial derivatives must vanish at only one point. According to the convexity analysis, this point must be a global minimum point. Hence, the Kuhn-Tucker necessary conditions are also sufficient conditions for a global minimum of f(x) at x if f(x) and all gi(x) are convex. Example 1 Determine x1, x2, x3 so as to maximize z ¼ x21 x22 x23 þ 4x1 þ 6x2 subject to the constraints x1 þ x2 2; 2x1 þ 3x2 12 and x1 ; x2 ; x3 0: Solution Let f ¼ x21 x22 x23 þ 4x1 þ 6x2 subject to g1 ¼ x1 þ x2 2 0; g2 ¼ 2x1 þ 3x2 12 0: Hence, the Kuhn-Tucker necessary conditions are
11.4
Kuhn-Tucker Conditions
257
2 X @f @gi þ ki ¼ 0; @xj @xj i¼1
ki gi ¼ 0;
j ¼ 1; 2; 3
i ¼ 1; 2
ð11:15Þ ð11:16Þ
gi 0;
i ¼ 1; 2
ð11:17Þ
k1 0;
k2 0:
ð11:18Þ
From (11.15), we have 2x1 þ 4 þ k1 þ 2k2 ¼ 0
ð11:19aÞ
2x2 þ 6 þ k1 þ 3k2 ¼ 0
ð11:19bÞ
2x3 ¼ 0:
ð11:19cÞ
From (11.16), we have ki gi ¼ 0; i ¼ 1; 2, i:e: k1 ðx1 þ x2 2Þ ¼ 0
ð11:20aÞ
k2 ð2x1 þ 3x2 12Þ ¼ 0:
ð11:20bÞ
From (11.17), we have gi 0; i ¼ 1; 2, i:e: x1 þ x2 2 0
ð11:21aÞ
2x1 þ 3x2 12 0:
ð11:21bÞ
and
According to (11.18), four different cases may arise: Case 1: k1 ¼ 0;
k2 ¼ 0
Case 2: k1 ¼ 0;
k2 6¼ 0
Case 3: k1 6¼ 0;
k2 ¼ 0
Case 4: k1 6¼ 0; k2 6¼ 0. Now we shall examine all the cases. Case 1: k1 ¼ 0; k2 ¼ 0 In this case, from (11.19a), (11.19b) and (11.19c), we have 2x1 þ 4 ¼ 0; 2x2 þ 6 ¼ 0 and x3 ¼ 0, i.e. x1 ¼ 2; x2 ¼ 3; x3 ¼ 0. However, this solution violates both constraints (11.21a) and (11.21b). So, this solution is rejected. Case 2: k1 ¼ 0; k2 6¼ 0
258
11 Constrained Optimization with Inequality Constraints
In this case, (11.20a) and (11.20b) give 2x1 þ 3x2 12 ¼ 0, and (11.19a) and (11.19b) give 2x1 þ 4 þ 2k2 ¼ 0, 2x2 þ 6 þ 3k2 ¼ 0, i.e. x1 ¼ k2 þ 2 and x2 ¼ 3 2 k2 þ 3. Substituting these results in 2x1 þ 3x2 ¼ 12, we have 3 2ðk2 þ 2Þ þ 3 k2 þ 3 ¼ 12 2 9 13 2 k2 ¼ 1 or; k2 ¼ \0: or 2k2 þ 4 þ k2 þ 9 ¼ 12 or 2 2 13 2 24 3 3 2 3 36 þ 2 ¼ ; x 2 ¼ k2 þ 3 ¼ þ3 ¼ : ) x1 ¼ þ3 ¼ 13 13 2 2 13 13 13 Again, from (11.19c), x3 = 0 ) x1 ¼
24 36 ; x2 ¼ ; x3 ¼ 0: 13 13
This solution violates the constraint x1 þ x2 2 0. So this solution is rejected. Case 3: k1 6¼ 0; k2 ¼ 0 In this case, (11.20a) gives x1 þ x2 ¼ 2, and (11.19a) and (11.19b) give x1 ¼ 2 þ k21 ; x2 ¼ 3 þ k21 . k1 k1 þ3þ ¼ 2 or, k1 ¼ 3\0 2 2 3 1 3 3 ) x1 ¼ 2 ¼ ; x2 ¼ 3 ¼ : 2 2 2 2
) 2þ
Again from (11.19c), we have x3 = 0. Hence, the solution is x1 ¼ 12 ; x2 ¼ 32 ; x3 ¼ 0. This solution satisfies the constraint 2x1 þ 3x2 12. Case 4: k1 6¼ 0; k2 6¼ 0 In this case, (11.20a) and (11.20b) give x1 + x2 = 2, 2x1 + 3x2 = 12. Solving these equations, we have x1 = –6, x2 = 8. Now, from (11.19a) and (11.19b) we have k1 þ 2k2 ¼ 16; k1 þ 3k2 ¼ 10. Solving these equations, we have k2 ¼ 26; k1 ¼ 68. Since k2 ¼ 26 violates the condition k2 0, this solution is also rejected. Hence, the optimum (maximum) solution of the given problem is x1 ¼ 12 ; x2 ¼ 32 ; x3 ¼ 0 with k1 ¼ 3; k2 ¼ 0 and Max z ¼ 17 2. Example 2
11.4
Kuhn-Tucker Conditions
259
Optimize z ¼ 2x1 þ 3x2 x21 þ x22 þ x23 subject to the constraints x1 þ x2 1 2x1 þ 3x2 6: Solution Here we have f ð xÞ ¼ 2x1 þ 3x2 x21 x22 x23 g1 ð x Þ ¼ x 1 þ x 2 1 0 g2 ð xÞ ¼ 2x1 þ 3x2 6 0: First of all, we have to determine whether the given problem is of minimization type or maximization type. For this purpose, we have to construct the bordered Hessian matrix as follows:
O HB ¼ UT
U ; V
where O is an m m null matrix,
@gi U¼ @xj
mn
@2L and V ¼ @xi @xj
: nn
Here m = 2, n = 3. Hence,
0
0
jH B j ¼ 1
1
0 ¼ 2:
0
1
1
0 2
2 2
3 0
3
0
2
0
0
0
0
0
0
0
2
Since n − m = 1, 2 m + 1 = 5, ð1Þm þ n ¼ negative and jHB j\0, the solution point should maximize the objective function. Hence the Kuhn-Tucker conditions for the given problem are given by 2 X @f @gi þ ki ¼ 0; @xj @xj i¼1
j ¼ 1; 2; 3
ð11:22Þ
260
11 Constrained Optimization with Inequality Constraints
ki gi ¼ 0;
i ¼ 1; 2
ð11:23Þ
gi 0;
i ¼ 1; 2
ð11:24Þ
k1 0;
k2 0:
ð11:25Þ
From (11.22), we have 2 2x1 þ k1 þ 2k2 ¼ 0
ð11:26aÞ
3 2x2 þ k1 þ 3k2 ¼ 0
ð11:26bÞ
2x3 ¼ 0:
ð11:26cÞ
From (11.23), we have k1 ð x 1 þ x 2 1Þ ¼ 0 : k2 ð2x1 þ 3x2 6Þ ¼ 0
ð11:27a; bÞ
From (11.24), we have x1 þ x2 1 0
ð11:28aÞ
2x1 þ 3x2 6 0:
ð11:28bÞ
According to (11.25), four different cases may arise: Case 1: k1 ¼ 0; k2 ¼ 0 Case 2: k1 ¼ 0; k2 6¼ 0 Case 3: k1 6¼ 0; k2 ¼ 0 Case 4: k1 6¼ 0; k2 6¼ 0. Now we shall examine the different cases. Case 1: k1 ¼ 0; k2 ¼ 0 In this case, from (11.26a)–(11.26c), we have x1 ¼ 1; x2 ¼ 3=2; x3 ¼ 0. This solution does not satisfy the inequalities (11.28a) and (11.28b). So this solution is rejected. Case 2: k1 ¼ 0; k2 6¼ 0 In this case, solving (11.26a)–(11.26c), (11.27a) and (11.27b), we have
11.4
Kuhn-Tucker Conditions
x1 ¼
12 ; 13
261
x2 ¼
18 1 ; x3 ¼ 0 and k2 ¼ : 13 13
This solution does not satisfy inequality (11.28a) and is therefore rejected. Case 3: k1 6¼ 0; k2 ¼ 0 In this case, the solution of Eqs. (11.26a)–(11.26c) and (11.27a) yields x1 ¼ 1 3 3 4 ; x2 ¼ 4 ; x3 ¼ 0 and k1 ¼ 2. This solution satisfies all the conditions and is therefore accepted. Case 4: k1 6¼ 0; k2 6¼ 0 In this case, the solution of Eqs. (11.26a)–(11.26c), (11.27a) and (11.27b) gives x1 ¼ 3; x2 ¼ 4; x3 ¼ 0, k1 ¼ 34; k2 ¼ 13. This solution violates the conditions (11.25) and is therefore discarded. Since only one solution satisfies all the conditions, this one is thus the optimal solution, and it is given by x1 ¼ 14 ; x2 ¼ 34 ; x3 ¼ 0 and Max z ¼ 17 8. Example 3 Solve the following problem: Minimize f ¼ x21 þ x22 þ x23 þ 40x1 þ 20x2 3000 subject to g1 ¼ x1 50 0 g2 ¼ x1 þ x2 100 0 g3 ¼ x1 þ x2 þ x3 150 0: Solution The Kuhn-Tucker necessary conditions are 3 X @f @gi þ ki ¼ 0; @xj @xj i¼1
ki gi ¼ 0;
From (11.29), we have
j ¼ 1; 2; 3
i ¼ 1; 2; 3
ð11:29Þ ð11:30Þ
gi 0;
i ¼ 1; 2; 3
ð11:31Þ
ki 0;
i ¼ 1; 2; 3:
ð11:32Þ
262
11 Constrained Optimization with Inequality Constraints 3 X @f @gi þ ki ¼ 0; @xj @xj i¼1
i:e:
j ¼ 1; 2; 3;
@f @g1 @g2 @g3 þ k1 þ k2 þ k3 ¼ 0; j ¼ 1; 2; 3; @xj @xj @xj @xj i:e: 2x1 þ 40 þ k1 þ k2 þ k3 ¼ 0
ð11:33aÞ
2x2 þ 20 þ k2 þ k3 ¼ 0
ð11:33bÞ
2x3 þ k3 ¼ 0:
ð11:33cÞ
From (11.30), we have ki gi ¼ 0; i ¼ 1; 2; 3, i:e: k1 ðx1 50Þ ¼ 0
ð11:34aÞ
k2 ðx1 þ x2 100Þ ¼ 0
ð11:34bÞ
k3 ðx1 þ x2 þ x3 150Þ ¼ 0:
ð11:34cÞ
From (11.31), we have gi 0; i ¼ 1; 2; 3, i:e: g1 0 g2 0 g3 0; i:e: x1 50 0
ð11:35aÞ
x1 þ x2 100 0
ð11:35bÞ
x1 þ x2 þ x3 150 0:
ð11:35cÞ
From (11.32), we have ki 0; i ¼ 1; 2; 3, i:e: k1 0
ð11:36aÞ
k2 0
ð11:36bÞ
k3 0:
ð11:36cÞ
Now from (11.34a), we have k1 ðx1 50Þ ¼ 0. Hence, either k1 ¼ 0 or x1 50 ¼ 0, i.e. either k1 ¼ 0 or x1 ¼ 50. Case 1: When k1 ¼ 0 Now from Eqs. (11.33a)–(11.33c), we have
11.4
Kuhn-Tucker Conditions
263
2x1 ¼ 40 k1 k2 k3
or x1 ¼ 20
k2 k3 2 2
2x2 ¼ 20 k2 k3 or; x2 ¼ 10 k22 k23 2x3 ¼ k3 or; x3 ¼ k23
½* k1 ¼ 0 ) :
ð11:37Þ
Substituting these values of x1, x2, x3 in Eqs. (11.34b) and (11.34c), we have k2 k3 k2 k 3 k2 20 10 100 ¼ 0 2 2 2 2 and k2 k3 k2 k3 k3 k3 20 10 150 ¼ 0; 2 2 2 2 2 3 i:e: k2 ð130 k2 k3 Þ ¼ 0 and k3 180 k2 k3 ¼ 0: 2
ð11:38Þ
From the first equation of (11.38), we have either k2 ¼ 0 or 130 k2 k3 ¼ 0: From the second equation of (11.38), we have 3 either k3 ¼ 0 or 180 k2 k3 ¼ 0: 2 Hence, the four possible solutions of (11.38) are given by 3 k2 ¼ 0; 180 k2 k3 ¼ 0 2
ðAÞ
k3 ¼ 0; 130 k2 k3 ¼ 0
ðBÞ
k2 ¼ 0; 130 k2 k3 ¼ 0;
k3 ¼ 0 3 180 k2 k3 ¼ 0: 2
ðCÞ ðDÞ
From (A), k2 ¼ 0; 180 k2 32 k3 ¼ 0, i.e. k2 ¼ 0 and 32 k3 ¼ 180 or k2 ¼ 0 and k3 ¼ 120. Now substituting k2 ¼ 0 and k3 ¼ 120 in (11.37), we have
264
11 Constrained Optimization with Inequality Constraints
0 120 i:e:; x1 ¼ 40 x1 ¼ 20 2 2 0 120 i:e:; x2 ¼ 50 x2 ¼ 10 2 2 120 ¼ 60: x3 ¼ 2 Hence, (A) gives the solution x1 = 40, x2 = 50, x3 = 60. This solution satisfies (11.35c), but it does not satisfy (11.35a) and (11.35b). So, we reject this solution. Now from (B), we have k3 ¼ 0; 130 k2 k3 ¼ 0, i:e: k3 ¼ 0 and k2 ¼ 130: Hence, from (11.37), we have 130 0 ¼ 20 þ 65 ¼ 45; 2 2 130 0 ¼ 55; x3 ¼ 0: x2 ¼ 10 2 2 ) x1 ¼ 45; x2 ¼ 55; x3 ¼ 0: x1 ¼ 20
This solution satisfies (11.36), but it does not satisfy (11.35a) and (11.35c). Hence, we reject this solution. Now from (C), we have k2 ¼ 0; k3 ¼ 0. From (11.37), we have x1 ¼ 20;
x2 ¼ 10; x3 ¼ 0:
This solution satisfies (11.36), but it does not satisfy (11.35a)–(11.35c). Hence, we reject this solution. Now from (D), we have 3 130 k2 k3 ¼ 0; 180 k2 k3 ¼ 0; 2 3 i:e: k2 þ k3 ¼ 130 and k2 þ k3 ¼ 180: 2 Subtracting the first equation from the second, we have 1 k3 ¼ 50 2 From the first, k2 100 ¼ 130
or k3 ¼ 100: or; k2 ¼ 30.
11.4
Kuhn-Tucker Conditions
265
Hence, from (11.37), we have 30 100 ¼ 20 þ 15 þ 50 ¼ 45 2 2 30 100 ¼ 10 þ 15 þ 50 ¼ 55 x2 ¼ 10 2 2 100 ¼ 50: x3 ¼ 2
x1 ¼ 20
This solution satisfies (11.35b) and (11.35c) but does not satisfy (11.35a). Hence, we reject this solution. Case 2: When x1 = 50 Now from Eqs. (11.33a)–(11.33c), we have k3 ¼ 2x3 k2 ¼ 20 2x2 k3 ¼ 20 2x2 þ 2x3 and k1 ¼ 40 2x1 k2 k3 ¼ 40 2:50 ð20 2x2 þ 2x3 Þ þ 2x3 ¼ 40 100 þ 20 þ 2x2 2x3 þ 2x3 ¼ 120 þ 2x2 : 9 ) k1 ¼ 120 þ 2x2 = k2 ¼ 20 2x2 þ 2x3 : ; k3 ¼ 2x3
ð11:39Þ
Now substituting these values of k1 ; k2 ; k3 in (11.34b) and (11.34c), we have ð20 2x2 þ 2x3 Þðx1 þ x2 100Þ ¼ 0
ð11:40aÞ
2x3 ðx1 þ x2 þ x3 150Þ ¼ 0:
ð11:40bÞ
and
From (11.40a), we have either 20 2x2 þ 2x3 ¼ 0 or; x1 þ x2 100 ¼ 0: From (11.40b), we have either x3 ¼ 0 or x1 þ x2 þ x3 150 ¼ 0. Therefore, the four possible solutions of (11.40a) and (11.40b) are given by 202x2 þ 2x3 ¼ 0;
x1 þ x2 þ x3 150 ¼ 0
202x2 þ 2x3 ¼ 0; x1 þ x2 100 ¼ 0;
x3 ¼ 0 x3 ¼ 0
ð11:41aÞ ð11:41bÞ ð11:41cÞ
266
11 Constrained Optimization with Inequality Constraints
x1 þ x2 100 ¼ 0;
x1 þ x2 þ x3 150 ¼ 0:
ð11:41dÞ
From (11.41a), we have 202x2 þ 2x3 ¼ 0; x1 þ x2 þ x3 150 ¼ 0; i:e:
10x2 þ x3 ¼ 0; 50 þ x2 þ x3 150 ¼ 0
½* x1 ¼ 50:
Adding the preceding two equations, we have 2x3 ¼ 1050 þ 150
or 2x3 ¼ 110 or x3 ¼ 55:
Now substituting x3 = 55 in −10 − x2 + x3 = 0, we have 10x2 þ 55 ¼ 0 or x2 ¼ 45: )
x1 ¼ 50; x2 ¼ 45; x3 ¼ 55:
This solution does not satisfy (11.35b); hence, we reject it. From (11.41b), we have 20 2x2 þ 2x3 ¼ 0; x3 ¼ 0; i:e: 10 x2 þ x3 ¼ 0; x3 ¼ 0: ) x1 ¼ 50;
x2 ¼ 10;
x3 ¼ 0:
This solution does not satisfy the constraints (11.35b) and (11.35c). From (11.41c), we have
or
x1 þ x2 100 ¼ 0; 50 þ x2 100 ¼ 0;
or
x2 ¼ 50; x3 ¼ 0:
) x1 ¼ 50;
x2 ¼ 50;
x3 ¼ 0 x3 ¼ 0
½* x1 ¼ 50
x3 ¼ 0:
This solution does not satisfy the constraint (11.35c). From (11.41d), we have x1 þ x2 100 ¼ 0;
x1 þ x2 þ x3 150 ¼ 0:
From x1 þ x2 100 ¼ 0 we get 50 þ x2 100 ¼ 0 or x2 = 50. From x1 þ x2 þ x3 150 ¼ 0 we get 50 þ 50 þ x3 150 ¼ 0 or x3 = 50. ) x1 ¼ 50;
x2 ¼ 50;
x3 ¼ 50:
This solution satisfies all the constraints (11.35a)–(11.35c).
11.4
Kuhn-Tucker Conditions
267
Now substituting x1 ¼ 50; x2 ¼ 50; x3 ¼ 50 in (11.39), we have k1 ¼ 120 þ 2:50 ¼ 20 k2 ¼ 20 2:50 þ 2:50 ¼ 20 k3 ¼ 2:50 ¼ 100: Since all these values of ki satisfy the requirement ki 0; i ¼ 1; 2; 3, then the optimum solution is x1 ¼ 50; x2 ¼ 50; x3 ¼ 50, and the minimum value of f is ð50Þ2 þ ð50Þ2 þ ð50Þ2 þ 40 50 þ 20 50 3000 ¼ 7500:
11.5 ð1Þ
Exercises Maximize f ðx1 ; x2 Þ ¼ 2x1 þ bx2 subject to g1 ðx1 ; x2 Þ ¼ x21 þ x22 5
g2 ðx1 ; x2 Þ ¼ x1 x2 2: Using the Kuhn-Tucker conditions, find the values of b for which x1 ¼ 1; x2 ¼ 2 will be the optimal solution. 2. Solve the following problems using the Kuhn-Tucker conditions: ðiÞ Minimize f ðx1 ; x2 Þ ¼ ðx1 1Þ2 þ ðx2 5Þ2 subject to x21 þ x2 4; ðx1 2Þ2 þ x2 3: ðiiÞ Maximize Z ¼ 10x1 x21 þ 10x2 x22 subject to x1 þ x2 14; x1 þ x2 6: ðiiiÞ Maximize z ¼ 12x1 þ 21x2 þ 2x1 x2 2x21 2x22 subject to x2 8; x1 þ x2 10: ðivÞ Maximize Z ¼ 2x1 þ 3x2 2x21 subject to x1 þ 4x2 4; x1 þ x2 2: ðvÞ Minimize Z ¼ 6 6x1 2x21 2x1 x2 þ 2x22 subject to x1 þ x2 2: ðviÞ Maximize z ¼ 10x1 þ 25x2 10x21 x22 4x1 x2 subject to x1 þ 2x2 10; x1 þ x2 9:
Chapter 12
Quadratic Programming
12.1
Objective
The main objective of this chapter is to discuss: • Solution procedures of quadratic programming problems • Wolfe’s modified simplex method • Beale’s method for solving quadratic programming problems.
12.2
Introduction
Quadratic programming deals with non-linear programming problems of optimizing (maximizing or minimizing) the quadratic function subject to a set of linear inequality constraints. The general form of a quadratic programming problem (QPP) may be written as follows: Optimize f ðxÞ ¼
n X
cj xj þ
j¼1
n X n 1X djk xj xk 2 j¼1 k¼1
subject to the constraints n X
aij xj bi ;
i ¼ 1; 2; . . .; m
j¼1
and xj 0;
j ¼ 1; 2; . . .; n:
© Springer Nature Singapore Pte Ltd. 2019 A. K. Bhunia et al., Advanced Optimization and Operations Research, Springer Optimization and Its Applications 153, https://doi.org/10.1007/978-981-32-9967-2_12
269
270
12 Quadratic Programming
In matrix notation, it is written as 1 Optimize f ðxÞ ¼ cT ; x þ hQx; xi 2 subject to the constraints Ax b and x 0; where x ¼ ðx1 ; x2 ; . . .; xn ÞT ; c ¼ ðc1 ; c2 ; . . .; cn Þ b ¼ ðb1 ; b2 ; . . .; bn ÞT , A ¼ aij is an m n real matrix and Q ¼ djk is an n n real symmetric matrix, i.e. djk ¼ dkj for all j and k. The matrix Q can be taken as non-null otherwise the QPP is a linear programing problem. The term hQx; xi defines a quadratic form where Q is a symmetric matrix. The quadratic form hQx; xi is said to be positive definite (negative definite) if hQx; xi [ 0ð\0Þ for x 6¼ 0 and positive semi-definite (negative semi-definite) if hQx; xi 0 ð 0Þ for all x such that there is one x 6¼ 0 satisfying hQx; xi ¼ 0. It may easily be verified that: (i) If hQx; xi is positive semi-definite (negative semi-definite), then it is convex (concave) in x over all of Rn (ii) If hQx; xi is positive definite (negative definite), then it is strictly convex (strictly concave) in x over all of Rn. These results help the students in determining whether the quadratic objective function f(x) is concave (convex) and the implication of the same on the sufficiency of the Kuhn-Tucker conditions for constrained maxima (minima) of f(x). The feasibility of the problem can be tested by Phase I of the simplex method. The solution of a general constrained optimization problem can be classified as follows: (i) No feasible solution (ii) An unbounded solution (iii) An optimal solution. Theorem 1 The quadratic programming problem cannot have an unbounded solution when the matrix Q is positive definite. Proof Let x 6¼ 0. The objective function can be written as 1 T f ðxÞ ¼ hQx; xi þ c ; x hQx; xi : 2
ð12:1Þ
Let x be any point on the hypersphere j xj ¼ r, where j xj2 ¼ hx; xi. Then x ¼ r^x, where j^xj ¼ 1, and then hQx; xi ¼ r 2 hQ^x; ^xi.
12.2
Introduction
271
Let m be the minimum value of hQ^x; ^xi. Since Q is positive definite, hQx; xi r 2 m [ 0 ) hQx; xi ! þ 1 as j xj ¼ r ! 1:
ð12:2Þ
Now, let M be the maximum value of hcT ; ^xi=hQ^x; ^xi. Then,
hcT ;xi hQx;xi
hcT ;^xi ¼ 1r hQ^x;^xi Mr ,
and therefore h cT ; xi ! 0 as r ! 1: hQx; xi
ð12:3Þ
From (12.1), (12.2) and (12.3), we get f ðxÞ ! 1 and j xj ! 1: Hence, for finite x; f ð xÞ will be finite. This means that the QPP cannot have an unbounded solution. Now we shall show that the QPP has a unique optimal solution provided the problem has a feasible solution and the matrix Q is positive definite. If Q is positive semi-definite, the solution could be an unbounded one. But the difficulties in this case can be avoided if we consider the perturbed matrix Q þ eI, where ðe [ 0Þ is substantially less than the magnitude of any element of Q, since this matrix is positive definite when Q is positive semi-definite.
12.3
Solution Procedure of QPP
Let us consider a QPP in the following form: Maximize z ¼ f ðxÞ ¼
n X j¼1
cj xj þ
n X n 1X djk xj xk 2 j¼1 k¼1
subject to the constraints n X
aij xj bi and xj 0;
i ¼ 1; 2; . . .; m; j ¼ 1; 2; . . .; n;
j¼1
where djk ¼ dkj for all j, k, bi 0 for i ¼ 1; 2; . . .; m and the matrix Q ¼ djk nn is symmetric negative definite.
272
12 Quadratic Programming
Now, introducing slack variables q2i in the ith constraint ði ¼ 1; 2; . . .; mÞ and rj2 in the jth non-negativity constraint ðj ¼ 1; 2; . . .; nÞ, the problem becomes Maximize z ¼ f ðxÞ ¼
n X
cj xj þ
j¼1
n X n 1X djk xj xk 2 j¼1 k¼1
subject to the constraints n X
aij xj þ q2i ¼ bi ;
i ¼ 1; 2; . . .; m
j¼1
xj þ rj2 ¼ 0;
j ¼ 1; 2; . . .; n:
Now the Lagrangian function is given by Lðx1 ; x2 ; . . .; xn ; q1 ; q2 ; . . .; qm ; r1 ; r2 ; . . .; rn ; k1 ; k2 ; . . .; km ; l1 ; l2 ; . . .; ln Þ ¼ f ðxÞ ! m n n
X X X 2 ki aij xj þ qi bi lj xj þ rj2 ; i¼1
j¼1
j¼1
where ki 0; i ¼ 1; 2; . . .; m and lj 0; j ¼ 1; 2; . . .; n. The necessary conditions for the maximum of L are given by m @L @f X ¼0) ki aij þ lj ¼ 0; @xj @xj i¼1
j ¼ 1; 2; . . .; n
ð12:4Þ
@L ¼ 0 ) 2ki qi ¼ 0; @qi
i ¼ 1; 2; . . .; m
ð12:5Þ
@L ¼ 0 ) 2lj rj ¼ 0; @rj
j ¼ 1; 2; . . .; n
ð12:6Þ
n X @L ¼0) aij xj þ q2i bi ¼ 0; @ki j¼1
@L ¼ 0 ) xj þ rj2 ¼ 0; @lj
i ¼ 1; 2; . . .; m
j ¼ 1; 2; . . .; n:
These conditions are known as Kuhn-Tucker conditions.
ð12:7Þ
ð12:8Þ
12.3
Solution Procedure of QPP
273
By defining a set of new variables si as si ¼ q2i 0; i ¼ 1; 2; . . .; m, the equations in (12.5) become ki si ¼ 0;
i ¼ 1; 2; . . .; m:
Then the equations in (12.7) reduce to n X
aij xj þ si bi ¼ 0;
i ¼ 1; 2; . . .; m:
j¼1
Now from (12.6) and (12.8), we have lj xj ¼ 0 and
xj 0;
j ¼ 1; 2; . . .; n:
The equations in (12.4) can be written as cj þ
n X
djk xk
m X
ki aij þ lj ¼ 0;
j ¼ 1; 2; . . .; n:
i¼1
k¼1
Thus, the necessary conditions can be summarized as follows: cj þ
n X
djk xk
m X
ki aij þ lj ¼ 0;
j ¼ 1; 2; . . .; n
ð12:9Þ
i¼1
k¼1 n X
aij xj þ si ¼ bi ;
i ¼ 1; 2; . . .; m
ð12:10Þ
j¼1
ki si ¼ 0;
i ¼ 1; 2; . . .; m
ð12:11Þ
lj xj ¼ 0;
j ¼ 1; 2; . . .; n
ð12:12Þ
xj ; lj 0;
j ¼ 1; 2; . . .; n
ð12:13Þ
si ; ki 0;
i ¼ 1; 2; . . .; m:
ð12:14Þ
These conditions, except ki si ¼ 0 and lj xj ¼ 0, are linear constraints involving 2 (n + m) variables. Thus, the solution of the original QPP can be obtained by finding a non-negative solution to the set of (n + m) linear equations given by the equations in (12.9) and (12.10), which also satisfies the (n + m) equations stated in (12.11) and (12.12).
274
12 Quadratic Programming
Since Q is a negative definite matrix, f(x) is a strictly concave function and the feasible space is convex (because of linear equations), any local maximum of the problem will be the global maximum. Again, from (12.9) to (12.14), it is observed that there are 2ðn þ mÞ equations with 2ðn þ mÞ variables in the necessary conditions. Hence, the solution of the system (12.9)–(12.14) must be unique. Thus, the feasible solution of the system (12.9)–(12.14), if it exists, must give the optimum solution of the QPP directly. The solution of the system (12.9)–(12.14) can be obtained by using Phase I of the two-phase simplex method. The only restriction here is that the satisfaction of non-linear relations (12.11) and (12.12) has to be maintained all the time. Since our objective is just to find a feasible solution to the system, there is no necessity for Phase II computations. For the application of Phase I of the two-phase method, to get the initial basis as an identity matrix, we have to introduce artificial variables Aj into Eq. (12.9) so that cj þ
n X
djk xk
m X
ki aij þ lj þ Aj ¼ 0;
j ¼ 1; 2; . . .; n:
i¼1
k¼1
Hence, the equivalent optimization QPP becomes Maximize z0 ¼
n X
Aj
j¼1
subject to the constraints cj þ
n X
djk xk
k¼1 n X
m X
ki aij þ lj þ Aj ¼ 0;
j ¼ 1; 2; . . .; n
i¼1
aij xj þ si ¼ bi ;
i ¼ 1; 2; . . .; m
j¼1
xj ; lj ; Aj 0;
j ¼ 1; 2; . . .; n si ; ki 0; i ¼ 1; 2; . . .; m:
While solving this problem, we have to consider the following additional conditions known as complementary slackness conditions: ki si ¼ 0;
i ¼ 1; 2; . . .; m
lj xj ¼ 0;
j ¼ 1; 2; . . .; n
Thus, when deciding whether to introduce si into the basic solution, we have to ensure first that either ki is not in the basic solution or ki will be removed when si enters the basic solution. Similar care has to be taken regarding the variables lj and xj.
12.4
12.4
Wolfe’s Modified Simplex Method
275
Wolfe’s Modified Simplex Method
Let the QPP be the following: Maximize z ¼ f ðxÞ ¼
n X j¼1
cj xj þ
n X n 1X djk xj xk 2 j¼1 k¼1
subject to the constraints n X
aij xj bi ;
i ¼ 1; 2; . . .; m and xj 0; j ¼ 1; 2; . . .; n;
j¼1
where djk ¼ dkj for all j and k, bi 0 for all i. P P Also, assume that the quadratic form nj¼1 nk¼1 djk xj xk is negative semi-definite. Then the iterative procedure for the solution of a modified simplex method is as follows: Step 1. Check whether the given QPP is in the maximization form or not. If it is in the minimization form, convert it into the maximization form by appropriate adjustment in f(x). Step 2. Convert the inequality constraints into equations by introducing the slack variables q2i in the ith constraint (i = 1, 2, …, m) and the slack variables rj2 in the jth non-negativity constraint (j = 1, 2, …, n). Step 3. Construct the Lagrangian function Lðx1 ; x2 ; . . .; xn ; q1 ; q2 ; . . .; qm ; r1 ; r2 ; . . .; rn ; k1 ; k2 ; . . .; km ; l1 ; l2 ; . . .; ln Þ ! m n n
X X X 2 ¼ f ðxÞ ki aij xj þ qi bi lj xj þ rj2 ; i¼1
j¼1
j¼1
where ki 0; i ¼ 1; 2; . . .; m and lj 0; j ¼ 1; 2; . . .; n. Step 4. Derive the Kuhn-Tucker conditions by differentiating L partially with respect to x1 ; x2 ; . . .; xn ; q1 ; q2 ; . . .; qm ; r1 ; r2 ; . . .; rn ; k1 ; k2 ; . . .; km ; l1 ; l2 ; . . .; ln and then equating to zero. Step 5. Introduce the non-negative artificial variables Aj ; j ¼ 1; 2; . . .; n in the first @L ¼ 0; j ¼ 1; 2; . . .; n and construct the new n equations obtained from @x j P n 0 objective function z ¼ j¼1 Aj , and form an LPP with this new objective function. Step 6. Use two-phase simplex method to obtain an optimum solution to the LPP formed in Step 5 satisfying the complementary slackness conditions.
276
12 Quadratic Programming
Step 7. The optimal solution obtained in Step 6 is an optimal solution to the given QPP also. Step 8. Stop.
12.5
Advantage and Disadvantage of Wolfe’s Method
The advantage in using the Wolfe method is that only a slight modification has to be made in the simplex method to take care of the restricted basis entry rule. However, the disadvantage is that the number of rows of the coefficient matrix increases from m to m + n. Example 1 Solve the following problem by Wolfe’s method: Minimize z ¼ x21 x1 x2 þ 2x22 x1 x2 subject to 2x1 þ x2 1 x1 0; x2 0: Solution Maximize z0 ¼ x21 þ x1 x2 2x22 þ x1 þ x2 subject to 2x1 þ x2 1 x1 0; x2 0 where z ¼ z 0
Now introducing slack variables q21 ; r12 and r22 , we have the reduced problem as Maximize z0 ¼ x21 þ x1 x2 2x22 þ x1 þ x2 subject to 2x1 þ x2 þ q21 ¼ 1 x1 þ r12 ¼ 0; x2 þ r22 ¼ 0: Now the Lagrange function L is given by Lðx1 ; x2 ; q1 ; q2 ; r1 ; r2 ; k1 ; k2 ; l1 ; l2 ;Þ
¼ x21 þ x1 x2 2x22 þ x1 þ x2 k1 2x1 þ x2 þ q21 1 l1 x1 þ r12 l2 x2 þ r22 :
12.5
Advantage and Disadvantage of Wolfe’s Method
277
Clearly, x21 þ x1 x2 2x22 ¼ x21 þ x22 2x1 x2 x1 x2 x22 ¼ ðx1 x2 Þ2 x1 x2 x22 is a negative semi-definite quadratic form. Hence, z0 ¼ x21 þ x1 x2 2x22 þ x1 þ x2 is a concave function in x1 ; x2 . Hence, the maximum of L will be the maximum of z0 and vice versa. Now the necessary conditions for maxima of L (and hence of z0 ) are given by
@L ¼ 0 ) 2x1 þ x2 þ 1 2k1 þ l1 ¼ 0 @x1
ð12:15Þ
@L ¼ 0 ) x1 4x2 þ 1 k1 þ l2 ¼ 0 @x2
ð12:16Þ
@L ¼ 0 ) 2x1 þ x2 þ q21 1 ¼ 0 @k1
ð12:17Þ
@L ¼ 0 ) x1 þ r12 ¼ 0 @l1
ð12:18Þ
@L ¼ 0 ) x2 þ r22 ¼ 0 @l2
ð12:19Þ
@L ¼ 0 ) k1 q1 ¼ 0 @q1
ð12:20Þ
@L ¼ 0 or l1 r1 ¼ 0 or l1 x1 ¼ 0 as x1 þ r12 ¼ 0 @r1
ð12:21Þ
@L ¼ 0 or l2 r2 ¼ 0 or l2 x2 ¼ 0 as x2 þ r22 ¼ 0 @r2
ð12:22Þ
Now letting q21 ¼ s1 0, from (12.17) we get 2x1 þ x2 þ s1 1 ¼ 0;
ð12:23Þ
k1 s1 ¼ 0;
ð12:24Þ
and from (12.20), we get
where x1 ; x2 ; s1 ; l1 ; l2 ; k1 0.
278
12 Quadratic Programming
Now the equivalent problem of the given QPP is as follows: 9 2x1 x2 þ 2k1 l1 ¼ 1 = x1 þ 4x2 þ k1 l2 ¼ 1 ; 2x1 þ x2 þ s1 ¼ 1
ð12:25Þ
x1 ; x2 ; s1 ; k1 ; l1 ; l2 0
ð12:26Þ
k1 s1 ¼ 0; l1 x1 ¼ 0; l2 x2 ¼ 0:
A solution xj ; j ¼ 1; 2 of (12.25) and satisfying (12.26) shall necessarily be an optimal one for maximizing L (i.e. maximizing z). Now to determine the solution to the simultaneous equation (12.25), we introduce two artificial variables A1 ð 0Þ and A2 ð 0Þ in the first two equations of (12.25) and construct the dummy objective function z00 ¼ A1 A2 . Then the equivalent problem of the given QPP becomes Maximize z00 ¼ A1 A2 subject to 2x1 x2 þ 2k1 l1 þ A1 ¼ 1 x1 þ 4x2 þ k1 l2 þ A2 ¼ 1 2x1 þ x2 þ s1 ¼ 1 x1 ; x2 0; k1 0; l1 ; l2 ; s1 ; A1 ; A2 0; satisfying the complementary slackness conditions k1 s1 ¼ 0; l1 x1 ¼ 0; l2 x2 ¼ 0: To solve the problem, we shall apply two-phase method. Here the initial basic feasible solution (BFS) is A1 ¼ 1; A2 ¼ 1; s1 ¼ 1. Now we construct the initial table of Phase I.
Basic variables 1 A1 1 A2 0 s1 z0B ¼ 2 cB
cj !
0
0
0
0
0
1
1
0
xB
x01
x02
k01
l01
l02
A01
A02
s01
1 1 1
2 1 2 1
1 4 1 3
2 1 0 3
1 0 0 1
0 1 0 1
1 0 0 0
0 1 0 0
0 0 1 0
Mini ratio – 1/4 1
12.5
Advantage and Disadvantage of Wolfe’s Method
279
Here all Dj j0. Hence, the current BFS is not optimal. From this table, we observe that either x02 or k01 as ðD2 ¼ 3; D3 ¼ 3Þ can be introduced into the basic solution. Arbitrarily, we choose x02 as the entering vector, since by the complementary condition, s1 is already in the basis.
Basic variables 1 A1 0 x2 0 s1 z0B ¼ 5=4 cB
cj !
0
0
0
0
0
1
1
0
xB
x01
x02
k01
l01
l02
A01
A02
s01
5/4 1/4 3/4
7/4 1=4 9/4 7=4
0 1 0 0
9/4 1/4 1=4 9=4
1 0 0 1
1=4 1=4 1/4 1/4
1 0 0 0
1/4 1/4 1=4 3/4
0 0 1 0
Mini ratio 5/7 – 1/3
Here all Dj j0. Hence, the current BFS is not optimal. Dj corresponding to k01 is most negative, but it does not enter into the basis, since by the complementary condition, s1 is already in the basis.
cB
Basic variables
1 A1 0 x2 0 x1 z0B ¼ 2=3 cB
cj ! xB 2/3 1/3 1/3
Basic variables
0 k1 0 x2 0 x1 z0B ¼ 0
0 x01
0 x02
k01
0 l01
0 l02
1 A01
0 s01
0 0 1 0 cj ! xB
0 1 0 0 0 x01
22/9 2/9 1=9 22=9 0 0 x02 k01
1 0 0 1 0 l01
4=9 2=9 1/9 4/9
1 0 0 0 0 l02
7=9 1/9 4/9 7/9 1 A01
0 s01
3/4 3/11 4/11
0 0 1 0
0 1 0 0
2=11 2 2 0
9/22 1 1 0
7=22 2 9 0
0
1 0 0 0
9=22 1 1 0
The optimal solution is x1 ¼ 4=11; x2 ¼ 3=11 and Max z0 ¼ 5=11. Example 2 Solve the following problem by Wolfe’s method: Minimize z ¼ 10x1 25x2 þ 10x21 þ x22 þ 4x1 x2 subject to x1 þ 2x2 10 x1 þ x 2 9 and x1 0; x2 0:
Mini ratio 3/11 3/2 –
280
12 Quadratic Programming
Solution This is a minimization problem. Now converting the problem into a maximization problem and then converting all the inequality constraints, including the non-negativity constraints, into type, we get the reduced problem as follows: Maximize z0 ¼ 10x1 þ 25x2 10x21 x22 4x1 x2 subject to x1 þ 2x2 10 x1 þ x2 9 x1 0; x2 0 0
where z ¼ z: Now introducing slack variables q21 ; q22 ; r12 and r22 , we have the reduced problem as Maximize z0 ¼ 10x1 þ 25x2 10x21 x22 4x1 x2 subject to x1 þ 2x2 þ q21 ¼ 10 x1 þ x2 þ q22 ¼ 9 x1 þ r12 ¼ 0; x2 þ r22 ¼ 0: Now the Lagrange function L is given by Lðx1 ; x2 ; q1 ; q2 ; r1 ; r2 ; k1 ; k2 ; l1 ; l2 Þ
¼ 10x1 þ 25x2 10x21 x22 4x1 x2 k1 x1 þ 2x2 þ q21 10 k2 x1 þ x2 þ q22 9 l1 x1 þ r12 l2 x2 þ r22 :
Clearly, 10x21 x22 4x1 x2 is a negative semi-definite quadratic expression. Hence, z0 ¼ 10x1 þ 25x2 10x21 x22 4x1 x2 is a concave function in x1 ; x2 . Hence, the maximum of L will be the maximum of z0 and vice versa. Now the necessary conditions for maxima of L (and hence of z0 ) are given by @L ¼ 0 or 10 20x1 4x2 k1 k2 þ l1 ¼ 0 @x1
ð12:27Þ
@L ¼ 0 or 25 2x2 4x1 2k1 k2 þ l2 ¼ 0 @x2
ð12:28Þ
12.5
Advantage and Disadvantage of Wolfe’s Method
@L ¼ 0 or x1 þ 2x2 þ q21 10 ¼ 0 @k1 @L ¼0 @k2
or x1 þ x2 þ q22 9 ¼ 0
281
ð12:29Þ ð12:30Þ
@L ¼ 0 or x1 þ r12 ¼ 0 @l1
ð12:31Þ
@L ¼ 0 or x1 þ r22 ¼ 0 @l2
ð12:32Þ
@L ¼0 @q1
or k1 q1 ¼ 0
ð12:33Þ
@L ¼0 @q2
or k2 q2 ¼ 0
ð12:34Þ
@L ¼0 @r1
or l1 r1 ¼ 0 or l1 x1 ¼ 0 as x1 þ r12 ¼ 0
ð12:35Þ
@L ¼0 @r2
or l2 r2 ¼ 0 or l2 x2 ¼ 0 as x2 þ r22 ¼ 0
ð12:36Þ
Now letting q21 ¼ s1 0, we have from (12.29) x1 þ 2x2 þ s1 10 ¼ 0:
ð12:37Þ
k1 s1 ¼ 0:
ð12:38Þ
From (12.33),
Let q22 ¼ s2 0; then we have from (12.30) and (12.34) x1 þ x2 þ s 2 9 ¼ 0
ð12:39Þ
k2 s 2 ¼ 0
ð12:40Þ
where x1 ; x2 ; s1 ; s2 ; l1 ; l2 ; k1 ; k2 0. Now the equivalent problem of the given QPP is as follows: 9 20x1 þ 4x2 þ k1 þ k2 l1 ¼ 10 > > = 4x1 þ 2x2 þ 2k1 þ k2 l2 ¼ 25 x1 þ 2x2 þ s1 ¼ 10 > > ; x1 þ x2 þ s2¼ 9
ð12:41Þ
282
12 Quadratic Programming
x1 ; x2 0; s1 ; s2 0; k1 ; k2 0; l1 ; l2 0
ð12:42Þ
k1 s1 ¼ 0; k2 s2 ¼ 0; l1 x1 ¼ 0; l2 x2 ¼ 0:
A solution xj ; j ¼ 1; 2 of (12.41) satisfying (12.42) shall necessarily be an optimal one for maximizing L (i.e. maximizing z). Now to determine the solution to the simultaneous Eqs. (12.41), we introduce two artificial variables A1 ð 0Þ and A2 ð 0Þ in the first two equations of (12.41) and construct the dummy objective function z00 ¼ A1 A2 . Then the equivalent problem of the given QPP becomes Maximize z00 ¼ A1 A2 subject to 20x1 þ 4x2 þ k1 þ k2 l1 þ A1 ¼ 10 4x1 þ 2x2 þ 2k1 þ k2 l2 þ A2 ¼ 25 x1 þ 2x2 þ s1 ¼ 10 x1 þ x2 þ s 2 ¼ 9 x1 ; x2 0; k1 ; k2 0; l1 ; l2 0; s1 ; s2 0; A1 ; A2 0; satisfying the complementary slackness conditions k1 s1 ¼ 0; k2 s2 ¼ 0; l1 x1 ¼ 0; l2 x2 ¼ 0: To solve the problem we shall apply the two-phase method. Here the initial BFS is A1 ¼ 10; A2 ¼ 25; s1 ¼ 10; s2 ¼ 9. Now we construct the initial table of Phase I.
Basic variables 1 A1 1 A2 0 s1 0 s2 z00B ¼ 35 cB
cj
0
0
0
0
0
0
1
1
0
0
xB
x01
x02
k01
k02
l01
l02
A01
A02
s01
s02
10 25 10 9
20 4 1 1 24
4 2 2 1 6
1 2 0 0 3
1 1 0 0 2
1 0 0 0 1
0 1 0 0 1
1 0 0 0 0
0 1 0 0 0
0 0 1 0 0
0 0 0 1 0
Mini ratio 1/2 25/4 10 9 Dj
Here all Dj l0. Hence, the current BFS is not optimal. Here x01 is introduced into the basis as l1 ¼ 0 with A01 as the outgoing (leaving) vector. Now we construct the next table, dropping the A01 column.
12.5
cB
Advantage and Disadvantage of Wolfe’s Method
283
cj
0
0
0
0
0
0
1
1
0
0
Basic variables
xB
x01
x02
k01
k02
l01
l02
A01
A02
s01
s02
Mini ratio
0
x1
1/2
1
1/5
1/20
1/20
1=20
0
1/20
0
0
0
5/2
1
A2
23
0
6/5
9/5
4/5
1/5
1
1=5
1
0
0
115/6
0
s1
19/ 2
0
9/5
1=20
1=20
1/20
0
1=20
0
1
0
95/18
0
s2
17/ 2
0
4/5
1=20
1=20
1/20
0
1=20
0
0
1
85/8
0
6=5
9=5
4=5
1=5
1
6/5
0
0
0
z00B ¼ 23
Dj
Here all Dj l0. Hence, the current BFS is not optimal. Again, Dj corresponding to k01 is most negative, but it does not enter into the basis, since by the complementary condition, s01 is already in the basis. So the entering vector is x02 , as l02 is not present in the basis and x01 is the leaving vector.
cB 0 1 0 0
Basic variables x2 A2 s1 s2
cj
0
0
0
0
0
0
1
1
0
0
xB
x01
x02
k01
k02
l01
l02
A01
A02
s01
s02
5/2 20 5 13/ 2
5 6 9 4
1 0 0 0
1/4 3/2 1=2 1=4
1/4 1/2 1=2 1=4
1=4 1/2 1/2 1/4
0 1 0 0
1/4 1=2 1=2 1=4
0 1 0 0
0 0 1 0
0 0 0 1
6
0
3=2
1=2
1=2
1
3/2
0
0
0
z00B ¼ 20
Mini ratio 40 10 26 Dj
Here all Dj l0. Hence, the current BFS is not optimal. Dj corresponding to k01 is most negative, but it does not enter into the basis, as by the complementary condition, s01 is already in the basis. Now k02 does not enter into the basis, as by the complementary condition, s02 is already in the basis. Now l01 is the entering vector, since by the complementary condition, l1 x1 ¼ 0 and s01 is the leaving vector.
cB 0
Basic variables x2
cj
0
0
0
0
0
0
1
1
0
0
xB
x01
x02
k01
k02
l01
l02
A01
A02
s01
s02
1/2
1
0
0
0
0
0
0
1/2
0
5
Mini ratio (continued)
284
12 Quadratic Programming
(continued) 1 A2 0 l1 0 s2 z00B ¼ 15
cj
0
0
0
0
0
0
1
1
0
0
15 10 4
3 18 1/2 3
0 0 0 0
2 1 0 2
1 1 0 1
0 1 0 0
1 0 0 1
0 0 0
1 0 0 0
1 2 1=2 1
0 0 1 0
15/2
Dj
Here all Dj l0. Hence, the BFS is not optimal. From the table we observe that x01 is most negative, but it does not enter into the basis, since by the complementary condition, l01 is already in the basis. So by the complementary condition, k01 is entering the basis and A02 is the leaving vector.
cB
Basic variables
0 x2 0 k1 0 l1 0 s2 z00B ¼ 0
cj
0
0
0
0
0
0
0
0
xB
x01
x02
l02
s01
s02
1/2 3/2 33/2 1/2 0
1 0 0 0 0
k02 0 1/2 1=2 0 0
l01
5 15/2 35/2 4
k01 0 1 0 0 0
0 0 1 0 0
0 1=2 1=2 0 0
1/2 1=2 3/2 1=2 0
0 0 0 1 0
Mini ratio
Dj
Since Dj 0, the current BFS is optimal. The optimal solution is x1 ¼ 0; x2 ¼ 5, k1 ¼
15 ; 2
k2 ¼ 0; l1 ¼
35 ; l2 ¼ 0; s1 ¼ 0; s2 ¼ 4: 2
This solution satisfies the complementary slackness conditions, k1 s1 ¼ 0 ¼ k2 s2 and l1 x1 ¼ 0 ¼ l2 x2 . Since z00B ¼ 0, the current solution is also feasible, and
Minimize z ¼ 10x1 25x2 þ 10x21 þ x22 þ 4x1 x2 ¼ 10:0 25:5 þ 10:0 þ 52 þ 4:0 ¼ 125 þ 25 ¼ 100:
Example 3 Apply Wolfe’s method for solving the following QPP:
12.5
Advantage and Disadvantage of Wolfe’s Method
285
Maximize z ¼ 4x1 þ 6x2 2x21 2x1 x2 2x22 subject to x1 þ 2x2 2 and x1 ; x2 0: Solution The given problem is a maximization problem. Now, converting all the inequality constraints including the non-negativity constraints, into type, we have x1 þ 2x2 2; x1 0 and x2 0: Now introducing slack variables q21 ; r12 and r22 , we have the reduced problem as Maximize z ¼ 4x1 þ 6x2 2x21 2x1 x2 2x22 subject to x1 þ 2x2 þ q21 ¼ 2 x1 þ r12 ¼ 0 x2 þ r22 ¼ 0 Now, the Lagrangian function L is given by L ¼ Lðx1 ; x2 ; q1 ; l1 ; l2 ; k1 ; r1 ; r2 Þ ¼ 4x1 þ 6x2 2x21 2x1 x2 2x22 k1 x1 þ 2x2 þ q21 2 l1 x1 þ r12 l2 x2 þ r22 : As 2x21 2x1 x2 2x22 represents a negative semi-definite quadratic form, then z ¼ 4x1 þ 6x2 2x21 2x1 x2 2x22 is concave in x1, x2. Therefore, the maxima of L will be the maxima of z and vice versa. Now, the necessary conditions for the maxima of L (and hence of z) are @L ¼ 0 or 4 4x1 2x2 k1 þ l1 ¼ 0 @x1 @L ¼ 0 or 6 2x1 4x2 2k1 þ l2 ¼ 0 @x2 @L ¼ 0 or x1 þ 2x2 þ q21 ¼ 2 @k1 @L ¼ 0 or x1 þ r12 ¼ 0 @l1
286
12 Quadratic Programming
@L ¼ 0 or x2 þ r22 ¼ 0 @l2 @L ¼ 0 or k1 q1 ¼ 0 ) k1 q22 ¼ 0 @q1 @L ¼ 0 or l1 r1 ¼ 0 or l1 r12 ¼ 0 or l1 x1 ¼ 0 as x1 þ r12 ¼ 0 @r1 @L ¼ 0 or l2 r2 ¼ 0 or l2 r22 ¼ 0 or l2 x2 ¼ 0 as x2 þ r22 ¼ 0: @r2 Now, letting q21 ¼ s1 0, we have k1 s1 ¼ 0 as k1 q22 ¼ 0. Also, x1 þ 2x2 þ s1 ¼ 2 and finally, x1 ; x2 ; s1 ; k1 ; l1 ; l2 0. Hence, to determine all the variables, the equivalent problem of the given QPP is as follows: 9 4x1 þ 2x2 þ k1 l1 ¼ 4 = 2x1 þ 4x2 þ 2k1 l2 ¼ 6 ; ; x1 þ 2x2 þ s1 ¼ 2
ð12:43Þ
where all variables are non-negative and k1 s1 ¼ 0; l1 x1 ¼ 0 l2 x2 ¼ 0:
ð12:44Þ
A solution xj ; j ¼ 1; 2; of (12.43) and satisfying (12.44) shall necessarily be an optimal one for maximizing L (i.e. maximizing z). Now to determine the solution to the simultaneous equations in (12.43), we introduce two artificial variables A1 ð 0Þ and A2 ð 0Þ in the first two equations of (12.43) and construct the dummy objective function z0 ¼ A1 A2 . Then the equivalent problem of the given QPP becomes Maximize z0 ¼ A1 A2 subject to 4x1 þ 2x2 þ k1 l1 þ A1 ¼ 4 2x1 þ 4x2 þ 2k1 l2 þ A2 ¼ 6 x1 þ 2x2 þ s1 ¼ 2 x1 ; x2 0; k1 0; l1 ; l2 ; s1 ; A1 ; A2 0;
12.5
Advantage and Disadvantage of Wolfe’s Method
287
satisfying the complementary slackness conditions k1 s1 ¼ 0;
l1 x1 ¼ l2 x2 ¼ 0:
To solve the problem, we shall apply the two-phase method. Here the initial BFS is A1 ¼ 4; A2 ¼ 6; s1 ¼ 0. Now we construct the initial table of Phase I.
cB
Basic variables
−1 A1 −1 A2 0 s1 z0B ¼ 10
cj xB
0 x01
0 x02
k01
0 l01
0 l02
−1 A01
−1 A02
0 s01
4 6 2
4 2 1 −6
2 4 2 −6
1 2 0 −3
−1 0 0 1
0 −1 0 1
1 0 0 0
0 1 0 0
0 0 1 0
0
Mini ratio 4/4 = 1 6/2 = 3 2/1 = 2 Dj
Here all Dj l0. Hence, the current BFS is not optimal. From the preceding table, we observe that either x1 or x2 (as D1 ¼ 6; D2 ¼ 6) can be introduced into the basic solution. Arbitrarily, we choose x1 as the entering basic variable, as l1 ¼ 0 (for satisfying l1 x1 ¼ 0). Therefore, A1 is the leaving basic variable. Now we construct the next table, dropping the A1 column. cj
0
0
0
0
0
−1
0
x02
k01
l01
l02
A02
s01
½ 3 3/ 2 −3
¼ 3/2 −¼
−¼ ½ ¼
0 −1 0
0 1 0
0 0 1
Mini Ratio 2 4/3 2/3
−3/2
−½
1
0
0
Dj
cB
Basic variables
xB
x01
0 −1 0
x1 A2 s1
1 4 1
1 0 0
ZB0 ¼ 4
0
Since l2 ¼ 0; x2 is introduced as the basic variable with s1 as the leaving basic variable. Now, we construct the next table. cj
0
0
0
0
0
−1
0
cB
Basic variables
xB
x01
x02
k01
l01
l02
A02
s01
0 −1 0
x1 A2 s1
1 4 1
1 0 0
½ 3 3/ 2 −3
¼ 3/2 −¼
−¼ ½ ¼
0 −1 0
0 1 0
0 0 1
Mini Ratio 2 4/3 2/3
−3/2
−½
1
0
0
Dj
ZB0 ¼ 4
0
288
12 Quadratic Programming
Since s1 ¼ 0, k1 is introduced as the basic variable. A2 is the leaving basic variable. Now we construct the next table, dropping the A2 column.
cB
Basic variables
0 x1 0 k1 0 x2 ZB0 ¼ 2
cj
0
0
0
0
0
0
xB
x01
x02
k01
l01
l02
s01
1/3 1 5/6
1 0 0 0
0 0 1 0
0 1 0 0
−1/3 0 1/6 0
1/6 −1/2 −1/12 0
0 −1 ½ 0
Dj
Here all Dj l0. Hence, the optimal solution for Phase I has been obtained, and the solution is x1 ¼ 13 ; x2 ¼ 56 ; k1 ¼ 1; l1 ¼ 0; l2 ¼ 0; s1 ¼ 0. Thus, the optimal solution of the original QPP is given by 2 2 1 5 1 5 1 1 5 5 and zmax ¼ 4 x1 ¼ ; x2 ¼ 2 þ 2 2 3 6 3 6 3 3 6 6 25 ¼ : 6 Example 4 Apply Wolfe’s method for solving the following problem: Maximize z ¼ 2x1 þ x2 x21 subject to 2x1 þ 3x2 6 2x1 þ x2 4 and x1 ; x2 0: Solution The given problem is a maximization problem. Now, converting all the inequality constraints, including the non-negativity constraints, into type, we have 2x1 þ 3x2 6 2x1 þ x2 4 x1 0; x2 0: Now introducing slack variables q21 ; r12 and r22 , we have the reduced problem as
12.5
Advantage and Disadvantage of Wolfe’s Method
289
Maximize z ¼ 2x1 þ x2 x21 2x1 þ 3x2 þ q21 ¼ 6 2x1 þ x2 þ q22 ¼ 4 x1 þ r12 ¼ 0 x2 þ r22 ¼ 0 The Lagrangian function L is given by Lðx1 ; x2 ; q1 ; q2 ; r1 ; r2 ; k1 ; k2 ; l1 ; l2 Þ ¼ 2x1 þ x2 x21 k1 2x1 þ 3x2 þ q21 6 k2 2x1 þ x2 þ q22 4 l1 x1 þ r12 l2 x2 þ r22 : As x21 represents a negative semi-definite quadratic form, z ¼ 2x1 þ x2 x21 is concave in x1, x2. Hence, the maxima of L will be the maxima of z and vice versa. Now, the necessary conditions for maxima of L (and hence of z) are @L ¼ 2 2x1 2k1 2k2 þ l1 ¼ 0 or 2x1 þ 2k1 þ 2k2 l1 ¼ 2 @x1 @L ¼ 1 3k1 k2 þ l2 ¼ 0 or 3k1 þ k2 l2 ¼ 1 @x2 @L ¼ 2x1 þ 3x2 þ q21 6 ¼ 0 or 2x1 þ 3x2 þ s1 ¼ 6 putting q21 ¼ s1 @k1 @L ¼ 2x1 þ x2 þ q22 4 ¼ 0 or 2x1 þ x2 þ s2 ¼ 4 putting q22 ¼ s2 @k2 @L ¼ 2k1 q1 ¼ 0 or k1 q21 ¼ 0 or k1 s1 ¼ 0 @q1 @L ¼ 2k2 q2 ¼ 0 or k2 q22 ¼ 0 or k2 s2 ¼ 0 @q2 @L ¼ x1 þ r12 ¼ 0 or x1 ¼ r12 @l1 @L ¼ x2 þ r22 ¼ 0 or x2 ¼ r22 @l2
290
12 Quadratic Programming
@L ¼ 2l1 r1 ¼ 0 or l1 r1 ¼ 0 or l1 r12 ¼ 0 or l1 x1 ¼ 0 @r1 @L ¼ 2l2 r2 ¼ 0 or l2 r2 ¼ 0 or l2 r22 ¼ 0 or l2 x2 ¼ 0 @r2 and x1 ; x2 0; k1 ; k2 0; l1 ; l2 ; s1 ; s2 0: Hence, to determine all the variables, the equivalent problem of the given QPP is as follows: 9 2x1 þ 2k1 þ 2k2 l1 ¼ 2 > > = 3k1 þ k2 l2 ¼ 1 ; ð12:45Þ 2x1 þ 3x2 þ s1 ¼ 6 > > ; 2x1 þ x2 þ s2 ¼ 4 where all variables are non-negative and k1 s1 ¼ 0; k2 s2 ¼ 0; l1 x1 ¼ 0; l2 x2 ¼ 0:
ð12:46Þ
A solution xj ; j ¼ 1; 2; of (12.45) and satisfying (12.46) shall necessarily be an optimal one for maximizing L (i.e. maximizing z). Now to determine the solution to the simultaneous Eq. (12.45), we introduce two artificial variables A1 ð 0Þ and A2 ð 0Þ in the first two equations of (12.45) and construct the dummy objective function z0 ¼ A1 A2 . Then the equivalent problem of the given QPP becomes Maximize z0 ¼ A1 A2 subject to 2x1 þ 2k1 þ 2k2 l1 þ A1 ¼ 2 3k1 þ k2 l2 þ A2 ¼ 1 2x1 þ 3x2 þ s1 ¼ 6 2x1 þ x2 þ s2 ¼ 4 and x1 ; x2 0; k1 ; k2 0; l1 ; l2 ; s1 ; s2 ; A1 ; A2 0; satisfying the complementary slackness conditions k1 s1 ¼ 0; k2 s2 ¼ 0; l1 x1 ¼ 0; l2 x2 ¼ 0: To solve the problem, we shall apply the two-phase method. Here the initial BFS is A1 ¼ 2; A2 ¼ 1; s1 ¼ 6; s2 ¼ 4.
12.5
Advantage and Disadvantage of Wolfe’s Method
291
Now we construct the initial table of Phase I. 0
0
0
0
0
0
0
0
−1
−1
xB
x01
x02
k01
k02
l01
l02
s01
s02
A01
A02
Mini ratio
2 1 6 4
2 0 2 2 −2
0 0 3 1 0
2 3 0 0 −5
2 1 0 0 −3
−1 0 0 0 1
0 −1 0 0 1
0 0 1 0 0
0 0 0 1 0
1 0 0 0 0
0 1 0 0 0
1 – 3 2
cj cB
Basic variables
−1 A1 −1 A2 0 s1 0 s2 z0B ¼ 3
Dj
Here all Dj j0. Hence, the current BFS is not optimal. Dj corresponding to k01 is most negative and corresponding to k02 is second most negative, but the corresponding variables k1 ; k2 will not be considered as basic variables, since s1 and s2 are basic variables. So the entering basic variable will be x1, as l1 is not a basic variable and A1 is the leaving basic variable. Now we construct the next table, dropping the A01 column.
cB
cj
0
0
0
0
0
0
0
0
−1
Basic variables
x01
x02
k01
k02
l01
l02
s01
s02
A02
Mini ratio
0
0
0
0
–
−1 0 0 1
0 1 0 0
0 0 1 0
1 0 0 0
– 4/3 2
xB
x1
1
1
0
1
1
−1 A2 0 s1 0 s2 z0B ¼ 1
1 4 2
0 0 0 0
0 3 1 0
3 −2 −2 −3
1 −2 −2 −1
0
12 0 1 1 0
Dj
Here all Dj j0. Hence, the current BFS is not optimal. From this table, we observe that D1 corresponding to k1 is most negative and k2 is second most negative. However, the corresponding variables k1 ; k2 will not be considered as basic variables, as s1 and s2 are already basic variables. So the basic variable will be x2 (as l2 is not the basic variable) and s1 is the leaving basic variable. Now, we construct the next table.
cB
cj ! Basic variables
0
x1
−1
A2
xB
0 x01
0 x02
0
0
k01
k02
1
1
1
0
0 l01
0
1
1
12
0
3
1
0
0 l02
0 s01
0 s02
−1 A02
Mini ratio
0
0
0
0
1
−1
0
0
1
1/3 (continued)
292
12 Quadratic Programming
(continued) cj ! Basic variables
0 x02
0
xB
0 x01
0
cB
k01
k02
0 l01
0
x2
4/3
0
1
23
23
1/3
0
0
43
43
0
0
−3
−1
0
s2
2/3
z0B ¼ 1
0 s01
0 s02
−1 A02
Mini ratio
0
1/3
0
0
–
2/3
0
13
1
0
–
0
1
0
0
0
0 l02
Dj
Here all Dj 0. Hence, the current solution is not optimal. From the preceding table, it is observed that Dj corresponding to k1 is most negative. Hence, k1 is the new basic variable, since s1 is not in the basis and A2 is the leaving basic variable. cj Basic variables
0 x02
0
xB
0 x01
0
cB
k01
k02
0 l01
0 l02
0 s01
0 s02
0
x1
2/3
1
0
0
2/3
12
1/3
0
0
0
0
1/3
0
13
1
0
0
0
k1
1/3
0
0
1
1/3
0
0
x2
14/9
0
1
0
49
1/3
0
s2
10/9
0
0
0
89
2/3
13 29 49
0
0
0
0
0
0
z0B
¼0
Dj
From the table, it is clear that Dj 0. Hence, the current solution is optimal. The 1 10 optimal solution is x1 ¼ 23 ; x2 ¼ 14 9 , k1 ¼ 3 ; k2 ¼ 0; l1 ¼ 0; l2 ¼ 0; s1 ¼ 0; s2 ¼ 9 . This solution also satisfies the complementary slackness conditions k1 s1 ¼ 0 ¼ k2 s2 and l1 x1 ¼ 0 ¼ l2 x2 : Again, since zB ¼ 0, the current solution is also feasible. 14 4 12 þ 144 ¼ 22 Now, Max z ¼ 2:2 3 þ 9 9¼ 9 9. Therefore, the optimal solution of the original QPP is given by x1 ¼ 23 ; x2 ¼ 14 9 and 22 zmax ¼ 9 .
12.6
Beale’s Method
In the year 1959, Beale developed an alternative technique for solving QPPs without using Kuhn-Tucker conditions. In this technique, the classical calculus results have been used to partition the variables into basic and non-basic ones. In
12.6
Beale’s Method
293
each iteration, all the basic variables and the objective function are expressed in terms of non-basic variables.
12.7
Beale’s Algorithm for QPPs
Let the general QPP be the following: 1 Maximize z ¼ hc; xi þ hQx; xi 2 subject to the constraints Ax or or ¼ b and x 0; where xT 2 Rn , A is an m n matrix, b is an m 1 matrix and Q is an n n symmetric matrix. Beale’s iterative procedure for solving this problem can be summarized as follows: Step 1. Convert the minimization problem into a maximization problem, if required. Introduce slack and/or surplus variables in the inequality constraints, if any, and put the QPP in the standard form. Step 2. Select arbitrarily m number of variables as basic variables; the remaining variables are considered as non-basic variables. Let the basic variables be denoted by xBi (i = 1, 2, …, m) and the non-basic variables as xNBk ; k ¼ 1; 2; . . .; n m. Step 3. Express each basic variable xBi entirely in terms of the non-basic variables, i.e. the xNBi ’s (and ui ’s, if any), using the given constraints (as well as any additional ones). Step 4. Express the objective function z, also in terms of the non-basic variable xNBi ’s (and ui ’s, if any). Step 5. Examine the partial derivatives of z with respect to the non-basic variables at the point xNB ¼ 0 (and u = 0):
(i) If @x@zNB x ¼ 0 0 for each k ¼ 1; 2; . . .; n m k
NB
u¼0 @z and @ui x ¼ 0 ¼ 0 for each i = 1,2, …, m, NB
u¼0 the current basic solution is optimal. Go to Step 7.
@z (ii) If @xNB x ¼ 0 [ 0 for each least one k, NB k u¼0
294
12 Quadratic Programming
choose the most positive one. The corresponding non-basic variables will
enter the basis. @z (iii) If @xNB x ¼ 0 \0 for each k ¼ 1; 2; . . .; n m NB k
u¼0 @z 6¼ 0 for some i = r, but @u
i xNB ¼ 0 u¼0 @z , and treat introduce a new non-basic variable uj, defined by uj ¼ @u r ur as a basic variable. Go to Step 3. Step 6. Let xNBi ¼ xk be the entering variable identified in Step 5 (ii). Compute the largest non-negative values of xk from the non-negativity conditions of @z each basic variable and the value of xk from @x ¼ 0 using other non-basic k variables as zero. Step 7. Find the minimum value of xk. (i) If the minimum value of xk occurs from the non-negative condition of a basic variable, then that variable will be the leaving variable from the basis and xk will be the new basic variable. @z (ii) If the minimum value of xk occurs from @x ¼ 0, introduce an addik tional non-basic variable, called a free variable, defined by @z (ui is unrestricted). ui ¼ @x k This relation becomes an additional constraint equation. Step 8. Go to Step 3 and repeat the procedure until an optimal basic solution is reached. Step 9. Determine the optimal values of xB and z by setting xNB ¼ 0 in their expressions obtained in Step 3 and Step 4. Step 10. Stop. Example-5 Use Beale’s method to solve the following QPP: Maximize z ¼ 4x1 þ 6x2 2x21 2x1 x2 2x22 subject to x1 þ 2x2 2 and x1 ; x2 0: Solution The given problem is Maximize z ¼ 4x1 þ 6x2 2x21 2x1 x2 2x22 subject to x1 þ 2x2 2 and x1 ; x2 0:
12.7
Beale’s Algorithm for QPPs
295
Now introducing slack variable x3 0 in the given constraint, the given problem reduces to Maximize z ¼ 4x1 þ 6x2 2x21 2x1 x2 2x22 subject to x1 þ 2x2 þ x3 ¼ 2 and x1 ; x2 ; x3 0: Let us consider x2 as basic variables xB and the remaining two variables x1 ; x3 as non-basic variables xNB . Now expressing xB and z in terms of xNB , we have 1 x2 ¼ 1 ðx1 þ x3 Þ 2 2 x1 x3
1 1 and z ¼ 4x1 þ 6 1 2x21 2x1 1 ðx1 þ x3 Þ 2 1 ðx1 þ x3 Þ 2 2 2 2 @z 6 1 1 1 1 2 1 ðx1 þ x3 Þ 4 1 ðx1 þ x3 Þ ) ¼ 4 4x1 2x1 @x1 2 2 2 2 2 ¼ 4 3 4x1 þ x1 2 þ x1 þ x3 þ 2 x1 x3 ¼ 1 3x1 @z 6 1 1 1 4 1 ðx1 þ x3 Þ ¼ 0 2x1 @x3 2 2 2 2 ¼ 3 þ x1 þ 2 x1 x3 ¼ 1 x3
@z
@z
) ¼ 1 3:0 ¼ 1 and ¼ 1 0 ¼ 1: @x1 xNB ¼0 @x3 x ¼0 NB
The values of the partial derivatives of z with respect to x1 and x3 at xNB ¼ 0 indicate that z would be increased if x1 is increased. Therefore, x1 will be the new basic variable. Now we have to find the value of x1 . Since x2 ¼ 1 12 ðx1 þ x3 Þ and x3 = 0, then x1 = 2 if x2 is selected as the non-basic variable. Again,
@z @x1
¼ 0 with x3 ¼ 0 implies 1 1 3x1 ¼ 0 or x1 ¼ : 3
Hence, the new value of x1 is given by x1 ¼ Min 2; 13 ¼ 13, which corresponds to Hence, x2 cannot be removed from the basis.
@z @x1
¼ 0 with x3 ¼ 0.
296
12 Quadratic Programming
Now, we introduce a non-basic free variable u1 and the new constraint u1 ¼
@z ; i:e: u1 ¼ 13x1 : @x1
Hence, the current basic variables will be x2, x1, and the remaining variables x3 and u1 will be non-basic variables. ) xB ¼
x2 x1
and xNB ¼
x3 : u1
Now expressing xB and z in terms of xNB , we have x1 ¼ ð1 u1 Þ=3 1 1 5 u1 x3 x2 ¼ 1 ðx1 þ x3 Þ=2 ¼ 1 ð1 u1 Þ x3 ¼ þ 6 2 6 6 2 5 u1 x3 2 þ =3 and z ¼ 4ð1 u1 Þ=3 þ 5 þ u1 3x3 2ð1 u1 Þ =9 2ð1 u1 Þ 6 6 2 5 u1 x3 2 2 þ 6 6 2 @z 1 5 u 1 x3 ) ¼ 3 þ þ þ @x3 3 6 6 2 @z 4 5 u1 x3 þ ¼ þ 1 þ 4ð1 u1 Þ=9 þ 2 =3 2ð1 u1 Þ=9 @u1 3 6 6 2 5 u1 x3 þ 2 =3: 6 6 2
Now evaluating these partial derivatives at xNB ¼ 0, i.e. at x3 = 0 and u1 = 0, we have
@z
@z
¼ negative and ¼ 0; @x3 xNB ¼0 @u1 xNB ¼0 which indicate that the current xB provides the optimal solution. As x3 ¼ 0, u1 ¼ 0, x1 ¼ 13 ; x2 ¼ 56 2 2 and Max z ¼ 4 13 þ 6 56 2 13 2 13 56 2 56 ¼ 25 6.
12.7
Beale’s Algorithm for QPPs
297
Example 6 Use Beale’s method to solve the following QPP: Maximize z ¼ 2x1 þ 3x2 2x22 subject to the constraints x1 þ 4x2 4; x1 ; þ x2 2 and x1 ; x2 0: Solution Introducing slack variables x3 ð 0Þ and x4 ð 0Þ into the given constraints, we get the reduced problem as Maximize z ¼ 2x1 þ 3x2 2x22 subject to x1 þ 4x2 þ x3 ¼ 4 x1 þ x2 þ x4 ¼ 2 and x1 ; x2 ; x3 ; x4 0:
Let us take x1 and x2 as basic variables and the remaining variables x3 and x4 as non-basic variables. ) xB ¼
x1 x2
and xNB ¼
x3 : x4
Now expressing xB and z in terms of xNB , we have x2 ¼ ð2 x3 þ x4 Þ=3 x1 ¼ ð4 þ x3 4x4 Þ=3 and z ¼ 2ð4 þ x3 4x4 Þ=3 þ ð2 x3 þ x4 Þ 2ð2 x3 þ x4 Þ2 =9 @z 2 4 1 4 ) ¼ 1 ð2 x3 þ x4 Þð1Þ ¼ þ ð2 x3 þ x4 Þ @x3 3 9 3 9 @z 8 4 5 4 and ¼ þ 1 ð2 x3 þ x4 Þ ¼ ð2 x3 þ x4 Þ: @x4 3 9 3 9 Now evaluating these partial derivatives at xNB ¼ 0, i.e. at x3 = 0, x4 = 0, we have
@z
1 8 5 @z
23 þ ¼ and ¼ ¼ :
@x3 xNB ¼0 3 9 9 @x4 xNB ¼0 9 The values of the partial derivatives of z with respect to x3 and x4 at xNB ¼ 0 indicate that z would be increase if x3 is increased. Therefore, x3 will be the new basic variable. Now we have to find the value of x3. Since x1 ¼ ð4 þ x3 4x4 Þ=3 and x4 = 0, then x3 = –4 if x1 is selected as the non-basic variable. This is impossible, as x3 0.
298
12 Quadratic Programming
Again, since x2 ¼ ð2 x3 þ x4 Þ=3 with x4 = 0, then x3 = 2 if x2 is selected as the @z non-basic variable. On the other hand, @x ¼ 0 with x4 = 0 gives 3 1 4 4 1 5 þ ð 2 x3 Þ ¼ 0 ) ð 2 x3 Þ ¼ ) x3 ¼ : 3 9 9 3 4 Hence, the new value of x3 is given by x3 ¼ min 2; 54 ¼ 54, which corresponds to
@z @x3
¼ 0 with x4 = 0.
Hence, we cannot remove x1 and x2 from the basis. Now, we introduce a non-basic free variable u1 and the new constraint u1 ¼
@z @x3
i:e:; u1 ¼
1 4 þ ð2 x3 þ x4 Þ: 3 9
Hence, the current basic variables will be x1, x2, x3, and the remaining variables x4 and u1 will be non-basic variables. 0 1 x1 x4 @ A ) xB ¼ x2 and xNB ¼ : u1 x3 Now expressing xB and z in terms of xNB, we have
9 5 4 5 9 u1 þ x 4 ¼ u1 þ x 4 4 9 9 4 4 1 1 5 9 1 21 9 7 3 x1 ¼ ð4 þ x3 4x4 Þ ¼ 4 þ u1 þ x4 4x4 ¼ u1 3x4 ¼ u1 x4 3 3 4 4 3 4 4 4 4 1 1 5 9 1 3 9 1 x2 ¼ ð 2 x3 þ x4 Þ ¼ 2 þ u1 x 4 þ x 4 ¼ þ u1 ¼ ð1 þ 3u1 Þ 3 3 4 4 3 4 4 4 7 3 3 1 2 and z ¼ 2 u1 x4 þ ð1 þ 3u1 Þ ð1 þ 3u1 Þ : 4 4 4 8 x3 ¼
Now,
@z @x4
¼ 2 and
@z @u1
¼ 32 þ 94 34 ð1 þ 3u1 Þ.
Now evaluating these partial derivatives at xNB ¼ 0, i.e. at x4 = 0, u1 = 0, we have
@z
@z
3 9 3 ¼ 2 and ¼ þ ¼ 0:
@x4 xNB ¼0 @u1 xNB ¼0 2 4 4 These partial derivatives indicate that the current xB provides the optimal solution. Since x4 ¼ 0; u1 ¼ 0; then x1 ¼ 74 ; x2 ¼ 14 and z ¼ 72 þ 34 18 ¼ 33 8.
12.7
Beale’s Algorithm for QPPs
299
Hence, the optimal solution is x1 ¼ 74 ; x2 ¼ 14 and maxz ¼ 33 8. Example 7 Solve the following QPP by Beale’s method: Maximize z ¼ 10x1 þ 25x2 10x21 x22 4x1 x2 subject to
x1 þ 2x2 10 x1 þ x2 9 and x1 ; x2 0:
Solution Introducing slack variables x3 ð 0Þ and x4 ð 0Þ into the given problem, we convert it into Maximize z ¼ 10x1 þ 25x2 10x21 x22 4x1 x2 subject to x1 þ 2x2 þ x3 ¼ 10 x1 þ x2 þ x4 ¼ 9 and x1 ; x2 ; x3 ; x4 0 Let us consider x1 ; x2 as basic variables and x3 ; x4 as non-basic variables. Now expressing xB and z in terms of xNB ; we have x1 ¼ 8 þ x3 2x4 and x2 ¼ 1 x3 þ x4 and z ¼ 10ð8 þ x3 2x4 Þ þ 25ð1 x3 þ x4 Þ 10ð8 þ x3 2x4 Þ2 ð1 x3 þ x4 Þ2 4ð8 þ x3 2x4 Þð1 x3 þ x4 Þ: Now @z ¼ 10 25 20ð8 þ x3 2x4 Þ þ 2ð1 x3 þ x4 Þ @x3 4ð1 x3 þ x4 Þ þ 4ð1 þ x3 2x4 Þ and @z ¼ 20 þ 25 20ð2Þð8 þ x3 2x4 Þ 2ð1 x3 þ x4 Þ þ 8ð1 x3 þ x4 Þ @x4 4ð8 þ x3 2x4 Þ:
300
@z
@x
3 xNB ¼0
@z
¼ 145 and @x
12 Quadratic Programming
¼ 299:
4 xNB ¼0
The values of the partial derivatives of z with respect to x3 and x4 at xNB ¼ 0 indicate that z will increase if x4 increases. Since 2x4 ¼ ð8 þ x3 x1 Þ; if x4 is increased to a value greater than 4, then x1 will become negative and x3 ¼ 0. Now @z ¼0 @x4 or 20 þ 25 þ 40ð8 þ x3 2x4 Þ 2ð1 x3 þ x4 Þ þ 8ð1 x3 þ x4 Þ 4ð8 þ x3 2x4 Þ ¼ 0:
Setting x3 ¼ 0; we get 5 þ 40ð8 2x4 Þ 2ð1 þ x4 Þ þ 8ð1 þ x4 Þ 4ð8 2x4 Þ ¼ 0 or; 5 þ 320 80x4 2 2x4 þ 8 þ 8x4 32 þ 8x4 ¼ 0 or; 66x4 ¼ 299 299 or; x4 ¼ 66 Hence, the new value of x4 is given by @z Min 4; 299 66 ¼ 4 which corresponds to @x4 ¼ 0 with x3 ¼ 0. So, the new basic variables are x4 and x2 . Now, expressing xB and z in terms of xNB , we have 1 x2 ¼ 5 ðx1 þ x3 Þ 2 1 x4 ¼ 4 þ ð x 3 x 1 Þ 2 2 1 1 1 z ¼ 10x1 þ 25 5 ðx1 þ x3 Þ 10x21 5 ðx1 þ x3 Þ 4x1 5 ðx1 þ x3 Þ ; 2 2 2
where xB ¼ ðx2 ; x4 Þ; xNB ¼ ðx1 ; x3 Þ. Now,
12.7
Beale’s Algorithm for QPPs
301
@z 25 1 1 1 ¼ 10 20x1 2 5 ðx1 þ x3 Þ 4 5 ð x1 þ x3 Þ @x1 2 2 2 2 1 4x1 2 and @z 25 1 1 1 ¼ 2 5 ð x1 þ x3 Þ 4x1 : @x3 2 2 2 2
@z
35 @z
15 and ¼ ¼ : @x1 xNB ¼0 2 @x3 xNB ¼0 2 Both partial derivatives are negative at xNB ; hence, neither x1 nor x3 can be introduced to increase z, and the current xB provides the optimal solution. Setting x1 ¼ 0; x3 ¼ 0, we have x2 ¼ 5 and x4 ¼ 4. Hence, the optimal solution is given by x2 ¼ 5 and x4 ¼ 4 and Max z ¼ 0 þ 25 5 0 25 0 ¼ 100. Note In this problem, if we initially consider x2 ; x4 as basic variables and x1 ; x3 as non-basic variables, then this problem can be solved very easily in one iteration.
12.8
Exercises
1. Use Wolfe’s method for solving the following problem: Maximize z ¼ 10x1 x21 þ 10x2 x22 , where x1 þ x2 14; x1 þ x2 6; x1 ; x2 0. 2. Use the Kuhn-Tucker conditions to solve the problem: Maximize z ¼ 12x1 þ 21x2 þ 2x1 x2 2x21 2x22 subject to x2 8; x1 þ x2 10: 3. Use Wolfe’s method for solving the following problem: Maximize Z ¼ 2x1 þ 3x2 2x21 subject to the constraints x1 þ 4x2 4; x1 þ x2 2; x1 ; x2 0:
302
12 Quadratic Programming
4. Use Beale’s method to solve the following problem: Minimize Z ¼ 6 6x1 2x21 2x1 x2 þ 2x22 such that x1 þ x2 2 and x1 ; x2 0: 5. Use Wolfe’s method for solving the following problem: Maximize z ¼ 10x1 þ 25x2 10x21 x22 4x1 x2 subject to x1 þ 2x2 10; x1 þ x2 9 and x1 ; x2 0: 6. Use Beale’s method to solve the following problem: Minimize z ¼ x21 x1 x2 þ 2x22 x1 x2 subject to 2x1 þ x2 1 and x1 0; x2 0:
Chapter 13
Game Theory
13.1
Objectives
The objectives of this chapter are to discuss: • The process of finding the optimal strategies in conflict and competitive situations. • Different types of finite and infinite games and their solution procedures.
13.2
Introduction
In most practical problems, there arises a situation where there are two or more opposite parties (particularly players) with conflicting of interests and actions. This type of situation is called a competitive situation. A large variety of competitive situations are commonly observed in daily life, e.g. in political campaigns, elections, advertisements, military battles, etc. A competitive situation is called a game if the following properties are satisfied: (i) (ii) (iii) (iv)
There is a conflict of interests between the participants. The number of competitors (participants), called players, is finite. Each of the participants has a finite/infinite set of possible courses of action. The rules governing these choices are specified and known to the players; a play of the game results when each of the players chooses a single course of action from a list of courses available to him. (v) The outcome of the game is affected by the choices made by all the players. (vi) The outcome for all specific sets of choices by all the players is known in advance and defined.
© Springer Nature Singapore Pte Ltd. 2019 A. K. Bhunia et al., Advanced Optimization and Operations Research, Springer Optimization and Its Applications 153, https://doi.org/10.1007/978-981-32-9967-2_13
303
304
13
Game Theory
(vii) Every play, i.e. combination of courses of action, determines an outcome (which may be money or points) which determines a set of payments (positive, negative or zero), one to each player.
13.3
Cooperative and Non-cooperative Games
Games in which the objective of each participant (player) is to achieve the largest possible individual gain (the ‘payoff’) are called non-cooperative games. Games in which the actions of the players are directed to maximize the gain of ‘collectives’ without subsequent subdivision of the gain among the players within the condition are called cooperative games. Here we shall consider only non-cooperative games. Let I denote the set of all players. We shall assume that I is finite. Let I ¼ f1; 2; . . .; ng, where n denotes the number of players in the game. Let each player i 2 I have at his disposal a certain set Si of the available actions, which are called strategies. Each player has at least two distinct strategies. The process of the game consists of each one of the players choosing a certain strategy si 2 Si . Thus, as a result of each ‘round’ of the game, a system of strategies ðs1 ; . . .; sn Þ s is put together. This system is called a situation. The set of all strategies is denoted by S ¼ S1 S2 Sn . In each situation the players attain certain gains (payoffs). The payoff of player i in situation s is denoted by Hi(s). The function Hi defined on the set of all situations is called the payoff function of player i. Thus, Hi : S ! R, where R is a set of real numbers. Definition A system denoted by C ¼ I; fSi gi2I ; fHi gi2I , where I and Si ði 2 I Þ are Qn sets and Hi’s are functions defined on the set S ¼ i¼1 Si taking on real values, is called a non-cooperative game. is called a Definition A non-cooperative game C ¼ I; fSi gi2I ; fHi gi2I P constant-sum game if there exists a constant C such that i2I Hi ðsÞ ¼ C for each and every situation s 2 S. Admissible situation and equilibrium situation Let s ¼ ðs1 ; . . .; si1 ; si ; si þ 1 ; . . .; sn Þ be an arbitrary situation in the game C, and let si be a strategy of player i. We form a new situation that differs from the situation sonly in that the strategy si for player i is now replaced by s0i . Denotethis as ss0i ¼ s1 ; . . .; si1 ; s0i ; si þ 1 ; . . .; sn . If si and s0i coincide, then obviously ss0i ¼ s.
13.3
Cooperative and Non-cooperative Games
305
Admissible situation A situation s ¼ ðs1 ; . . .; sn Þ in a game C ¼ hI; fsi g; fHi gi is called an admissible situation for player i if for any other strategy s0i of this player we have Hi ss0i Hi ðsÞ: The term ‘admissible’ can thus be justified by0 the fact that if in situation s there 0 exists a strategy si for player i such that Hi ssi [ Hi ðsÞ, then player i, knowing that the situation s will materialize, may choose the strategy s0i at the last moment and as a result of this choice end up with a large payoff. In this sense the situation may be viewed as admissible for player i. Equilibrium situation An admissible situation for all players is called an equilibrium situation; i.e. Hi ss0i Hi ðsÞ is satisfied for any player i and for any strategy s0i 2 Si . Equilibrium strategy An equilibrium strategy is one that appears in at least one equilibrium situation of the game. Solution of the game The process of determination of an equilibrium situation in a non-cooperative game is often referred to as the solution of the game. Strategic equivalence of games Let two non-cooperative games with the same set of players and the same set of strategies be given (i.e. the games differ only in their payoff functions) as D n o E ð1Þ C1 ¼ I; fSi gi2I ; Hi i2I
D n o E ð2Þ C2 ¼ I; fSi gi2I ; Hi : i2I
Definition The games C1 and C2 are called strategically equivalent if a positive number k and numbers Ci (for each of the players i 2 I) exist such that for any situations ð1Þ
ð2Þ
Hi ðsÞ ¼ k Hi ðsÞ þ Ci : We denote this fact by the symbol C1 C2 .
306
13
Game Theory
Properties of strategic equivalence (i) Reflexivity: C C; this can easily be proved choosing k = 1 and Ci = 0. (ii) Symmetry: C1 C2 ) C2 C1 . ð1Þ
Proof Since C1 C2 , we can write Hi
ð2Þ
¼ k Hi ðsÞ þ Ci
k [ 0; 8 i 2 I
1 ð1Þ Ci ð2Þ or Hi ðsÞ ¼ Hi ðsÞ k k 1 ð1Þ ¼ k 0 Hi ðsÞ þ Ci0 where k 0 ¼ ; k
Ci0 ¼
Ci : k
Hence, by definition, C2 C1 . (iii) Transitivity: If C1 C2 and C2 C3 , then C1 C3 . This can easily be proved using the definition. Note The basic difference between two strategically equivalent games lies in the amount of initial capital possessed by the players and in the relative units in which the payoffs are measured. This is indicated by the coefficient k. It is therefore natural to suppose that the rational behaviour of players in distinct strategically equivalent games should be the same. Theorem 13.1 Strategically equivalent games possess the same equilibrium situation. D n o E ð1Þ Proof Let C1 C2 , where C1 ¼ I; fSi gi2I ; Hi i2I
D n o E ð2Þ C2 ¼ I; fSi gi2I ; Hi i2I
and let s be an equilibrium situation in C1 . This means that 8i 2 I and s0i 2 Si , we have the inequality ð1Þ ð1Þ Hi s k s0i Hi ðsÞ, where s ¼ ðs1 ; . . .; si1 ; si ; si þ 1 ; . . .; sn Þ, s k s0i ¼ ðs1 ; . . .; si1 ; s0i ; si þ 1 ; . . .; sn Þ. ð1Þ
ð2Þ
Now since C1 C2 ; Hi ¼ k Hi þ Ci ðk [ 0Þ 8i 2 I and si 2 Si ð1Þ ð2Þ i:e: Hi s k s0i k Hi þ Ci ð2Þ ð2Þ ð1Þ ð2Þ or k Hi s k s0i þ Ci Hi ðsÞ þ Ci as Hi s k s0i ¼ k Hi s k s0i þ Ci ð2Þ ð2Þ or Hi s k s0i Hi ðsÞ 8i 2 I and s0i 2 Si since k [ 0 which implies s is an equilibrium situation in C2 .
13.3
Cooperative and Non-cooperative Games
307
Zero-sum game AP non-cooperative game C is called a zero-sum game if for each situation s; i2I Hi ðsÞ ¼ 0. Theorem 13.2 Any non-cooperative constant-sum game is strategically equivalent to a certain zero-sum game. P Proof Let C ¼ I; fSi gi2I ; fHi gi2I be a constant-sum game. Then i2I Hi ðsÞ ¼ C, D E C being a constant. Now let us consider the game C0 ¼ I; fSi gi2I ; Hi0 i2I , where P Hi0 ¼ Hi Ci and i2I Ci ¼ C. Then obviously C C0 when k = 1. But
P i
Hi0 ðsÞ ¼
P i
Hi ðsÞ
P i
Ci ¼ C C ¼ 0,
So C′ is a zero-sum game.
13.4
Antagonistic Game
The game C ¼ I; fSi gi2I ; fHi gi2I is called antagonistic if there are only two players in the game (i.e. I ¼ f1; 2g), and the values of the payoff function for these players in each situation are the same in absolute value but are of opposite sign. Thus, for an antagonistic game, I ¼ f1; 2g and H2 ðsÞ ¼ H1 ðsÞ. Hence, H1 ðsÞ þ H2 ðsÞ ¼ 0; s ¼ ðs1 ; s2 Þ 2 S1 S2 . Thus, an antagonistic game is also a two-person zero-sum game. Thus, to define an antagonistic game, it is sufficient to stipulate the payoff function of one of the players only. It is usually denoted by the triplet hX; Y; M i, where X, Y are the sets of strategies for players 1 and 2 respectively and M is the payoff function for player 1 which is a real-valued function such that M : X Y ! R. Note It is obvious that each non-cooperative two-person constant zero-sum game is strategically equivalent to a certain antagonistic game.
Equilibrium situation for an antagonistic game Let hX; Y; M i be an antagonistic game, H1 ¼ M; H2 ¼ M. Hence, s1 x 2 X; s2 y 2 Y; s ðx; yÞ 2 X Y. If (x, y) is an equilibrium situation, then H1 ðx; yÞ H1 ðx0 ; yÞ8x0 2 X and H2 ðx; yÞ H2 ðx; y0 Þ8y0 2 Y; i:e: M ðx; yÞ M ðx0 ; yÞ8x0 2 X and M ðx; yÞ M ðx; y0 Þ8y0 2 Y; i:e: M ðx0 ; yÞ M ðx; yÞ M ðx0 ; yÞ8x0 2 X; y0 2 Y:
308
13
Game Theory
Changing the notation (x, y) to (x*, y*) and (x′, y′) to (x, y), we see that (x*, y*) will be an equilibrium situation for an antagonistic game hX; Y; M i if M ðx; y Þ M ðx ; y Þ M ðx ; yÞ 8x 2 X; y 2 Y: The point (x*, y*) is called the saddle point of M(x, y).
13.5
Matrix Games or Rectangular Games
Antagonistic games in which each player possesses a finite number of strategies are called matrix games or rectangular games. When there are two competitors playing a game, it is called a two-person game. In a game, if the number of competitors is more than two, say n, the game is referred to as an n-person game. If, in a game, the algebraic sum of the payments to all the competitors is zero for every possible outcome of the game, then the game is said to be a zero-sum game. Otherwise, it is said to be non-zero-sum game. A game with only two players in which the gains of one player are exactly equal to the losses of another player is called a two-person zero-sum game. It is also called a rectangular game because their payoff matrix is in a rectangular form. Some basic terms Player: The competitors in the game are known as players. A player may be an individual or a group of individuals or an organization. Strategy: A strategy for a player is defined as a set of rules or alternative courses of action available to him/her in advance, by which the player decides the course of action that he should adopt. A strategy may be of two types: (i) pure strategy, (ii) mixed strategy. Pure strategy: If the players select the same strategy each time, then it is referred to as a pure strategy. In this case, each player knows exactly what the other player is going to do, and the objective of the players is to maximize the gains or to minimize the losses. Payoff matrix: The payoff is the outcome of playing the game. When the players select their particular strategies, the payoffs (gains or losses) can be represented in the form of a matrix called a payoff matrix. Since the game is zero sum, the gain of one player is exactly equal to the loss of the other and vice versa. In other words, one player’s payoff table would contain the same amounts as in the payoff table of the other player with the sign changed. Thus, it is sufficient to construct payoff only for one of the players.
13.5
Matrix Games or Rectangular Games
309
Let player A have m strategies, say, A1, A2, …, Am and player B have n strategies, say, B1, B2, …, Bn. Here, it is assumed that each player has his choices from among the pure strategies. Also, it is assumed that player A is always the gainer and player B is always the loser. That is, all payoffs are assumed in terms of player A. Let aij be the payoff which is the gain of player A from player B if player A chooses strategy Ai whereas player B chooses Bj. Then the payoff matrix of player A is Player B 2
A1
B1 B2 a11 a12
Bn 3 a1n a2n 7 7
.. 7 7i:e:; aij mxn . 5
A2 6 6 a21 Player A . 6 . .. 6 4 ..
a22 .. .
am1
am2
amn
Am
The payoff matrix of player B is ½aij mn : Example 1 Consider a two-person coin tossing game. Each player tosses an unbiased coin simultaneously. Player B pays Rs. 7.00 to player A if the outcomes of both tosses are heads and Rs. 4.00 if both the outcomes are tails; otherwise, player A pays Rs. 3.00 to player B. This two-person game is a zero-sum game, since the winnings of one player are the losses of the other. Each player has choices from among two pure strategies H and T. In that case, A’s payoff matrix will be
Player A
Player B H T H 7 3 T
3
4
Maximin-minimax principle or maximin-minimax criteria of optimality This principle is used for the selection of optimal strategies by two players. It states that ‘If a player lists his worst possible outcomes of all potential strategies then he will choose that strategy which corresponds to the best of these worst outcomes’.
310
13
Game Theory
Let player A’s payoff matrix be
B1 2a
Player B Bj
B2
11 A1 6 a A2 6 21 .. 6 6 . . 6 .. 6 Player A Ai 6 6 ai1 .. 6 6 . . 4 .. Am am1
a12
a1j
a22 .. .
a2j .. .
ai2 .. . am2
aij . .. amj
Bn
a1n 3 a2n 7 7 .. 7 7 . 7 7: ain 7 7 7 .. 7 . 5 amn
If player A takes the strategy Ai, then surely he will get at least aij ði ¼ 1; 2; . . .; m; j ¼ 1; 2; . . .; nÞ for taking any strategy by the opponent player B. Thus, by the maximin-minimax criteria of optimality, player A will choose that strategy which corresponds to the best of these worst outcomes: Min a1j ; Min a2j ; . . .; Min amj : j
j
j
Thus, the maximin value for player A is given by Maxi Minj aij . Similarly, player B will choose that strategy which corresponds to the best (minimum) of the worst outcomes (maximum losses): Max ai1 ; Max ai2 ; . . .; Max ain : i i i
Thus, the minimax value for player B is given by Minj Maxi aij .
Let Max Min aij ¼ apq
ð13:1Þ
and Min Max aij ¼ ars :
ð13:2Þ
i
j
i
i
From (13.1), it follows that apq is the minimum element in the pth row, i:e: apq aps ;
ð13:3Þ
where aps is another element of the pth row. From (13.2), it follows that ars is the maximum element in the sth column, i:e: aps ars ; where aps is another element of the sth column.
ð13:4Þ
13.5
Matrix Games or Rectangular Games
311
From (13.3) and (13.4), we have apq ars .
Max Min aij Min Max aij ; i
j
j
i
i.e. maximin for A minimax for B.
Therefore, Maxi Minj aij ; i.e. maximin for A is called the lower value of the game
and is denoted by Minj Maxi aij , and minimax for B is called the upper value of the game and is denoted by v. If v is the value of the game, then it will always satisfy the inequality maximin for A v minimax for B: If for a game, v ¼ v ¼ alk , then the game possesses a solution given by: (i) The optimal strategy for player A is the strategy Al. (ii) The optimal strategy for player B is the strategy Bk. (iii) The value of the game is v = alk. For the case of pure strategy, such a game is called a game with a saddle point. Saddle point: A saddle point is a position in the payoff matrix where the maximum of row minima coincides with the minimum of the column maxima. The cell entry (or payoff) at the saddle point position is called the value of the game. A game for which maximin for A = minimax for B is called a game with a saddle point. Thus, in a game with a saddle point, the players use pure strategies; i.e. they choose the same course of action throughout the game. Note (i) A game is said to be fair if v ¼ v ¼ 0: (ii) A game is said to be strictly determinable if maximin value = minimax value = the value of the game 6¼ 0, i.e. v ¼ v ¼ vð6¼ 0Þ. (iii) Not all payoff matrices possess a saddle point, and so the determination of the optimal solution is not an easy task. Example 2 Solve the game whose payoff matrix is given by Player B
Player A
B1 B2 2 A1 1 3 6 A2 4 0 4 A3 1 5
B3 3
3
7 3 5 1
312
13
Game Theory
Solution To find the saddle point, the row minima and column maxima are selected and displayed on the right side of the corresponding row and in the bottom of the corresponding column respectively.
2
Player A
Player B B1 B2 B3
A1 1 6 A2 4 0 A3 1
Column maxima
3
3 7 3 5 1
4 5
1
Row minima
3
5
1 4 1
3
Hence, the maximin value of the game is v = Max{1, −4, −1} = 1 whose location is in the cell (1, 1), and the minimax value of the payoff matrix is v ¼ Minf1; 5; 3g ¼ 1 whose locations are in the cells (1, 1) and (3, 1). Hence, the payoff matrix has a saddle point at the position (1, 1). The solution of the game is given by: (i) The optimal strategy for player A is A1. (ii) The optimal strategy for player B is B1. (iii) The value of the game is 1. Example 3 For what value of k is the game with the following payoff matrix strictly determinable? Player B
Player A
B1 B2 2 A1 k 6 6 A2 4 1 k A3 2 4
B3 2
3
7 7 5 k
Solution Ignoring the value of k, we shall find the maximin and minimax values of the payoff. For this purpose, we have
Player A
Player B B1 B2 B3 Row minima 2 3 k 6 2 A1 2 6 7 A2 4 1 k 7 5 7 A3 2 4 k 2
Column maxima 1
6
2
13.5
Matrix Games or Rectangular Games
313
) v = Max{2, −7, −2} = 2 and Min{−1, 6, 2} = −1. We know that the game is strictly determinable if v ¼ v ¼ v 6¼ 0. Hence, 1 k 2ðk 6¼ 0Þ. Mixed strategy If the payoff matrix does not have a saddle point, then there exists no equilibrium situation of the form (i*, j*); i.e. an optimal strategy of the players cannot be a single (pure) strategy.
Let us consider the matrix game whose payoff matrix is A ¼ aij mn . By a mixed strategy for player 1, we shall mean an ordered m-tuple ðx1 ; . . .; xm Þ or the Pm vector n ¼ ðx1 ; . . .; xm Þ0 , where xi 0 i = 1, 2, …, m and i¼1 xi ¼ 1: The components x1 ; x2 ; . . .; xm may be thought of as the relative frequencies (or probabilities) with which player 1 chooses the strategies (pure) 1; 2; . . .; m. It should be noted that em . .; 0; 1; 0; . . .; 0Þ0 ; i ¼ 1; 2; . . .; m represents the pure i ¼ ð0; .P m strategy of player 1, and n ¼ m i¼1 ei xi . Similarly, a mixed strategy for player 2 is denoted by g ¼ ðy1 ; y2 ; . . .; yn Þ0 , where Pn 0 n yj 0; j ¼ 1; 2; . . .; n and j¼1 yj ¼ 1. Note that ej ¼ ð0; 0; . . .; 0; 1; 0; . . .; 0Þ Pn m represents the pure strategy of player 2, and g ¼ j¼1 ej yj . P Let Sm ¼ n : xi 0; m i¼1 xi ¼ 1 , then Sm 2 Em . This Sm is the space of mixed strategies for player 1. Note that a pure strategy is a particular case of a mixed strategy. n o P Similarly, let Sn ¼ g : yj 0; nj¼1 yj ¼ 1 , then Sn 2 En . Thus, Sn is the space of mixed strategies for player 2. Mixed extension of a matrix game without a saddle point Let players 1 and 2 choose their mixed strategies as n ¼ ðx1 ; x2 ; . . .; xm Þ0 , g ¼ ðy1 ; y 2 ; . . .; yn Þ0 independently of one another in a matrix game with payoff matrix A ¼ aij mn . Definition A pair ðn; gÞ of mixed strategies for the players in a matrix game is called a situation in mixed strategies. In a situation ðn; gÞ of mixed strategies, each usual situation (i, j) in pure strategies becomes a random event occurring with probabilities xi, yj. Since in the situation (i, j) player 1 receives a payoff aij, the expected value of his payoff under ðn; gÞ is equal to Eðn; gÞ ¼
m X n X
aij xi yj :
i¼1 j¼1
We thus arrive at a new game which can be described as follows.
314
13
Game Theory
Definition A mixed extension of a matrix game is an antagonistic game, denoted by hSm ; Sn ; E i, in which the set of strategies for the players is the set of their mixed strategies in the original game, and the payoff function for player 1 is defined by E ðn; gÞ ¼
m X n X
aij xi yj :
i¼1 j¼1
Recalling the definition of saddle point of an antagonistic game, we observe that the situation ðn ; g Þ in a mixed extension of a matrix game will be a saddle point (i.e. an equilibrium situation), provided that for any n 2 Sm and g 2 Sn the following double inequality is satisfied: E ðn; g Þ E ðn ; g Þ E ðn ; gÞ
8n 2 Sm ; g 2 Sn :
Theorem 13.3 Let E (n,η) be such that both Min Max Eðn; gÞ and Max Min g2Sn n2Sm
Eðn; gÞ exist; then
n2Sm g2Sn
Min Max Eðn; gÞ Max Min Eðn; gÞ: g2Sn n2Sm
n2Sm g2Sn
Proof Let n and g be two arbitrarily chosen mixed strategies for players A and B respectively. Then, for every n 2 Sm , we have Max Eðn; g Þ Eðn ; g Þ; n2Sm
ð13:5Þ
and for every g 2 Sn , we have Min Eðn ; gÞ Eðn ; g Þ: g2Sn
ð13:6Þ
Hence, from (13.5) and (13.6), the inequality Min Eðn ; gÞ Max Eðn; g Þ g2Sn
n2Sm
or Max Eðn; g Þ Min Eðn ; gÞ n2Sm
g2Sn
holds for all n and η. Since g is an arbitrarily chosen mixed strategy, the preceding inequality holds for all values of η. Hence, if g is such a strategy for which Maxn Eðn; gÞ has the
13.5
Matrix Games or Rectangular Games
315
maximum value, the inequality remains true. Therefore, Min Max Eðn; gÞ g
Min Eðn ; gÞ.
n
g
Again, since n is any strategy, the preceding inequality holds even if we select n which gives the maximum value of Min Eðn; gÞ. g
Therefore, Min Max Eðn; gÞ Max Min Eðn; gÞ. This proves the theorem. g
n
g
n
Saddle point of a function Let E(n,η) be a function of two variables (vectors) n and η in Sm and Sn respectively. The point ðn ; g Þ; n 2 Sm ; g 2 Sn is said to be the saddle point of the function E(n,η) if Eðn; g Þ Eðn ; g Þ Eðn ; gÞ: Now we shall discuss a theorem regarding the existence of the saddle point of a function. Theorem 13.4 Let E(n,η) be a function of two variables n 2 Sm and g 2 Sn such that Max Min Eðn; gÞ and Min Max Eðn; gÞ exist. Then the necessary and suffin
g
g
n
cient condition for the existence of a saddle point ðn ; g Þ of E(n,η) is that Eðn ; g Þ ¼ Max Min Eðn; gÞ ¼ Min Max Eðn; gÞ: g
g
n
n
Proof The condition is necessary; i.e. the point ðn ; g Þ is the saddle point of E(n,η). Hence, from the definition of saddle point, we have Eðn; g Þ Eðn ; g Þ Eðn ; gÞ
ð13:7Þ
for all n 2 Sm and g 2 Sn . From (13.7), clearly, Max Eðn; g Þ Eðn ; g Þ holds for all n 2 Sm . n
Hence; Min Max Eðn; gÞ Eðn ; g Þ: g
n
ð13:8Þ
Similarly, from (13.7), we have Eðn ; g Þ Min Eðn ; gÞ; g
i:e: Eðn ; g Þ Max Min Eðn; gÞ: n
g
ð13:9Þ
316
13
Game Theory
From (13.8) and (13.9), we have Min Max Eðn; gÞ Max Min Eðn; gÞ:
ð13:10Þ
Min Max Eðn; gÞ Max Min Eðn; gÞ:
ð13:11Þ
g
n
n
g
But, we know that g
n
n
g
Hence, from (13.10) and (13.11), we have Eðn ; g Þ ¼ Max Min Eðn; gÞ ¼ Min Max Eðn; gÞ: n
g
g
n
The condition is sufficient. Let the point ðn ; g Þ be such that Max Min Eðn; gÞ ¼ Min Max Eðn; gÞ: g
n
g
n
ð13:12Þ
Also, let Max Min Eðn; gÞ ¼ Min Eðn ; gÞ n
g
g
and Min Max Eðn; gÞ ¼ Max Eðn; g Þ. g
n
n
Hence, from (13.12), we have Min Eðn ; gÞ ¼ Max Eðn; g Þ: g
n
ð13:13Þ
Now, from the definition of maximum and minimum, we have Min Eðn ; gÞ Eðn ; g Þ
ð13:14Þ
and Eðn ; g Þ Max Eðn; g Þ:
ð13:15Þ
g
n
From (13.13) and (13.14), we have Max Eðn; g Þ Eðn ; g Þ; n
which implies E ðn; g Þ Eðn ; g Þ for all n 2 Sm :
ð13:16Þ
From (13.13) and (13.15), we have Eðn; g Þ Min Eðn ; gÞ; g
which implies Eðn ; g Þ Eðn ; gÞ for all g 2 Sn : Combining (13.16) and (13.17), we have
ð13:17Þ
13.5
Matrix Games or Rectangular Games
317
Eðn; g Þ Eðn ; g Þ Eðn ; gÞ; which implies ðn ; g Þ is a saddle point of E(n, η). Fundamental theorem of a matrix game
Theorem 13.5 The quantities Max Min E ðn; gÞ and Min Max E ðn; gÞ exist g2Sn
n2Sm
g2Sn
n2Sm
and are equal. Proof This theorem can easily be proved from Theorem 13.4. Value of a matrix game
The common value of Max Min E ðn; gÞ
and Min Max Eðn; gÞ is called the g n
value of the matrix game with payoff matrix A ¼ aij and denoted by v(A) or simply v. n
g
Definition Equilibrium strategies of players in a matrix game are called their optimal strategies. Definition The solution of a matrix game is the process of determining the value of the game and the pairs of optimal strategies. Note: Thus, if ðn ; g Þ is an equilibrium situation in mixed strategies of the game hSm ; Sn ; E i, then n ; g are the optimal strategies for players 1 and 2 respectively in the matrix game with payoff matrix A ¼ aij mn . Hence, n ; g are optimal strategies for players 1 and 2 respectively iff E ðn; g Þ Eðn ; g Þ Eðn ; gÞ 8n 2 Sm ; g 2 Sn : Note: Min E ðn; gÞ ¼ Eðn; g Þ ) Max Min E ðn; gÞ ¼ Max Eðn; g Þ ¼ E ðn ; g Þ n
g
n
and Max Eðn; gÞ ¼ Eðn ; gÞ n
n
) Min Max Eðn; gÞ ¼ Min E ðn; gÞ ¼ Eðn ; g Þ: g
n
g
Notation The expected payoff to player 1 under ðn; gÞ is E ðn; gÞ ¼
P i; j
aij xi yj .
The expected payoff to player 1 under n; enj or simply ðn; jÞ is
318
13
Game Theory
X E n; enj ¼ E ðn; jÞ ¼ aij xi : i
The expected payoff to player 1 under em i ; g or simply ði; gÞ is X E em aij yj : j ; g ¼ E ði; gÞ ¼ j
Thus, E ðn; gÞ ¼
P
P
E ði; gÞ ¼
i
Eðn; jÞ.
j
n Also, E em ; e ¼ Eði; jÞ ¼ aij . i j Some properties of the value of a matrix game
The matrix game is characterized by the payoff matrix A ¼ aij mn and v is its value. Theorem 13.6 v ¼ Max Min E ðn; jÞ ¼ Min Max E ði; gÞ , and the outer g
j
n
i
extrema are attained at optimal strategies of the players. Proof We have, by definition, v ¼ Max Min E ðn; gÞ . g
n
For a fixed n, let Eðn; j0 Þ ¼ Min E ðn; jÞ. Then j
E ðn; j0 Þ E ðn; jÞ; j ¼ 1; 2; . . .; n X X Eðn; j0 Þyj E ðn; jÞyj ; or j
j
i:e: E ðn; j0 Þ E ðn; gÞ for a fixed n and all g 2 Sn Hence, Eðn; j0 Þ Min E ðn; gÞ: g
ð13:18Þ
Again since Min E ðn; gÞ E ðn; gÞ 8g 2 Sn , then setting g ¼ enj0 , we have g
Min E ðn; gÞ E n; enj0 ¼ Eðn; j0 Þ: g
ð13:19Þ
By (13.18) and (13.19), we have Min E ðn; gÞ ¼ E ðn; j0 Þ ¼ Min Eðn; jÞ: g
j0
This is valid for any n 2 Sm ; hence, taking the maximum of both sides with respect to n, we have
13.5
Matrix Games or Rectangular Games
319
Max Min E ðn; gÞ
¼ Max Min E ðn; jÞ ;
g
n
j
n
where the outer maxima on both sides are attained for the same value of n. It is known, however, that on the left side the outer extremum is attained at the optimal strategies of player 1. Therefore, the same strategies yield the maximum on the right-hand side. Thus, the first part of the theorem is proved. The proof of the second part is similar. Theorem 13.7
Max Min aij v Min Max aij : i
j
j
i
Proof By the preceding theorem, we have v ¼ Max Min E ðn; jÞ 8n 2 Sm .
j
n
But Max Min E ðn; jÞ Min E ðn; jÞ 8n 2 Sm j
n
j
) v Min Eðn; jÞ j
Setting
n ¼ em i ,
we
8n 2 Sm : have
v Min E em i ; j ¼ Min E ði; jÞ ¼ Min aij , j
j
j
i.e.
v Min aij . j
The left side v is independent of i, so taking the maximum with respect to i, we obtain
v Max Min aij : i
j
The proof of the second part is similar. Theorem 13.8 (i) If player 1 possesses a pure optimal strategy i*, then v ¼ Max Min aij ¼ Min ai j : i
j
j
(ii) If player 2 possesses a pure optimal strategy j*, then v ¼ Min Max aij ¼ Max aij j
Proof We know that
i
i
320
13
Game Theory
v ¼ Max Min E ðn; jÞ j n m ¼ Min E ei ; j as n ¼ em i is optimal: j
The proof of the second part is similar.
Solution of matrix game Let us
consider a 2 2 matrix game whose payoff matrix A is given by a12 a A ¼ 11 . a21 a22 If A has a saddle point, the solution is obvious. Let A have no saddle point. Let player 1 have the strategy n ¼ ðx1 ; x2 Þ0 ðx; 1 xÞð0 x 1Þ and player 2 have the strategy g ¼ ðy; 1 yÞ0 ð0 y 1Þ. Then Eðn; gÞ ¼
2 X 2 X
aij xi yj
i¼1 j¼1
¼ a11 xy þ a12 xð1 yÞ þ a21 ð1 xÞy þ a22 ð1 xÞð1 yÞ ¼ f ðx; yÞ; say: If n ¼ ðx ; 1 x Þ0 ; g ¼ ðy ; 1 y Þ are optimal strategies, then from E ðn; g Þ E ðn ; g Þ E ðn ; gÞ 8n 2 S1 ; g 2 S2 we have f ðx; y Þ f ðx ; y Þ f ðx ; yÞ 8x 2 ð0; 1Þ; y 2 ð0; 1Þ. From the first part of the inequality, we set that f ðx; y Þ regarded as a function of x has a maximum at x , i.e. @f ¼ 0: @xðx ; y Þ This gives a11 y þ a12 ð1 y Þ a21 y a22 ð1 y Þ ¼ 0 or ða11 a12 a21 þ a22 Þy ¼ a22 a12 a22 a12 provided that a11 þ a22 ða12 þ a21 Þ 6¼ 0: or y ¼ a11 þ a22 ða12 þ a21 Þ Similarly, from the second part of the inequality, it is seen that f ðx ; yÞ regarded as a function of y has a minimum at y*, i.e.
13.5
Matrix Games or Rectangular Games
321
@f ¼ 0: @yðx ; y Þ This gives a11 x a12 x þ a21 ð1 x Þ a22 ð1 x Þ ¼ 0 a22 a21 provided that a11 þ a22 ða12 þ a21 Þ 6¼ 0: or x ¼ a11 þ a22 ða12 þ a21 Þ Now v ¼ f ðx ; y Þ ¼ a11 x y þ a12 x ð1 y Þ þ a21 ð1 x Þy þ a22 ð1 x Þð1 y Þ a11 a22 a12 a21 : ¼ a11 þ a22 ða12 þ a21 Þ It can be proved that a11 þ a22 ða12 þ a21 Þ ¼ 0 implies that A has a saddle point. Some properties of optimal strategies in a matrix game
Let A ¼ aij mn be a payoff matrix of a game whose value is v. Theorem 13.9 Let u be an arbitrary real number. Then (i) u E ðn; jÞ; (ii) u E ði; gÞ;
j ¼ 1; 2; . . .; n for some fixed n implies u v. i ¼ 1; 2; . . .; m for some fixed η implies u v.
Proof Let g be an optimal strategy for player 2. Now we have u Eðn; jÞ j ¼ 1; . . .; n for some fixed n. )
u
n X
y j
n X
j¼1
E ðn; jÞy j ;
j¼1
i:e: u E ðn; g Þ Eðn ; g Þ; where n is an optimal strategy for player 1; i:e: u v The proof of the second part is similar. Theorem 13.10 Let u be a real number such that Eði; g0 Þ u E ðn0 ; jÞ for i ¼ 1; 2; . . .; m, j ¼ 1; 2; . . .; n and some particular n0 ; g0 . Then u is the value of the game and n0 ; g0 are the optimal strategies for players 1 and 2 respectively. Proof Given that E ði; g0 Þ u; ði ¼ 1; 2; . . .; mÞ and for some particular g0 , X X ) E ði; g0 Þx i u x i i
i
or E ðn0 ; g0 Þ u: Again u E ðn0 ; jÞ;
j ¼ 1; 2; . . .; n gives similarly
ð13:20Þ
322
13
Game Theory
u Eðn0 ; g0 Þ:
ð13:21Þ
Hence, (13.20) and (13.21) imply that u ¼ E ðn0 ; g0 Þ. Again we have Eði; g0 Þ u ¼ E ðn0 ; g0 Þ; i ¼ 1; 2; . . .; m X X Eði; g0 Þxi E ðn0 ; g0 Þ xi so that i
i
ð13:22Þ
or Eðn; g0 Þ E ðn0 ; g0 Þ 8n 2 Sm : Again u ¼ E ðn0 ; g0 Þ E ðn0 ; jÞ; j ¼ 1; 2; . . .; n X X yj E ðn0 ; jÞyj so that E ðn0 ; g0 Þ j
j
or E ðn0 ; g0 Þ Eðn0 ; gÞ
ð13:23Þ
8g 2 Sn :
Hence, from (13.22) and (13.23), we have E ðn; g0 Þ Eðn0 ; g0 Þ Eðn0 ; gÞ 8n 2 Sm ; g 2 Sn : Hence, by the definition of optimal strategies, n0 ; g0 are optimal strategies and E ðn0 ; g0 Þ ¼ u is the value of the game. Theorem 13.11 If Max E ði; g0 Þ Min E ðn0 ; jÞ is valid for some fixed n0 and g0 , i
j
then n0 ; g0 are optimal strategies and the inequality becomes equality. Proof We have E ði; g0 Þ Max E ði; g0 Þ i
8i ¼ 1; 2; . . .; m [by the definition of
maximum] Min Eðn0 ; jÞ [by the given condition] j
E ðn0 ; jÞ 8j ¼ 1; 2; . . .; n [by the definition of minimum] 8i ¼ 1; 2; . . .; m and j ¼ 1; 2; . . .; n. Let u1 ¼ Maxi E ði; g0 Þ. Then E ði; g0 Þ u1 Eðn0 ; jÞ 8i ¼ 1; 2; . . .; m and j ¼ 1; 2; . . .; n. Hence, by Theorem 13.10, u1 is the value of the game and n0 ; g0 are the optimal strategies. Again let u2 ¼ Min Eðn0 ; jÞ. j
Now by the definition of minima, we have Eðn0 ; jÞ Min E ðn0 ; jÞ 8j ¼ 1; 2; . . .; n j
Max E ði; g0 Þ ½by the given condition
i
Eði; g0 Þ 8i ¼ 1; 2; . . .; m ½by the definition of maxima
13.5
Matrix Games or Rectangular Games
323
) Eðn0 ; jÞ Min Eðn0 ; jÞ Eði; g0 Þ: j
i:e: Eði; g0 Þ Min E ðn0 ; jÞ E ðn0 ; jÞ j
or Eði; g0 Þ u2 E ðn0 ; jÞ Hence, by Theorem 13.10, u2 is the value of the game and n0 ; g0 are the optimal strategies. But the value of the game is unique. Hence, u1 = u2; i.e. the inequality becomes an equality. Theorem 13.12 If Eði; g0 Þ Eðn0 ; jÞ n0 ; g0 are the optimal strategies.
8i ¼ 1; 2; . . .; m;
j ¼ 1; 2; . . .; n, then
Proof This theorem can be proved from Theorem 13.3. Theorem 13.13 (i) If for a strategy n0 of player 1, v E ðn0 ; jÞ; optimal strategy for player 1. (ii) If for a strategy g0 of player 2, v E ði; g0 Þ; optimal strategy for player 2.
8j ¼ 1; 2; . . .; n, then n0 is an 8i ¼ 1; 2; . . .; m, then g0 is an
Proof Let g be an optimal strategy for player 2, then E ðn; g Þ v; 8n 2 Sm . Setting n ¼ em i , we have E ði; g Þ v 8i ¼ 1; 2; . . .; m. But v Eðn0 ; jÞ (given). Hence, Eði; g Þ v E ðn0 ; jÞ 8i ¼ 1; 2; . . .; m; 8j ¼ 1; 2; . . .; n. By Theorem 13.2, it follows that n0 is an optimal strategy for player 1. The proof of the second part is similar. Theorem 13.14 (i) If v Min E ðn0 ; jÞ for a fixed n0 , n0 is an optimal strategy for player 1. j
(ii) If v Max E ði; g0 Þ for a fixed g0 , g0 is an optimal strategy for player 2. i
Proof (i) v Min Eðn0 ; jÞ Eðn0 ; jÞ 8j ¼ 1; 2; . . .; n:: j
Then by Theorem 13.5, this theorem can be proved. Corollary (i) A necessary and sufficient condition for the optimality of strategy n0 for player 1 is v ¼ Min E ðn0 ; jÞ. j
(ii) A similar condition for player 2 is v ¼ Max E ði; g0 Þ. i
324
13
13.6
Game Theory
Dominance
In a game, if one course of action (pure strategy) of one player is better than or as good as another, for all possible courses of action of the opponent, then the first course of action is said to dominate the second. In this situation, the dominated (or inferior) course of action can simply be discarded from the payoff matrix, and a reduced matrix is obtained. In this case, the optimal strategies for the reduced matrix are also optimal for the original matrix with zero probability for discarded strategies. When there is no saddle point in a payoff matrix, then the size of the game can be reduced by this process, which is known as dominance. For example, let us consider a game with the following payoff matrix:
Player B B1
B2
B3
A1 4 Player A A2 7
9 4
4 8
A3 8
5
9
From this payoff matrix, it is clear that the second row is inferior to the third row for the maximizing player A, as every element of the third row is either greater than or equal to the corresponding element of the second row. Then by deleting the second row, we get the reduced payoff matrix as follows:
B1 A1 4 A3 8
B2
B3
7 5
4 9
Again, from player B’s point of view, the third strategy ðB3 Þ is dominated by B1 , as every element of the third column is either greater than or equal to the corresponding element of the first column. Then by deleting the third column, we get the reduced payoff matrix as follows:
B1 A1 4 A3 8
B2 7 5
Now, if x 1 ; x 3 and y 1 ; y 2 are the optimal strategies for this reduced matrix, then x1 ; 0; x 3 and y 1 ; y 2 ; 0 are the optimal strategies for the original payoff matrix.
13.6
Dominance
325
Theorem 13.15 If the ith row of the payoff matrix of an m n rectangular game is dominated by its rth row, then the deletion of the ith row from the matrix does not change the optimal strategies of the player whose objective is to maximize the gain. Proof Let A and B be the
players of a rectangular game whose payoff matrix is given by the matrix aij mn . According to the given condition of the theorem, aij arj for j ¼ 1; 2; . . .; n
i 6¼ r;
and
ð13:24Þ
and for at least one j; aij 6¼ arj . Let g ¼ y 1 ; y 2 ; . . .; y n be the optimal mixed strategy for player B. Then from (13.24), we have n n X X aij y j \ arj y j ; ð13:25Þ j¼1 j¼1 i:e: E ðAi ; g Þ\EðAr ; g Þ; where Ai and Ar are the ith and rth pure strategies of A. Let v be the value of the game. Hence, from (13.25), we have v E ðAr ; g Þ [ E ðAi ; g Þ:
ð13:26Þ
Let n ¼ x 1 ; x 2 ; . . .; x m be an optimal strategy for player A. If possible, let x i [ 0. Then, from (13.26), we have x i v [ x i EðAi ; g0 Þ: Since ðn0 ; g0 Þ is the optimal solution of the game problem, we have v ¼ E ðn ; g Þ ¼
m X
x k E ðAk ; g Þ
k¼1
¼ x i EðAi ; g Þ þ
m X k¼1 k6¼i
x k EðAk ; g Þ\x i v þ v
m X k¼1
x k ¼ v
m X
x k ¼ v
k¼1
k 6¼ i
i:e: v \ v: This is a contradiction to our assumption that x i [ 0 and hence x i ¼ 0. Hence, the deletion of the ith row does not alter the optimal solution.
326
13
Game Theory
Theorem 13.16 If the jth column of the payoff matrix of an m n rectangular game dominates its kth column, then the deletion of the jth column from the matrix does not change the optimal strategy of the player whose objective is to minimize the loss. [The proof of this theorem is similar to that of the preceding theorem.] Theorem 13.17 If the ith row of the payoff matrix of an m n rectangular game is strictly dominated by a convex combination of the other rows of the matrix, then the deletion of the ith row from the matrix does not affect the optimal strategy of the player whose objective is to maximize the gain. Proof Let A and B be the
players of a rectangular game whose payoff matrix is given by the matrix aij mn . Let ðx1 ; x2 ; . . .; xm Þ be the mixed strategy of player A where
P
k xk
¼ 1.
According to the given condition, xi ¼ 0 and aij
m X
akj xk ;
j ¼ 1; 2; . . .; n
k¼1 k6¼i
or aij
m P
ð13:27Þ
akj xk
k¼1
in which strict inequality holds for at least one j. Let g ¼ y 1 ; y 2 ; . . .; y n be the optimal strategy for player B, whose objective is to minimize the loss. Hence, from (13.27), we have n X
aij y j \
n X m X
j¼1
or E ðAi ; g Þ\
akj xk y j
j¼1 k¼1 n X m X
ð13:28Þ akj xk y j v;
j¼1 k¼1
where Ai is the of A and v is the value of the game. ith pure strategy Let n ¼ x1 ; x2 ; . . .; xm be the optimal strategy of player A, whose objective is to maximize the gain. If possible, let x i 6¼ 0. From (13.28), we have x i E ðAi ; g Þ\x i v;
since x i 6¼ 0:
13.6
Dominance
327
Hence, we have v ¼ E ð n ; g Þ ¼
n P m P j¼1 k¼1
¼ x i E ðAi ; g Þ þ ¼ x i E ðAi ; g Þ þ
m P k¼1 k6¼i m P k¼1 k6¼i
x k
akj x k y j n P j¼1
akj y j
x k EðAk ; g Þ\x i v þ v
m P k¼1 k6¼i
x k ¼ v
m P k¼1
x k ¼ v:
This is a contradiction to our assumption that x i 6¼ 0. Thus, x i ¼ 0. Hence, the deletion of the ith row does not affect the optimal solution. Theorem 13.18 If the jth column of the matrix strictly dominates a convex combination of other columns, then the deletion of the jth column of the matrix does not affect the optimal strategy of the player whose objective is to minimize the loss. [The proof of this theorem is similar to that of Theorem 13.3.] Theorem 13.19 R1 and R2 are the subsets of the rows and C1 and C2 are the subsets of the columns of the payoff matrix aij mn of a matrix game. (i) If a convex combination of the rows in R1 strictly dominates a convex combination of the rows in R2 , then there exists a row in R2 which, if discarded, does not change the optimal strategy of the player whose objective is to maximize the gain. (ii) If a convex combination of the columns in C1 dominates a convex combination of the columns in C2 , then there exists a column in C1 which, if discarded, does not change the optimal strategy of the player whose objective is to minimize the loss. The Rules of dominance Sometimes, in a rectangular game, it is seen that one or more pure strategies of a player are inferior to at least one of the remaining strategies. In such a case, this inferior strategy is never used. In other words, we can say that this inferior pure strategy is dominated by a superior pure strategy. In such cases of dominance, we can reduce the size of the payoff matrix by removing the pure strategies which are dominated by other strategies. Hence, the rules of dominance are used to reduce the size of the payoff matrix. These rules are especially used for solving a two-person zero-sum game without a saddle point. Rule 1: If each element in a row (say, the ith row, i.e. Ri) of a payoff matrix is either less than or equal to the corresponding element in another row (say, the jth row, i.e. Rj), then the ith strategy is dominated by the jth strategy, and that row (i.e. the ith row) can be deleted from the payoff matrix.
328
13
Game Theory
In other words, player A will never use that strategy, because if player A chooses that strategy, then he will gain less payoff. Rule 2: If each element in a column (say, the ith column, i.e. Ci) of a payoff matrix is either greater than or equal to the corresponding element in another column (say, the jth column, i.e. Cj), then the ith strategy is dominated by the jth strategy, and that column (i.e. the ith column) can be deleted from the payoff matrix. Rule 3: If the ith row is dominated by a convex combination of some/all other rows, then the ith row is deleted from the payoff matrix. If the jth column dominates a convex combination of some/all other columns, then the jth column is deleted from the payoff matrix. Rule 4: If R1 , R2 are the subsets of the rows of the payoff matrix, and if a convex combination of rows in R1 dominates a convex combination of the rows in R2 , then there exists a row in R2 which is deleted. If C1 ; C2 are the subsets of the columns of the payoff matrix, and if a convex combination of the columns in C1 dominates a convex combination of the columns in C2 , then there exists a column in C1 which is deleted. Rules 3 and 4 are called the modified dominance property. We note that the element or probability value corresponding to the deleted strategy by the rules of dominance is taken as zero. Example 4 Solve the game whose payoff matrix is given by
A1
Player A
2
B1
Player B B2 B3
B4
2
2
2 2
3 1
0
3
1
2
A2 6 6 3 1 6 A3 4 1 3 A4
2 2
3 7 7 7 5
Solution First of all, we shall compute the maximin and minimax values in the case of a pure strategy. 2
B1
B2
B3
2 1
2 2
2
3 2
2 0
3
3
3
2
3
A1 1 A2 6 6 3 6 A3 4 1 A4 Column maxima
B4
3
2 3 7 7 7 1 5
Row minima 2 1 1 3
In the case of a pure strategy, maximin value = 1 and minimax value = 2. As maximin value 6¼ minimax value, this game has no saddle point for a pure strategy.
13.6
Dominance
329
Hence, this game can be solved for the mixed strategy. For this purpose, we shall try to reduce the size of the payoff matrix by using the dominance rule. Since every element of the fourth row is less than the corresponding elements of the third row, then from player A’s point of view, the fourth strategy, i.e. A4, is dominated by the third strategy (A3); thus, we can delete the fourth row. In this situation, the optimal strategy will not be affected. Now deleting the fourth row, we get the reduced payoff matrix as follows: 2
B1
A1 1 6 A2 4 3 1
A3
B2
B3
B4 3 2 2 2 7 1 2 35
3
2
1
Again, from player B’s point of view, the fourth strategy (B4) is dominated by B1, as every element of the fourth column is either greater than or equal to the corresponding element of the first column. Then by deleting the fourth column, we get the reduced payoff matrix as follows:
A1
2
B1 1
6 A2 4 3 A3 1
B2
B3
2 2 1 3
3
7 2 5 2
From the reduced payoff matrix, it is seen that none of the pure strategies of players A and B is inferior to any of their other strategies. However, the convex combination due to strategies A2 and A3 (i.e. the average of the second and third rows, 1, 2, 2) is superior to the payoff due to strategy A1. Thus, strategy A1 may be deleted, giving us the reduced payoff matrix as follows: B1 B2 B3
A2 3 1 2 A3 1 3 2 In this reduced matrix, the convex combination due to strategies B1 and B2 is superior to the payoff due to strategy B3. Thus, strategy B3 may be deleted, and we will get the reduced 2 2 payoff matrix as follows: B1 B2
A2 3 1 A3 1 3 Now, we have to solve this reduced game whose payoff matrix is of 2 2 order.
330
13
Game Theory
Clearly, this game has no saddle point and cannot be reduced further. Therefore, the optimal strategies will be mixed strategies. Let player A choose his strategies A2 and A3 with probabilities x2 and x3; those of B are y1 and y2. ) x2 þ x3 ¼ 1, y1 + y2 = 1 and x2, x3, y1, y2 >0 and v is the value of the game. To determine the optimal values of x2 and x3, we have 3x2 x3 ¼ x2 þ 3x3 ¼ v; which implies x2 ¼ 23 ) x1 ¼ 1 x2 ¼ 13. To determine the optimal values of y1 and y2, we have 3y1 þ y2 ¼ y1 þ 3y2 which implies y1 ¼ 13 ) y2 ¼ 23. The value of the game v ¼ 3x2 x3 ¼ 3 23 13 ¼ 53. Hence, the solution of the game is as follows: Optimal strategies are n ¼ 0; 23 ; 13 ; 0 , g ¼ 13 ; 23 ; 0; 0 , and the value of the game is 53. An important result If a constant is added to all the elements of a matrix game, the optimal strategies remain unaltered, and the value of the game is increased by this constant. This result can easily be proved.
13.7
Graphical Solution of 2 n or m 2 Games
Any rectangular game of order 2 n or m 2 can be reduced to a 2 2 game by using a graphical method, and the reduced game of order 2 2 can be solved by an algebraic method. Let us consider a 2 n game without a saddle point. Its payoff matrix is as follows: Player B
Player A
A1
B1 B2 Bn a11 a12 a1n
A2 a21
a22
a2n
13.7
Graphical Solution of 2 n or m 2 Games
331
Let n = (x1, x2) and η = (y1, y2, …, yn) be the mixed strategies for players A and B respectively. When player B uses his pure strategy Bj, then the expected gain of player A is given by Eðn; Bj Þ ¼ a1j x1 þ a2j x2 ¼ a1j x1 þ a2j ð1 x1 Þ; j ¼ 1; 2; . . .; n
ð13:29Þ
as x1 + x2 = 1. Clearly, both x1 and x2 must lie in the open interval (0, 1) [because if either x1 = 1 or x2 = 1, the game reduces to a game of pure strategy, which is against our assumption]. Hence Ej (n) is a linear function of either x1 or x2. Considering Ej as a linear function of x1 (say), we have Ej n2 ; Bj ¼ a2j for x1 ¼ 0 ¼ a1j for x1 ¼ 1 Hence, Ej ðn; Bj Þ represents a line segment joining the points (0, a2j) and (1, a1j). Now player A expects a least possible gain v. Therefore, Ej ðn; Bj Þ v for all j. Now considering the strict equations for inequalities, and with the help of a graphical method, we shall find out two particular moves or choices of B which will maximize the minimum gain of A. Let us draw two parallel lines x1 = 0 and x1 = 1 at unit distance apart. Now draw n line segments joining the points (0, a2j) and (1, a1j), j = 1, 2, …, n. The lower envelope (or lower boundary) of these line segments (indicated by a thick line segment) will give the minimum expected gain of A as a function of x1. Now the highest point of the lower envelope will give the maximum of minimum gain of A. The line segments passing through the point corresponding to B’s two pure moves, say, Bk and Bl, are the critical moves for B which will maximize the minimum expected gain of A. Now the 2 2 payoff matrix corresponding to A’s moves A1, A2 and B’s moves Bk , Bl will produce the required result. Thus, solving the 2 2 game algebraically, we can find the value of the game. If there are more than two line segments passing through the highest (maximin) point, there are ties for the optimum mixed strategies for player B. Thus, any two such lines with opposite sign slopes will determine an alternative optimum for player B. Again, if there is more than one maximin point, alternative optima exist corresponding to these points. Example 5 Solve the following 2 4 game graphically: Player B
Player A
B1
A1 1
B2 B3 3 3
B4 7
A2 2
5
6
4
332
13
Game Theory
Solution Clearly, the given problem does not possess any saddle point in the case of pure strategy. Hence, this problem can be solved with a mixed strategy. Let player A play with mixed strategy n = (x1, x2), where x1+ x2 = 1 and both x1 and x2 lie in the open interval (0, 1). Then player A’s expected gains against B’s pure moves are given by B0 s Pure strategy B1 B2 B3 B4
A0 s expected gain Eðx1 ; B1 Þ ¼ x1 þ 2x2 ¼ x1 þ 2ð1 x1 Þ Eðx1 ; B2 Þ ¼ 3x1 þ 5ð1 x1 Þ Eðx1 ; B3 Þ ¼ 3x1 þ 4ð1 x1 Þ Eðx1 ; B4 Þ ¼ 7x1 þ 6ð1 x1 Þ
Draw two vertical lines x1 = 0 and x1 = 1 at unit distance apart. Mark the lines x1= 0 and x1 = 1 by using the same scale as given in Fig. 13.1. Now draw the line segments for the expected gain equations between the two vertical lines x1 = 0 and x1 = 1. These line segments represent A’s expected gain due to B’s pure move. We denote these line segments as B1, B2, B3, B4. Since player A wishes to maximize his minimum expected gain, the highest point of intersection H of two line segments B3 and B4 represents the maximin expected value of the game for player A. Hence, the solution to the original game reduces to that of a simpler game with the following 2 2 payoff matrix:
Player A
Fig. 13.1 Graphical representation of 2 4 game
Player B B3 B4
A1 3 7 A2
4
6
7
7 B4
B2
4 B3
2 0
B1 H
-6
0 -2
-2 -4
2
Lower Envelop
-4 -6
x 1=1
x 1=0
4
13.7
Graphical Solution of 2 n or m 2 Games
333
Now, if n = (x1, x2) and η = (y3, y4) are the optimum strategies for players A and B, then we have a22 a21 6 4 10 1 ¼ ¼ ¼ ða11 þ a22 Þ ða12 þ a21 Þ 3 6 ð7 þ 4Þ 20 2 1 ) x2 ¼ 1 x1 ¼ : 2 x1 ¼
67 13 7 Again, y3 ¼ 36ð7 þ 4Þ ¼ 20, y4 ¼ 1 y3 ¼ 20 and the value of the game is given by
v¼
a11 a22 a12 a21 18 28 1 ¼ : ¼ 20 2 ða11 þ a22 Þ ða12 þ a21 Þ
Hence, the solution of the game is as follows: 7 (i) Optimal strategies n ¼ 12 ; 12 , g ¼ 0; 0; 13 20 ; 20 . (ii) The value of the game is 12. Example 6 By the graphical method, solve the game whose payoff matrix is given by Player B B1 B2 B3 B4
A1 2 2 3 1 Player A A2 4 3 2 6 Solution In a similar way, draw the graph. Here three lines pass at the highest point of the lower envelope. Thus, accordingly we get three square matrices of order 2, and there are three optimal solutions. But actually we shall have to select a pair of lines which have the slope opposite in sign. Thus, we get two reduced games with a payoff matrix of order 2 2 (see Fig. 13.2)
Fig. 13.2 Graphical representation of 2 4 game
6 4 2 0
B1
6
B4 B3 H
B2
Lower Envelope
4 2 0
334
13
Player B
Player A
Player B
B2 B3 A1 2 3 3
A2
Game Theory
and Player A
2
B3 B4 A1 3 1 A2
2
6
Solving these, we get the value of the game as 52 and the optimal strategies as n ¼ 12 ; 12 ; g ¼ 0; 12 ; 12 ; 0 for the first case and n ¼ 12 ; 12 ; g ¼ 0; 0; 78 ; 18 for the second case. Any m 2 game problem can be solved by using B’s mixed strategy g ¼ ðy1 ; y2 Þ, where y1 þ y2 ¼ 1 and both lie in the open interval (0,1), and A’s particular critical moves can be determined graphically. In this case, the problem is to determine the minimum of the maximum expected loss of B. Thus, we shall have to select the lowest point (minimax) of the upper envelope, and A’s critical moves are those whose corresponding line segments pass through the minimax point. Now selecting a 2 2 payoff matrix, the value of the game can be determined easily. Example 7 Solve the game whose payoff matrix is as follows: Player B B1 B2 2 3 A1 2 3 6 7 A2 4 2 5 5
Player A
A3
1
0
Here H is the lowest point of the upper envelope. The point H is the point of intersection of the two line segments A1, A2 (see Fig. 13.3). Hence, the payoff matrix of the reduced game will be B1
B2
A1 2 A2 2
3 5
Now solving the reduced game by the algebraic method, we shall easily find optimal strategies and the value of the game. For this problem, the optimal strategies will be
n ¼ and the value of the game is 13.
7 5 2 1 ; ;0 ;g ¼ ; 12 12 3 3
13.7
Graphical Solution of 2 n or m 2 Games
Fig. 13.3 Graphical representation of 3 2 game
335
5
Upper Envelope A2
3
0
A3
-3
A1
5 3 0
H -3
Symmetric game A game is said to be symmetric if the payoff matrix remains unchanged on interchanging the positions of two players. Hence, the payoff matrix of any player of a symmetric game is skew-symmetric; i.e. the payoff matrix of any player is a square matrix in which aij ¼ aji and aii ¼ 0, since the gain of one player is equal to the loss of the other. Theorem 13.20 In a symmetric game, the sets of all optimal strategies of players 1 and 2 are the same, and the value of the game is zero. Proof For a matrix game, the expected payoff for player 1 is given by Eðn; gÞ ¼
m X n X
aij xi yj ;
i¼1 j¼1
where n ¼ ðx1 ; x2 ; . . .; xm Þ and
g ¼ ðy1 ; y2 ; . . .; yn Þ are the mixed strategies of A and B respectively and aij mn is the payoff matrix. In a symmetric game, the payoff matrix is skew-symmetric, and Eðn; nÞ ¼ 0 and E ðg; gÞ ¼ 0. Let n and g be the optimal strategies for players 1 and 2 respectively. Then the value of the game is given by v ¼ E ðn ; g Þ: Again, we know that v ¼ Max Min Eðn; gÞ ¼ Minðn ; gÞ n
g
) E ðn ; gÞ v: Setting g ¼ n above, we have E ðn ; n Þ v.
g
336
13
But Eðn ; n Þ ¼ 0; therefore v 0:
Game Theory
ð13:30Þ
Again v ¼ Min Max E ðn; gÞ ¼ Max Eðn; g Þ g
n
n
) E ðn; g Þ v: Setting n ¼ g , we have E ðg ; g Þ v. But Eðg ; g Þ ¼ 0; therefore v 0:
ð13:31Þ
Hence, from (13.30) and (13.31), we have v¼0 and therefore n ¼ g . This completes the proof.
13.8
Matrix Games and Linear Programming Problems (LPPs)
Theorem 13.21 A matrix game is equivalent to a certain LPP.
Proof Let A ¼ aij mn be the payoff matrix and v be the value of the game. It is assumed that v > 0. If not, then a suitable constant may be added to each element of A so as to make v positive. This change does not affect the optimal strategies for two players; only the value of the game is increased by this constant. Let n ; g be the optimal strategies and v (> 0) be the value of the game. Then v E ð n ; gÞ
8g 2 Sn ;
i:e: v E ðn ; jÞ 8j ¼ 1; 2; . . .; n We can say that v is the largest number u (v u) such that for some n 2 Sm u E ðn; jÞ;
j ¼ 1; 2; . . .; n
ð13:32Þ
n is an optimal strategy for player 1 iff n satisfies (13.32) with u replaced by v. Now (13.32) can also be written as m X i¼1
aij xi u j ¼ 1; 2; . . .; n:
ð13:33Þ
13.8
Matrix Games and Linear Programming Problems (LPPs)
Also, we have
m X
xi ¼ 1;
xi 0;
i ¼ 1; 2; . . .; m
337
ð13:34Þ
i¼1
Since v is assumed to be positive, u is also positive, so that from (13.33) and (13.34), we have m X
and
aij
i¼1 m X i¼1
xi 1; u
xi 1 ¼ ; u u
j ¼ 1; 2; . . .; n xi 0; u
i ¼ 1; 2; . . .; m:
Now substituting xi for xui , we get m X
and
aij xi 1;
i¼1 m X i¼1
1 xi ¼ ; u
j ¼ 1; 2; . . .; n xi 0;
i ¼ 1; 2; . . .; m:
For the optimal strategy, v ¼ Max u (we can regard u as a function of n) or 1 v
¼
Min 1u n
¼ Min
m P
n
xi .
i¼1
Thus, finding v and n is equivalent to solving the following LPP: Minimize z ¼
m X
xi
i¼1
subject to m P
aij xi 1;
i¼1
xi 0;
9 j ¼ 1; 2; . . .; n = : ; i ¼ 1; 2; . . .; m
ð13:35Þ
0 If zmin is the optimal value and n ¼ x 1 ; . . .; x m is an optimal solution of (13.35), 1 then the value of the game is v ¼ zMin , and an optimal strategy for player 1 is n n ¼ vn ¼ . zMin
Again, we have v E ðn; g Þ 8n 2 Sm . Hence, v Eði; g Þ;
i ¼ 1; 2; . . .; m.
Thus there exists w such that (v w) for g 2 Sn ,
338
13
w Eði; gÞ
i ¼ 1; . . .; m:
Game Theory
ð13:36Þ
η is an optimal strategy for player 2 iff η satisfies (13.36) with w replaced by v. Now (13.36) can be written as n X
aij yj w
i ¼ 1; 2; . . .; m:
ð13:37Þ
j¼1
Also, we have
n X
yj ¼ 1;
yj 0;
j ¼ 1; 2; . . .; n:
ð13:38Þ
j¼1
Since v is assumed to be positive, we can assume w > 0, so that from (13.37) and (13.38), we have n X j¼1 n X
aij yj 1; yj ¼
j¼1
1 ; w
i ¼ 1; 2; . . .; m yj 0;
j ¼ 1; 2; . . .; n;
where yj ¼ yj =w. For the optimal strategy, v ¼ Min w ðw can be regarded as a function of gÞ g
or
n X 1 1 yj : ¼ Min ¼ Min v w j¼1
Thus, finding v and g is equivalent to solving the following LPP: Maximize z0 ¼
n X
yj
j¼1
subject to n P
aij yj 1;
j¼1
yj 0;
9 i ¼ 1; 2; . . .; m = : ; j ¼ 1; 2; . . .; n
ð13:39Þ
is an optimal solution of (13.39), then v ¼ z0 1 If z0Max is the optimal value and g
and g ¼ v g ¼ z0g . Max
Max
13.8
Matrix Games and Linear Programming Problems (LPPs)
339
Note that (13.39) is the dual of (13.35) and vice versa. Example 8 Solve the following game by reducing it to an LPP: 0 B1 B2 B3 1 1 1 3 @ 3 5 3 A 6 2 2
A1 A2 A3
Solution First we construct the table for row minimum and column maximum as follows: Row minima 2
1 1
6 6 63 4
5
6
2
Column maxima 6
5
3
1 7 7 3 7 3 5 3
2 2 3
Hence, maximin value = −1, minimax value = 3. Hence, the value of the game lies between −1 and 3, i.e. 1 v 3. This means that the value of the game may be negative or zero. Thus, we add a constant k = 4 to all the elements of the matrix so that all these elements of the payoff matrix became positive, assuming that the value of the game represented by this new matrix is non-negative and non-zero. The transformed (reduced) matrix is as follows: B B 2 1 B2 A1 5 3 A2 4 7 9 A3 10 6
B33 7 15 2
Let p ¼ ðp1 ; p2 ; p3 Þ and q ¼ ðq1 ; q2 ; q3 Þ be the mixed strategies of the two players respectively. Hence, the problem of player B is as follows: Maximize z0 ¼ x1 þ x2 þ x3 subject to 5x1 þ 3x2 þ 7x3 1 7x1 þ 9x2 þ x3 1 10x1 þ 6x2 þ 2x3 1 and x1 ; x2 ; x3 0: Now, if max z0 ¼ v1 , the value of the original game is v ¼ v 4, and the optimal solution is qj ¼ xj v , j = 1, 2, 3.
340
13
Game Theory
Introducing the slack variables x4 ; x5 ; x6 ; we get 5x1 þ 3x2 þ 7x3 þ x4 ¼ 1 7x1 þ 9x2 þ x3 þ x5 ¼ 1 10x1 þ 6x2 þ 2x3 þ x6 ¼ 1:
cB
yB
0 y4 0 y5 0 y6 zB ¼ 0
cB
yB
1 y3 0 y5 0 y6 zB ¼ 1=7
yB cB 1 y3 1 y2 0 y6 zB ¼ 1=5
cj xB
1 y1
1 y2
1 y3
0 y4
0 y5
0 y6
Mini ratio
x4 ¼ 1 x5 ¼ 1 x6 ¼ 1
5 7 10 1
3 9 6 1
7 1 2 1 "
1 0 0 0 #
0 1 0 0
0 0 1 0
1/7 1/1 1/2 Dj
cj xB
1 y1
1 y2
1 y3
0 y4
0 y5
0 y6
Mini ratio
x3 ¼ 1=7 x5 ¼ 6=7 x6 ¼ 5=7
5/7 44/7 60/7 2=7
3/7 60=7 36/7 4=7 "
1 0 0 0
1/7 1=7 2=7 1=7
0 1 0 0 #
0 0 1 0
1/3 1/10 5/36 Dj
cj
1
1
1
0
0
0
xB x3 ¼ 1=10 x2 ¼ 1=10 x6 ¼ 1=5
y1 2/5 11/15 24/5 2=5
y2 0 1 0 0
y3 1 0 0 0
y4 3/20 1=60 1=5 2=15
y5 1=20 7/60 3=5 1/5
y6 0 0 1 0
Dj
Since all zj cj 0, the optimal solution is attained, and the optimal solution is 1 1 x1 ¼ 0; x2 ¼ 10 ; x3 ¼ 10 and Max z0 ¼ 15. 0 1 From z ¼ v , we have v ¼ 5 ) q1 ¼ x1 v ¼ 0;
q2 ¼ x 2 v ¼
1 1 5¼ ; 10 2
and the value of the game v ¼ v 4 ¼ 5 4 ¼ 1.
q3 ¼ x 3 v ¼
1 1 5¼ 10 2
13.8
Matrix Games and Linear Programming Problems (LPPs)
341
Since A’s problem is the dual of B’s problem, then using duality theory, we have w1 ¼
2 ; 15
w2 ¼
1 ; 15
w3 ¼ 0
) p1 ¼ x 1 v ¼
values of D4 ; D5 ; D6 if Dj ¼ zj cj
2 2 5¼ ; 15 3
1 p2 ¼ ; 3
p3 ¼ 0:
Hence, the optimal solution is the following: (i) Best strategy for A is p ¼ 23 ; 13 ; 0 . (ii) Best strategy for B is q ¼ 0; 12 ; 12 . (iii) Value of the game = 1.
13.9
Infinite Antagonistic Game
Definition An infinite antagonistic game is a game in which at least one of the players possesses an infinite number of strategies. Let X and Y be the set of strategies for players 1 and 2 respectively and M(x, y) ðx 2 X; y 2 Y Þ be the payoff function of the antagonistic game hX; Y; M i. In the case of the matrix game, recall that X, Y are discrete sets of strategies (pure) for players 1 and 2 respectively. In that case, we have defined a mixed strategy for player 1 as a vector n having m (the number of pure strategies) components, each component diminishing the probability of choosing a pure strategy. A mixed strategy g for player 2 was similarly defined. We can extend this idea of mixed strategy when X and Y are infinite. In this case, a mixed strategy for player 00 1 will be denoted by a probability 0 distribution function F(x), where F xi F xi denotes the probability of choosing x when x00i \x\x0i . Similarly, the probability function G(y) will denote a mixed strategy for player 2. Note that F(x) is a non-negative, non-decreasingR function and R dFðxÞ ¼ 1. G(y) is also non-negative and non-decreasing, and Y dGðyÞ ¼ 1. X Here the integration is in the Stieltjes sense. Let DX ¼
DY ¼
8 < : 8 < :
Z FðxÞ : 0 FðxÞ 1;
dFðxÞ ¼ 1
9 =
; 9 Z = dGðyÞ ¼ 1 : ;
X
GðyÞ : 0 GðyÞ 1;
Y
For a fixed y, let player 1 choose x 2 X with probability distribution F(x); i.e. the probability of choosing x, where x00i \x\x0i , is F x0i F x00i . Then the expected
342
13
Game Theory
payoff to player 1 is M ðx; yÞ F x0i F x00i . Hence, when x 2 X, the expected payoff to the player is (for a fixed y) n X
M ðx; yÞfF ðxi Þ F ðxi1 Þg;
i¼1
where X is divided into n intervals with the help of the points xi ; i ¼ 0; 1; . . .; n. If we now make n ! 1 in such a way that Max jxi xi1 j ! 0, then this becomes i
the Stieltjes integral with respect to F(x) over the interval X; Z i:e: EðF; yÞ ¼ J ð yÞ ¼ M ðx; yÞ dF ð xÞ: X
In a similar way, when player 2 chooses y with probability distribution G(y), P0 then the expected payoff to player j is nj¼1 J ð yÞ G yj G yj1 , where Y is 0 0 divided into n0 intervals with the help of the points y0 ; . . .; yn . Now making n ! 1 in such a way that Max yj yj1 ! 0, the expected payoff to player 1, when j
his strategy is F(x) and player 2’s strategy is G(y), is Z E ðF; GÞ ¼
0 @
Y
Z
1 M ðx; yÞ dF ð xÞA dGð yÞ:
X
Note (i) If M(x, y) is a continuous function of x, y for x 2 X; y 2 Y, then Z E ðF; GÞ ¼ Y
0 @
Z X
1
M ðx; yÞ dxA dy ¼
Z X
0 @
Z
1
M ðx; yÞ dyA dx ¼
Y
Z Z M ðx; yÞ dx dy: Y
(
X
0 for x\x0 (ii) When F(x) is the Heaviside function H ðx x0 Þ ¼ , then 1 for x [ x0 R E(F, G) becomes Y M ðx0 ; yÞ dGð yÞ and we denote it by E ðx0 ; GÞ. Thus, E ðx0 ; GÞ is the expected payoff to player 1 when he chooses a simple strategy H ðRx x0 Þ and player 2 chooses the strategy G. Similarly, E ðF; y0 Þ ¼ X M ðx; y0 Þ dGð xÞ is the expected payoff to player 1 when he chooses the strategy F(x) and player 2 chooses the simple strategy H ðy y0 Þ. (iii) Let hX; Y; M i be an infinite antagonistic game. Then this is strategically equivalent to the game hDX ; DY ; Ei, where E ¼ E ðF; GÞ; F 2 DX ; G 2 DY .
13.9
Infinite Antagonistic Game
343
Equilibrium situation Definition Let ðF ; G Þ be a situation in mixed strategies for the infinite antagonistic game hX; Y; M i or hDX ; DY ; E i. This situation is called an equilibrium situation in mixed strategies or a saddle point in mixed strategies if for any fixed strategies F, G for player 1 and player 2 respectively, the following inequality is satisfied: E ðF; G Þ EðF ; G Þ E ðF ; GÞ 8F 2 DX ; G 2 DY : For mixed strategies in infinite antagonistic games, we shall prove the following theorem. Theorem 13.22 If E ðx; GÞ u 8x 2 X and a fixed G, then E ðF; GÞ u 8F 2 DX [for a matrix game: F ði; gÞ u; i ¼ 1; 2; . . .; m for a fixed g 2 Sn implies that E ðn; gÞ u 8n 2 Sm ]. Proof We R integrate both sides R of E ðx; GÞ u with respect to F(x) over X so as to obtain X Eðx; GÞ dF ð xÞ u X dF ð xÞ, 2 i:e: E ðF; GÞ u
4*
Z
3 dF ð xÞ ¼ 15:
X
Similarly; E ðx; GÞ u 8x 2 X implies EðF; GÞ u 8F 2 DX ; EðF; yÞ [ u; 8y 2 Y implies E ðF; GÞ [ u; 8G 2 DY and E ðF; yÞ\u; 8y 2 Y implies E ðF; GÞ\u; 8G 2 DY Theorem 13.23 In order for the situation ðF ; G Þ to be an equilibrium situation in an infinite antagonistic game C ¼ hX; Y; M i, it is necessary and sufficient that for all x 2 X and y 2 Y, the following inequalities are satisfied: Eðx; G Þ E ðF ; G Þ EðF ; yÞ: ½For a matrix game: Eði; g Þ E ðn ; g Þ E ðn ; jÞ
i ¼ 1; 2; . . .; m;
j ¼ 1; 2; . . .; n
This proof is left to the reader. Theorem 13.24 For any mixed strategy G for player 2, sup E ðx; GÞ ¼ sup E ðF; GÞ; x2X
F2DX
where the supremum on the left side is taken over all the pure strategies for player 1, and on the right side it is taken over all the mixed strategies.
344
13
Game Theory
Similarly, for a fixed strategy F for player 1, inf E ðF; yÞ ¼ inf E ðF; GÞ:
y2Y
2 4
G2DY
For a matrix game: Max Eði; gÞ ¼ Max ðn; gÞ for some fixed g i
n
and Min E ðn; jÞ ¼ Min E ðn; gÞ for some fixed n
3 5
g
j
Proof Since the set of mixed strategies contains all the pure strategies ( e:g:; F ð xÞ ¼ H ðx x0 Þ ¼
1
x [ x0
1
x \ x0
! ;
we have sup E ðx; GÞ sup E ðF; GÞ. x2X
F2DX
Now, let if possible, sup E ðx; GÞ\ sup EðF; GÞ. x2X
F2DX
This implies that there exists F1 2 DX such that for some e [ 0 sup Eðx; GÞ\EðF1 ; GÞ e; x2X
i:e: Eðx; GÞ\EðF1 ; GÞ e 8x 2 X: Now integrating both sides with respect to F1(x), we have Z
Z Eðx; GÞ dF1 ð xÞ\fE ðF1 ; GÞ eg
X
2
i:e: EðF1 ; GÞ\EðF1 ; GÞ e
4*
dF1 ð xÞ; X
Z
3
dF1 ð xÞ ¼ 15;
X
which is a contradiction Hence, sup Eðx; GÞ ¼ sup E ðF; GÞ. x2X
F2DX
The proof of the second result is similar. Value of an infinite antagonistic game Definition If sup inf EðF; GÞ ¼ inf sup E ðF; GÞ, then the common value of these F
G
G
F
mixed extrema is called the value of the game hX; Y; M i.
13.9
Infinite Antagonistic Game
345
Some properties of the value of the game and optimal strategies Let v be the value of the game hX; Y; M i. Theorem 13.25 sup inf M ðx; yÞ v inf sup M ðx; yÞ y
x
y
x
For a matrix game: Max Min aij u Min Max aij : i
j
j
i
Proof We know that inf E ðF; yÞ ¼ inf EðF; GÞ for any arbitrary mixed strategy y2Y
G2DY
F for player 1.
(
Let us choose a simple strategy F ¼ H ðx x0 Þ ¼
0 for x\x0 . 1 for x [ x0
Then inf M ðx0 ; yÞ ¼ E ðx0 ; GÞ 8x0 2 X;
y2Y
i.e. changing x0 to x, inf M ðx; yÞ ¼ inf E ðx; GÞ. y2Y
G2DY
But sup inf M ðx; yÞ sup inf E ðF; yÞ x
y
y
F
½as the set of mixed strategies DX contains all pure strategies
¼ sup inf E ðF; GÞ ¼ v; G F i:e: sup inf M ðx; yÞ v: x
y
ð13:40Þ
Similarly, we can prove that v inf sup M ðx; yÞ: y
ð13:41Þ
x
Hence, from (13.40) and (13.41), we have sup inf M ðx; yÞ v inf sup M ðx; yÞ: x
y
y
x
Theorem 13.26 If player 1 possesses a pure optimal strategy x0 and player 2 possesses an arbitrary strategy (in general mixed), then
346
13
v ¼ Max Min M ðx; yÞ x
Game Theory
¼ Min M ðx0 ; yÞ:
y
y
Similarly, if player 2 possesses a pure strategy y0 and player 1 possesses an arbitrary strategy, then n o v ¼ Min Max M ðx; yÞ ¼ Max M ðx; y0 Þ: y
x
x
The reader may easily prove this theorem. Theorem 13.27 Let u be a real number. (i) If F0 is a mixed strategy for player 1, then u E ðF0 ; yÞ 8y 2 Y ) u v: (ii) If G0 is a mixed strategy for player 2, then u E ðx; G0 Þ
8x 2 X ) u v:
Proof We have u EðF0 ; yÞ 8y 2 Y. Now integrating both sides with respect to G(y) over Y, we have Z Z u dGð yÞ EðF0 ; yÞ dGð yÞ Y
Y
or u E ðF0 ; GÞ
8G 2 DY Hence u inf E ðF0 ; GÞ sup inf E ðF; GÞ ¼ v G
F
G
) u v: The proof of (ii) is similar. Theorem 13.28 (i) In order for the strategy F0 for player 1 to be optimal, it is necessary and sufficient that EðF0 ; yÞ v 8y 2 Y:
13.9
Infinite Antagonistic Game
347
(ii) In order for the strategy G0 for player 2 be optimal, it is necessary and sufficient that E ðx; G0 Þ v 8x 2 X: Proof of (i) Let F0 be an optimal strategy. This implies that inf E ðF0 ; GÞ ¼ sup inf E ðF; GÞ ¼ v; G
F
G
so that E ðF0 ; GÞ v for all G 2 DY : Let us choose G to be a pure strategy H ðy y0 Þ. Then we obtain EðF0 ; y0 Þ v for all y0 2 Y; i:e: E ðF0 ; yÞ v for all y 2 Y:
Sufficient: Let E ðF0 ; yÞ v for all y 2 Y:
ð13:42Þ
Now, we have to show that F0 is an optimal strategy for player 1. Integrating both sides of (13.42) with respect to G(y) over Y, we have Z
Z EðF0 ; yÞ dGð yÞ v
Y
dGð yÞ Y
or E ðF0 ; GÞ v
8G 2 DY : Hence; inf E ðF0 ; GÞ ¼ v ¼ sup inf E ðF; GÞ ; G
F
G
i.e. the supremum of the function inf EðF0 ; GÞ is attained at F0. This implies that G
F0 is optimal.
13.10
Continuous Game
Definition An antagonistic game C ¼ hX; Y; M i is called a game on the unit square if the sets of pure strategies for each of the players are the unit segments, i.e. X = Y = [0, 1]. Hence, the mixed strategies for the players in a game on the unit square are R1 probability distributions on the segment [0, 1] such that 0 dF ð xÞ ¼ 1 and R1 0 dGð yÞ ¼ 1.
348
13
Game Theory
Here DX ¼ DY ¼ D: Definition A game on the unit square is called continuous if the payoff function is continuous.
Fundamental theorem for continuous game Theorem 13.29 For any continuous game on the unit square with payoff function M(x, y), where both players possess optimal strategies,
Max Min E ðF; GÞ F
G
and Min Max EðF; GÞ G
F
exist and are equal. The proof of the theorem is left to the reader.
13.11
Separable Game
Definition A function M(x, y) of two variables x, y is called separable if there exist m continuous functions ri (x), i = 1, 2, …, m, n continuous functions sj (y), j = 1, 2, …, n and mn constants aij i = 1, 2, …, m, j = 1, 2, …, n such that M ðx; yÞ ¼
m X n X
aij ri ð xÞ sj ð yÞ:
i¼1 j¼1
Example 9 M ðx; yÞ ¼ x sin y þ x cos y þ 2x2 is separable. We can take r 1 ð xÞ ¼ x r 2 ð xÞ ¼ x2
s1 ð yÞ ¼ sin y s2 ð yÞ ¼ cos y s 3 ð yÞ ¼ 1
a11 ¼ 1; a12 ¼ 1; a13 ¼ 0 a21 ¼ 0; a22 ¼ 0; a23 ¼ 2
We can also take r 1 ð xÞ ¼ x r2 ð xÞ ¼ 2x2
s1 ð yÞ ¼ sin y þ cos y s2 ð yÞ ¼ 1
a11 ¼ 1; a12 ¼ 0 a21 ¼ 0; a22 ¼ 1
13.11
Separable Game
349
Note
P Pn (i) The representation M ðx; yÞ ¼ m i¼1 j¼1 aij ri ð xÞ sj ð yÞ of a separable function is not unique. (ii) If M(x, y) is separable, we may also write M ðx; yÞ ¼
n m X X j¼1
! aij ri ð xÞ sj ð yÞ ¼
i¼1
n X
tj ð xÞ sj ð yÞ;
j¼1
where ti(x) and sj(y) are continuous functions of x and y respectively. Thus, a continuous game on the unit square whose payoff function is separable is called a separable game. Expectation function in a separable game A separable game is hX; Y; M i, where X = Y = [0, 1], and M(x, y) is separable. Hence, a mixed strategy for player 1 is a probability distribution function F(x), and a mixed strategy for player 2 is a probability distribution function G(y), where F 2 D; G 2 D and D¼
Z
1
F : 0 F 1; 0
or D ¼
Z
1
G : 0 G 1;
dF ð xÞ ¼ 1 dGð yÞ ¼ 1 :
0
If player 1 chooses a mixed strategy F(x) and player 2, a mixed strategy G (y) ðF; G 2 DÞ, then the expectation of player 1 is Z E ðF; GÞ ¼
1
Z
1
M ðx; yÞ dF ð xÞ dGð yÞ ) Z 1 Z 1 (X m X n ¼ aij ri ð xÞ sj ð yÞ dF ð xÞ dGð yÞ
¼ ¼
0
0
0
0
i¼1 j¼1
m X n X i¼1 j¼1 m X n X
Z
1
aij
and vj ¼
R1 R01 0
1
ri ð xÞ dF ð xÞ
0
aij ui vj ;
i¼1 j¼1
where ui ¼
Z
ri ð xÞ dF ð xÞ;
i ¼ 1; 2; . . .; m
sj ð yÞ dGð yÞ;
j ¼ 1; 2; . . .; n.
0
sj ð yÞ dGð yÞ
350
13
Game Theory
Note 1: There may exist two different strategies F1 ð xÞ; F2 ð xÞ such that Z1
Z1 ri ð xÞ dF1 ð xÞ ¼
0
ri ð xÞ dF2 ð xÞ
i ¼ 1; 2; . . .; m:
0
But E ðF1 ; GÞ ¼
m X n X i¼1 j¼1
aij ui vj ¼
m X n X i¼1 j¼1
0 aij @
Z1
1 ri ð xÞ dF2 ð xÞAvj
0
¼ EðF2 ; GÞ 8 G 2 D: Thus, it is immaterial whether player 1 uses strategy F1 ð xÞ or F2 ð xÞ. In this case we shall say that the two strategies F1 and F2 are equivalent (with respect to the game). P-space and Q-space Let a ¼ ðu1 ; . . .; um Þ0 ; b ¼ ðv1 ; . . .; vn Þ0 ; then a 2 E m ; b 2 En . n o R1 Let P ¼ a ¼ ðu1 ; u2 ; . . .; um Þ0 ; ui ¼ 0 ri ð xÞ dF ð xÞ; i ¼ 1; 2; . . .; m; F 2 D . Then obviously P E m . We also say that a and F correspond. Note that a given point a of the P-space, in general, corresponds to many different distribution functions. n R1 Similarly, let Q ¼ b ¼ ðv1 ; v2 ; . . .; vn Þ0 ; vj ¼ 0 sj ð yÞ dGð yÞ; j ¼ 1; 2; . . .; n; G 2 Dg. Then obviously Q E n . We also say that b and G correspond. Note that a given point b in the Q-space, in general, corresponds to many different distribution functions. Note If u corresponds to both F(x) and F1(x), and v corresponds to both G(y) and G1(y), then E ðF; GÞ ¼ E ðF1 ; GÞ ¼ E ðF; G1 Þ ¼ E ðF1 ; G1 Þ X aij ui vj ; and let us denote it by E ða; bÞ ¼ i; j
Geometrical properties of P- and Q-spaces Theorem 13.30 Let Fk ð xÞ ðk ¼ 1; . . .; pÞ be a distribution function (i.e. strategies P for player 1), and let F ð xÞ ¼ pk¼1 ck Fk ð xÞ, where ck 0; k ¼ 1; 2; . . .; p and
13.11 p P
Separable Game
351
ck ¼ 1. Then F 2 D. Let ak and Fk correspond ðk ¼ 1; . . .; pÞ. Then a ¼
k¼1 p P
ck ak corresponds to F so that a 2 P.
k¼1
A similar result holds for the Q-space. m P n P Proof We have M ðx; yÞ ¼ aij ri ð xÞ sj ð yÞ. i¼1 j¼1 0 ðk Þ Let ak ¼ u1 ; . . .; uðmkÞ u ¼ 1; . . .; p.
Since ak corresponds to Fk ð xÞ; k ¼ 1; . . .; p, we have Z1 ui ¼
ri ð xÞ dFk ð xÞ; i ¼ 1; . . .; m: 0
Let a ¼ ðu1 ; . . .; um Þ0 . Since a ¼
p X
ck ak ; then ui ¼
k¼1
¼
p X k¼1
Z
ðk Þ
ck ui ; i ¼ 1; 2; . . .; m
k¼1 1
ck
ri ð xÞ dFk ð xÞ
0
Z1 ¼
p X
r i ð xÞ d 0
p X k¼1
! ck Fk ð xÞ
Z1 ¼
ri ð xÞ dF ð xÞ 0
Since F 2 D, a corresponds to F, and hence a 2 P. The proof for the Q-space is similar. P0 - and Q0 -spaces
Let us define P ¼ q ¼ ðr1 ; ; rm Þ0 : ri ðtÞ; 0 t 1 ) P 2 E m Q ¼ r ¼ ðs1 ; . . .; sn Þ0 : sj ðtÞ; 0 t 1 ) Q E n :
Properties of P0 - and Q0 -spaces Property-1: P P and Q Q.
352
13
Game Theory
Property-2: Since ri(x) and sj(y) are continuous functions defined on a closed set, P* and Q* are bounded, closed and connected sets. Property-3: The P-space is the convex hull of P*. Similarly, the Q-space is the convex hull of Q*. Example 10 Let the payoff function be M ðx; yÞ ¼ cosð2pxÞ cosð2pyÞ þ x þ y; 0 x; y 1. Here we can write M ðx; yÞ ¼
3 P 3 P
aij ri ð xÞ sj ð yÞ,
i¼1 j¼1
where r1 ð xÞ ¼ x; r2 ð xÞ ¼ cos 2px; r3 ð xÞ ¼ 1;
s1 ð yÞ ¼ y; a11 ¼ 0; a12 ¼ 0; a13 ¼ 1 s2 ð yÞ ¼ cos 2py; a21 ¼ 0; a22 ¼ 1; a23 ¼ 0 s3 ð yÞ ¼ 1; a31 ¼ 1; a32 ¼ 0; a33 ¼ 0
Thus; P ¼ q ¼ ðr1 ; r2 ; r3 Þ0 : r1 ðtÞ ¼ t; r2 ðtÞ ¼ cos 2pt; r3 ðtÞ ¼ 1; 0 t 1 Q ¼ r ¼ ðs1 ; s2 Þ0 : s1 ðtÞ ¼ t; s2 ðtÞ ¼ cos 2pt; s3 ðtÞ ¼ 1; 0 t 1 :
Image of a 2 P Definition The image of a 2 P is defined as the set of points b 2 Q such that E ða; bÞ ¼ Min E ða; gÞ, and we denote this image of a by QðaÞ and QðaÞ Q. g2Q
Note that QðaÞ is in general a set of points 2 Q. Image of b 2 Q Definition Similarly, the image of b 2 Q is defined as the set of points a 2 P such that Eða; bÞ ¼ Max E ðn; bÞ, and we denote this image of b by P(b) and PðbÞ P. n2P
Note that P(b) in general is a set of points 2 P. Fixed point Definition If b 2 Q is an image point of a 2 P and a 2 P is an image point of b 2 Q, then a is called a fixed point of P-space and b is a fixed point of Q-space. Theorem 13.31 If F1 is any distribution function and a 2 P corresponds to F1, then F1 is an optimal strategy for player 1 iff a is a fixed point P. Similarly, G is an optimal strategy for player 2 iff the corresponding point b 2 Q is a fixed point Q. Proof Let F1 be a mixed strategy for player 1 and let it correspond to a. Now, a is a fixed point of P means that for some point b 2 Q, we have b 2 QðaÞ and a 2 PðbÞ. That is, b is an image point of a and a is an image point of b, i.e.
13.11
Separable Game
353
E ða; bÞ ¼ Min E ða; gÞ g2Q
and Eða; bÞ ¼ Max Eðn; bÞ: n2P
Let G1 be a strategy (mixed) for player 2, and let it correspond to b 2 Q. Then E ðF1 ; G1 Þ E ða; bÞ ¼ Min E ða; gÞ ¼ Min E ðF1 ; GÞ g2Q
G2D
) E ðF1 ; G1 Þ E ðF1 ; GÞ 8G 2 D: Again E ðF1 ; G1 Þ E ða; bÞ ¼ Max E ðn; bÞ ¼ Max E ðF; G1 Þ E ðF; G1 Þ 8F 2 D; F2D
n2P
i:e: E ðF; G1 Þ E ðF1 ; G1 Þ 8F 2 D;
Hence, we obtain EðF; G1 Þ EðF1 ; G1 Þ E ðF1 ; GÞ 8F; G 2 D: Thus, F1 and G1 are optimal strategies. Theorem 13.32 If a is any fixed point of P and b is any fixed point of Q, then a is an image point of b, and b is an image point of a. Proof Since a is a fixed point of P, then 9 a point b1 of Q such that b1 is an image point of a and a is an image point b1 , i.e. a 2 Pðb1 Þ and b1 2 QðaÞ, i.e. such that E ða; b1 Þ ¼ Max E ðn; b1 Þ
ð13:43Þ
and E ða; b1 Þ ¼ Min Eða; gÞ:
ð13:44Þ
n2P
g2Q
Similarly, since b is a fixed point of Q, then there exists a point a1 2 P such that a1 is an image point of b and b is an image point of a1 , i:e: a1 2 PðbÞ and b 2 Qða1 Þ i:e: E ða1 ; bÞ ¼ Max E ðn; bÞ
ð13:45Þ
and Eða1 ; bÞ ¼ Min E ða1 ; gÞ:
ð13:46Þ
n2P
g2Q
Thus, we have
354
13
Game Theory
E ða; bÞ Max E ðn; bÞ ½by the definition of maxima
n2P
¼ E ða1 ; bÞ ½by ð13:45Þ
¼ Min E ða1 ; gÞ ½by ð13:46Þ
g2Q
E ða1 ; b1 Þ ½by the definition of minima
Max E ðn; b1 Þ
½by the definition of maxima
n
¼ E ða; b1 Þ ½by ð13:43Þ
¼ Min E ða; gÞ ½by ð13:44Þ
g2Q
E ða; bÞ
½by the definition of minima :
Since the first and last terms of these inequalities are equal, we therefore conclude that all the quantities involved are equal, and thus in particular, E ða; bÞ ¼ Max E ðn; bÞ ¼ Min E ða; gÞ; g2Q
n2P
which means that a 2 PðbÞ and b 2 QðaÞ, i.e. a is an image point of b and b is an image point of a. General procedure to solve a separable game Step-1 Plot the curves P* and Q* and determine their convex hulls to give the Pspace and Q-space. Step-2 Find Q(a) for every a 2 P and P(b) for every b 2 Q; i.e. find all the image points of the Pspace and Q-space. Step-3 Using the results of Step 2, find the fixed points of P and Q. Step-4 Express the fixed points as convex linear combinations of points in P* and Q* respectively, then find the distribution function in which these fixed points corresponding to the distribution function will be the optimal strategies Example 11 Given that M ðx; yÞ ¼ 3 cos 4x cos 5y þ 5 cos 4x sin 5y þ sin 4x cos 5y þ sin 4x sin 5y: Find P and Q
Solution We can write M ðx; yÞ ¼
P2 P2 i¼1
j¼1
aij ri ð xÞ sj ð yÞ,
where r1 ð xÞ ¼ cos 4x s1 ð yÞ ¼ cos 5y r2 ð xÞ ¼ sin 4x
s2 ð yÞ ¼ sin 5y
and a11 ¼ 3; a12 ¼ 5; a21 ¼ 1; a22 ¼ 1.
0x1 0y1
13.11
Separable Game
355
Thus, P ¼ q ¼ ðr1 ; r2 Þ0 r1 ¼ cos 4t; r2 ¼ sin 4t; 0 t 1 and Q ¼ r ¼ ðs1 ; s2 Þ0 ; s1 ¼ cos 5t; s2 ¼ sin 5t; 0 t 1 . Canonical representation of the payoff function in a separable game For a separable game, let the payoff function M(x, y) be represented by M ðx; yÞ ¼
n X n X
aij ri ð xÞ sj ð yÞ þ
i¼1 j¼1
n X
bi r i ð x Þ þ
i¼1
n X
cj sj ð yÞ þ d;
j¼1
where (i) r1 ð x Þ; r 2 ð xÞ; . . . rn ð xÞ and s1 ð yÞ; . . .; sn ð yÞ are continuous functions over [0, 1]. (ii) det aij 6¼ 0. Any representation of a separable function in this form is called a canonical representation of the function M(x, y). We note that every separable function has a canonical Pn form. For if M(x, y) is separable, then it can always be written in the form j¼1 rj ð xÞsj ð yÞ so that det
aij 6¼ 0. Example 12 Examine whether the representation M ðx; yÞ ¼ xy xey þ 2x cos y þ 2ex y þ 3ex ey þ ex cos y þ 5 cos xey 3 cos x cos y is canonical or not: Let r1 ð xÞ ¼ x s1 ð yÞ ¼ y, r2 ð xÞ ¼ ex s2 ð yÞ ¼ ey , r3 ð xÞ ¼ cos x Then a11 ¼ 1; a21 ¼ 2; a31 ¼ 0;
a12 ¼ 1; a22 ¼ 3; a32 ¼ 5;
a13 ¼ 2;
s3 ð yÞ ¼ cos y.
b1 ¼ b2 ¼ b3 ¼ 0
a23 ¼ 1; c1 ¼ c2 ¼ c3 ¼ 0 a33 ¼ 3; d ¼ 0:
2 1 1 3 1 ¼ 0. Now, det aij ¼ 2 0 5 3 Now from M ðx; yÞ ¼ xy xey þ 2x cos y þ 2ex y þ 3ex ey þ ex cos y þ 5 cos xey 3 cos x cos y we have
356
13
Game Theory
7 3 y M ðx; yÞ ¼ ðx 2 cos xÞ y þ cos y ðx 2 cos xÞ e cos y 5 5 3 3 x x y þ 2ðe þ cos xÞ y þ cos y þ 3ðe þ cos xÞ e cos y ; 5 5 i:e: M ðx; yÞ ¼ r1 ð xÞs1 ð yÞ r1 ð xÞs2 ð yÞ þ 2r2 ð xÞs1 ð yÞ þ 3r2 ð xÞs2 ð yÞ where r1 ð xÞ ¼ x 2 cos x r2 ð xÞ ¼ ex þ cos x
7 s1 ð yÞ ¼ y þ cos y 5 3 y s2 ð yÞ ¼ e cos y: 5
Hence, a11 ¼ 1;
a22 ¼ 1;
b1 ¼ b2 ¼ 0
a21 ¼ 2;
a22 ¼ 3
c1 ¼ c2 ¼ 0
d ¼ 0:
1 1 ¼ 5 6¼ 0. Therefore, det aij ¼ 2 3 Thus, this representation is canonical.
First critical point and second critical point Let a separable function M(x, y) have the following canonical representation: M ðx; yÞ ¼
n X n X
aij ri ð xÞ sj ð yÞ þ
i¼1 j¼1
n X i¼1
bi r i ð x Þ þ
n X
cj sj ð xÞ þ d;
j¼1
where ri ð xÞ; sj ð yÞ ði; j ¼ 1; 2; . . .; nÞ are continuous functions on [0, 1] and
det aij 6¼ 0. In this case, let F, a mixed strategy for player 1, correspond to a ¼ ðu1 ; . . .; un Þ0 2 P, and let G, a mixed strategy for player 2, correspond to b ¼ ðv1 ; . . .; vn Þ0 2 Q (note that both P; Q E n ). Z1 Then ui ¼
ri ð xÞ dF ð xÞ;
i ¼ 1; 2; . . .; n
sj ð yÞ dGð yÞ;
j ¼ 1; 2; . . .; n
0
Z1 vj ¼ 0
13.11
Separable Game
357
and EðF; GÞ E ða; bÞ ¼
n X n X i¼1 j¼1
¼
n n X X j¼1
aij ui vj þ
n X
bi ui þ
i¼1
!
aij ui þ cj vj þ
i¼1
n X
n X
cj vj þ d
j¼1
bi ui þ d:
i¼1
Alternatively, E ðF; GÞ ¼
n n X X i¼1
! aij vi þ bi ui þ
n X
j¼1
cj vj þ d:
j¼1
Since det aij 6¼ 0, then the system of equations n X
aij ui þ cj ¼ 0;
j ¼ 1; 2; . . .; n
ð13:47Þ
i¼1
has a unique solution, as does the system of equations n X
aij vj þ bi ¼ 0;
i ¼ 1; 2; . . .; n
ð13:48Þ
j¼1
Q The unique solution of (13.47), say ¼ ðp1 ; p2 ; . . .; pn Þ0 , is called the first critical point of the game (with respect to the given representation), and the unique solution of (13.48), say v ¼ ðq1 ; q2 ; . . .; qn Þ0 , is called the second critical point of the game (with respect to the given representation). Q Note The point may notQbelong to the P-space, and the point v may not belong to the Q-space. However, if 2 P and v 2 Q, then the corresponding game problem can be solved easily by using the following two theorems. Theorem 13.33 Let the payoff function of a separable game have a given canonical form M ðx; yÞ ¼
n X n X i¼1 j¼1
aij ri ð xÞ sj ð yÞ þ
n X
b i r i ð xÞ þ
i¼1
n X
cj sj ð yÞ þ d; det aij 6¼ 0
j¼1
Q and also let ¼ ðp1 ; p2 ; . . .; pn Þ0 ; v ¼Qðq1 ; q2 ; . . .; qn Þ0 beQrespectively the first and second critical points and suppose 2 P; v 2 Q. Then and v are the fixed n n Q P P points, and the value of the game is given by E ð ; vÞ ¼ bi pi þ d ¼ cj qj þ d. i¼1
Proof By the definition of critical points,
j¼1
358
13 n X
and
aij pi þ cj ¼ 0;
i¼1 n X
aij qj þ bi ¼ 0;
Game Theory
j ¼ 1; 2; . . .; n i ¼ 1; 2; . . .; n:
j¼1
Now, for any point b 2 Q b ¼ ðv1 ; v2 ; . . .; vn Þ0 , ! n n n Y X X X aij pi þ cj vj þ bi pi þ d E ;b ¼ j¼1
¼ 0þ
i¼1 n X
i¼1
bi pi þ d; which is independent of b:
i¼1
Q Hence, Qð Þ ¼ Q. Similarly, PðvÞ ¼ P. Q Q Q Hence, 2 PðvÞ and v 2 Qð Þ, so that and v are fixed points, and the value of the game is n n Y X X E ;v ¼ bi pi þ d ¼ cj qj þ d: i¼1
j¼1
Theorem 13.34 Let the interior of Q contain a fixed point; then the first critical point belongs to P and is the only fixed point. Similarly, if the interior of P contains a fixed point, then the second critical point belongs to Q and is the only fixed point of Q. Proof Let b ¼ ðv1 ; . . .; vn Þ0 be a fixed point of Q which lies in the interior of Q. Let Q 0 a ¼ ðu1 ; . . .; un Þ0 be a fixed point of QP and let ¼ ðp1 ; . . .; pn ÞQbe the first critical point. WeQwish to show that a ¼ . Suppose Pn if possible, a 6¼ . Since is the unique solution of j ¼ 1; 2; . . .; n, we i¼1 aij pi þ cj ¼ 0; have for some k n, Pn i¼1 aik ui þ ck 6¼ 0 and is equal to g. Let h be a real number of opposite sign to g, and let it be small enough to ensure that the point ¼ ðv1 ; . . .; vk1 ; vk þ h; vk þ 1 ; . . .; vn Þ 2 Q: b ! n n n X X X Now E ða; bÞ ¼ aij ui þ cj vj þ bi ui þ d j¼1
i¼1
¼ E ða; bÞ þ h E a; b
i¼1 n X
!
aik vk þ ck
i¼1
¼ E ða; bÞ þ hg \E ða; bÞ as hg\0:
13.11
Separable Game
359
This means that a 62 PðbÞ, which contradicts the assumption that a and Q b are both fixed points of P and Q respectively. Hence, a must coincide with . Corollary 1 Let the payoff of a separable game be given in a canonical form and let the first critical point 62 P. Then every fixed point of Q is in the boundary of Q. Similarly, if the second critical point 62 Q, then every fixed point of P is in the boundary of P. CorollaryQ2 Let the payoff function of a separable game be given in a canonical Q form and and v be the first and second critical points respectively. Let belong Q to the interior of P and v belong to the interior of Q; then is the only fixed point of P and v is the only fixed point of Q. Example 13 Solve the separable game whose payoff function is M ðx; yÞ ¼ 3 cos 7x cos 8y þ 5 cos 7x sin 8y þ 2 sin 7x cos 8y þ sin 7x sin 8y: Solution Let r1 ð xÞ ¼ cos 7x;
r2 ð xÞ ¼ sin 7x
s1 ð yÞ ¼ cos 8y;
s2 ð yÞ ¼ sin 8y:
Then M ðx; yÞ ¼ 3r1 ð xÞ s1 ð yÞ þ 5r1 ð xÞs2 ð yÞ þ 2r2 ð xÞ s1 ð yÞ þ r2 ð xÞ s2 ð yÞ. Hence; a11 ¼ 3; a12 ¼ 5; b1 ¼ b2 ¼ 0 a21 ¼ 2; a22 ¼ 1; c1 ¼ c2 ¼ 0 3 5 ¼ 3 10 ¼ 7 6¼ 0: ) det aij ¼ 2 1 Hence, this representation is canonical. P* and Q* spaces Here P ¼ q ¼ ðr1 ; r2 Þ0 : r1 ¼ cos 7t; r2 ¼ sin 7t; Q ¼ r ¼ ðs1 ; s2 Þ0 : s1 ¼ cos 8t; s2 ¼ sin 8t;
0t1 0t1
The P* space is the circle with its interior, and the Q* space is the circle with its interior. See Fig. 13.4a, b. First and second critical points Q ¼ ðp1 ; p2 Þ0 is the unique solution of
360
13
(a)
r2
Game Theory
s2
(b)
s1
r1
Q* space
P* space
Fig. 13.4 a Graphical representation of P* space. b Graphical representation of Q* space
n X
aij pi þ cj ¼ 0;
j ¼ 1; 2;
i¼1
i:e: 3p1 þ 2p2 ¼ 0 5p1 þ p2 ¼ 0 ) p1 ¼ 0; p2 ¼ 0 Q ) ¼ ð0; 0Þ0 2 interior of P-space. Now v ¼ ðq1 ; q2 Þ0 is the unique solution of q X
aij qj þ ci ¼ 0;
i ¼ 1; 2;
j¼1
i:e: 2q1 þ 5q2 ¼ 0 2q1 þ q2 ¼ 0 ) q1 ¼ 0; q2 ¼ 0 )v ¼ ð0; 0Þ0 2 interior of Q: Q Hence, is the only fixed point of P and v is the only fixed point of Q. Value of the game The value of the game is E
2 2 Y X X ;v ¼ bi pi þ d ¼ cj qj þ d ¼ 0: i¼1
j¼1
13.11
Separable Game
361
P*
Fig. 13.5 Graphical representation of P* space
B (1, 0)
(-1, 0) A
fixed point
Optimal strategies Q We write the fixed point of P, i.e. ¼ ð0; 0Þ0 , as a convex combination of the * points on P . See Fig. 13.5. We can obviously write 1 1 ð0; 0Þ0 ¼ ð1; 0Þ0 þ ð1; 0Þ0 : 2 2 Hence, an optimal strategy for player −1 is 1 1 F ð xÞ ¼ FA ð xÞ þ FB ð xÞ; 2 2 where FA ð xÞ is a strategy corresponding to A ð1; 0Þ0 and FB ð xÞ is a strategy corresponding to B ¼ ð1; 0Þ0 . Now to find FA ð xÞ, we note that it satisfies Z1
Z1 r1 ðtÞ dFA ðtÞ ¼
1 ¼
cos7t dFA ðtÞ
0
0
Z1
Z1
0¼
r2 ðtÞ dFA ðtÞ ¼ 0
sin7t dFA ðtÞ: 0
These expressions are satisfied by FA ð xÞ ¼ H x p7 . Similarly, FB ð xÞ satisfies
t1 ( cos 7t1 ,sin 7t1 )
t2 ( cos 7t2 ,sin 7t2 )
362
13
Z1 1¼
Z1 r1 ðtÞ dFB ðtÞ ¼
cos7t dFB ðtÞ
0
0
Z1
Z1
0¼
Game Theory
r2 ðtÞ dFB ðtÞ ¼ 0
sin7t dFB ðtÞ: 0
These are satisfied by FB ð xÞ ¼ H ð xÞ or H x 2p 7 . Hence, an optimal strategy for player 1 is
t1 ( cos 7t1 ,sin 7t1 )
t2 ( cos 7t2 ,sin 7t2 ) 1 p either F ð xÞ ¼ H x þ 2 7 1 p or F ð xÞ ¼ H x þ 2 7
1 H ð xÞ 2 1 2p H x : 2 7
Hence, we can also write 1 1 p 0 t1 ; t2 1: ð0; 0Þ0 ¼ ðcos 7t1 ; sin 7t1 Þ0 þ ðcos 7t2 ; sin 7t2 Þ0 where t2 t1 ¼ 2 2 7 Hence, a most general optimal strategy for player 1 is 1 1 p 0 x1 ; x2 1: F ð xÞ ¼ H ðx x1 Þ þ H ðx x2 Þ where x2 x1 ¼ 2 2 7 Similarly, a most general optimal strategy for player 2 is
13.11
Separable Game
363
1 1 p G ð yÞ ¼ H ðy y1 Þ þ H ðy y2 Þ; y2 y1 ¼ 2 2 8
0 y1 ; x2 1:
Z1 Also, 0 ¼
cos7x dF ð xÞ 0
Z1 0¼
cos7x dF ð xÞ 0
are satisfied by choosing F ð xÞ ¼
7x
2p ;
1 Z7
Z1
2p
cos7x dF ð xÞ ¼
In fact;
0 x 2p 7 . x1
2p 7
Z1 7x cos 7x d þ cos7x d ð1Þ 2p
0
0
2p 7 sin 7x 7 ¼0 ¼ 2p 7 0
2p 7
and
2p
Z7
Z1 sin7x dF ð xÞ ¼
sin7x d
Z1 7x þ sin7x dð1Þ 2p
0
0
¼
7 cos 7x 2p 7
2p7 0
2p 7
¼
7 ð1 1Þ ¼ 0: 2p
Similarly, it can easily be verified that 8 8y 2p > < 0y 2p 8 G ð yÞ ¼ 2p > :1 y1 8 is an optimal strategy for player 2. Note This is example of a game where there are an infinite number of optimal strategies.
364
13
13.12
Game Theory
Exercises
1. Solve the game whose payoff matrix is given below: 2
2 6 3 6 4 4 5
0 2 3 3
0 1 0 4
5 2 2 2
3 3 2 7 7 6 5 6
2. Solve the following 4 2 game graphically: Player B 3 2 0 6 3 1 7 6 7 PlayerA6 7 4 3 2 5 2
5
4
3. Determine the optimum minimax strategies for each player in the following games:
ðiÞ
ðiiÞ
A1 A2 A3 A1 A2 A3
2 B1 B2 B3 B4 3 5 2 0 7 4 5 6 4 8 5 4 0 2 3 2B1 B2 2 15 4 5 6 5 20
B3 3 2 4 5 8
4. Use dominance to solve the following game: 2
1 62 6 43 4
2 2 1 3
3 1 0 2
3 1 5 7 7 2 5 6
5. Solve the following two-person zero-sum game to find the value of the game: Company B 2 2 4 6 6 1 12 6 4 3 2 0 2 3 7 2 Company A
3 1 37 7 65 7
13.12
Exercises
365
6. Use the notion of dominance to simplify the rectangular game with the following payoff, and then solve it graphically: Player K I II III B 18 4 6 B B 6 2 13 B @ 11 5 17 7 6 12 0 Player L
1 2 3 4
1 IV 4 C C 7 C C 3 A 2
7. Convert the following game problem into a linear programming problem and solve:
2 Player 5 7 4 10 4 6 2
Player A
B 3 2 95 0
8. For the following two-person zero-sum game, find the strategy of each player and the value of the game: 2 A
3 7 5 5 3
B 3 4 2
8 4 6 2
9. Solve the following 3 3 game: 2
7 49 5
1 1 7
3 7 15 6
10. Consider the following 2 2 game:
4 6
7 5
(i) Does it have a saddle point? (ii) Is it correct to state that the value of game G will satisfy 5\G\6? (iii) Determine the frequency of optimum strategies by the arithmetic method and find the value of the game.
366
13
Game Theory
11. Solve the following game:
I II III IV
2I II III IV3 3 2 4 0 63 4 2 47 6 7 44 2 4 05 0 4 0 8
12. Solve the following game whose payoff matrix is given by the graphical method:
A1 A2 A3
2 B1 B2 B3 4 2 3 4 1 2 0 2 1 2
B4 3 1 1 5 0
13. Obtain the optimal strategies for both persons and the value of the game for a two-person zero-sum game whose payoff matrix is as follows: 2 Player B 3 1 3 6 3 5 7 7 6 6 1 6 7 7 6 6 4 1 7 7 6 4 2 2 5 5 0
Player A
14. Use the graphical method to solve the following game:
Player B
2 4
Player A 2 3 2 3 2 6
Chapter 14
Project Management
14.1
Objectives
The objectives of this chapter are to: • Discuss the importance of PERT and CPM techniques for project management. • Distinguish between PERT and CPM network techniques. • Discuss the different phases of any project and various activities to be done during these phases. • Draw the network diagrams with single and three-time estimates of activities involved in a project. • Determine the critical path and floats associated with non-critical activities and events along with the total project completion time. • Determine the probability of completing a project within the scheduled time. • Establish a time-cost trade-off for completion of a project.
14.2
Introduction
A project is a well-defined set of activities, jobs or tasks, all of which must be completed to finish the project. Construction of a highway or power plant, production and marketing of a new product and research & development work are the examples of projects. Such projects involve a large number of inter-related activities (or tasks) which must be completed in a specified time and in a specified sequence (or order) and require resources such as personnel, money, materials, facilities and/ or space. The main objective before starting any project is to schedule the required activities in an efficient manner so as to: (i) Complete it on or before a specified time limit (ii) Minimize the total time © Springer Nature Singapore Pte Ltd. 2019 A. K. Bhunia et al., Advanced Optimization and Operations Research, Springer Optimization and Its Applications 153, https://doi.org/10.1007/978-981-32-9967-2_14
367
368
(iii) (iv) (v) (vi)
14
Minimize Minimize Minimize Minimize
the the the the
Project Management
time for a prescribed cost cost for a specified time total cost idle resources.
Therefore, before starting any project, it is essential to prepare a plan for scheduling and controlling the various activities involved in the project. The techniques of operations research (OR) used for planning, scheduling and controlling large & complex projects are often referred to as network analysis, network planning or network scheduling techniques. In all these techniques, a project is broken down into various activities which are arranged in a logical sequence in the form of a network. This approach helps managers to visualize a project as a number of tasks which can easily be defined in terms of duration, cost and starting time. The sequence of activities is also defined. There are two basic planning and control techniques that utilize a network to complete a predetermined project or schedule. These are PERT (program evaluation and review technique) and CPM (critical path method). A PERT network was developed during 1956–1958 by a research team of the US Navy’s Polaris Nuclear Submarine Missile development project. Since 1958, this technique has been used to plan almost all types of projects. At the same time but independently, CPM was developed jointly by two companies: E. I. DuPont de Nemours and Company and Remington Rand Corporation. Other network techniques are PEP (performance evaluation programme), LCES (least cost estimating and scheduling), SCANS (scheduling and control by automated network system), etc.
14.3
Phases of Project Management
The work involved in a project can be divided into three phases corresponding to the management functions of planning, scheduling and control. Planning This phase involves setting the objectives of the project and the assumptions to be made. It also involves the listing of tasks or jobs that must be performed to complete a project under consideration. In this phase, the men, machines and materials required for the project in addition to the estimates of costs and duration of the various activities of the project are also determined. Scheduling This consists of organizing the activities according to the precedence order and to determine
14.3
Phases of Project Management
369
(i) The starting and finishing times for each activity (ii) The critical path for which the activities require special attention (iii) The slack and float for the non-critical paths. Control This phase is exercised after the planning and scheduling and involves the following: (i) (ii) (iii) (iv)
14.4
Making periodical progress reports Reviewing the progress Analysing the status of the project Management decisions regarding updating, crashing and resource allocation, etc.
Advantages of Network Analysis
The network analysis: (i) Shows the inter-relationships of all jobs in the project (ii) Gives a clear picture, other than a typical bar chart, of the relationship controlling the order of performance of various activities (iii) Helps in communication of ideas. The pictorial approach helps to clarify the verbal instructions (iv) Provides time schedules containing much more information than other methods such as bar charts (v) Identifies jobs which are critical for a project completion date (vi) Permits an accurate forecast of resource requirements (vii) Provides a method of resource allocation to meet the limiting condition and to maintain or minimize the overall costs (viii) Integrates all elements of a programme to whatever detail is desired by management (ix) Relates time to costs, which allows a money value to be placed on proposed changes.
14.5
Basic Components
There are two basic components in a network. These are: (i) Event/node (ii) Activity.
370
14
Project Management
Event/node An event/node is a particular instant in time showing the end or beginning of one or more activities. It is a point of accomplishment or decision. The starting and end points of an activity are thus described by two events, usually called the tail event and head event. An event is generally represented by a circle, rectangle, hexagon or some other geometric shape. These geometric shapes are numbered for distinguishing one activity from another. The occurrence of an event indicates that the work has been accomplished up to that point (See Fig. 14.1). Merge and burst event It is necessary for an event to be the ending event of at least one activity, but it can also be the ending event of two or more activities. Then it is defined as a merge event. If an event is the beginning event of two or more activities, it is defined as a burst event. See Fig. 14.2. Activity An activity is a task or item of work to be done that consumes time, effort, money or other resources. Activities are represented by arrows. Activities are identified by the numbers of their starting (tail) event and ending (head) event. Generally, an ordered pair (i, j) represents an activity where events i and j represent the starting and ending of the activity respectively. It can also be denoted by i − j. Sometimes, activities are denoted by capital letters (Fig. 14.3). The activities can further be classified into different categories: (i) Predecessor activity: An activity which must be completed before one or more other activities start is known as a predecessor activity. (ii) Successor activity: An activity which started immediately after the completion of one or more other activities is known as a successor activity. (iii) Dummy activity: In connecting events by activities showing their interdependencies, very often a situation arises where a certain event j cannot occur until another event i has taken place, but the activity connecting i and j does not involve any time or expenditure of other resources. In such a case, the activity is called a dummy activity. It is depicted by a dotted line in the network diagram.
Activity Tail event / node Fig. 14.1 Head and trail events
Head event/node
14.5
Basic Components
371
(a)
(b)
Merge event
Burst event
Fig. 14.2 a Merge event. b Burst event
Activity
Starting event (tail event)
Ending event (head event)
Fig. 14.3 Activity
Let us consider an example of a car taken to a garage for cleaning. Both the inside and the outside of the car are to be cleaned before it is taken away from the garage. The events can be stated as follows: Event Event Event Event Event Event
1: 2: 3: 4: 5: 6:
Starting of car from house Parking of car in garage Completion of outside cleaning Completion of inside cleaning Taking of car from garage Parking of car in house.
The network diagram for the problem is given in Fig. 14.4. It is assumed that the inside cleaning and outside cleaning can be done concurrently by two garage assistants. Activities B and C represent these cleaning operations. What do activities D and E stand for? Their time consumptions are zero, but they express the condition that events 3 and 4 must occur before event 5 can take place. Activities D and E are called dummy activities.
3
D
B
F
A 1
5
2 4
C Fig. 14.4 Project network
E
6
372
14
Project Management
Network A network is the graphical representation of logically and sequentially connected arrows and nodes representing the activities and events of a project. Networks are also called arrow diagrams. Path An unbroken chain of activity arrows connecting the initial event to some other event is called a path.
14.6
Common Errors
There are three common errors in a network construction: looping, dangling and redundancy. Looping (cycling) In a network diagram, a looping error is also known as a cycling error. The drawing of an endless loop in a network is known as an error of looping. A looping network is shown in Fig. 14.5. Dangling To disconnect an activity before the completion of all the activities in a network diagram is known as dangling. It should be avoided. In that case, a dummy activity is introduced in order to maintain the continuity of the system (see Fig. 14.6). Redundancy If a dummy activity is the only activity emanating from an event and it can be eliminated, this is known as redundancy (see Fig. 14.7).
1
2
3
4 Fig. 14.5 Looping in project network
5
14.7
Rules for Network Construction
373
5 2
1
4
7 6
3
Fig. 14.6 Dangling in project network
5 2
4
6
7
1 3 Fig. 14.7 Redundancy in project network
14.7
Rules for Network Construction
For the construction of a network, certain rules are generally followed: (i) Each activity is represented by one and only one arrow. (ii) Crossing of arrows and curved arrows should be avoided; only straight arrows are to be used. (iii) Each activity must be identified by its starting and ending nodes. (iv) No event can occur until every activity preceding it has been completed. (v) An event cannot occur twice; i.e. there must be no loops. (vi) An activity succeeding an event cannot be started until that event has occurred. (vii) Events are numbered to identify an activity uniquely. The number of the tail event (starting event) should be lower than that of the head (ending) event of an activity. (viii) Between any pair of nodes (events), there should be one and only one activity. However, more than one activity may emanate from a node or terminate at a node. (ix) Dummy activities should be introduced only if it is extremely necessary. (x) The network has only one entry point, called the starting event, and one terminal point, called the end or terminal event.
374
14
14.8
Project Management
Numbering of Events
After the network is drawn in a logical sequence, every event is assigned by a number. The number sequence must reflect the flow of the network. In numbering the events, Fulkerson’s rules (for D. R. Fulkerson) are used. These rules are as follows: (i) The event number should be unique. (ii) Event numbering should be carried out on a sequential basis from left to right. (iii) An initial event is one which has all outgoing arrows with no incoming arrow. In any network, there will be one such event. Number it as 1 (one). (iv) Delete all arrows emerging from event 1 (one). This will create at least one more initial event. (v) Number these initial events as 2, 3, …, etc. (vi) Delete all emerging arrows from these numbered events, which will create new initial events. (vii) Repeat steps (v) and (vi) until the last event is obtained which has no arrows emerging from it. Number the last event. Example 1 Construct a network of a project whose activities and their precedence relationships are given below. Then number the events (Fig. 14.8). Activity
A
B
C
D
E
F
G
H
I
Immediate predecessor
–
A
A
–
D
B, C, E
F
D
G, H
G
7
I
8
2
B
4
A C 1
F
5 D
E 3
H
Fig. 14.8 Construction of project network
6
14.9
14.9
Critical Path Analysis
375
Critical Path Analysis
Once the network of a project is constructed, the time analysis of the network becomes essential for planning various activities of the project. The main objective of the time analysis is to prepare a planning schedule of the project. The planning schedule should include the following factors: (i) Total completion time for the project (ii) Earliest time when each activity can start (iii) Latest time when each activity can be started without delaying the total project (iv) Float for each activity, i.e. the duration of time by which the completion of an activity can be delayed without delaying the total project completion (v) Identification of critical activities and critical path. Notation The following notation is used in this analysis: Ei = Earliest occurrence time of event i, i.e. the earliest time at which the event i can occur without affecting the total project duration Li = Latest allowable occurrence time of event i. It is the latest allowable time at which an event can occur without affecting the total project duration tij = Duration of activity (i, j) ESij = Earliest starting time of activity (i, j) LSij = Latest starting time of activity (i, j) EFij = Earliest finishing time of activity (i, j) LFij = Latest finishing time of activity (i, j). The critical path calculations are done in the following two ways: (a) Forward pass calculations method (b) Backward pass calculations method. Forward pass calculations method In this method, calculation begins from the initial event, proceeds through the events in an increasing order of event numbers and ends at the final event of the network. At each node (event), the earliest starting and finishing times are calculated for each activity. The method may be summarized as follows: Step 1 Set E1 = 0, i = 1. Step 2 Calculate the earliest starting time ESij for each activity that begins at event i, i.e. ESij = Ei for all activities (i, j) that start at node i. Step 3 Calculate the earliest finishing time EFij of each activity that begins at event i by adding the earliest starting time of the activity with the duration of the activity. Thus, EFij = ESij + tij = Ei + tij.
376
14
Project Management
Step 4 Go to the next event (node), say event j(j > i), and compute the earliest occurrence time for event j. This is the maximum of the earliest finishing times of all ending activities at that event, i.e. Ej ¼ Max EFij ¼ Max Ei þ tij for all immediate predecessor activities. i
i
i
i
Step 5 If j = n (final event number), then the earliest finishing time for the project is given by En ¼ Max EFij ¼ Max Ei þ tij for all terminal activities. Backward pass calculations method In this method, calculations begin from the terminal event, proceed through the events in a decreasing order of event numbers and end at the initial event of the network. At each node (event), the latest starting and finishing times are calculated for each activity. The method may be summarized as follows: Step 1 Set Ln = En, j = n. Step 2 Calculate the latest finishing time LFij for each activity that ends at event j, i.e. LFij= Lj for all activities (i, j) that end at node j. Step 3 Calculate the latest starting time LSij of each activity that ends at event j by subtracting the duration of each activity from the latest finishing time of the activity. Thus, LSij ¼ LFij tij ¼ Lj tij : Step 4 Proceed backward to the node in the sequence that decreases j by 1. Also compute the latest occurrence time of node i (i < j). This is the minimum of the latest times starting of allactivities starting from that event, i.e. Li ¼ Min LSij ¼ Min Lj tij for all immediate successor activities. j
j
Step 5 If i = 1 (initial node), then L1 ¼ Min LSij ¼ Min Lj tij for all initial activities. j
j
Determination of floats and slack times When the network is completely drawn and properly labelled, and the earliest and latest event times are computed, the next objective is to determine the floats of each activity and the slack time of each event. The float of an activity is the amount of time by which it is possible to delay its completion time without affecting the total project completion time. There are three types of activity floats: (i) Total float (ii) Free float (iii) Independent float.
14.9
Critical Path Analysis
377
Total float The total float of an activity represents the amount of time by which an activity can be delayed without a delay in the project completion time. Mathematically, the total float of an activity (i, j) is the difference between the latest start time and earliest start time of that activity (or the difference between the earliest finish time and latest finish time). Hence, the total float for an activity (i, j) is denoted by TFij and is computed by the following formula: TFij ¼ LSij ESij or TFij ¼ LFij EFij or TFij ¼ Lj Ei þ tij as Lj ¼ LFij and EFij ¼ Ei þ tij : Free float Sometimes we may need to know how much an activity’s completion time may be delayed without causing any delay in its immediate successor activities. This amount of float is called a free float. Mathematically, the free float for an activity (i, j) is denoted by FFij and is computed by FFij ¼ Ej Ei tij as TFij ¼ Li Ei tij and Li Ej ) TFij Ej Ei tij i:e: TFij FFij : Hence, for all activities, free float can take values from zero up to total float, but it does not exceed total float. Again, free float is very useful for rescheduling activities with a minimum disruption of earlier plans. Independent float In some cases, the delay in the completion of an activity affects neither its predecessor nor its successor activities. This amount of delay is called an independent float. Mathematically, independence of an activity (i, j) denoted by IFij is computed by the following formula: IFij ¼ Ej Li tij : A negative independent float is always taken as zero. Event slack or event float The slack of an event is the difference between its latest time and its earliest time. Hence, for an event i, slack ¼ Li Ei :
378
14
Project Management
Critical event: An event is said to be critical if its slack is zero, i.e. Li= Ei for the ith event. Critical activity: An activity is critical if its total float is zero, i.e. LSij = ESij or LFij = EFij for an activity (i, j). Otherwise, an activity is called non-critical. Critical path The critical path is the continuous chain or sequence of critical activities in a network diagram path from starting event to ending event. It is shown by a dark line or double lines to make a distinction from other, non-critical paths. The length of the critical path is the sum of the individual times of all critical activities lying on it and defines the minimum time required to complete the project. The critical path on a network diagram can be identified as follows: (i) For all activities (i, j) lying on the critical path, the E values and L values for tail and head events are equal, i.e. Ej = Lj and Ei = Li. (ii) On the critical path, Ej − Ei = Lj − Li = tij. Main features of the critical path The critical path has two main features: (i) If the project has to be shortened, then some of the activities on that path must be shortened. The application of additional resources on other activities will not give the desired results unless that critical path is shortened first. (ii) The variation in actual performance from the expected activity duration time will be completely reflected in one-to-one fashion in the anticipated completion of the whole project. Example 2 Determine the critical path and minimum time of completion of the project whose network diagram is shown in Fig. 14.9. Also find the different floats.
2
G(19)
5 H(4)
C(20)
1
B(8)
6
4 F(18) D(16)
A(23)
E(24) 3
Fig. 14.9 Project network
I(10)
7
14.9
Critical Path Analysis
379
Solution Forward pass calculations At At At At
node 1: Set E1 = 0 node 2: E2 = E1 + t12 = 0 + 20 = 20 node 3: E3 = E1 + t13 = 0 + 23 = 23 node 4: E4 ¼ MaxfEi þ ti4 g ¼ MaxfE1 þ t14 ; E3 þ t34 g ¼ Maxf0 þ 8; 23 þ i¼1;3
16g ¼ 39 At node 5: E5 ¼ MaxfEi þ ti5 g ¼ MaxfE2 þ t25 ; E4 þ t45 g ¼ Maxf20 þ 19; 39 þ i¼2;4
0g ¼ 39 At node 6: E6 ¼ MaxfEi þ ti6 g ¼ MaxfE4 þ t46 ; E5 þ t56 g ¼ Maxf39 þ 18; 39 þ i¼4;5
0g ¼ 57 At node 7: E7 ¼ Max fEi þ ti7 g ¼ MaxfE3 þ t37 ; E5 þ t57 ; E6 þ t67 g i¼3;5;6
¼ Maxf23 þ 24; 39 þ 4; 57 þ 10g ¼ 67 Backward pass calculations At node 7: Set L7 = E7 = 67 At node 6: L6 = L7 − t67 = 67 − 10 = 57 At node 5: L5 ¼ Min Lj t5j ¼ MinfL6 t56 ; t7 t57 g ¼ Minf57 0; 67 j¼6;7
4g ¼ 57 At node 4: L4 ¼ Min Lj t4j ¼ MinfL5 t45 ; L6 t46 g ¼ Minf57 0; 57 j¼5;6
18g ¼ 39 At node 3: L3 ¼ Min Lj t3j ¼ MinfL4 t34 ; L7 t37 g ¼ Minf39 16; 67 j¼4;7
24g ¼ 23 At node 2: L2 = L5 − t 25 = 57 − 19 = 38 At node 1: L1 ¼ Min Lj t1j ¼ MinfL2 t12 ; L3 t13 ; L4 t14 g j¼2;3;4
¼ Minf38 20; 23 23; 39 8g ¼ 0 To find the critical activities and different floats, we construct the table (cf. Table 14.1). Table 14.1 Computation of critical activity and different floats Activity
Duration of activity
Earliest time
Latest time
Start (Ei)
Finish (Ei+ tij)
Start (Lj − tij)
Float Finish (Lj)
Total Lj − Ei − tij
Free Ej − Ei − tij
Independent Ej − Li − tij
(1, 2)
20
0
20
18
38
18
0
0
(1, 3)
23
0
23
0
23
0
0
0
(1, 4)
8
0
8
31
39
31
31
31
(2, 5)
19
20
39
38
57
18
0
0
(3, 4)
16
23
39
23
39
0
0
0
(3, 7)
24
23
47
43
67
20
20
20
Critical activity
(1, 3)
(3, 4)
(continued)
380
14
Project Management
Table 14.1 (continued) Activity
Duration of activity
Earliest time
Latest time
Start (Ei)
Finish (Ei+ tij)
Start (Lj − tij)
Finish (Lj)
Total Lj − Ei − tij
Float Free Ej − Ei − tij
Independent Ej − Li − tij
(4, 5)
0
39
39
57
57
18
0
(4, 6)
18
39
57
39
57
0
0
0
(5, 6)
0
39
39
57
57
18
18
0
(5, 7)
4
39
43
63
67
24
24
46
(6, 7)
10
57
67
57
67
0
0
0
Critical activity
0 (4, 6)
(6, 7)
From the preceding table, it is clear that the critical activities (zero total float) are (1, 3), (3, 4), (4, 6) and (6, 7). Hence, the critical path is 1–3−4–6–7, and the duration of the project is 67 time units (as E7 = L7 = 67).
14.10
PERT Analysis
Time estimates It is very difficult to estimate the time required for the execution of each activity or because of various uncertainties. Taking the uncertainties into account, three types of time estimates are generally obtained. The PERT system is based on these three time estimates of the performance time of an activity: (i) Optimistic time (to): This is the estimate of the shortest possible time in which an activity can be completed under ideal conditions. (ii) Pessimistic time (tp): This is the maximum time which is required to perform the activity under extremely bad conditions. However, such conditions do not include labour strikes or acts of nature (like floods, earthquakes, tornados, etc.). (iii) Most likely time (tm): This is the estimate of the normal time which an activity would require. This time estimate lies between the optimistic and pessimistic time estimates. Statistically, it is the modal value of the duration of the activity. The range specified by the optimistic time (to) and pessimistic time (tp) estimates supposedly must close every possible estimate of duration of the activity. The most likely time (tm) estimate may not coincide with the midpoint tmid = (to + tp)/2 and may occur to its left or right, as shown in Fig. 14.10. Keeping in mind the above-described time properties, it may be justified to assume that the duration of each activity may follow a beta (b)-distribution with its unimodal point occurring at tm and its end point at to and tp.
14.10
t0
PERT Analysis
te
tm
381
tp
t0
tm
tmid
tp
t0
tm
tp
Fig. 14.10 Time estimates of an activity
The expected or mean value of an activity duration can be approximated by a linear combination of the three time estimates or by the weighted average of three time estimates to, tp and tm, i.e. te = (to + 4tm + tp)/6. Again, to determine the activity duration variance in PERT, the unimodal property of the b-distribution is used. However, in PERT, the standard deviation is expressed as r¼
t t 2 1 p o tp to or variance r2 ¼ : 6 6
Note that in PERT analysis, the b-distribution is assumed as it is unimodal, has non-negative end points and is approximately symmetric. Probability of meeting the schedule time After identifying the critical path and the occurrence time of all activities, there arises a question: What is the probability that a particular event will occur on or before the schedule date? This particular event may be any event in the network. Let us recall that the expected time of an activity is the weighted average of the three time estimates to, tp and tm, i:e: te ¼ to þ 4tm þ tp =6: The probability that the activity (i, j) will be completed in time te is 0.5; i.e. the chance of completion of that activity is 50%. In the frequency distribution curve, for the activity (i, j) the vertical line through te will divide the area under the curve into two equal parts, as shown in Fig. 14.11. For completing the activity in any other time tk, the probability will be p¼
Area under AEK : Area under AEB
A project consists of a number of activities. All activities, as we know, are independent random variables, and hence the length of the project up to a certain event through a certain path is also a random variable. But the point of difference is that
Probability density function
382
14
Project Management
E
K
A
B Activity duration
t0
te
tk
tp
Fig. 14.11 Time distribution curve of an activity
the expected project length Te does not have the same probability distribution as the expected activity time te. While a b-distribution curve approximately represents the activity time probability distribution, the project expected time Te follows approximately a standard normal distribution. This standard normal distribution curve has an area equal to unity and standard deviation 1 and is symmetrical about the mean, as shown in Fig. 14.12. The probability of completing a project in scheduling time Ts is given by probðTs Þ ¼
Area under ACS : Area under ACB
Probability prob(Ts) depends upon the location of Ts. Taking Te as a reference point, the distance Ts − Te can be expressed in terms of the standard deviation for a network, calculated as
Probability density
C
S
A
B Project duration
Te
Ts
Fig. 14.12 Time distribution of a project network
14.10
PERT Analysis
383
Standard deviation for a network ¼ re pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ sum of the variances along the critical path;
i.e. d for a network ¼
qffiffiffiffiffiffiffiffiffiffiffi t t 2 P 2 rij , where r2ij for an activity (i, j) ¼ p 6 o .
Since the standard deviation for a standard normal curve is unity, the standard deviation re calculated above is used as a scale factor for calculation of the normal deviate, The normal deviate D ¼
T s Te : re
Hence, the probability of completing the project by scheduling time (Ts) is given by e prob(Z D), where D ¼ TsrT and Z is the standard normal variate. e The values of the probabilities for a normal distribution curve corresponding to the different values of the normal deviate are available from the table of the standard normal curve. Example 3 A small project is composed of seven activities whose time estimates (in weeks) are listed in the Table 14.2. (a) Draw the project network. (b) Find the expected duration and variance of each activity. (c) Calculate the earliest and latest occurrence time for each event and the expected project length. (d) Calculate the variance and standard deviation of the project length. (e) What is the probability that the project will be completed: (i) At least 4 weeks earlier than expected? (ii) Not more than 4 weeks later than expected? (f) If the project due date is 19 weeks, what is the probability of meeting the due date? (g) Find also the schedule time at which the project will be completed with a probability 0.90. Table 14.2 Activities and their different time estimates Activity
(1, 2)
(1, 3)
(1, 4)
(2, 5)
(3, 5)
(4, 6)
(5, 6)
t0 tm tp
1 1 7
1 4 7
2 2 8
1 1 1
2 5 14
2 5 8
3 6 15
384
14
Project Management
Solution (a) The project network diagram is shown in Fig. 14.13. (b) The expected time and variance of each activity is computed and displayed in the Table 14.3. Table 14.3 Computation of expected time and variance of each activity Activity
to
tm
tp
te ¼
(1, (1, (1, (2, (3, (4, (5,
1 1 2 1 2 2 3
1 4 2 1 5 5 6
7 7 8 1 14 8 15
2 4 3 1 6 5 7
2) 3) 4) 5) 5) 6) 6)
to þ 4tm þ tp 6
r2 ¼
tp to 2 6
1 1 1 0 4 1 4
Let Ei be the earliest occurrence time of event i. Set E1 = 0 E2 = E1 + t12 = 0 + 2 = 2 E3 = E1 + t13 = 0 + 4 = 4 E4 = E1 + t14 = 0 + 3 = 3
E5 ¼ MaxfEi þ ti5 g ¼ MaxfE2 þ t25 ; E3 þ t35 g ¼ Maxf2 þ 1; 4 þ 6g ¼ 10 i¼2;3
E6 ¼ MaxfEi þ ti6 g ¼ MaxfE4 þ t46 ; E5 þ t56 g ¼ Maxf3 þ 5; 10 þ 7g ¼ 17 i¼4;5
2 5
1 3
6 4 Fig. 14.13 Project network diagram
14.10
PERT Analysis
385
(c) Backward pass calculations Let Lj be the latest Set L6 = E6 = 17 L5 = L6 − t56 = 17 L4 = L6 − t46 = 17 L3 = L5 − t35 = 10 L2 = L5 − t25 = 10
occurrence time of event j. − − − −
7 5 6 1
= = = =
10 12 4 9
L1 ¼ Min Lj t1j ¼ MinfL2 t12 ; L3 t13 ; L4 t14 g j¼2;3;4
¼ Minf9 2; 4 4; 12 3g ¼ 0: From these calculations, it is seen that E1 ¼ L1 ; E3 ¼ L3 ; E5 ¼ L5 ; E6 ¼ L6 : Hence, the critical events are 1, 3, 5, 6 and the critical path is 1–3–5–6. Also, the expected project length = E6 = L6 = 17 weeks. (d) The variance of the project length is given by r2e ¼ 1 þ 4 þ 4 ¼ 9
or; re ¼ 3:
(e) The standard normal deviate is given by D¼
schedule time expected time of completion pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi : variance or standard deviation
(i) Now, the probability that the project will be completed at least 4 weeks earlier than expected is given by probðZ DÞ where D ¼
ð17 4Þ 17 4 ¼ ¼ 1:33 ðapprox:Þ 3 3
¼ probðZ 1:33Þ ¼ 0:5 Probð0\Z 1:33Þ ¼ 0:5 uð1:33Þ ¼ 0:5 0:4082 ¼ 0:0918:
½From the table of the area under the standard normal curve
386
14
Project Management
(ii) Again, the probability that the project will be completed not more than 4 weeks later than expected = the probability that the project will be completed within 17 + 4 weeks, i.e. within 21 weeks:
probðZ DÞ; where D ¼
21 17 4 ¼ ¼ 1:33 ðapprox:Þ; 3 3
¼ probðZ 1:33Þ ¼ 0:5 þ uð1:33Þ ¼ 0:5 þ 0:4082 ¼ 0:9082: (f) When the due date is 19 weeks, D ¼ 1917 ¼ 23 ¼ 0:67. 3 Then the probability of meeting the due date is given by probðZ 0:67Þ ¼ 0:5 þ uð0:67Þ ¼ 0:5 þ 0:2514 ¼ 0:7514: (g) Since the probability for the completion of the project is 0.90, e prob(Z D) = 0.90, where D ¼ Ts T and Ts is the schedule time. r
As prob(Z 1.29) = 0.90, ∴ D = 1.29, which implies
14.11
Ts 17 3
¼ 1:29, i.e. Ts = 17 + 3 1.29, i.e. Ts = 20.87.
Difference Between PERT and CPM
The differences between PERT and CPM are listed as follows: PERT
CPM
1. This technique was developed in connection with R&D (research and development) works; therefore, it had to cope with the uncertainty which is associated with R&D activities. In this case, the total project duration is regarded as a random variable. As a result, multiple time estimates are made to calculate the probability of completing the project within the schedule time. Therefore, it is a probabilistic model.
1. This technique was developed in connection with a construction and maintenance project which consists of routine tasks or jobs whose resource requirements and duration are known with certainty. Therefore, it is basically a deterministic model.
2. It is used for projects involving activities of a non-repetitive nature.
2. It is used for projects involving activities of a repetitive nature.
3. It is an event-oriented technique, because the results of analyses are expressed in terms of events or distinct points in time indicative of progress.
3. It is an activity oriented technique, as the results of calculations are considered in terms of activities.
4. It incorporates statistical analysis and thereby enables the determination of probabilities concerning the time by
4. It does not incorporate statistical analysis in determining the estimates because the time is precise and known.
(continued)
14.11
Difference Between PERT and CPM
387
(continued) PERT
CPM
which each activity and the entire project would be completed. 5. It serves as a useful control device, as it assists management in controlling a project by calling attention through constant review to delays in activities which might lead to a delay in the project completion date.
14.12
5. It is difficult to use this technique as a controlling device, because one must repeat the entire evaluation of the project each time changes are introduced into the network.
Project Time-Cost Trade-Off
In this section, the costs of resources consumed by activities are taken into consideration. The project completion time can be reduced (crashing) by reducing the normal completion time of critical activities. The reduction in normal time of completion will increase the total budget of the project. However, the decision-maker will always look for a trade-off between the total cost of the project and the total time required to complete it. Project cost In order to include the cost aspect in project scheduling we have to find the cost duration relationships for various activities in the project. The total cost of any project comprises direct and indirect costs. Direct cost This cost is directly dependent upon the amount of resources in the execution of individual activities such as manpower loading, material consumed, etc. The direct cost increases if the activity duration is to be reduced. Indirect cost This cost is associated with expenditures which cannot be allocated to individual activities of the project. This cost may include managerial services, loss of revenue, fixed overheads, etc. The indirect cost is computed on a per-day, per-week or per-month basis. This cost decreases if the activity duration is to be reduced. The network diagram can be used to identify the activities whose duration should be shortened so that the completion time of the project can be shortened in the most economic manner. The process of reducing the activity duration by putting in extra effort is called crashing the activity. The crash time (Tc) represents the minimum activity duration time that is possible; any attempts to further crash would only raise the activity cost without reducing the time. The activity cost corresponding to the crash time is called the crash cost (Cc), which is the minimum direct cost required to achieve the crash performance time.
388
14
Project Management
Fig. 14.14 Direct cost w.r.t. time
Direct Cost
A
B
Time The normal cost (Cn) is equal to the absolute minimum of the direct cost required to perform an activity. The corresponding time duration taken by an activity is known as the normal time (Tn). The direct cost curve (from the relationship of direct cost and time) is shown in Fig. 14.14. The point B denotes the normal time for completion of an activity, whereas point A denotes the crash time which indicates the least duration in which activity can be completed. The cost curve has a non-linear and asymptotic nature. But, for simplicity, it can be approximated by a straight line whose slope (in magnitude) is given by Cost slope =
Crash cost Normal cost Cc Cn ¼ : Normal time Crash time Tn Tc
It is also called the crash cost slope or crash rate of increase in the cost of performing the activity per unit reduction in time and is also known as the time-cost trade-off. It varies from activity to activity. After assessing the direct and indirect project costs, the total project cost, which is the sum of the direct and indirect costs, can be determined. Time-cost optimization algorithm/time-cost trade-off procedure The following are the steps involved in the project crashing. Step 1 Considering normal times of all activities, identify the critical activities and find the critical path. Step 2 Calculate the cost slope for different activities and rank the activities in ascending order of cost slope.
14.12
Project Time-Cost Trade-Off
389
Step 3 Crash the activities on the critical path as per ranking of cost slopes of different activities; i.e. an activity having a lower cost slope would be crashed first to the maximum extent possible. (For the crashing of a lower cost slope, i.e. for the reduction of activity duration time, the direct cost of the project would be increased very slowly.) Step 4 Due to the reduction of the critical path duration by crashing in Step 3, other paths also become critical; i.e. we get parallel critical paths. In such cases, the project duration can be reduced by crashing of activities simultaneously in the parallel critical paths. Step 5 Repeat the process until all the critical activities are fully crashed or no further crashing is possible. In the case of indirect cost, the process of crashing is repeated until the total cost is minimum, beyond which it may increase. The minimum cost is called the optimum project cost and the corresponding time, the optimum project time. Example 4 The following table shows activities, their normal time, cost and crash time and cost for a project: Activity
Normal time (days)
Cost (Rs.)
Crash time (days)
Cost (Rs.)
(1, (1, (2, (2, (3, (3, (4, (5,
6 8 4 3 Dummy 6 10 3
1400 2000 1100 800 _ 900 2500 500
4 5 2 2 _ 3 6 2
1900 2800 1500 1400 _ 1600 3500 800
2) 3) 3) 4) 4) 5) 6) 6)
The indirect cost for the project is Rs. 300 per day. (i) (ii) (iii) (iv)
Draw the network of the project. What are the normal duration and associated cost of the project? What will be the least project duration and the corresponding cost? Find the optimum duration and minimum project cost.
Solution (i) The network of the project is shown in Fig. 14.15. (ii) Using the normal time duration of each activity, the earliest and latest occurrence times at various nodes are computed and displayed in Fig. 14.15 of the network.
390
14
L2 = 6 E2 = 6
2
6(4)
L4 = 10 E4 = 10
3(2)
• •
4
Crash time are displayed in bracket Double line shows the critical path
10(6)
4(2)
L1 = 0 1 E1 = 0
Project Management
6 8(5)
E6 = 20 L6 = 20
3(2)
5
3
6(3)
E5 = 16 L5 = 17
E3 = 10 L3 = 10
Fig. 14.15 Project network with time duration 20 days
L4 = 9 E4 = 9
3(2)
L2 = 6 E2 = 6
2
6(4)
10(6)
4
3(2)
1 L1 = 0 E1 = 0
6
E6 = 19 L6 = 19
6
E6 = 18 L6 = 18
3(2)
8(5) E3=9 L3=9
5
3
E5 = 15 L5 = 16
6(3)
Fig. 14.16 Project network with time duration 19 days
L2 = 5 E2 = 5
L4 = 8 E4 = 8
3(2)
2
4 10(6)
5(4)
3(2)
1 L1 = 0 E1 = 0
8(5)
3(2)
3 E3=8 L3=8
5 6(3)
Fig. 14.17 Project network with time duration 18 days
E5 = 14 L5 = 15
14.12
Project Time-Cost Trade-Off L1 = 5 E1 = 5
3(2)
2
4
5(4)
L1 = 0 E1 = 0
391
9(6) L1 = 8 E1 = 8
3(2)
1
6
E6 = 17 L6 = 17
6
E6 = 16 L6 = 16
3(2)
8(5)
5
3
L1 = 8 E1 = 8
L1 = 14 E1 = 14
6(3)
Fig. 14.18 Project network with time duration 17 days
L1 = 5 E1 = 5
L1 = 0 E1 = 0
L1 = 8 E1 = 8
3(2)
2
4
5(4)
8(6)
3(2)
1 8(5)
3(2) 5
3
L1 = 8 E1 = 8
5(3)
L1 = 13 E1 = 13
Fig. 14.19 Project network with time duration 16 days
L1 = 5 E1 = 5
L1 = 0 E1 = 0
3(2) 2
4
5(4)
7(6)
L1 = 8 E1 = 8
3(2)
1
6
E6 = 15 L6 = 15
3(2)
8(5) L1 = 8 E1 = 8
3
5 4(3)
L1 = 12 E1 = 12
Fig. 14.20 Project network with time duration 15 days
From the network, it is seen that the L values and E values at nodes 1, 2, 3, 4, 6 are same. This means that the critical path is 1–2–3–4–6 and the normal duration of the project is 20 days.
392
14
L1 = 5 E1 = 5
L1 = 8 E1 = 8
3(2)
2
Project Management
4 6(6)
5(4) L1 = 0 E1 = 0
3(2)
1
6 8(5)
E6 = 14 L6 = 14
3(2) 5
3
L1 = 8 E1 = 8
3(3)
L1 = 11 E1 = 11
Fig. 14.21 Project network with time duration 14 days L1 = 4 E1 = 4
L1 = 7 E1 = 7
3(2)
2
4 6(6)
4(4)
L1 = 0 E1 = 0
3(2)
1
6 7(5)
E6 = 13 L6 = 13
3(2) L1 = 7 E1 = 7
5
3
L1 = 10 E1 = 10
3(3)
Fig. 14.22 Project network with time duration 13 days
The associated cost of the project are as follows: ¼ Direct normal cost þ indirect cost for 20 days ¼ Rs: ½ð1400 þ 2000 þ 1100 þ 800 þ 900 þ 2500 þ 500Þ þ 20 300 ¼ Rs: ½9200 þ 6000 ¼ Rs: 15;200: (iii) The cost slopes of different activities are computed by using the formula Cost slope ¼
Crash cost Normal cost Normal time Crash time
and these are shown in the following table: Activity Slope
(1, 2) 250
(1, 3) 267
(2, 3) 200
(2, 4) 600
(3, 5) 233
(4, 6) 250
(5, 6) 300
(1, 2) (1)
(4, 6) (1)
Figure 14.15
Figure 14.16
Figure 14.17
Figure 14.18
Figure 14.19
Figure 14.20
Figure 14.21
Figure 14.22
1–2–3–4–6
1–2–3–4–6, 1–2–4–6
1–2–3–4–6, 1–2–4–6 1–3–4–6
1–2–3–4–6, 1–2–4–6, 1–3–4–6, 1–3–5–6, 1–2–3–5–6
1–2–3–4–6, 1–2–4–6, 1–3–4–6, 1–3–5–6, 1–2–3–5–6
1–2–3–4–6, 1–2–4–6, 1–3–4–6, 1–3– 5–6, 1–2–3–5–6,
1–2–3–4–6, 1–2–4–6, 1–3–4–6, 1–3–5–6, 1–2–3–5–6
1–2–3–4–6, 1–2–4–6, 1–3–4–6, 1–3–5–6, 1–2–3–5–6
(2,3) (1) (2,4) (1) (1,3) (1)
(1, 2) (1) (1, 3) (1)
(3, 5) (1) (4, 6) (1)
(3, 5)(1) (4, 6)(1)
(3, 5) (1) (4, 6) (1)
(2, 3) (1)
_
Figure 14.15
1–2–3–4–6
Activities crashed (time)
See figure (before crashing)
Critical path(s)
12
13
14
15
16
17
18
19
20
Project length after crashing (days)
9200
9200
9200
9200
9200
9200
9200
9200
15,200
300 19 300 18 300 17
300 16
300 15
300 14
300 13
300 12
200 1 = 200 200 + 250 1 = 450 450 + 250 1 = 700
700 + 233 1 + 250 1 = 1183
1183 + 233 1 + 250 1 = 1666
1666 + 233 1 + 250 1 = 2149
2149 + 250 1 + 267 1 = 2666
2666 + 200 1 + 600 1 + 267 1 = 3733
16,533
15,766
15,549
15,366
15,183
15,000
15,050
15,100
(A + B + C)
300 20
Total cost (Rs.)
(C)
Indirect cost (Rs. 300/ day)
_
(B)
(A) 9200
Crashing cost (Rs.)
Normal direct cost (Rs.)
Table 14.4 Computation of total cost with duration due to crashing for Example 4
14.12 Project Time-Cost Trade-Off 393
394
14
L1 = 4 E1 = 4
4(4) L1 = 0 E1 = 0
L1 = 6 E1 = 6
2(2)
2
Project Management
4
6(6)
2(2)
1
6
L1=12 E1= 12
6(5)
L1 = 6 E1 = 6
5
3
3(3)
3(2)
L1 = 9 E1 = 9
Fig. 14.23 Project network with time duration 12 days
(iv) Now we construct the table for finding the optimal cost with duration and minimum project duration with cost (cf. Table 14.4). From Fig. 14.23, it is seen that no further crashing is possible beyond 12 days. Hence, the least project duration is 12 days, and the corresponding cost of the project will be Rs. 16,533.00. As the minimum cost occurs for a 17-day schedule, the optimum duration of the project is 17 days, and the minimum project cost is Rs. 15,000.00. Example 5 The following table gives data on normal time & cost and crashed time & cost for a project: Activity (1,2) (1,3) (2,3) (2,4) (2,5) (3,6) (4,5) (5,6)
Time (days) Normal
Crash
Cost (Rs.) Normal
Crash
9 15 7 7 12 12 6 9
4 13 4 3 6 11 2 6
1300 1000 700 1200 1700 600 1000 900
2400 1380 1540 1920 2240 700 1600 1200
The indirect cost for the project is Rs. 400 per day. (i) (ii) (iii) (iv)
Draw the network of the project. What are the normal duration and associated cost of the project? What are the optimum project duration and corresponding cost? Find the optimum cost and corresponding duration.
14.12
Project Time-Cost Trade-Off
L2 = 9 E2 = 9
395
L4 = 16 E4 = 16
7(3)
2
4 12(6) 9(4) L1 = 0 E1 = 0
6(2) 9(6)
7(4)
L6 = 31 E6 = 31
6
5
1
E5 = 22 L5 = 22 15(13)
12(11) 3
L3 = 16 E3 = 19
Fig. 14.24 Project network with time duration 31 days L2 = 9 E2 = 9
(a)
L4 = 16 E4 = 16
7(3)
2
4 12(6) 9(4) L1 = 0 E1 = 0
6(2) 8(6)
7(4)
1
L6 = 30 E6 = 30
E5 = 22 L5 = 22 15(13)
12(11) 3
L3 = 16 E3 = 18 L2 = 9 E2 = 9
(b)
6
5
L4 = 16 E4 = 16
7(3)
2
4 12(6) 9(4) L1 = 0 E1 = 0
7(4)
1
6(2) 7(6) 6
5
L6 = 29 E6 = 29
E5 = 22 L5 = 22 15(13)
12(11) L3 = 16 E3 = 17
3
Fig. 14.25 a Project network with time duration 30 days. b Project network with time duration 29 days
396
14
L2 = 9 E2 = 9
L4 = 16 E4 = 16
7(3)
2
Project Management
4 12(6) 9(4) L1 = 0 E1 = 0
6(2) 6(6)
7(4)
5
1
6
L6 = 28 E6 = 28
6
L6 = 27 E6 = 27
E5 = 22 L5 = 22 15(13)
12(11) L3 = 16 E3 = 16
3
Fig. 14.26 Project network with time duration 28 days
L2 = 8 E2 = 8
L4 = 15 E4 = 15
7(3)
2
4 12(6) 8(4) L1 = 0 E1 = 0
7(4)
1
6(2) 6(6) 5 E5 = 21 L5 = 21
15(13)
12(11) L3 = 15 E3 = 15
3
Fig. 14.27 Project network with time duration 27 days
Solution (i) The network diagram of the project is shown in Fig. 14.24. (ii) The network of the project has been drawn using the normal time duration for each activity. Then for each event, the earliest and latest occurrence times for each event have been computed and displayed in the network. The normal and crash time durations are also shown in the network.
14.12
Project Time-Cost Trade-Off
L2 = 8 E2 = 8
397
L4 = 15 E4 = 15
7(3)
2
4 12(6) 8(4) L1 = 0 E1 = 0
5(2) 6(6)
7(4)
5
1
6
L6 = 26 E6 = 26
6
L6 = 25 E6 = 25
E5 = 20 L5 = 20 15(13)
11(11) L3 = 15 E3 = 15
3
Fig. 14.28 Project network with time duration 26 days
L2 = 7 E2 = 7
L4 = 14 E4 = 14
7(3)
2
4 12(6) 7(4) L1 = 0 E1 = 0
7(4)
1
5(2) 6(6) 5 E5 = 19 L5 = 19
14(13)
11(11) L3 = 14 E3 = 14
3
Fig. 14.29 Project network with time duration 25 days
From the network, it is seen that L1 ¼ E1 ¼ 0; L2 ¼ E2 ¼ 9; L4 ¼ E4 ¼ 16; L5 ¼ E5 ¼ 22; L6 ¼ E6 ¼ 31: Hence, the critical path is 1–2–4–5–6. As E6 ¼ L6 ¼ 31, the normal duration of the project is 31 days. The associated cost of the project ¼ Direct normal cost þ indirect cost for 31 days ¼ Rs: ½ð1300 þ 1000 þ 700 þ 1200 þ 1700 þ 600 þ 1000 þ 900Þ þ 31 400 ¼ Rs: 20;800:
398
14
Project Management
(iii) The cost slopes of different activities are computed by using the formula Cost slope = (Crash Cost − Normal cost)/(Normal time − Crash time), and these values are shown in the following table: Activity Cost slope
(1,2) 220
(1,3) 190
(2,3) 280
(2,4) 180
(2,5) 90
(3,6) 100
(4,5) 150
(5,6) 100
Now we construct the table for finding the optimal cost with duration and minimum duration with cost (cf. Table 14.5). Table 14.5 Computation of total cost with duration due to crashing for Example 5 Critical path(s)
See figure (before crashing)
1–2–4– 5–6
Figure 14.24
1–2–4– 5–6
Figure 14.24
1–2–4– 5–6
Indirect cost (Rs. 400/day) (C)
Total cost (Rs.) (A + B + C)
400 31
20,800
100 1 = 100
400 30
20,500
8400
100 + 100 1 = 200
40ion ion 0 29
20,200
28
8400
200 + 100 1 = 300
400 28
19,900
(1,2) (1)
27
8400
300 + 220 1 = 520
400 27
19,720
Figure 14.27
(4,5) (1) (3,6) (1)
26
8400
520 + 150 1 + 100 1 = 770
400 26
19,570
1–2–4– 5–6 1–2–3– 6 1–3–6 1–2–5– 6
Figure 14.28
(1,2) (1) (1,3) (1)
25
8400
770 + 220 1 + 190 1 = 1180
400 25
19,580
Do
Figure 14.29
(1,2) (1) (1,3) (1)
24
8400
1180 + 220 1 + 190 1 = 1590
400 24
19,590
Project length after crashing (days)
Normal direct cost (Rs.) (A)
31
8400
(5,6) (1)
30
8400
Figure 14.25a
(5,6) (1)
29
1–2–4– 5–6 1–2–3– 6
Figure 14.25b
(5,6) (1)
1–2–4– 5–6 1–2–3– 6
Figure 14.26
1–2–4– 5–6 1–2–3– 6 1–3–6 1–2–5– 6
Activities crashed (time)
Crashing cost (Rs.) (B)
From Fig. 14.30, it is seen that no further crashing is possible beyond 24 days. Hence, the optimum project duration, i.e. the least project duration, is 24 days and the corresponding cost of the project will be Rs. 19,590.00.
14.12
Project Time-Cost Trade-Off
L2 = 6 E2 = 6
399
L4 = 13 E4 = 13
7(3)
2
4 12(6) 6(4) L1 = 0 E1 = 0
5(2) 6(6)
7(4)
6
5
1
L6 = 24 E6 = 24
E5 = 18 L5 = 18 13(13)
11(11) L3 = 13 E3 = 13
3
Fig. 14.30 Project network with time duration 24 days
(iv) From Table 14.5, it is observed that the optimum cost is Rs. 19,570.00 and the corresponding project duration is 26 days.
14.13
Exercises
1. A project consists of a series of tasks labelled A, B, …, H, I with the following relationships (W < X, Y means X and Y cannot start until W is completed; X, Y < W means W cannot start until both X and Y are completed). With this notation, construct the network diagram having the following constraints: A\D; E; B; D\F; C\G; G\H; F; G\I: Find also the minimum time of completion of the project, when the time (in days) of completion of each task is as follows: Task Time
A 23
B 8
C 20
D 16
E 24
F 18
G 19
H 4
I 10
400
14
Project Management
2. The following are the details of the estimated times of activities of a certain project: Activity Immediate predecessor Estimated time (weeks)
A _ 2
B A 3
C A 4
D B, C 6
E _ 2
F E 8
(a) Find the critical path and the expected time of the project. (b) Calculate the earliest start time and earliest finish time for each activity. (c) Calculate the float for each activity. 3. A project consists of eight activities with the following relevant information: Activity
Immediate predecessor
Estimated duration (days) Optimistic Most likely
Pessimistic
A B C D E F G H
_ _ _ A B C D, E F, G
1 1 2 1 2 2 3 1
7 7 8 1 14 8 15 3
1 4 2 1 5 5 6 2
(ii) Draw the network and find the expected project completion time. (iii) What duration will have 95% confidence for project completion? (iv) If the average duration for activity F increases to 14 days, what will be its effect on the expected project completion time which will have 95% confidence? 4. Draw the network for the following project and compute the earliest and latest times for each event and also find the critical path: Activity
(1, 2)
(1, 3)
(2, 4)
(3, 4)
(4, 5)
(4,6)
(5, 7)
(6,7)
Immediate predecessor
_
_
(1, 2)
(1, 3)
(2, 4)
(2,4), (3,4)
(4, 5)
(4, 6)
(7, 8) (6, 7), (5, 7)
Time (days)
5
4
6
2
1
7
8
4
3
14.13
Exercises
401
5. A small project consists of seven activities, the details of which are given below: Activity A B C D E F G
Time estimates tm to
tp
3 2 2 2 1 4 1
9 8 6 10 11 8 15
6 5 4 3 3 6 5
Predecessor None None A B B C, D E
Find the critical path. What is the probability that the project will be completed by 18 weeks? 6. The following table gives data on normal time-cost and crash time-cost for a project: Activity (1, (1, (2, (2, (3, (4, (5, (6,
2) 3) 4) 5) 4) 6) 6) 7)
Normal Time (days)
Cost (Rs.)
Crash Time (days)
Cost (Rs.)
6 4 5 3 6 8 4 3
650 600 500 450 900 800 400 450
4 2 3 1 4 4 2 2
1000 2000 1500 650 2000 3000 1000 800
The indirect cost per day is Rs. 100. (i) Draw the network and identify the critical path. (ii) What are the normal project duration and associated cost? (iii) Crash the relevant activities systematically and determine the optimum project completion time and cost.
402
14
Project Management
7. The following table gives the activities in a construction project and other relevant information: Activity
Immediate predecessor
Time (days) Normal Crash
Direct cost (Rs.) Normal Crash
A B C D E F G
_ _ _ A C A B, D, E
4 6 2 5 2 7 4
60 150 38 150 100 115 100
3 4 1 3 2 5 2
90 250 60 250 100 175 240
Indirect costs vary as follows: Days Cost (Rs.)
15 600
14 500
13 400
12 250
11 175
10 100
9 75
8 50
7 35
6 25
(i) Draw the network of the project. (ii) Determine the project duration which will result in minimum total project cost.
Chapter 15
Queueing Theory
15.1
Objective
The objective of this chapter is to discuss • The different situations which generate queueing problems • The various factors/components of a queueing system • A variety of performance measures (operating characteristics) of a queueing system • The probability distributions of arrivals and departures in a queueing system • Poisson queueing models with finite and infinite capacity • The single server infinite capacity queueing model with Poisson arrival and multiphase Erlang service • The single server infinite capacity queueing model with Poisson arrival and general service • A mixed queueing model with random arrival rate which follows the Poisson distribution and deterministic service rate • Finding the optimum service level by minimizing the total cost of M/M/1 systems with finite and infinite capacity.
15.2
Basics of Queueing Theory
15.2.1 Introduction A group of customers waiting to receive a service including those receiving the service is known as a waiting line or queue. Queueing theory involves the mathematical study of queues or waiting lines. It analyses a production and servicing process system exhibiting random variability in market demand (arrival times) and © Springer Nature Singapore Pte Ltd. 2019 A. K. Bhunia et al., Advanced Optimization and Operations Research, Springer Optimization and Its Applications 153, https://doi.org/10.1007/978-981-32-9967-2_15
403
404
15
Queueing Theory
service times. This subject had its beginning in the research work of Danish engineer A.K. Erlang. In the year 1909, Erlang experimented with fluctuating demand in telephone traffic. After eight years he established a report addressing the delays in automatic dialing equipment. At the end of the Second World War, Erlang’s earlier work was extended and applied to solve general problems and the business applications of waiting lines. Queueing theory tries to answer questions involving the average number of customers in the queue, average number of customers in the system, mean waiting time in the queue, mean waiting time in the system, and so forth. The objective of a queueing model is to determine how to provide the service to the customers so as to maintain an economic balance between the cost (time) of the service and the cost (time) associated with the wait for the service by manipulating certain variables such as number of servers, rate of service and order of service.
15.2.2 Reason Behind the Queue The formation of waiting lines (or queues) is a common phenomenon which occurs whenever the current demand for a service exceeds the current capacity to provide that service; i.e. a customer does not get the service immediately but must wait, or the service facilities stand idle or the total number of service facilities exceeds the number of customers requiring service. Waiting lines or queues are a common occurrence both in our daily life and in various business and industrial situations. Most queueing problems involve the question of finding the level of services that a firm should provide. Table 15.1 shows examples of queueing services. For example: (i) Supermarkets must decide how many cash counters should be opened. (ii) Gasoline stations must decide how many pumps should be opened and how many attendants should be on duty. (iii) Manufacturing plants must determine the optimal number of machines to have running in each shift. (iv) Banks must decide how many teller windows to keep open to serve customers during certain hours of the day. The person waiting in a queue or receiving the service is called the customer, and the person by whom the customer is serviced is called a server. Queueing problems arise because the cost of service may not be cheap enough to ensure sufficient service facilities so that no customer has to wait. Customers arrive at a counter according to a certain probabilistic law (Poisson’s input, Erlang input, etc.). On the other hand, the customers will be served by one or more servers following a certain principle (FCFS, i.e. first come, first served, random service, etc.). By providing additional service facilities, the customer’s waiting time or the queue can be reduced; conversely, by decreasing the number of service facilities, a long queue
15.2
Basics of Queueing Theory
405
Table 15.1 Examples of queueing services Situation
Customers/ arrivals
Service facility or servers
Service process
Flow of automobile traffic through a road network Ships entering a port
Automobiles
Road network
Ships
Docks
Telephone calls Aeroplanes
Switchboard operators or electric switching operators Runways
Automatic signalling Unloading the items Computer
Bank
Customer
Bank clerks, check drawn
Sale of railway tickets
Booking clerks
Maintenance and repair of machines Department store
Railway passengers Machine breakdown Customer
Landing or flying the planes Money deposited or withdrawn Selling tickets
Mechanics
Repairing Bill payment
Job interviewing
Applicant
Checking clerks and bag packers Interviewee
Telephone exchange Number of runways at airports
Selection of the applicant
will be formed. The service times are random variables governed by a given probabilistic law. After being served, a customer leaves the queue. The objective of the queueing model is to determine how to provide the service to the customers so as to minimize the total cost of service and the waiting time of customers by manipulating certain factors (variables) such as number of servers, rate of service and order of service.
15.2.3 Characteristics of Queueing Systems Any queueing system can be characterized by the following set of characteristics: • • • •
Input source Service mechanism Service discipline Capacity of the system.
406
15
Queueing Theory
Serviced customers
Queueing discipline
queue
Customers waiting in the
Customers outside the system and decide to join or not to join in the queue
The general framework of a queueing system is shown below. Input Service Service Queue source mechanism completed
Input source The input source is a device or group of devices that provides customers at the service facility for the service. An input source is characterized by the following factors: • Input size • Arrival pattern • Behaviour of the arrivals. Input size If the total number of customers requiring service is only a few, then the input size is said to be finite. But if the potential customers requiring service are sufficiently large in number, then the input source is considered to be infinite. Also, if the customers arrive at the service facility in batches of fixed size or variable size instead of one at a time, then the input is said to occur in bulk or batches. If the service is not available, they may be required to join the queue. Arrival pattern If the pattern by which customers arrive at the service system or time between successive arrivals (inter-arrival time) is uncertain, then the arrival pattern is measured by either mean arrival rate or inter-arrival time. These measures are characterized in the form of the probability distribution associated with these random processes. The most common stochastic queueing model assumes that the arrival rate follows a Poisson distribution or, equivalently, that the inter-arrival time follows an exponential distribution. Customer behaviour The customers generally behave in different ways as follows: (a) Balking: A customer may leave the queue because the queue is too long and he/ she has no time to wait or there is not sufficient waiting space.
15.2
Basics of Queueing Theory
407
(b) Reneging: A customer may become impatient after waiting in the queue for some time and leave the queue. (c) Priorities: In certain applications some customers are served before others regardless of their order of arrival. These customers have priority over others. (d) Jockeying: Customers may change position from one queue to another, hoping to receive the service more quickly. Service mechanism The service mechanism is concerned with the manner in which customers are served. It can be characterized by observing different factors: (i) Availability of service facility: Service may be available only at certain times but not always. (ii) Capacity of service facility: The capacity is measured in terms of customers who can be served simultaneously. A queueing system may have one or more parallel service channels or a sequence of channels in series, as shown in Figs. 15.1, 15.2, 15.3 and 15.4. (iii) Service time or duration: Service time or duration may be constant or a random variable. In general, service time is not constant, since it is a common practice to use overtime or extra effort when an excessive queue condition is present.
Service facilities
Queue
Fig. 15.1 Single channel single phase queueing system
Queue
Service facility
Service facility
Fig. 15.2 Single channel multiple phase queueing system
Fig. 15.3 Multiple channel single phase queueing system
Service facility Queue Service facility
408
15
Queueing Theory
Service facility
Service facility
Service facility
Service facility
Queue
Fig. 15.4 Multiple channel multiple phase queueing system
A variety of probability distributions can be used to characterize the service pattern. The service pattern is assumed to be independent with respect to the arrival of customers. In the most common stochastic queueing model, it is assumed that service time follows an exponential distribution or, equivalently, that service rate follows a Poisson distribution. Service discipline The service discipline refers to the order or manner in which customers in the queue will be served. The most common disciplines are as follows: (i) First come, first served (FCFS): According to this discipline the customers are served in the order of their arrival. It is also known as FIFO (first input first output). This service discipline may be seen at a cinema ticket window or a railway ticket window, etc. (ii) Last come, first served (LCFS): This discipline may be seen in a big godown, where the units (items) which come last are taken out (served) first. (iii) Serving in random order (SIRO): In this case, the arrivals are serviced randomly irrespective of their arrival in the system. (iv) Service on some priority procedure: Some customers are served before the others without considering their order of arrival; i.e. some customers are served on a priority basis. There are two types of priority services. (a) Preemptive priority In this case customers with the highest priority are allowed to enter the service immediately after entering the system, even if a customer with lower priority is already in service; i.e. lower priority customer service is interrupted to start service for a special customer. This interrupted service is resumed again after the highest priority customer is served. (b) Non-preemptive priority In this case the highest priority customer goes ahead of the queue, but service is started immediately on completion of the current customer service.
15.2
Basics of Queueing Theory
409
Capacity of the system In certain cases a queueing system is unable to accommodate more than the required number of customers at a time. No further customers are allowed to enter until the space becomes available to accommodate new customers. This type of situation is referred to as a finite source queue.
15.2.4 Important Definitions Queue length: Queue length is defined by the number of persons (customers) waiting in the queue at any time. Average length of queue: Average length of queue is defined by the number of customers in the queue per unit time. Waiting time: The time up to which a unit has to wait in the queue before it is taken into the service. Servicing time: The time taken for servicing of a unit. Busy period: The busy period of a server is the time during which the server remains busy in servicing. Thus, it is the time between the starting of service of the first unit to the end of service of the last unit in the queue. Idle period: When all the customers in the queue are served, the idle period of the server begins, and it continues up to the time of arrival of a customer. Thus, the idle period of a server is the time during which he remains free because there is no customer present in the system. Mean arrival rate: The mean arrival rate in a queue is defined as the expected number of arrivals occurring in a time interval of unit length. Mean servicing rate: The mean servicing rate for a particular servicing station is defined as the expected number of services completed in a time interval of unit length, given that the servicing is going on throughout the entire time unit. Traffic intensity: For a simple queue, the traffic intensity is the ratio of the mean arrival and the mean servicing rate, i.e., Traffic intensity ¼
Mean arrival rate Mean servicing rate
15.2.5 The State of the System The state of a system involves the study of a system’s behaviour over time. The states of a system may be classified as follows:
410
15
Queueing Theory
(i) Transient state: A system is said to be in transient state when its operating characteristics are dependent on time. Thus, a queueing system is in transient state when the probability distributions of arrivals, waiting time and servicing time of the customers are dependent on time. This state occurs at the beginning of the operation of the system. (ii) Steady state: A system is said to be in steady state when its operating characteristics become independent of time. Thus, a queueing system is in steady state when the probability distributions of arrivals, waiting time and servicing time of the customers are independent of time. Let pn(t) denote the probability that there are n units in the system at time t. Then if the probability pn(t) remains the same as time passes, the system acquires steady state. Mathematically, in steady state, lim pn ðtÞ ¼ pn ðindependent of timeÞ;
t!1
which implies lim ddt pn ðtÞ ¼ ddptn ¼ 0: t!1 (iii) Explosive state: If the servicing rate is less than the arrival rate, the length of the queue will go on increasing with time and will tend to infinity as t ! /. The system is then said to be in explosive state.
15.2.6 Probability Distribution in a Queueing System In the queueing system, it is assumed that the customer’s arrival is random and follows a Poisson distribution or, equivalently, that the inter-arrival times obey an exponential distribution. In most cases, service times are also assumed to be exponentially distributed. The basic assumptions (axioms) are as follows. Axiom 1 The probability that exactly one arrival will occur during the time interval (t, t + Dt) is equal to kDt + D(t), i.e. p1(Dt)= kDt + O(Dt), where k is a constant and independent of the total number of arrivals up to the time t, Dt is a small time interval and O(Dt) n denotes o a quantity which is of smaller order of magnitude than Dt such that lim
Dt!o
OðDtÞ Dt
¼ 0:
Axiom 2 The probability of more than one arrival during the time interval (t, t + Dt) is negligible and is denoted by O(Dt), i.e. p0 ðDtÞ þ p1 ðDtÞ þ OðDtÞ ¼ 1: Axiom 3 The number of arrivals in non-overlapping intervals is statistically independent.
15.2
Basics of Queueing Theory
411
Distribution of arrival (pure birth process) A process in which only arrivals are counted and no departure takes place is called a pure birth process. Stated in terms of queueing, a birth-death process usually arises by birth or arrival in the system and decreases by death or departure of serviced customers from the system. Property If the arrivals are completely random, then the probability distribution of the number of arrivals in a fixed time interval follows a Poisson distribution. Proof Let there be n units in the system at time t. In order to derive the arrival distribution in queues, we will consider the preceding three axioms. Now we wish to determine the probability of n arrivals at time t, denoted by pn ðtÞ: Clearly, n will be an integer greater than or equal to zero. Case 1 When n = 0 If there is no unit in the system at time t þ Dt, there will be no unit at time t and no arrival during Dt: Therefore, the probability of no unit in the system at time t þ Dt is given by p0 ðt þ DtÞ ¼ p0 ðtÞð1 kDtÞ þ OðDtÞ:
ð15:1Þ
Case 2 When n > 0 In this case, there may be the following mutually exclusive ways of having n units at time t þ Dt: (i) n arrivals in the system at time t, and no arrival during the time Dt (ii) (n – 1) arrivals in the system at time t and one arrival during the time Dt (iii) (n – 2) arrivals in the system at time t and two arrivals during the time Dt, and so on. Therefore, the probability of n units in the system at time t þ Dt is given by pn ðt þ DtÞ ¼ pn ðtÞð1 kDtÞ þ pn1 ðtÞkDt þ OðDtÞ: Equations (15.1) and (15.2) can be written as
ð15:2Þ
412
15
p0 ðt þ DtÞ p0 ðtÞ OðDtÞ ¼ kp0 ðtÞ þ ; n¼0 Dt Dt pn ðt þ DtÞ pn ðtÞ OðDtÞ ¼ kpn ðtÞ þ kpn1 ðtÞ þ ; Dt Dt
Queueing Theory
n [ 0:
Taking the limit as Dt ! 0 on both sides of these equations, we have the following system of differential difference equations: p00 ðtÞ ¼ kp0 ðtÞ;
n¼0
p0n ðtÞ ¼ kpn ðtÞ þ kpn1 ðtÞ;
n [ 0:
ð15:3Þ ð15:4Þ
Solution of differential difference equations From (15.3),
p00 ðtÞ p0 ðtÞ
¼ k:
Integrating, we have log p0 ðtÞ ¼ kt þ A, where A is an arbitrary constant. When t = 0, p0 ð0Þ ¼ 1, which implies A = 0. Hence, p0 ðtÞ ¼ ekt :
ð15:5Þ
Now substituting n = 1 in (15.4), we have p01 ðtÞ þ kp1 ðtÞ ¼ kekt
* p0 ðtÞ ¼ ekt :
This is a first-order linear differential equation. Solving and using p1 ðtÞ = 0 when t = 0, we have p1 ðtÞ ¼ ktekt : Now, substituting n = 2 in (15.4) and using the result of p1 ðtÞ, we have p02 ðtÞ þ kp2 ðtÞ ¼ k ktekt : Solving this equation and using p2 ð0Þ ¼ 0 when t = 0, we have p2 ðtÞ ¼ ekt Similarly, for n = 3,
ðktÞ2 : 2!
ð15:6Þ
15.2
Basics of Queueing Theory
413
p3 ðtÞ ¼ ekt
ðktÞ3 : 3!
pm ðtÞ ¼ ekt
ðktÞm : jm
Let us assume that for n = m,
Now, we shall prove that this result is true for n = m + 1 also. For this purpose, substituting n = m + 1 in (15.4) and using the result m pm ðtÞ ¼ ekt ðktm!Þ , we have p0m þ 1 ðtÞ þ kpm þ 1 ðtÞ ¼ k
ðktÞm ekt : m!
On solving and using pm þ 1 ð0Þ ¼ 0 when t =0, we have pm þ 1 ð t Þ ¼
ðktÞm þ 1 kt e : ðm þ 1Þ!
Hence, in general, pn ð t Þ ¼
ðktÞn kt e ; n!
n ¼ 0; 1; 2; . . .
which is the probability mass function of the Poisson distribution. Therefore, the probability distribution of the number of arrivals in a fixed time interval follows a Poisson distribution. Distribution of inter-arrival times The inter-arrival time is defined as the time interval between two successive arrivals. Property of inter-arrival times If the arrival process in a queueing system follows a Poisson distribution, then the associated random variable defined as inter-arrival time follows an exponential distribution and vice versa. Proof If n is the number of arrivals at time t, then the probability of n arrivals at time t is given by pn ð t Þ ¼
ðktÞn ekt : n!
ð15:7Þ
414
15
Queueing Theory
ΔT
T t0
t0 + T
First arrival
t0 + T + ΔT second arrival
Let t0 be the instant of an initial arrival. Let T be the random variable of inter-arrival time. If there is no arrival during the intervals ðt0 ; t0 þ T Þ and ðt0 þ T; t0 þ T þ DT Þ, then t0 þ T þ DT will be the instant of consecutive arrival. From (15.7), p0 ðTÞ ¼ probability of no arrival at time T ¼
ðkT Þ0 ekT ¼ ekT ; 0!
and p0 ðT þ DTÞ ¼ probability of no arrival at time T þ DT fkðT þ DT Þg0 ekðT þ DT Þ ¼ ekðT þ DT Þ ¼ ekT ekDT 0! ¼ p0 ðT Þ½1 kDT þ OðDT Þ;
¼
where OðDT Þ contains the terms of second and higher powers of DT: ) p0 ðT þ DT Þ p0 ðT Þ ¼ kp0 ðT ÞDt þ OðDT Þp0 ðT Þ or
p0 ðT þ DT Þ p0 ðT Þ OðDT Þ ¼ kp0 ðT Þ þ p0 ðT Þ : D DT
Taking the limit as DT ! 0, we have d ½p0 ðT Þ ¼ kp0 ðT Þ: dT d * ½p0 ðT Þ ¼ kekT *p0 ðT Þ ¼ ekT : dT
ð15:8Þ
According to probability theory, F 0 ð xÞ ¼ f ð xÞ, where F ð xÞ is the distribution function and f(x) is the probability density function. Hence, by a similar argument, we may write ddT ½p0 ðT Þ as f ðT Þ, where p0 ðT Þ is the probability distribution
15.2
Basics of Queueing Theory
415
function for no arrival at time T and f ðT Þ is the corresponding probability density function of T. Again, since the probability density function is always non-negative, we disregard the negative sign from the right-hand side of (15.8). Hence, f ðT Þ ¼ kekT , which is the probability density function of the exponential distribution. That is, the random variable T follows an exponential distribution. The converse of the property may be proved in a similar way. Distribution of departure (pure death process) Sometimes a situation may arise where no additional customer joins the system while service is continued for those who are in the queue. At time t let there be N ( 1) customers in the system. It is clear that service will be provided at a rate l. The number of customers in the system at time t 0 is equal to N minus the total departure up to the time t. The distribution of the departure can be obtained with the help of the following basic axioms. Basic axioms (i) The probability of one departure during the time interval (t, t + Dt) is equal to lDt + O(Dt). (ii) The probability of more than one departure during the time interval (t, t + Dt) is negligible and is denoted by O(Dt). (iii) The numbers of departures in non-overlapping intervals are statistically independent.
15.2.7 Classification of Queueing Models Generally, a queueing model may be specified in a symbolic form as (a/b/c):(d/e/f), where: a = Arrival distribution, i.e. type of arrival process b = Service time distribution, i.e. type of service process c = Number of servers d = Capacity of the system e = Service discipline f = Number of calling source capacity of input service. Generally, the last symbol f is not used. The standard notation for a and b is taken as M, which is called the Poisson (or Markovian) arrival or departure distribution or equivalently the exponential inter-arrival service time distribution. Ek = Erlangian or gamma inter-arrival for service time distribution with parameter k G = General service time distribution or departure distribution.
416
15
Queueing Theory
Thus, (M/Ek/1):(∞/FIFO/∞) defines a queueing system in which arrival follows the Poisson distribution; service time distributions are Erlangian; there is a single server; the capacity of the system is infinite, i.e. the system can hold an infinite number of customers; the service discipline is first input first output; and finally the source generating the arriving customers has an infinite capacity. Generally queues with arrivals and departures start under a transient condition and gradually reach steady state after a sufficiently long time has elapsed, provided that the parameters of the system permit reaching steady state. Now we define some parameters for a steady state queueing system: pn = (steady state) Probability of n customers in the system Ls = Expected number of customers in the system Lq = Expected number of customers in the queue Ws = Expected waiting time in the system Wq = Expected waiting time in the queue
15.3
Poisson Queueing Models
15.3.1 Introduction In this section, we shall discuss single as well as multiserver queueing models with Poisson arrival and departure processes. According to the capacity of the queueing system, we can classify these models as follows: (i) Single server queueing model with infinite capacity of the system, i.e., ðM=M=1Þ:ð1=FCFS=1Þ (ii) Single server queueing model with finite capacity of the system, i.e., ðM=M=1Þ:ðN=FCFS=1Þ (iii) Multiserver queueing model with infinite capacity of the system, i.e., ðM=M=cÞ:ð1=FCFS=1Þ (iv) Multiserver queueing model with finite capacity of the system, i:e:ðM=M=cÞ:ðN=FCFS=1Þ: We shall also discuss the repair problem of machine breakdowns and the power supply problem.
15.3
Poisson Queueing Models
417
15.3.2 Model (M/M/1):(∞/FCFS/∞) Assumptions In this model, we shall discuss a single server queueing system under the following assumptions: (i) (ii) (iii) (iv) (v)
The The The The The
arrival rate follows a Poisson distribution. servicing rate follows a Poisson distribution. queueing system has only a single service channel. service discipline is first come, first served (FCFS). capacity of the system is infinite.
Let k and l be the mean arrival rate and mean service rate of units respectively, and let n be the number of customers in the system. Hence, the probability of one arrival during time Dt is kDt þ OðDtÞ and the probability of one departure or service during Dt is lDt þ OðDtÞ: If there is no customer in the system at time t +Dt, there may be the following mutually exclusive cases: (i) No customer in the system at time t, no arrival in time Dt (ii) One customer in the system at time t, no arrival in time Dt, one service in time Dt (iii) No customer in the system at time t, one arrival in time Dt, one service in time Dt, and so on. Therefore, the probability of no customers in the system at time t +Dt is given by p0 ðt þ DtÞ ¼ p0 ðtÞ½1 kDt þ OðDtÞ þ p1 ðtÞ½1 kDt þ OðDtÞ½lDt þ OðDtÞ þ OðDtÞ ¼ p0 ðtÞ kp0 ðtÞDt þ lp1 ðtÞDt þ OðDtÞ or,
p0 ðt þ DtÞ p0 ðtÞ OðDtÞ ¼ kp0 ðtÞ þ lp1 ðtÞ þ : Dt Dt
Taking the limit as Dt ! 0 on both sides in the preceding expression, we have p00 ðtÞ ¼ kp0 ðtÞ þ lp1 ðtÞ:
ð15:9Þ
To have n (>0) customers in the system at time t þ Dt, there may be the following mutually exclusive cases: (i) n customers in the system at time t, no arrival in time Dt, no service in time Dt (ii) (n – 1) customers in the system at time t, one arrival in time Dt, no service in time Dt (iii) (n + 1) customers in the system at time t, no arrival in time Dt, one service in time Dt, and so on.
418
15
Queueing Theory
Therefore, the probability of n customers in the system at time (t + Dt) is given by pn ðt þ DtÞ ¼ pn ðtÞf1 kDt þ OðDtÞgf1 lDt þ OðDtÞg þ pn1 ðtÞ ½kDt þ OðDtÞ½1 lDt þ OðDtÞ þ pn þ 1 ðtÞ½1 kDt þ OðDtÞ½lDt þ OðDtÞ þ OðDtÞ for n [ 0 ¼ pn ðtÞ þ ½ðk þ lÞpn ðtÞ þ kpn1 ðtÞ þ lpn þ 1 ðtÞDt þ OðDtÞ: ) pn ðt þ DtÞ pn ðtÞ ¼ ½kpn1 ðtÞ ðk þ lÞpn ðtÞ þ lpn þ 1 ðtÞDt þ OðDtÞ pn ðt þ DtÞ pn ðtÞ OðDtÞ ¼ kpn1 ðtÞ ðk þ lÞpn ðtÞ þ lpn þ 1 ðtÞ þ for n [ 0: or Dt Dt
Now taking the limit as Dt ! 0 of both sides in this equation, we have p0n ðtÞ ¼ kpn1 ðtÞ ðk þ lÞpn ðtÞ þ lpn þ 1 ðtÞ;
n [ 0:
ð15:10Þ
Under the steady state condition of the system, i.e. lim pn ðtÞ ¼ pn and lim p0n t!1
ðtÞ ¼ 0, Eqs. (15.9) and (15.10) reduce to
t!1
kp0 þ lp1 ¼ 0
ð15:11Þ
kpn1 ðk þ lÞpn þ lpn þ 1 ¼ 0 for n [ 0;
ð15:12Þ
which are the steady state difference equations of the system. Solution of the Model From (15.11), kp0 þ lp1 ¼ 0; which implies k p1 ¼ p0 ¼ q p0 l
*
k ¼q : l
Now setting n = 1 in (15.12), we have kp0 ðk þ lÞp1 þ lp2 ¼ 0 k k þ 1 p1 or p2 ¼ p0 þ l l or p2 ¼ qp0 þ ðq þ 1Þqp0 ¼ q2 p0 : Again setting n = 2 in (15.12), we have
15.3
Poisson Queueing Models
419
kp1 ðk þ lÞp2 þ lp3 ¼ 0 k k þ 1 p2 or p3 ¼ p1 þ l l ¼ q2 p0 þ q3 p0 þ qp0 ¼ q3 p0 : Hence, p 3 ¼ q3 p 0 :
ð15:13Þ
Proceeding in this way, we will have p n ¼ qn p 0 : The value of p0 is obtained from the condition i.e. p0 ð1 þ q þ q2 þ . . .Þ ¼ 1 or
P1 n¼0
pn ¼ 1,
p0 ¼ ð1 qÞ; since q ¼ k=l\1:
ð15:14Þ
Therefore, (15.13) becomes pn ¼ ð1 qÞpn ;
n ¼ 0; 1; 2; . . .
ð15:15Þ
This gives the probability that there are n units in the system at any time. Equations (15.14) and (15.15) rather give the required probability distribution of the queue length. We note that the probability distribution in (15.15) depends only on the traffic intensity ratio q ¼ k=l. Obviously in order for p0 to exist, it is required that ðq¼Þk=l\1, i.e. k\l; otherwise, the series will diverge. This is the stability condition for the M/M/1 model. Characteristics of the Model (i) Probability of queue size greater than or equal to N, i.e. prob (queue size N) prob (queue size NÞ ¼ pN þ pN þ 1 þ pN þ 2 þ ¼ p 0 q N 1 þ q þ q2 þ N qN k ¼ qN ¼ ½*p0 ¼ 1 q: ¼ p0 l 1q N k N : ) prob (queue size NÞ ¼ q ¼ l (ii) Expected line length Ls
420
15
Queueing Theory
Ls = expected number of customers in the system, i.e. expected line length or average number of customers in the system ¼
1 X
npn ¼
n¼0
1 X
nqn p0 ¼ p0
n¼0
1 X
nqn
n¼0
¼ p0 q½1 þ 2q þ 3q2 þ 4q3 þ 1 ¼ p0 qð1 qÞ2 ¼ ð1 qÞq ð1 qÞ2 q k k ¼ * q¼ ¼ 1q lk l
½* p0 ¼ 1 q:
(iii) Expected queue length Lq Lq=Pexpected number of customers in queue (i.e. expected queue length) ¼ 1 n¼1 ðn 1Þpn [Since there are (n − 1) customers in the queue excluding the one in service] ¼
1 X
npn
n¼1
1 X
pn ¼ L s
1 X
n¼1
! pn p0
n¼0
q q ¼ Ls ð1 p0 Þ ¼ Ls q ¼ 1q k2 k * q¼ ¼ l lðl kÞ
Note: Ls ¼ Lq þ q ¼ Lq þ lk as Lq = Ls − q. (iv) Variance of queue length
From the definition of variance, VarðnÞ ¼ E n2 fE ðnÞg2 ¼
1 X
( n2 pn
n¼1
1 X n¼1
)2 npn
2 q ¼ n q ð1 qÞ 1q n¼1 " 1 X q * Ls ¼ npn ¼ and pn ¼ qn ð1 qÞ 1 q n¼1 1 X
2 n
q *Ls ¼ 1q
15.3
Poisson Queueing Models
421
¼ qð1 qÞ þ 22 q2 ð1 qÞ þ 32 q3 ð1 qÞ þ 42 q4 ð1 qÞ þ ¼ qð1 qÞ 1 þ 22 q þ 32 q2 þ 42 q3 þ q2
¼ qð1 qÞS
ð1 qÞ2
;
q2 ð1 qÞ2
q2 ð1 qÞ2
where S ¼ 1 þ 22 q þ 32 q2 þ 42 q3 þ
Now, we have to calculate the simplified expression of S. S ¼ 1 þ 22 q þ 32 q2 þ 42 q3 þ Integrating both sides with respect to q from 0 to q, we have Zq Zq Sdq ¼ 1 þ 22 q þ 32 q2 þ 42 q3 þ dq q¼0
q¼0
¼ q þ 2q2 þ 3q3 þ 4q4 þ 5q5 þ ¼ qð1 qÞ2 Zq ) Sdq ¼ qð1 qÞ2 ¼ q¼0
q ð1 qÞ2
:
Differentiating both sides with respect to q, we have S¼ ¼
1 ð1 qÞ2 1þq ð1 qÞ3
þq
2 ð1 qÞ3
ð1Þ ¼
1 ð1 qÞ2
þ
2q ð1 qÞ3
¼
1 q þ 2q ð1 qÞ3
: )VarðnÞ ¼ qð1 qÞS
q2
ð1 qÞ2 1þq q2 ¼ qð1 qÞ : 3 ð1 qÞ2 ð1 qÞ qð1 þ qÞ q2 q ¼ ¼ 2 2 ð1 qÞ ð1 qÞ2 ð1 qÞ
(v) Probability density function of waiting time excluding the service time distribution
422
15
Queueing Theory
In a steady state, each customer has the same waiting time distribution. Let this distribution be a continuous distribution with probability density function w(w), and we denote by w(w) dw the probability that a customer begins to be served in the interval (w, w +dw), where w is measured from the time of his arrival. We suppose that a customer arrives at time w = 0 and his service begins in the interval (w, w +dw).
Customer A’s arriveal
A’s service start
w=0
w
w + dw
There may be two possibilities: (i) There is a finite probability p0 (the probability that the system is empty) that the waiting time is zero. (ii) Let there be n customers in the system, (n − 1) waiting, one in the service, when customer A arrives. Therefore, before the service of A begins, (n − 1) customers must leave in the time interval (0, w) and with the nth customer in (w, w +dw). As the server’s mean rate of service is l in unit time or lw in time w, and the service distribution is Poisson, we have prob [(n − 1) customers waiting are served in time w] ¼
ðlwÞn1 elw ðn 1Þ!
and prob [one customer is served in time dw] = l dw. ) wn ðwÞdw ¼ probability that a new arrival is taken into service after a time lying between w and w þ dw ¼ prob½ðn1Þcustomers waiting are served in time w prob ½one customer is served in time dw ¼
ðlwÞn1 elw l dw: ðn 1Þ!
Let W be the waiting time of the customer who has to wait such that w W w þ dw: Since the queue length can vary between 1 and ∞, the probability density of the waiting time is given by
15.3
Poisson Queueing Models
423
wðwÞdw ¼ pðw W w þ dwÞ 1 X ½probability that there are n customers in the system wn ðwÞdw ¼ n¼1
¼
1 X
qn ð1 qÞ
n¼1
ðlwÞn1 elw ldw ½* pn ¼ qn ð1 qÞ ðn 1Þ!
1 X ðlwÞn1 ðqlwÞn1 ¼ qð1 qÞelw ldw ðn 1Þ! ðn 1Þ! n¼1 n¼1
n1 k 1 1 X l lw k k lw k lw X ðkwÞn1 ¼k 1 1 e ldw e dw ¼ ðn 1Þ! ðn 1Þ! l l l n¼1 n¼1 " # 2 k lw kw ðkwÞ k lw kw ¼k 1 þ ¼ k 1 e dw e dw 1 þ þ e 2! l 1! l k ðlkÞw dw where w [ 0: e ¼k 1 l
¼ ð1 qÞelw ldw
1 X
qn
Hence, Z1 0
ZB Z1 k ðlkÞw k ðlkÞw wðwÞdw ¼ k 1 dw ¼ lim k 1 dw e e B!1 l l 0
0
ðlkÞw B k e ¼k 1 lim l B!1 ðl kÞ 0 k 1 lk 1 k ¼ 6¼ 1: ¼k 1 0þ ¼k l ðl kÞ l lk l
This happens because the case for which w = 0 is not included. Thus, Z1 prob½w [ 0 ¼
wðwÞdw ¼ q 0
and prob[w = 0] = prob[no customer in the system] = p0 = 1 − q. Since the sum of these probabilities of waiting time is 1, the complete distribution of the waiting time is: (a) Continuous for w W w + dw with probability density function w(w) given by
424
15
Queueing Theory
k ðlkÞw wðwÞdw ¼ k 1 dw e l (b) Discrete for w = 0 with prob(w = 0) = 1 lk. Note that the probability that the waiting time exceeds w1 is 1 Z1 k k wðwÞdw ¼ eðlkÞw ¼ eðlkÞw1 ; l l w1
w1
which does not include the service time. (vi) Busy period distribution, i.e. the conditional density function for waiting time, given that a person has to wait Here we find the probability density function for the distribution of total time (waiting plus service) that an arrival spends in the system. Let w(w/w >0) = probability density function for waiting time such that a person has to wait wðwÞ probðw [ 0Þ " * probðw [ 0Þ þ probðw ¼ 0Þ ¼ 1 wðwÞ ¼ q ) probðw [ 0Þ ¼ 1 probðw ¼ 0Þ ¼ 1 ð1 qÞ ¼ q ð15:16Þ
k 1 lk eðlkÞw ¼ ¼ ðl kÞeðlkÞw k ¼
l
R1 R1 Now 0 wðw=w [ 0Þ ¼ 0 ðl kÞeðlkÞw dw ¼ 1. Thus, (15.16) gives the required probability density function for the busy period. (vii) Mean or expected waiting time in the queue, i.e. Wq (average waiting time of an arrival in the queue) We have Z1 Wq ¼
wwðwÞdw 0
Z1 ¼ 0
k ðlkÞw wk 1 dw e l
15.3
Poisson Queueing Models
425
1 ZB k eðlkÞw l eðlkÞw dw ¼ wk 1 þ k Lt 1 B!1 l ðl kÞ 0 k lk 0
½Integrating by parts taking was the first function #B k eðlkÞw ¼ 0 0þk 1 Lt l B!1 ðl kÞ2 #1 0 lk 1 ¼ k eðlkÞw l ðl kÞ2 0
kðl kÞ
k 1 ¼ Lq ; ¼ 0þ 1¼ 2 lðl kÞ k lðl kÞ
where Lq ¼
k2 : lðl kÞ
(viii) Expected waiting time in the system Ws, i.e. average time that an arrival spends in the system Ws ¼ expected waiting time in the system ¼ expected waiting time in queue plus expected service time 1 1 Since the expected service time ¼ mean service time ¼ ¼ Wq þ l l k 1 k þ * Wq ¼ ¼ lðl kÞ l lðl kÞ 1 : ¼ lk 1 k and Ls ¼ ; Ls ¼ kWs : As W s ¼ lk lk (ix) Expected length of non-empty queue, or average length of non-empty queue, i.e. L/L > 0. We have Lq Lq Lq ¼ ¼ P1 prob½an arrival has to wait, L [ 0 prob½ðn 1Þ [ 0 n¼2 pn Lq Lq Lq ¼ ¼ P1 ¼ p ðp þ p Þ 1 p qp 1 p ð1 þ qÞ n 0 1 0 0 0 n¼0 1 l ¼ : ¼ 1q lk
ðL=L [ 0Þ ¼
426
15
Queueing Theory
Note: From the preceding characteristics, we can write Ls ¼ Lq þ lk ; Ls ¼ kWs , Lq ¼ kWq and Ws ¼ Wq þ l1. These formulas are known as Little’s formula.
Example 1 Arrivals at a telephone booth are considered to have a Poisson distribution with an average time of 10 min between one arrival and the next. The length of a phone call is assumed to be distributed exponentially with mean 3 min. (a) What is the probability that a person arriving at the booth will have to wait? (b) What is the average length of queues that form from time to time? (c) The telephone company will install a second booth when convinced that an arrival would expect to have to wait at least 3 min for the phone. By how much time should the flow of arrivals be increased to justify a second booth? (d) Find the average number of units in the system. (e) What is the probability that an arrival has to wait more than 10 min before the phone is free? (f) Estimate the fraction of a day that the phone will be in use (or busy). Solution This is an (M/M/1):(∞/FCFS/∞) problem. 1 In this problem, the mean arrival rate is k ¼ 10 and the mean service time is 1 l ¼ 3. (a) Now the probability that a person arriving at the telephone booth will have to wait ¼ prob½w [ 0 ¼ 1 probðw ¼ 0Þ ¼ 1 p0 ¼ 1 ð1 qÞ ¼ q ¼
k ¼ 0:3: l
(b) The average length of the queues that form from time to time ¼ L=L [ 0 ¼
l ¼ 1:43 persons lk
(c) The installation of a second booth will be justified if the arrival rate is greater than the waiting time. Then the length of the queue will go on increasing. In that case, let k′ be the mean arrival rate. We know that the average waiting time of an interval in the queue is k0 Wq ¼ lðlk 0 : Þ Here, Wq ¼ 3;
l¼
1 k0 ) 3 ¼ 1 1 0 or; 3 3 3k
k0 ¼
1 1 k0 or k0 ¼ : 3 6
15.3
Poisson Queueing Models
427
1 1 Hence, the increase in mean arrival rate ¼ k0 k ¼ 16 10 ¼ 15 ¼ 0:067 arrival per minute. 1 Again, the increase in the flow of arrivals ¼ 15 60 ¼ 4 per hour. Therefore, the second booth is justified if the increase in arrival rate is 0.067 arrival per minute (or 4 persons per hour). k (d) The average number of units in the system is Ls ¼ lk ¼ 0:43 persons. (e) prob [waiting time of an arrival in the queue > 10]
Z1 ¼
Z1 wðwÞdw ¼
10
10
3B k k eðlkÞw 5 ðlkÞw ðl kÞe dw ¼ lim ðl kÞ B!1 l l ðl kÞ
i kh ¼ 0 eðlkÞ10 ¼ 0:03 ðapproxÞ l
10
(f) The fraction of a day that the phone will be busy = traffic intensity ¼ q ¼ lk ¼ 0:3. Example 2 In a railway yard, goods trains arrive at a rate 30 trains per day. Assume that the inter-arrival time follows an exponential distribution, and the service time (the time taken to hump a train) distribution is also exponential with an average of 36 min. Calculate the following: (i) (ii) (iii) (iv) (v)
The average number of trains in the system The probability that the number of trains in the system exceeds 10 Expected waiting time in the queue Average number of trains in the queue If the input of trains increases to an average 33 per day, what will be the change in (i) and (ii)?
Solution This is an (M/M/1):(1/FCFS/1) problem. 30 1 In this problem, the mean arrival rate k ¼ 30 trains/day ¼ 6024 trains/minute ¼ 48 1 trains/min, and the mean service time l ¼ 36 trains/min. 1
3 Hence, the traffic intensity q ¼ lk ¼ 481 ¼ 36 48 ¼ 4 ¼ 0:75. 36
(i) The average number of trains in the system is given by q 0:75 Ls ¼ 1q ¼ 10:75 ¼ 0:75 0:25 ¼ 3 trains: (ii) The probability that the number of trains in the system exceeds 10 ¼ probðqueue size 10Þ ¼ q10 ¼ ð0:75Þ10 ¼ 0:056. (iii) The expected waiting time in the queue is given by 1=48 k Wq ¼ lðlk Þ ¼ 1 ð 1 1 Þ min ¼ 108 minor1 h 48 min: 36 36 48
428
15
Queueing Theory
(iv) The average number of trains in the queue is given by 2 ð481 Þ k2 Lq ¼ lðlk ¼ 94 ¼ 2:25or nearly 2 trains: ¼ 1 1 1 Þ 36ð3648Þ 33 (v) If the input of trains increases to 33 trains per day, then k ¼ 6024 trains/min 11
11 ¼ 480 trains/min. Hence, q ¼ lk ¼ 4801 ¼ 33 40. 36
33
q ¼ 14033 ¼ 33 In this case, Ls ¼ 1q 7 ¼ 5 trains (approx.) and prob(queue size 40 10 ¼ 0:2 (approx.). 10) ¼ q10 ¼ 33 40
15.3.3 Model (M/M/1):(N/FCFS/∞) Assumptions In this model, we shall discuss a single server finite queueing system under the following assumptions: (i) (ii) (iii) (iv) (v)
Customers arrive according to a Poisson fashion. The service rate follows a Poisson distribution. The queueing system has only a single service channel. The service discipline is FCFS (first come, first served). The maximum number of customers in the system is limited to N.
If a new customer arrives when N persons are already in the system, then the customer is restricted from entering, and is said to be lost from the system. This is often referred to as blocking. Model Formulation Let n be the customers in the system. Let k and l be the mean arrival and service rate of units respectively. In this model, ( kn ¼
k when n\N i.e.; ðn ¼ 0; 1; 2; :. . .N 1Þ 0 when n N
and ln ¼ l ðindependent of nÞ: Therefore, the probability of one arrival during Dt is kn Dt þ OðDtÞ for n N, and the probability of one departure or service during Dt is lDt þ OðDtÞ. To have 0 (zero) customer in the system at time t þ Dt, there may be the following mutually exclusive cases: (i) No customer in the system at time t, no arrival in time Dt (ii) One customer in the system at time t, no arrival in time Dt, one service in time Dt, and so on.
15.3
Poisson Queueing Models
429
Therefore, the probability of no customer in the system at time t + Dt is given by p0 ðt þ DtÞ ¼ p0 ðtÞð1 kDtÞ þ p1 ðtÞð1 kDtÞlDt þ OðDtÞ ¼ p0 ðtÞ kp0 ðtÞDt þ lp1 ðtÞDt þ OðDtÞ or
p0 ðt þ DtÞ p0 ðtÞ OðDtÞ ¼ kp0 ðtÞ þ lp1 ðtÞ þ Dt Dt
for n ¼ 0:
ð15:17Þ
To have n (>0) customers in the system at time t + Dt, there may be the following mutually exclusive cases: (i) n customers in the system at time t, no arrival in time Dt, no service in time Dt (ii) (n − 1) customers in the system at time t, one arrival in time Dt, no service in time Dt (iii) (n + 1) customers in the system at time t, no arrival in time Dt, one service in time Dt, and so on. Therefore, the probability of n customers in the system at time (t + Dt) is given by pn ðt þ DtÞ ¼ pn1 ðtÞkn1 Dtð1 lDtÞ þ pn ðtÞð1 kn DtÞð1 lDtÞ þ pn þ 1 ðtÞð1 kn þ 1 DtÞlDt þ OðDtÞ or
pn ðt þ DtÞ pn ðtÞ OðDtÞ ¼ kn1 pn1 ðtÞ ðkn þ lÞpn ðtÞ þ lpn þ 1 ðtÞ þ : Dt Dt ð15:18Þ
When n < N, then kn1 ¼ k; kn ¼ k. Hence, from (15.18), we have pn ðt þ DtÞ pn ðtÞ OðDtÞ ¼ kpn1 ðtÞ ðk þ lÞpn ðtÞ þ lpn þ 1 ðtÞ þ : Dt Dt
ð15:19Þ
When n = N, pn þ 1 ¼ 0; kn1 ¼ k; kn ¼ 0. From ∴ (15.18), we have pN ðt þ DtÞ pN ðtÞ OðDtÞ ¼ kpN1 ðtÞ lpN ðtÞ þ : Dt Dt Taking the limit as Dt ! 0 in Eqs. (15.17), (15.19) and (15.20), we have p00 ðtÞ ¼ kp0 ðtÞ þ lp1 ðtÞ for n ¼ 0
p0n ðtÞ ¼ kpn1 ðtÞ ðk þ lÞpn ðtÞ þ lpn þ 1 ðtÞ for 0\n\N p0n ðtÞ ¼ kpN1 ðtÞ lpN ðtÞ for n ¼ N
ð15:20Þ
430
15
Queueing Theory
Under a steady state condition of the system, these equations reduce to kp0 þ lp1 ¼ 0 for n ¼ 0 kpn1 ðk þ lÞpn þ lpn þ 1 ¼ 0 kpN1 lpN ¼ 0
for 1 n\N
for n ¼ N:
ð15:21Þ ð15:22Þ ð15:23Þ
Solution of the Model From (15.21), we have p1 ¼ lk p0 ¼ qp0 where q ¼ lk . Setting n = 1 in (15.22), we have kp0 ðk þ lÞp1 þ lp2 ¼ 0 k k p1 ¼ qp1 ¼ q2 p0 : or p2 ¼ p0 þ 1 þ l l Setting n = 2 in (15.22), we have kp1 ðk þ lÞp2 þ lp3 ¼ 0 k k ) p3 ¼ p1 þ 1 þ p2 ¼ qp2 ¼ q3 p0 : l l Proceeding in this way, we have n k p0 ¼ qn p0 for 1 n\N pn ¼ l N1 k ) pN1 ¼ p0 ¼ qN1 p0 : l
ð15:24Þ
From (15.23), we have pN ¼ lk pN1 ¼ qN p0 ) pn ¼ qn p0 for 0 n N: Since the capacity of the system is N,
ð15:25Þ
15.3
Poisson Queueing Models N X
431
pn ¼ 1
n¼0
or p0 or
1 qN þ 1 ¼ 1 for q 6¼ 1 and ðN þ 1Þp0 ¼ 1 for q ¼ 1 1q ( 1q 1qN þ 1 for q 6¼ 1 p0 ¼ : 1 for q ¼ 1 N þ1
From (15.9), we have ( pn ¼ q p0 ¼ n
ð1qÞqn 1qN þ 1 1 N þ1
for 0 n N and q 6¼ 1 for 0 n N and q ¼ 1
Note (i) This solution holds for any value of q, as the number of customers allowed in the system is controlled by the queue length (N – 1), not by the rates of arrival and departure k and l. So in this model we do not require the condition q\1, and hence the system is stable for q\1 as well as q 1. (ii) keff is the effective arrival rate in the system, i.e. the mean rate of customers entering the system. It equals the arrival rate k when all arriving customers can join the system; otherwise, keff \k. Since the probability that a customer does not join the system is pN, the probability that the customers join the system is ð1 pN Þ. Hence, keff ¼ kð1 pN Þ. (iii) In this model, customers arrive at the rate k, but not all arrivals can join the system. So the equations Ls ¼ kWs and Lq ¼ kWq are modified by redefining k to include only those customers who actually join the system. Hence, for this model, Ls ¼ keff Ws and Lq ¼ keff Wq , where keff ¼ effective arrival rate for those who join the system.
Characteristics of the Model (i) Expected line length, i.e. average number of customers in the system The expected line length, i.e. the average number of customers in the system, is given by N 1q n 1q X q ¼ nqn 1 qN þ 1 1 qN þ 1 n¼0 n¼0 n¼0 1q ¼ q þ 2q2 þ 3q3 þ þ NqN : N þ 1 1q
Ls ¼ EðnÞ ¼
N X
npn ¼
N X
n
when q 6¼ 1
432
15
Queueing Theory
[Let S ¼ q þ 2q2 þ 3q3 þ þ NqN . ) qS ¼ q2 þ 2q3 þ þ NqN þ 1 . Subtracting the second equation from the first, we have Sð1 qÞ ¼ q þ q2 þ q3 þ n times NqN þ 1 or or
q ð1 q N Þ NqN þ 1 Sð1 qÞ ¼ 1q qð1 qN Þ 1 N þ1 Nq S¼ 1q 1q 1q qð1 qN Þ N þ1 Nq ¼ =ð1 qÞ 1 qN þ 1 1q 1 q qN þ 1 NqN þ 1 þ NqN þ 2 ¼ 1 qN þ 1 1q N N þ1 q½1 ðN þ 1Þq þ Nq when q 6¼ 1: ¼ ð1 qÞð1 qN þ 1 Þ
For q ¼ 1; Ls ¼
N X
npn ¼
N X
n¼0
"
For q ¼ 1; pn ¼
1 N þ1
n 1 N ¼ ð1 þ 2 þ þ NÞ ¼ N þ 1 N þ 1 2 n¼0 # N X and pn ¼ 1 : n¼0
(ii) Average number of customers in the queue The average number of customers in the queue ¼ Lq ¼
N X
ðn 1Þpn ¼
n¼1
¼
N X n¼0
npn
N X n¼1
N X
npn
N X n¼1
pn "
pn þ p0 ¼ Ls 1 þ p0 * Ls ¼
n¼0
N X n¼0
q½1 ðN þ 1ÞqN þ NqN þ 1 1q 1 ð1 qÞð1 qN þ 1 Þ 1 qN þ 1 ¼ q2 1 NqN1 þ ðN 1ÞqN =ð1 qÞ 1 qN1 : ¼
(iii) Waiting time in the queue The waiting time in the queue is
# npn
15.3
Poisson Queueing Models
433
Lq Lq ¼ ½* keff ¼ kð1 pN Þ keff kð1 pN Þ Lq q2 ½1 NqN1 þ ðN 1ÞqN h i¼ : N kð1 qN Þð1 qÞ k 1 ð1qÞq N þ1
Wq ¼
1q
(iv) Waiting time in the system The waiting time in the system is given by Ws ¼ Wq þ
1 Ls or : l keff
Special case: If q\1 and N ! 1, then the steady state solution is pn ¼ ð1 qÞqn , and the resulting system reduces to the result ðM=M=1Þ:ð1=FCFS=1Þ. Example 3 In a car wash service facility, information gathering indicates that cars arrive for service according to a Poisson distribution with mean 5 per hour. The time for washing and cleaning for each car varies but is found to follow an exponential distribution with mean 10 min per car. The facility cannot handle more than one car at a time and has a total of 5 parking spaces. If the parking spot is full, newly arriving cars balk to seek services elsewhere. (a) How many customers is the manager of the facility losing due to the limited parking space? (b) What is the expected waiting time until a car is washed? Solution In this system, there may be 5 cars in the 5 parking spots and one can be serviced; i.e. the capacity of this system is N = 5 + 1 = 6. Hence, this problem is of the model (M/M/1):(6/FCFS/∞). 1 Here k = 5 per hour, l = 10 60 = 6 per hour ) q ¼ lk ¼ 56. 1q 1q 6 N Now pN ¼ p6 ¼ 1q 6 þ 1 q ¼ 0:0774 [Since pN ¼ 1qN þ 1 q ].
(a) Therefore, the rate at which the cars balk ¼ k keff ¼ k kð1 pN Þ ¼ kp6 ¼ 5 :0774 ¼ 0:387 car/h: Assuming 8 working hours per day, a manager will lose 0.387 8 = 3.096 = 3 cars a day.
434
15
Queueing Theory
(b) The expected waiting time until a car is washed is given by Ws ¼
Ls keff
Nþ1
þ 1Þq þ Nq Here Ls ¼ q½1ðN = 2.29 h, ð1qÞð1qN þ 1 Þ and keff = k(1 − pN) = k(1 − p6) = 4.613. Hence, Ws ¼ kLeffs = 0.496 h. N
15.3.4 Model (M/M/C):(∞/FCFS/∞) Assumptions In this model, we shall discuss a multiserver queueing system under the following assumptions: (i) The arrival rate follows a Poisson distribution. (ii) This queueing system deals with queues which are being served by parallel service channels in which the service time of each server has an independently and identically distributed with exponential service time distribution. (iii) This queueing system has cð [ 1Þ service channels. (iv) The service discipline is first come first served. (v) The capacity of the system is infinite. (vi) The length of the waiting line will depend on the number of occupied channels. Model Formulation Let n be the number of customers in the system. Therefore, according to assumption (i), the mean arrival rate is given by kn ¼ k for all n. For the mean service rate, if there are more than c customers in the system ðn [ cÞ, all the c servers will remain busy and each is providing service at a mean rate l, while ðn cÞ customers will be waiting in the queue. Thus, the mean service rate is cl. If n = c, then all the service channels will be busy, each providing at a mean service rate l, and thus the mean rate of service is cl. If there are fewer than c customers in the system ðn\cÞ, only n of the c servers will be busy, and in this case ðc nÞ servers will remain idle; there will be no customer waiting in the queue. Thus, the mean service rate is nl. Hence, for this model, kn ¼ k; n ¼ 0; 1; 2; . . . nl; 0 n\c ; ln ¼ cl n c
15.3
Poisson Queueing Models
435
i.e. the probability of one arrival during Dt is k. Dt + O(Dt), and the probability of one departure during Dt is nlDt þ OðDtÞ if 0\n\c and clDt þ OðDtÞ
if n c:
To have zero customer in the system during time t + Dt, there may be the following mutually exclusive cases: (i) No customer in the system at time t, no arrival in time Dt (ii) One customer in the system at time t, no arrival in time Dt, one service in time Dt, and so on. Therefore, the probability of no customer in the system at time (t + Dt) is given by or or
p0 ðt þ DtÞ ¼ p0 ðtÞð1 kDtÞ þ p1 ðtÞð1 kDtÞlDt þ OðDtÞ p0 ðt þ DtÞp0 ðtÞ ¼ kp0 ðtÞDt þ p1 ðtÞlDt þ OðDtÞ p0 ðt þ DtÞp0 ðtÞ ¼ kp0 ðtÞ þ lp1 ðtÞ þ OðDtÞ Dt Dt :
ð15:26Þ
To have n customers in the system at time t + Dt, there may be the following mutually exclusive cases: (i) n customers in the system at time t, no arrival in time Dt, no service in time Dt (ii) (n − 1) customers in the system at time t, one arrival in time Dt, no service in time Dt (iii) (n + 1) customers in the system at time t, no arrival in time Dt, one service in time Dt, and so on. Therefore, the probability of n customers in the system at time (t + Dt) is given by pn ðt þ DtÞ ¼ pn ðtÞð1 kDtÞð1 ln DtÞ þ pn1 ðtÞkDtð1 ln1 DtÞ þ pn þ 1 ðtÞð1 kDtÞln þ 1 DDt þ OðDtÞ pn ðt þ DtÞ pn ðtÞ OðDtÞ ¼ kpn1 ðtÞ ðk þ ln Þpn ðtÞ þ ln þ 1 pn þ 1 ðtÞ þ : or Dt Dt ð15:27Þ (a) When 0 < n < c, then ln ¼ nl; ln þ 1 ¼ ðn þ 1Þl. Hence, from (15.27), we have pn ðt þ DtÞ pn ðtÞ OðDtÞ ¼ kpn1 ðtÞ ðk þ nlÞpn ðtÞ þ ðn þ 1Þlpn þ 1 ðtÞ þ : Dt Dt ð15:28Þ
436
15
Queueing Theory
(b) When n c, then ln ¼ cl, ln þ 1 ¼ cl. Hence, from (15.27), we have pn ðt þ DtÞ pn ðtÞ OðDtÞ ¼ kpn1 ðtÞ ðk þ clÞpn ðtÞ þ clpn þ 1 ðtÞ þ : Dt Dt
ð15:29Þ
Now taking the limit as Dt ! 0, the differential difference equations for this system obtained from (15.26), (15.28) and (15.29) are as follows: p00 ðtÞ ¼ kp0 ðtÞ þ lp1 ðtÞ
for n ¼ 0
p0n ðtÞ ¼ kpn1 ðtÞ ðk þ nlÞpn ðtÞ þ ðn þ 1Þlpn þ 1 ðtÞ; for 0\n\c p0n ðtÞ ¼ kpn1 ðtÞ ðk þ clÞpn ðtÞ þ clpn þ 1 ðtÞ; for n c:
Under the steady state condition of the system, i.e. lim pn ðtÞ ¼ pn and lim p0n ðtÞ ¼ 0, t!1
t!1
the preceding three equations reduce to kp0 þ lp1 ¼ 0 for n ¼ 0 kpn1 ðk þ nlÞpn þ ðn þ 1Þlpn þ 1 ¼ 0 kpn1 ðk þ clÞpn þ clpn þ 1 ¼ 0
for for
ð15:30Þ 0\n\c
ð15:31Þ
n c;
ð15:32Þ
which are the steady state differential difference equations of the system. Solution From (15.30), k kp0 þ lp1 ¼ 0 or, lp1 ¼ kp0 or; p1 ¼ p0 l Now setting n = 1 in (15.31), we have kp0 ðk þ lÞp1 þ 2lp2 ¼ 0 kþl k p1 p0 or p2 ¼ 2l 2l k 1 k 2 k p1 ¼ ¼ p0 * p1 ¼ p 0 : 2l ð2Þ! l l 2 1 k p0 : ) p2 ¼ ð2Þ! l Again setting n = 2 in (15.31), we have
15.3
Poisson Queueing Models
437
kp1 ðk þ 2lÞp2 þ 3lp3 ¼ 0 k þ 2l k p2 p1 or, p3 ¼ 3l 3l 1 k 3 ¼ p0 : ð3Þ! l Hence, we have pn ¼ n!1
n k l
) pc1 ¼
p0 for 1 n < c.
c1 1 k 1 k pc2 : p0 ¼ ðc 1Þ! l c 1l
Now setting n = c – 1 in (15.31), we have kpc2 ½k þ ðc 1Þlpc1 þ clpc ¼ 0 k þ ðc 1Þl k pc1 pc2 or pc ¼ cl cl k c1 c1 pc1 pc1 ½by ð15:8Þ ¼ pc1 þ cl c c 1 k c ðcqÞc k p0 where q ¼ : ¼ p0 ¼ ðcÞ! l cl ðcÞ! Setting n = c in (15.32), we have kpc1 ðk þ clÞpc þ clpc þ 1 ¼ 0 k þ cl k pc pc1 or pc þ 1 ¼ cl cl k k ¼ pc þ pc pc * pc ¼ pc1 cl cl c þ 1 1 k ¼ p0 : cðcÞ! l 1 Similarly, pc þ 2 ¼ c2 ðcÞ!
c þ 2 k l
p0 .
In general, n 1 k p0 for n c cnc ðcÞ! l n cc k pn ¼ n p0 for n c: c ðcÞ! l pn ¼
ð15:33Þ
438
15
Queueing Theory
Therefore, 1 k n p0 for 1 n\c pn ¼ ðnÞ! l cc k n ¼ n p0 for n c: c ðcÞ! l
Determination of p0 : 1 P
We have
pn ¼ 1
n¼0 1 P pn þ pn ¼ 1 n¼c n¼1 n n cP 1 1 P 1 k cc k p0 þ ðnÞ! l p0 þ cna ðcÞ! l p0 ¼ 1 n¼c n¼1 c1 1 n P 1 k n P cc k þ ðcÞ! p0 ¼1 ðnÞ! l cl n¼c n¼0 cP 1 1 P n 1 cc k qn ¼ 1 where q ¼ cl p0 ðnÞ! ðcqÞ þ ðcÞ!
p0 þ
or or or or
cP 1
n¼c
n¼0
p0 ¼ hPc1 1n ðcqÞ
or
n¼0 ðnÞ!
ð15:34Þ
þ
ðcqÞc 1 ðcÞ! 1q
i:
Hence, 8
c, a queue of n customers would consist of c customers being served together with a genuine queue of (n − c) waiting customers. Hence, Lq ¼
1 X
ðn cÞpn
n¼c þ 1 1 X
cc qn p0 ¼ ðn cÞ ðcÞ! n¼c þ 1 ¼
cc n k q p0 where q ¼ * pn ¼ cl ðcÞ!
1 X cc qc p0 ðn cÞqnc ðcÞ! n¼c þ 1
ðcqÞc p0 q þ 2q2 þ 3q3 þ ðcÞ! q ¼ pc ð1 qÞ2 q ) Lq ¼ pc : ð1 qÞ2 ¼
Particular case: When c = 1; i.e. for a system with one channel only, Lq ¼
q ð1 qÞ
2
p1 ¼
q 2
ð1 qÞ
qð1 qÞ ¼
q2 : 1q
(ii) Average number of customers in the system We know that Ls ¼ Lq þ lk. ) Ls ¼
qpc ð1 qÞ
2
þ
k qpc ¼ þ cq l ð1 qÞ2
where q ¼
(iii) Average waiting time The average waiting time in the system is Average number of customers in the system rate of arrival Ls qpc =ð1 qÞ2 þ cq qpc cq ¼ ¼ ¼ þ : 2 k k k kð1 qÞ
Ws ¼
k cl
440
15
Queueing Theory
The average waiting time in the queue is Average number of customers in the queue rate of arrival Lq qpc ¼ ¼ : k kð1 qÞ2
Wq ¼
(iv) Expected length of non-empty queue, i.e. to find L/L > 0
ðL=L [ 0Þ ¼
Lq prob[an arrival has to wait]
¼
Lq qpc =ð1 qÞ2 qpc =ð1 qÞ2 ¼ P1 ¼ P1 cc n probðn [ cÞ n¼c þ 1 pn n¼c þ 1 jc q p0
¼
qpc =ð1 qÞ2 qpc =ð1 qÞ2 q=ð1 qÞ2 ¼ ¼ P 2 3 2 ðcqÞ 1 p qnc pc ðq þ q þ q þ Þ qð1 þ q þ q þ Þ jc 0 n¼c þ 1
¼
2
q=ð1 qÞ2 q 1q
¼
1 : 1q
(v) Probability that a customer has to wait Probability that a customer has to wait ¼ probðn cÞ ¼ c
1 X n¼c
pn ¼
1 c X c n¼c
jc
q n p0 ¼
1 cc X qn p0 jc n¼c
c p0 qc þ qc þ 1 þ jc cc ðqcÞc 1 1 pc ¼ pc : ¼ p0 : ¼ p0 qc ð1 þ q þ q2 þ Þ ¼ 1q 1q 1q jc jc ðqcÞc since pc ¼ p0 : jc
¼
(vi) Probability that on arrival a customer will not have to wait pc . ¼ 1 probðn cÞ ¼ 1 1q Alternatively, we can find this probability using the formula prob P (n < c) = c1 n¼0 pn .
15.3
Poisson Queueing Models
441
The probability that all channels will be occupied ¼probðn cÞ ¼
pc : 1q
Example 4 A telephone exchange has two long-distance operators. The telephone company finds that, during the peak load, long-distance calls arrive in a Poisson fashion at an average rate of 15 per hour. The length of service on these calls is approximately exponentially distributed with mean length 5 min. (a) What is the probability that a subscriber will have to wait for this long-distance call during the peak hours of the day? (b) If the subscriber waits and is serviced in turn, what is the expected waiting time? Solution This problem is of the model (M/M/c):(∞/FCFS/∞). 1
1 1 k 4 Here c = 2 and k = 15 calls/h ¼ 15 60 ¼ 4 calls=min, l ¼ 5 and q ¼ cl ¼ 2:1 ¼ 5 8
5
and cq ¼ 2 58 ¼ 54. Now, p0 ¼ Pc1 ðcqÞ1n n¼0
jn
þ
ðcqÞc
jc ð1qÞ
¼P
1
1 n¼0
ð54Þ jn
n
ð 5Þ þ 4 5 j2 ð18Þ n
¼
1 1þ
5 4 j1
25
þ 163
3 ¼ 13 .
2: 8
(a) The probability that a subscriber will have to wait for his long-distance call ¼ probðn cÞ ¼
pc 1q
¼
ðcqÞc p jc 0
1q
c ð 5Þ : 3 ¼ ðcqÞ p0 ¼ 4 13 ¼ 25 52 ¼ 0:48. cð1qÞ 2ð58Þ 2
(b) The expected waiting time 52 5 Lq qpc q ðcqÞ2 3 8 4 ¼ Wq ¼ ¼ ¼ : p0 ¼ : 2 : 2 2 1 5 k jc j2 13 kð1 qÞ kð1 qÞ 4 18 ¼
5 25 125 8 16 3 ¼ 3:2 min: : ¼ : 1 3 2 2 13 39 4: 8
Example 5 A supermarket has two girls ringing up sales at the counters. If the service time for each customer is exponential with mean 4 min, and if the people arrive in a Poisson fashion at the counter at the rate 10 per hour, then: (a) What is the probability of having to wait for service? (b) What is the expected percentage of idle time for each girl? (c) If a customer has to wait, what is the expected length of his waiting time?
442
15
Queueing Theory
Solution This problem is of the model (M/M/c):(∞/FCFS/∞). 1 1 Here c = 2 k ¼ 10 60 ¼ 6 people per minute, l ¼ 4 people per minute. 1 1 k 2 1 ¼ 6 1 ¼ 61 ¼ ¼ cl 2: 4 2 6 3 1 2 ) cq ¼ 2 ¼ 3 3
) q¼
1 1 ¼ ¼ 12. 2 2 2 P1 ð23Þn ð23Þ2 ð 3Þ 3 1þ þ 2 þ n¼0 jn 2 j1 j2 ð113Þ 3 (a) The probability of having to wait for service
Now, p0 ¼ Pc1 ðcqÞ12 n¼0
jn
þ
ðcqÞ2
¼
jc ð1qÞ
ð23Þ :1 2
¼ probðn 2Þ ¼
P1 n¼2
pn ¼
pc 1q
¼
ðcqÞc p jc 0
1q
¼
j2 2 113
¼ 0:167.
(b) The expected number of girls who are idle h i 1 ¼ 2 p0 þ 1 p1 ¼ 2 12 þ 1 13 ¼ 1 þ 13 ¼ 43 p1 ¼ lk p0 ¼ 61 12 ¼ 13 . 4
Now the probability of any girl being idle expected number of idle girls 43 2 ¼ 2 ¼ 3 ¼ 0:67. total number of girls Hence, the expected percentage of idle time for each girl = 0.67 100% = 67%. ¼
(c) The expected length of the customer waiting time on the condition that the customer has to wait ¼ W=W [ 0 ¼ ¼
1 1 ¼ Mean service rate mean arrival rate for n c cl k
1 ¼ 3 min: 2l k
15.3.5 Model (M/M/C):(N/FCFS/∞) Assumptions In this model, we shall discuss a multiserver queueing system under the following assumptions:
15.3
Poisson Queueing Models
443
(i) (ii) (iii) (iv)
Customers arrive in a Poisson fashion. The service rate follows a Poisson distribution. This queueing system has cð [ 1Þ service channels. The maximum number of customers in the system is limited to N, where N [ c; the capacity of the system is finite. (v) The service discipline is first come first served.
Model Formulation Let n be the number of customers in the system and k, l be the mean arrival rate and mean service rate of units respectively. In this model, kn ¼
k 0
when 0 n\N when n N
and ln ¼
nl when 0 n\c cl when c n N
Hence, the probability of one arrival during time Dt is kn Dt þ OðDtÞ, and the probability of one departure or service during time Dt is ln Dt þ OðDtÞ. To have zero customer in the system at time t + Dt, there may be the following mutually exclusive cases: (i) No customer in the system at time t, no arrival during time Dt (ii) One customer in the system at time t, no arrival during time Dt, one service during time Dt, and so on. Therefore, the probability of no customer in the system at time t + Dt is given by p0 ðt þ DtÞ ¼ p0 ðtÞð1 k0 DtÞ þ p1 ðtÞð1 k1 DtÞl1 Dt þ OðDtÞ Here k0 ¼ k or p0 ðt þ DtÞ p0 ðtÞ ¼ kp0 ðtÞDt þ lp1 ðtÞDt þ OðDtÞ k1 ¼ k; l1 ¼ l p0 ðt þ DtÞ p0 ðtÞ OðDtÞ ¼ kp0 ðtÞ þ lp1 ðtÞ þ for n ¼ 0: or Dt Dt Taking the limit as Dt ! 0, we have p00 ðtÞ ¼ kp0 ðtÞ þ lp1 ðtÞ for n ¼ 0:
ð15:36Þ
To have n customers in the system at time t + Dt, there may be the following three cases:
444
15
Queueing Theory
(i) n customers in the system at time t, no arrival during time Dt, no service during time Dt (ii) (n – 1) customers in the system at time t, one arrival during time Dt, no service during time Dt (iii) (n + 1) customers in the system at time t, no arrival during time Dt, one service during time Dt, and so on. Therefore, the probability of n customers in the system at time (t + Dt) is given by pn ðt þ DtÞ ¼ pn ðtÞð1 kn DtÞð1 ln DtÞ þ pn1 ðtÞkn1 Dtð1 ln1 DtÞ þ pn þ 1 ðtÞð1 kn þ 1 DtÞln þ 1 Dt þ OðDtÞ or pn ðt þ DtÞ pn ðtÞ ¼ ðkn þ ln Þpn ðtÞDt þ kn1 pn1 ðtÞDt þ ln1 pn þ 1 ðtÞDt þ OðDtÞ pn ðt þ DtÞ pn ðtÞ OðDtÞ ¼ ðkn þ ln Þpn ðtÞ þ kn1 pn1 ðtÞ þ ln þ 1 pn þ 1 ðtÞ þ : or Dt Dt
ð15:37Þ
(a) For 0 < n < c, kn1 ¼ kn ¼ k;
ln ¼ nl; ln þ 1 ¼ ðn þ 1Þl:
Hence, from (15.37), we have pn ðt þ DtÞ pn ðtÞ OðDtÞ ¼ ðk þ nlÞpn ðtÞ þ kpn1 ðtÞ þ ðn þ 1Þlpn þ 1 ðtÞ þ : Dt Dt Taking the limit as Dt ! 0 of both sides, we have p0n ðtÞ ¼ ðk þ nlÞpn ðtÞ þ kpn1 ðtÞ þ ðn þ 1Þlpn þ 1 ðtÞ:
ð15:38Þ
(b) For c n\N; kn1 ¼ k; kn ¼ k; ln ¼ cl; ln þ 1 ¼ cl: Hence, from (15.37), we have pn ðt þ DtÞ pn ðtÞ OðDtÞ ¼ ðk þ clÞpn ðtÞ þ kpn1 ðtÞ þ clpn þ 1 ðtÞ þ : Dt Dt Taking the limit as Dt ! 0 of both sides, we have
15.3
Poisson Queueing Models
445
p0n ðtÞ ¼ ðk þ clÞpn ðtÞ þ kpn1 ðtÞ þ clpn þ 1 ðtÞ for c n\N
ð15:39Þ
(c) For n = N, kn ¼ kN ¼ 0; kn1 ¼ kN1 ¼ k ln ¼ cN ¼ cl; pn þ 1 ðtÞ ¼ 0 . Hence, from (15.37), we have pn ðt þ DtÞ pn ðtÞ OðDtÞ ¼ ðo þ clÞpn ðtÞ þ kpn1 ðtÞ þ : Dt Dt Taking the limit as Dt ! 0 of both sides, we have p0n ðtÞ ¼ clpn ðtÞ þ kpn1 ðtÞ
p0N ðtÞ ¼ clpN ðtÞ þ kpN1 ðtÞ
Under a steady state condition of the system, i.e. lim p0 ðtÞ t!1 n
ð15:40Þ
for n ¼ N
¼ 0, Eqs. (15.36), (15.38)–(15.40) reduce to:
lim pn ðtÞ ¼ pn and
t!1
kp0 þ lp1 ¼ 0 for n ¼ 0 kpn1 ðk þ nlÞpn þ ðn þ 1Þlpn þ 1 ¼ 0
ð15:41Þ ð15:42Þ
for 0\n\c
kpn1 ðk þ clÞpn þ clpn þ 1 ¼ 0 for c n\N
ð15:43Þ
kpN1 clpN ¼ 0 for n ¼ N:
ð15:44Þ
Solution of the Model
pn ¼ where
1
8 cP 1 > ðcqÞn > > < n! þ n¼0 p0 ¼ cP 1 > > ðcqÞn > : n! þ n¼0
Characteristics of the Model (i) Average queue length The average queue length is
n n!c ðncqÞ p0 ; cq c! p0 ;
for 0 n\c ; for c n N
ðcqÞc 1qNc þ 1 c! 1q ðcqÞc c!
1
1 ðN c þ 1 Þ
k for q ¼ cl 6¼ 1 k for q ¼ cl ¼1
:
446
15
Lq ¼
N X
ðn cÞpn ¼
n¼c þ 1
¼ ¼
c
c p0 jc
N X
ðn cÞpn ¼
n¼c
N X
ðn cÞqn ¼
n¼c
N X
ðn cÞ
n¼c c
c p0 qc jc
N X
Queueing Theory
cc qn p0 jc
ðn cÞqnc
n¼c
N c ðcqÞc X p0 xqx putting n c ¼ x jc x¼0
N c ðcqÞc X d p0 q ðqx Þ dq jc n¼0 ( ) c N c ðcqÞ d X x q ¼ p0 q dq x¼0 jc ðcqÞc d 1 qNc þ 1 ¼ p0 q dq 1q jc 8 c ðcqÞ 1 qNc þ 1 ð1 qÞðN c þ 1ÞqNc k > > 6¼ 1 for q ¼ > < jc p 0 q cl ð1 qÞ2 ¼ > > cc ðn cÞðN c þ 1Þ k > : p0 ¼1 for k ¼ Lj c cl
¼
(ii) Average number of customers in the system Ls = average number of customers in the system ¼ EðnÞ ¼
N X
npn ¼
n¼0
¼
N X
¼ Lq þ c
"
npn þ
N X
cpn þ
n¼c N X
pn
n¼0
¼ Lq þ c c
c1 X
#
c1 X
pn þ
n¼0
¼ Lq þ c p0
pn þ
npn
c1 X
npn
n¼0
c1 X
npn ¼ Lq þ c þ
n¼0
c1 X ðc nÞðqcÞn n¼0
npn
n¼0
n¼0
c1 X
N X n¼c
n¼0
ðn cÞpn þ
n¼c
c1 X
jn
Calculation of keff: keff ¼ kð1 pN Þ Let c = expected number of idle servers
c1 X n¼0
ðn cÞpn
15.3
Poisson Queueing Models
) c ¼
c X
447
ðc nÞpn
n¼0
) c c ¼ the expected number of busy servers. ) keff ¼ lðc cÞ Lq ks and Wq ¼ Ws ¼ keff keff L
Now, Ws ¼ kkseff and Wq ¼ keffq . Special cases: (i) Let us take c ¼ 1; N ! 1 and consider q\1. Then we get p0 ¼ 1 q and pn ¼ qn p0 , n ¼ 0; 1; 2; . . . This result is exactly the same as that of model ðM=M=1Þ:ð1=FCFS=1Þ. (ii) Let us take c = 1. Then we get ( p0 ¼
1q 1qN þ 1 1 1þN ;
k ; q ¼ cl 6¼ 1 k q ¼ cl ¼ 1
and pn ¼ qn p0 ; 1 n N. This result corresponds to that of the queueing model ðM=M=1Þ:ðN=FCFS=1Þ. k (iii) Let us take N ! 1 and consider q ¼ cl \1.
Then we get " p0 ¼
c1 X ðcqÞn n¼0
n!
ðcqÞc 1 qNc þ 1 þ c! 1q
#1
8 1 > < ðcqÞn p0 ; 0 n\c , and pn ¼ n! > : 1 cc qn p ; c n N 0 c! which is the steady state solution of the queueing model ðM=M=cÞ:ðN= FCFS=1Þ. Thus, we can consider this model as a general queueing model, as we can get the results of previous queueing models from this one.
448
15
Queueing Theory
Machine Repairing Problem Whenever a machine breaks down, it will result in a great loss to an organization. Thus, machine repair problems are very important problems in queueing theory. When a machine breaks down, at that point the machine starts to be repaired. During this time, if another machine breaks down, then it will be attended after completion of the repair of the first machine. Thus, the broken-down machines form a queue and wait for their repair. There are various situations. There may be one or more than one mechanic. If there is one mechanic, we have a problem of single channel; if there is more than one mechanic, we have a multichannel problem. The machines may be repaired in a single phase or in k phases.
15.3.6 Model (M/M/R):(K/GD) K > R Assumptions In this model, we shall discuss a queueing system under the following assumptions: (i) This is a queueing system in which there are R service channels. (ii) The maximum limit imposed on incoming customers, say k, is fixed. (iii) The arrivals (breakdowns) and services are assumed to follow a Poisson distribution with parameters k and l respectively. (iv) This system has a finite calling source. Model Formulation Let n be the number of machines in a Hence, the approximate probability of where 8 < nl ln ¼ Rl : 0
breakdown situation. a single service during the time Dt is lnDt, for n R for R n k : for n [ k
If k is the rate of breakdown per machine, then the probability of a single arrival during the time Dt (when there are n machines in a breakdown situation) is approximately knDt for n k, where ( kn ¼
ðk nÞk for 0 n\k 0
for n k
:
To have 0 customer in the system at time t + Dt, there may be the following three cases: (i) No customer in the system at time t, no arrival in time Dt
15.3
Poisson Queueing Models
449
(ii) One customer in the system at time t, no arrival in time Dt, one service in time Dt (iii) No customer in the system at time t, one arrival in time Dt, one service in time Dt, and so on. Therefore, the probability of no customers in the system at time t + Dt is given by p0 ðt þ DtÞ ¼ p0 ðtÞð1 k0 DtÞ þ p1 ðtÞð1 k1 DtÞl1 Dt þ OðDtÞ ¼ p0 ðtÞð1 kkDtÞ þ p1 ðtÞf1 ðk 1ÞkDtglDt þ OðDtÞ p0 ðt þ DtÞ p0 ðtÞ OðDtÞ ¼ kkp0 ðtÞ þ lp1 ðtÞ þ : or Dt Dt
ð15:45Þ
To have n (>0) customers in the system at time t + Dt, there may be the following three cases: (i) n customers in the system at time t, no arrival in time Dt, no service in time Dt (ii) (n – 1) customers in the system at time t, one arrival in time Dt, no service in time Dt (iii) (n + 1) customers in the system at time t, no arrival in time Dt, one service in time Dt, and so on. Therefore, the probability of n customers in the system at time (t + Dt) is given by pn ðt þ DtÞ ¼ pn ðtÞð1 kn DtÞð1 ln DtÞ þ pn1 ðtÞkn1 Dtð1 ln1 Dt þ pn þ 1 ðtÞð1 kn þ 1 DtÞln þ 1 Dt þ OðDtÞ or
pn ðt þ DtÞ pn ðtÞ ¼ ðkn þ ln Þpn ðtÞ Dt þ kn1 pn1 ðtÞ þ ln þ 1 pn þ 1 ðtÞ þ ½dividing both sides by Dt:
ð15:46Þ OðDtÞ Dt
When 0 < n < R, then kn1 ¼ fk ðn 1Þgk kn ¼ ðk nÞk; kn þ 1 ¼ fk ðn þ 1Þgk ln ¼ nl; ln þ 1 ¼ ðn þ 1Þl Hence, from (15.46), we have pn ðt þ DtÞ pn ðtÞ ¼ fðk nÞk þ nlgpn ðtÞ þ fk ðn 1Þgkpn1 ðtÞ Dt OðDtÞ þ ðn þ 1Þlpn þ 1 ðtÞ þ Dt
ð15:47Þ
450
15
Queueing Theory
For R n < k, kn1 ¼ fk ðn 1Þgk; kn ¼ ðk nÞk; ln ¼ Rl; ln þ 1 ¼ Rl; Hence, from (15.46), we have pn ðt þ DtÞ pn ðtÞ ¼ fðk nÞk þ Rlgpn ðtÞ þ fk ðn 1Þgkpn1 ðtÞ Dt OðDtÞ for R n\k þ Rlpn þ 1 ðtÞ þ Dt
ð15:48Þ
For n = k, kn1 ¼ fk ðn 1Þgk ¼ fk ðk 1Þgk ¼ k½)n ¼ k kn ¼ ðk nÞk ¼ 0 ln ¼ Rl Hence, from (15.46), we have pn ðt þ DtÞ pn ðtÞ OðDtÞ ¼ ð0 þ RlÞpn ðtÞ þ kpn1 ðtÞ þ Dt Dt pk ðt þ DtÞ pk ðtÞ OðDtÞ i:e: ¼ kpk1 ðtÞ þ Rlpk ðtÞ þ : Dt Dt
for n ¼ k;
ð15:49Þ
Now taking the limit as Dt ! 0 in (15.45), (15.47)–(15.49), we have p0n ðtÞ ¼ lkp0 ðtÞ þ lp1 ðtÞ for n ¼ 0 p0n ðtÞ ¼ fðk nÞk þ nlgpn ðtÞ þ fk ðn 1Þgkpn1 ðtÞ þ ðn þ 1Þlpn þ 1 ðtÞ for 0\n\R p0n ðtÞ ¼ fðk nÞk þ Rlgpn ðtÞ þ fk ðn 1Þgkpn1 ðtÞ þ Rlpn þ 1 ðtÞ for R n\k p0n ðtÞ ¼ kpk1 ðtÞ þ Rlpk ðtÞ for n ¼ k:
Under a steady state condition of this system, i.e. lim pn ðtÞ ¼ pn and lim p0n ðtÞ ¼ 0, the preceding four equations reduce to
t!1
t!1
kkp0 þ lp1 ¼ 0 for n ¼ 0
ð15:50Þ
15.3
Poisson Queueing Models
451
fk ðn 1Þgkpn1 fðk nÞk þ nlgpn þ ðn þ 1Þlpn þ 1 ¼ 0 fk ðn 1Þgkpn1 fðk nÞk þ Rlgpn þ Rlpn þ 1 ¼ 0 kpk1 Rlpk ¼ 0
for 0\n\R ð15:51Þ
for R n\k ð15:52Þ
for n ¼ k:
Solution From (15.50), lp1 ¼ kkp0 k or p1 ¼ k p0 or p1 ¼ kqp0 l
k where q ¼ : l
Now setting n = 1 in (15.51), we have kkp0 fðk 1Þk þ lgp1 þ 2lp2 ¼ 0 2lp2 ¼ fðk 1Þk þ lgp1 kkp0 k k 2p2 ¼ ðk 1Þ þ 1 p1 k p0 l l k ¼ fðk 1Þq þ 1gp1 kqp0 * q ¼ l
or or
¼ ðk 1Þqp1 þ p1 p1 ½* p1 ¼ kqp0 ¼ ðk 1Þqp1 ¼ ðk 1Þqkqp0 ¼ kðk 1Þq2 p0 ! k kðk 1Þ 2 q p0 ¼ p2 ¼ q 2 p0 : 2 2 Again, setting n = 2 in (15.51), we have ðk 2Þðk 1Þkq3 p0 2 k ðk 2Þðk 1Þkq3 p0 p3 ¼ ¼ q3 p 0 : 3:2 3
3p3 ¼ or
k By induction, pn ¼ qn p0 for 0 n R. n Setting n = R in (15.52), we have
ð15:53Þ
452
15
Queueing Theory
fk ðR 1ÞgkpR1 fðk RÞk þ RlgpR þ RlpR þ 1 ¼ 0 or RpR þ 1 ¼ fðk RÞq þ RgpR ðk R þ 1ÞqpR1 k k R ¼ fðk RÞq þ Rg qR1 p0 q p0 ðk R þ 1Þq R1 R k k R ¼ fðk RÞq þ Rg q p0 R qR p0 R R k ¼ qR p0 ½ðk RÞq þ R R R k ¼ ðk RÞqR þ 1 p0 R k ¼ ðR þ 1ÞqR þ 1 p0 Rþ1 k R þ 1 Rþ1 q ) pR þ 1 ¼ p0 R Rþ1 Again, setting n = R + 1 in (15.52), we have
ðk RÞqpR ðk R þ 1Þq þ R pR þ 1 þ RpR þ 2 ¼ 0 or
RpR þ 2 ¼ fðk R 1Þq þ RgpR þ 1 ðk RÞqpR ¼ ðk R 1ÞqpR þ 1 þ RpR þ 1 ðk RÞqpR ðk R 1ÞðR þ 1Þ k ) pR þ 2 ¼ qR þ 2 p 0 R2 Rþ1 ðk R 1ÞðR þ 1Þ jk qR þ 2 p 0 ¼ R2 j R þ 1j K R 1 ðR þ 1ÞðR þ 2Þ jk qR þ 2 p 0 ¼ 2 R jR þ 2 jk ðR þ 2Þ k k ðR þ 1ÞðR þ 2Þ R þ 2 jR þ 2 R þ 2 ¼ q p ¼ q p0 0 R2 Rþ2 R þ 2 jR R2 k jR þ i R þ i ) pR þ i ¼ q p0 for R R þ i\k: R þ i jR Ri
15.3
Poisson Queueing Models
453
From (15.53), we have Rlpk ¼ kpk1
k k jk 1 k1 or Rpk ¼ pk1 ¼ q q p0 l k 1 jR RkR1 k k jk 1 k k k jk 1 k jk q p0 ¼ q p0 ¼ qk p0 : kR1 kR1 k jR R k jR RkR1 1 jR R k jk ) pk ¼ qk p0 : k jR RkR
Thus, 8 k > n > for 0 n R > < n q p0 : pn ¼ > k jn > n > q p for R n k : 0 n jR RnR Now, or or or
n P
pn ¼ 1
n¼0 R1 P
k P pn þ pn ¼ 1 n¼0 n¼R R1 k P P k k jn qn p0 þ qn p0 ¼ 1 n n jR RnR n¼R n¼0 1 p0 ¼ : PR1 k n Pn k qn j n q þ n¼0 n¼R n n RnR jR
Characteristics of the Model (i) Average number of customers in the system The average number of customers in the system is given by Ls ¼ EðnÞ ¼
k X
npn ¼
n¼0
R1 X n¼0
k X n¼R
npn
! k k 1 X qn jn ¼ n q þ n nR jR n¼R n n jR R n¼0 ! ! " # n1 k n X X k k 1 q n j ¼ p0 n qn þ n nR jR n¼R n n R n¼0 R1 X
k
!
npn þ
n
454
15
Queueing Theory
(ii) Expected queue length The expected queue length is given by Lq ¼
k X
ðn RÞpn ¼
n¼R þ 1
¼
k X
¼ Ls
R X n¼0
R X
(
npn R
¼ Ls R
npn R 1
pn )
pn
n¼0 R X
npn þ R
n¼0
¼ Ls R þ
k X
R X
n¼0
R X
R X
npn
n¼R þ 1
n¼0
(
k X
npn
n¼R þ 1
npn
n¼0
k X
R X
)
pn
n¼0
"
*
k X
# pn ¼ 1
n¼0
pn
n¼0
ðR nÞpn
n¼0
¼ Ls ðR RÞ where R ¼
PR n¼0
ðR nÞpn = expected number of idle machine repairmen.
(iii) Effective arrival rate (keff) In this queueing system, the arrivals occur with a rate k, but all arrivals do not join the system. This situation occurs when the maximum allowable queue length is reached. In that case, no new arrivals are allowed to join the queue. So, we shall define k considering those arrivals which join the system. This arrival rate is known as the effective arrival rate, and it is denoted by keff. Hence, the effective arrival rate is given by keff ¼ lðR RÞ or keff ¼
k X
kn pn ¼ E½kn ¼ E½kðk nÞ½Since kn ¼ kðk nÞ
n¼0
¼ kE ½ðk nÞ ¼ k½k E ðnÞ ¼ k½k Ls : (iv) Expected waiting time The expected waiting time of a customer in the system is given by
15.3
Poisson Queueing Models
455
Ws ¼
Ls ; keff
whereas the expected waiting time of a customer in the queue is given by Wq ¼
Lq : keff
Example 6 There are 5 machines, each of which, when running, suffers breakdowns at an average rate of 2 per hour. There are two servicemen, and only one man can work on a machine at a time. If n machines are out of order when n > 2, then n – 2 of them wait until a serviceman is free. Once a serviceman starts work on a machine, the time to complete the repair has an exponential distribution with mean 5 min. Find the distribution of the number of machines out of action at a given time. Also, find the average time an out-of-action machine has to spend waiting for the repairs to start. Solution Here k = total number of machines in the system = 5 R = number of servicemen = 2 k = 2 per hour, l = 1 per 5 min per hour. 2 Hence, q = k/ l = 12 ¼ 16. Let n be the number of machines out of order. Hence,
8 5 1n > > > < n 6 p0 pn ¼ > 5 jn 1n > > p0 : n 2n2 j2 6 8 5 1n > > > < n 6 p0 ¼ 1 n > 5 > > p0 2jn 12 : n
for 0 n 2 for 2 n 5 for 0 n 2 ; for 2 n 5
where " p0 ¼
21 n X 1 5 n¼0
n
6
þ
5 X 5 n¼2
1 2j n n 12
n #1
The average number of machines out of action is given by
¼
648 : 1493
456
15
Lq ¼
5 X
ðn 2Þpn ¼
n¼2 þ 1
5 X
ðn 2Þpn ¼p3 þ 2p4 þ 3p5 ¼
n¼3
Queueing Theory
165 : 1493
The average time an out-of-action machine has to spend waiting for repairs to start is Wq = Lq/keff. But keff ¼
k X
kn pn ¼
n¼0
k X
kðk nÞpn ¼ k
n¼0
5 X
ð5 nÞpn ¼
n¼0
6 2050 : 1493
Hence, Wq = 55/4100, h = 33/41 min.
15.3.7 Power Supply Model Model Formulation The power supply problem can be analysed as a queueing problem. The objective of this problem is to ensure the better utilization of a power supply to the customers. Let there be an electrical circuit supplying power to customers. Further, let the requirement and supply schedules follow a Poisson distribution with parameters k and l respectively. Let there be n customers in the queue at any time t and let C be the total number of customers. Then, kn ¼ ðc nÞk and ln ¼ nl
) for 0 n c:
Proceeding similarly as in other models, the probabilities that (i) there is no customer and (ii) there are n customers in the system at time t þ Dt are given by p0 ðt þ DtÞ ¼ p0 ðtÞð1 k0 DtÞ þ p1 ðtÞð1 k1 DtÞl1 Dt þ OðDtÞ
ð15:54Þ
pn ðt þ DtÞ ¼ pn1 ðtÞkn1 Dtð1 ln1 DtÞ þ pn ðtÞð1 kn DtÞð1 ln DtÞ þ pn þ 1 ðtÞð1 kn þ 1 DtÞln þ 1 Dt þ OðDtÞ:
ð15:55Þ
and
From (15.1), we have
15.3
Poisson Queueing Models
457
p0 ðt þ DtÞ ¼ p0 ðtÞð1 ckDtÞ þ p1 ðtÞf1 ðc 1ÞkDtglDt þ OðDtÞ or
p0 ðt þ DtÞ p0 ðtÞ OðDtÞ ¼ ckp0 ðtÞ þ lp1 ðtÞ þ : Dt Dt
ð15:56Þ
From (15.55), we have pn ðt þ DtÞ pn ðtÞ OðDtÞ ¼ kn1 pn1 ðtÞ ðkn þ ln Þpn ðtÞ þ ln þ 1 pn þ 1 ðtÞ þ : Dt Dt ð15:57Þ When 0\n\c, kn1 ¼ fc ðn 1Þgk;
kn ¼ ðc nÞk; ln ¼ nl; ln þ 1 ¼ ðn þ 1Þl:
Hence, from (15.57), we have Pn ðt þ DtÞ pn ðtÞ ¼ ða n þ 1Þkpn1 ðtÞ fðc nÞk þ nlgpn ðtÞ Dt OðDtÞ ; 0\n\c: þ ðn þ 1Þlpn þ 1 ðtÞ þ Dt
ð15:58Þ
When n = c, kn1 ¼ fc ðc 1Þgk ¼ k, kn ¼ ðc cÞk ¼ 0 ln ¼ nl ¼ cl; pn þ 1 ðtÞ ¼ 0: Hence, from (15.57), we have pc ðt þ DtÞ pc ðtÞ OðDtÞ ¼ kpc1 ðtÞ clpc ðtÞ þ : Dt Dt
ð15:59Þ
Taking the limit as Dt ! 0, from (15.56), (15.58) and (15.59), we have p00 ðtÞ ¼ ckp0 ðtÞ þ lp1 ðtÞ for n ¼ 0
ð15:60Þ
p0n ðtÞ ¼ ðc n þ 1Þkpn1 ðtÞ fðc nÞk þ nlgpn ðtÞ þ ðn þ 1Þlpn þ 1 ðtÞ ð15:61Þ for 0\n\c p0c ðtÞ ¼ kpc1 ðtÞ clpc ðtÞ for n ¼ c:
ð15:62Þ
Under a steady state condition of this system, i.e. lim pn ðtÞ ¼ pn and lim p0n ðtÞ ¼ 0, the preceding three equations reduce to
t!1
t!1
458
15
ckp0 þ lp1 ¼ 0
Queueing Theory
for n ¼ 0
ðc n þ 1Þkpn1 fðc nÞk þ nlgpn þ ðn þ 1Þlpn þ 1 ¼ 0;
ð15:63Þ 0\n\c ð15:64Þ
kpc1 clpc ¼ 0 for n ¼ c: Equations (15.63)–(15.65) are the steady state equations of the system. Solution of the Model From (15.63), we have p1 ¼ c
k l
p0 .
Now, setting n = 1 in (15.64), we have ckp0 fðc 1Þk þ lgp1 þ 2lp2 ¼ 0 k p0 or 2lp2 ¼ ckp0 þ fðc 1Þk þ lgc l c ð c 1Þ k 2 or p2 ¼ p0 : l j2 Again, for n = 2, c ð c 1Þ ð c 2Þ p3 ¼ j3
3 k p0 : l
In general, pn ¼
c ð c 1Þ ð c n þ 1Þ k n p0 l jn
From (15.65), we have 1k pc1 cl
pc ¼
Þ2 or pc ¼ 1c lk : cðc1 jc 1 But
c1 k l
p0 ¼
c k l
c X n¼0
p0 :
pn ¼ 1
for 0\n\c:
ð15:65Þ
15.3
Poisson Queueing Models
or or or or
459
p0 þ p1 þ p2 þ . . . þ pc ¼ 1 " c # ck cðc 1Þ k 2 k þ þ þ p0 1 þ ¼1 l l l j2 k c ¼1 p0 1 þ l c l p0 ¼ : kþl
Hence, c c ð c 1Þ ð c n þ 1Þ k n l l kþl jn ! n cn c k l ¼ ; kþl kþl n
pn ¼
which is a binomial distribution.
15.4
Non-poisson Queueing Models
15.4.1 Introduction When the arrivals and/or departures of a queueing system do not follow a Poisson distribution, they are called non-Poisson queueing systems. The development of these systems is more complicated, as the Poisson axioms do not hold. In studying non-Poisson queues, usually the following techniques are employed: (i) Phase technique (ii) Imbedded Markov chain technique (iii) Supplementary variable technique. In the phase technique, service is provided to a customer in several phases, say k in number. The technique by which non-Markovian queues are reduced to Markovian queues is termed the imbedded Markov chain technique. In the supplementary variable technique, one or more random variables are added to convert a non-Markovian process into a Markovian one. These techniques have been applied in studying queueing systems like M/Ek/1, M/G/1, G1/G/c, etc. In this chapter, we shall discuss M/Ek/1 and M/G/1 queueing systems only.
460
15
Queueing Theory
15.4.2 Model (M/Ek/1):(∞/FCFS/∞) Assumptions In this model, we shall discuss a single server non-Poisson queueing system under the following assumptions: (i) The arrivals follow a Poisson distribution. (ii) The service time has an Erlang type k distribution. Even though the service may not actually consist of k phases, it is convenient in analysing this system to consider the Erlang as being made up of k exponential phases, 1 each with mean kl , where l is the number of customers served per unit of time. (iii) The service discipline is first come, first served. (iv) The capacity of the system is infinite. (v) This system consists of a single service channel in which there are k phases in the system (waiting or in service). (vi) A new arrival creates k phases of service, and departure of one customer reduces k phases of service. (vii) The distribution of the total service time of a customer in the system will be some combined distribution of time in all these phases. (viii) Each customer is served in k phases one by one, and a new service does not start until all k phases have been completed; i.e. each arrival increases the number of phases by k in the system.
Model Formulation If at any time there are m customers waiting in the queue with one customer in service which has to still pass through s phases, then the total number of phases n in the system (waiting and in service) will be n = mk + s. Figure 15.5 shows the multiphase service-based model. As l is the number of customers served per unit time, then kl will be the number 1 of phases served per unit time and kl will be the average time taken by the server at each phase. Therefore, kn ¼ k; constant arrival per unit time with ln ¼ kl phases served per unit time.
Queue
1
s k phases of services
Fig. 15.5 Multiphase service-based queueing system
depurture k
15.4
Non-poisson Queueing Models
461
Hence, the probability of one arrival during time Dt is kDt þ OðDtÞ and the probability of one departure (of k phases) during time Dt is klDt þ OðDtÞ. Let Pn ðtÞ be the probability that there are n phases in the system at time t. This model consists of a single service channel in which there are n phases in the system (waiting or in service). According to the assumption of the model, a new arrival creates k phases of service, and departure of one customer reduces k phases of service. To have zero phase in the system at time t þ Dt, there may be the following cases: (i) No phase in the system at time t, no arrival in time Dt (ii) One phase in the system at time t, no arrival in time Dt, service of one unit (of k phase) in time Dt, and so on. Therefore, the probability of no phase in the system at time t þ Dt is given by p0 ðt þ DtÞ ¼ p0 ðtÞð1 kDtÞ þ p1 ðtÞð1 kDtÞklDt þ OðDtÞ p0 ðt þ DtÞ p0 ðtÞ OðDtÞ ¼ kp0 ðtÞ þ klp1 ðtÞ þ : or Dt Dt
ð15:66Þ
To have n (>0) phases in the system at time t þ Dt, there may be the following cases: (i) (n – k) phases in the system at time t, one arrival in time Dt, no service (of any phase) in time Dt (ii) n phases in the system at time t, no arrival in time Dt, no service (of any phase) in time Dt (iii) (n + 1) phases in the system at time t, no arrival in time Dt, service of one unit (of k phase) in time Dt, and so on. Therefore, the probability Pn ðt þ DtÞ that there are n (>0) phases in the system at time t þ Dt is given by pn ðt þ DtÞ ¼ pnk ðtÞkDtð1 klDtÞ þ pn ðtÞð1 kDtÞð1 klDtÞ þ pn þ 1 ðtÞð1 kDtÞklDt þ OðDtÞor pn ðt þ DtÞ pn ðtÞ OðDtÞ ¼ kpnk ðtÞ ðk þ klÞpn ðtÞ þ klpn þ 1 ðtÞ þ : Dt Dt
ð15:67Þ
Now taking the limit as Dt ! 0 in (15.1) and (15.67), we get the differential difference equations for the system as follows: p0n ðtÞ ¼ kpnk ðtÞ ðk þ klÞpn ðtÞ þ klpn þ 1 ðtÞ and
p00 ðtÞ ¼ kp0 ðtÞ þ klp1 ðtÞ:
ð15:68Þ ð15:69Þ
462
15
Queueing Theory
Under the steady state condition of the system, i.e. lim pn ðtÞ ¼ pn and t!1
lim p0n ðtÞ ¼ 0, these two equations reduce to
t!1
kpnk ðk þ klÞpn þ klpn þ 1 ¼ 0;
n1
ð15:70Þ
and kp0 þ klp1 ¼ 0;
n ¼ 0:
ð15:71Þ
k Let q ¼ kl ; then Eqs. (15.70) and (15.71) reduce to
p1 ¼ qp0 ;
n¼0
ð15:72Þ
and ð1 þ qÞpn ¼ qpnk þ pn þ 1 ; n 1:
ð15:73Þ
Now, to solve Eq. (15.73), we shall use the generating function technique. Solution of the Model Let the generating function be G(x), which is defined by Gð xÞ ¼
1 X
pn x n ;
j xj 1:
ð15:74Þ
x¼0
Multiplying both sides of (15.73) by xn and then summing over from 1 to ∞ (since the equation holds for n 1), we have ð 1 þ qÞ
1 X
pn x n ¼ q
n¼1
or
ð1 þ qÞ
1 X
or or
or
ð 1 þ qÞ ð 1 þ qÞ
pnk xn þ
n¼1
1 X
pn þ 1 x n
1 X
pnk xn þ
n¼1
#
pn x þ p0 p0 ¼ q n
n¼1 1 X
1 X
n¼0
n¼k
pn xn p0 ¼ q
1 X n¼1
pn xn þ qp0 ¼ p1 þ q
n¼1
"
1 X
1 X
1 X
pn þ 1 x n
½* qp0 ¼ p1
n¼1
"
pnk x þ p1 þ n
n¼1
pnk xn þ
1 X n¼1
1 1X pn þ 1 x n þ 1 x n¼0
½since pnk ¼ 0; for n k\0; i.e. for n\k 1 1 1 X X 1X pn x n p0 ¼ q pj x j þ k þ pi x i ð 1 þ qÞ x n¼0 j¼0 i¼1
# pn þ 1 x
n
15.4
or or or
Non-poisson Queueing Models
463
½substituting n k ¼ j; i:e. ¼ j þ k and n þ 1 ¼ i; i:e.n ¼ i 1 " # 1 1 1 X X 1 X n k j i pn x p0 ¼ qx pj x þ pi x p0 ð1 þ q Þ x i¼0 n¼0 j¼0 1 ð1 þ qÞGð xÞ p0 ¼ qxk Gð xÞ þ ½Gð xÞ p0 x p0 ð 1 x Þ G ð xÞ ¼ j xj 1 ð1 xÞ qxð1 xk Þ 1 p0 1 xk k ¼ p0 1 qx ¼ 1x 1 qx 1x 1x 1 k m X 1x ¼ p0 ðqxÞm 1x m¼0
ð15:75Þ
½by binomial expansion: 1 X m ðxqÞm 1 þ x þ x2 þ þ xk1 ) G ð x Þ ¼ p0 m¼0
or
1 xk ¼ 1 þ x þ x2 þ þ xk1 Since 1x 1 X m G ð x Þ ¼ p0 qm x þ x2 þ þ xk : m¼0
Now we shall find the value of p0. From (15.74) and (15.75), we have 1 X
pn x n ¼ p0
n¼0
1 X
m qm x þ x2 þ þ xk :
m¼0
This is an identity. Now setting x =1 on both sides, we have 1 X
pn ¼ p0
n¼0
1 X
qm ð1 þ 1 þ þ k timesÞm
m¼0 1 X
or
1 ¼ p0
or
h i 1 ¼ p0 1 þ qk þ ðqk Þ2 þ ðqk Þ3 þ
qm k m
m¼0
or or
1 k ; if qk\1 i:e.; if \1 i:e:; k\l 1 qk l p0 ¼ 1 kq: 1 ¼ p0
Substituting p0 ¼ 1 kq in (15.76), we have
ð15:76Þ
464
15 1 X
pn xn ¼ ð1 kqÞ
n¼0
or
1 X
1 X
ðqxÞm
m¼0
pn x ¼ ð1 kqÞ n
n¼0
1 X
ðqxÞ
m 1 xk 1x
m
1x
k m
ð 1 xÞ
Queueing Theory
ð15:77Þ m
m¼0
m rk Now 1 x ¼ ð1Þ x r r¼0 1 P mþs 1 s and ð1 xÞm ¼ x. s s¼0 Hence, from (15.77), we have
k m
1 X
or
m P
r
pn x ¼ ð1 kqÞ n
1 X
n¼0
n¼0
1 X
1 X
pn xn ¼ ð1 kqÞ
n¼0
" m m
q x " qm
m¼0
m X
ð1Þ
r¼0 m X 1 X
ð1Þr
r¼0 s¼0
!
# 1 X mþs 1 s x x s r s¼0 ! # m m þ s 1 m þ rk þ s x : s r
m
r
rk
Comparing the coefficients of xn from both sides, we have pn ¼ ð1 kqÞ
X
qm ð1Þr
m;r;s
m r
mþs 1 : s
[The summation is taken over all such values of m, r, s ð0 m\1; 0 r m; 0 s\1Þ for which m þ rk þ s ¼ n.] Characteristics of the Model (i) Average (expected) number of phases in the system The average number of phases in the system is given by Lp ¼
1 X
npn
n¼0
From (15.7), we have ð1 þ qÞpn ¼ qpnk þ qn þ 1 ;
n 1:
Now multiplying both sides by n2 and then summing from 1 to ∞, we have
15.4
Non-poisson Queueing Models
ð1 þ qÞ ð1 þ q Þ
or
1 X
465
n2 pn ¼ q
1 X
1 X
n2 pnk þ
n¼1
n¼1
n¼1
1 X
1 X
1 X
n2 pn ¼ q
n¼1
n2 pnk þ
n2 pn þ 1 n2 pn þ 1
n¼0
n¼k
½since pnk ¼ 0 for n k\0; i:e: for n\k: Now taking n + 1 = m and n − k = m in the first and second summations on the right-hand side respectively, we have ð1 þ q Þ
1 X
n2 pn ¼ q
n¼0
¼q
1 X
1 X
1 X
ð m þ k Þ 2 pm þ
m¼0
ðn þ kÞ2 pn þ
n¼0
ðm 1Þ2 pm
m¼1 1 X
ð n 1Þ 2 pn p0
n¼0
1 X ¼ q n2 þ 2nk þ k 2 þ n2 2n þ 1 pn p0 n¼0
¼ ð1 þ q Þ
1 X
n2 pn 2ð1 qk Þ
n¼0
or or
2ð1 qkÞ 2ð1 qkÞ
1 X n¼0 1 X
1 X
1 X npn þ 1 þ qk2 pn p0
n¼0
npn ¼ 1 þ qk
2
"
p0 *
1 X
#
n¼0
pn ¼ 1
n¼0
npn ¼ 1 þ qk 2 ð1 qkÞ
½* p0 ¼ 1 qk
n¼0
or
1 X
npn ¼
n¼0
qkðk þ 1Þ 2ð1 qkÞ
2
qkðk þ 1Þ l ð k þ 1Þ
¼ 2ð1 qkÞ 2 1 k k
or
Lp ¼
l
k 6 * q ¼ lk 6 6 4 i:e.:; qk ¼
3 7 7 7 k5 l
kþ1 k ¼ : : 2 lk (ii) Average (expected) number of units in the queue We know that the average number of phases of one unit in service ¼ k þ2 1. Therefore, the time taken for their service ¼ k þ2 1 l1, as l1 is the mean service time per unit.
466
15
Hence, the expected number of phases in service = expected number of phases arrived during the time
Queueing Theory
kþ1 2l
þ1 ¼ k 2l k. Therefore, the expected number of units in the queue is given by
1 Lp expected number of phases in service k 1 kþ1 k kþ1 ¼ k k 2 lk 2l
Lq ¼
¼
kþ1 k2 : : 2k lðl kÞ
(iii) Average (expected) number of units in the system From the relation between the expected number of units in the system and in the queue, we have Ls ¼ Lq þ or
Ls ¼
k l
kþ1 k2 k þ : 2k lðl kÞ l
(iv) Expected waiting time in the system The expected waiting time in the system is given by 1 Ws ¼ L s k kþ1 k 1 þ : ¼ 2k lðl kÞ l (v) Expected waiting time in the queue The expected waiting time in the queue is given by 1 Wq ¼ Lq k kþ1 k : ¼ 2k lðl kÞ Example 7 In a factory cafeteria, the customers have to pass through three counters. The customers buy coupons at the first counter, select and collect the snacks at the
15.4
Non-poisson Queueing Models
467
second counter and collect tea at the third. The server at each counter takes on average 1.5 min, although the distribution of service time is approximately exponential. If the arrival of customers to the cafeteria is approximately Poisson at an average rate of 6 per hour, calculate: (i) The average time that a customer spends waiting in the cafeteria (ii) The average time of getting the service (iii) The most probable time getting the service. Solution This is a queueing problem of the form ðM=Ek =1Þ:ð1=FCFS=1Þ. Here k = number of phases = 3 Service time per phase = 1.5 min ∴ Service time per customer = 1.5 3 = 4.5 min 1 ) l ¼ 4:5 customers/minute = 40 3 customers/h. Here k ¼ 6 customers/h. (i) The average waiting time spent by a customer in the cafeteria is given by kþ1 k 3þ1 6 40 h : ¼ 40 2k lðl kÞ 2 3 3 3 6 9 h ¼ 2:45 min: ¼ 220
Wq ¼
(ii) The average time of getting the service ¼
1 3 ¼ h ¼ 4:5 min l 40
(iii) The most probable time spent in getting the service = the modal value of service time t = the modal value of the Erlang distribution. 31 31 1 Hence, the most probable time spent ¼ k1 lk ¼ 403 ¼ 403 ¼ 20 h = 3 min. 3
3
[The probability density function of the Erlang distribution with parameters l and k is given by f ðt; l; k Þ ¼
ðlkÞk k1 klt t e ; ðk 1Þ!
0 t\1; k ¼ 1; 2; . . .: 2
l 1 The mode of the distribution is at k1 kl and the mean = l, with variance ¼ k .]
468
15
Queueing Theory
15.4.3 Model (M/G/1):(1/GD) A precise description of this model is as follows: (i) Customers arrive according to a Poisson fashion at an average rate of k. (ii) The service time t is represented by any probability distribution with mean E ðtÞ ¼ l1 and variance vðtÞ ¼ r2 . (iii) This system consists of a single server. (iv) The service discipline is a general service discipline such as FCFS, LCFS, SIRO, etc. (v) An arriving customer who finds the servers idle begins to take the service immediately. (vi) All blocked customers wait until served. (vii) The server cannot be idle when there is a waiting customer. Here the probability that a customer leaves the system in ðt; t þ DtÞ is a quantity dependent on when the service of that particular customer begins. That is, it would involve not only t, but also another time variable, say t0 , measured from the beginning of the current service. Let q0 denote the queue length when the service of the nth customer terminates and let q1 be the queue length when the service of the ðn þ 1Þ th customer terminates. Further, let kðk ¼ 0; 1; 2; . . .Þ denote the number of customers who arrive during the service of the ðn þ 1Þth customer. Then the relation between q0 ; q1 and k is given by 8 > >
> : i:e. the queue has q0 units after serving n-th customer: This relation can be written as q1 ¼ q0 d þ k;
ð15:78Þ
0 if q0 ¼ 0 . 1 if q0 [ 0 Now, taking expectation on both sides of (15.78), we have
where d ¼
E ðq1 Þ ¼ E ðq0 Þ EðdÞ þ EðkÞ: In a steady state system, E ðq1 Þ ¼ E ðq0 Þ. Hence, from (15.79), we have
ð15:79Þ
15.4
Non-poisson Queueing Models
469
E ðdÞ ¼ Eðk Þ:
ð15:80Þ
Now, squaring both sides of (15.78), we have q21 ¼ q20 þ k2 þ d2 þ 2kq0 2dq0 2dk or
q21 ¼ q20 þ k2 þ d þ 2kq0 2q0 2kd " # since d can take values 0 and 1 only, so d2 ¼ d and dq0 ¼ q0
ð15:81Þ :
Now taking expectation on both sides, we have E q21 ¼ E q20 þ E k 2 þ E ðdÞ þ 2E ðkq0 Þ 2Eðq0 Þ 2E ðkdÞ:
ð15:82Þ
Again, since E q21 ¼ E q20 is in a steady state system, from (15.82) we have 2Eðq0 Þ 2Eðkq0 Þ ¼ E k 2 þ E ðdÞ 2E ðkdÞ or 2E ðq0 Þ 2Eðk ÞE ðq0 Þ ¼ E k2 þ E ðk Þ 2fE ðkÞg2 " # From (3), E ðdÞ ¼ E ðkÞ and q0 ; k and d are independent or
E ðq0 Þ ¼
ð15:83Þ
E ð k 2 Þ þ E ð k Þ 2fE ð k Þ g2 : 2f1 E ð k Þ g
Now, in order to determine E ðq0 Þ, the values of E ðkÞ and E ðk2 Þ must be computed. Since the arrivals in time t follow a Poisson distribution, Eðk=tÞ ¼ kt; E k 2 =t ¼ ðktÞ2 ; V ðk=tÞ ¼ E k 2 =t fE ðk=tÞg2 ; where k is the mean arrival rate. Hence, Z1 Z Eðk Þ ¼ Eðk=tÞf ðtÞdt ¼
Z
1
kt f ðtÞdt ¼ k
0
0
1
t f ðtÞdt ¼ kEðtÞ
0
Also, E k2 ¼
Z1 Z1h i 2 E k =t f ðtÞdt ¼ ðktÞ2 þ kt f ðtÞdt 0
0
¼ k E t2 þ kE ðtÞ ¼ k2 V ðtÞ þ k2 fE ðtÞg2 þ kEðtÞ: 2
470
15
Queueing Theory
Substituting the values of E ðkÞ and E ðk2 Þ in (15.5), we have E ðq0 Þ ¼
k2 V ðtÞ þ k2 fE ðtÞg2 þ kE ðtÞ þ kE ðtÞ 2k2 fE ðtÞg2 2f1 kE ðtÞg
k2 V ðtÞ þ k2 fE ðtÞg2 þ 2kE ðtÞ 2k2 fE ðtÞg2 2f1 kE ðtÞg h i 2 k V ðtÞ þ fE ðtÞg2 þ kE ðtÞ ¼ 2f1 kE ðtÞg 2 2 1 k2 r2 þ l1 1 6 As E ðtÞ ¼ l where l is the mean o þk 4 ¼ n l 2 1 k l1 service time and V ðtÞ ¼ r2 k2 r2 þ q2 k þ q As q ¼ ¼ l 2ð1 qÞ ¼
or
Ls ¼ q þ
ð15:84Þ
k2 r 2 þ q 2 : 2ð1 qÞ
This is known as the Pollaczek-Khintchine (P-K) formula. Alternative form of Pollaczek formula If the (n + 1)th customer has to wait for the time w units before taking the service, the length of the queue must be q1 during the time w + t. Hence, E ðq1 Þ ¼ kE ðw þ tÞ ¼ kE ðwÞ þ kE ðtÞ or kE ðwÞ ¼ E ðq1 Þ kE ðtÞ, h i i.e. the average waiting time ¼ EðwÞ ¼ 1k E ðq1 Þ lk where E ðtÞ ¼ l1 or
# " since Eðq1 Þ ¼ E ðq0 Þ 1 k2 r2 þ q2 q E ðw Þ ¼ P þ k 2ð1 qÞ in the steady state
k2 r2 þ q2 : 2ð1 qÞk E ð w Þ l k2 r 2 þ q 2 ¼ ) E ðtÞ 2ð1 qÞk E ðwÞ q ¼ 1 þ l2 r2 : or E ðt Þ 2ð1 qÞ ¼
This formula was also suggested by Pollaczek. Hence, the average number of customers in the system is given by
ð15:85Þ
15.4
Non-poisson Queueing Models
471
Ls ¼ q þ
k2 r2 þ q2 : 2ð1 qÞ
The average number of customers in the queue is given by Lq ¼ Ls
k k 2 r 2 þ q2 ¼ Ls q ¼ : l 2ð1 qÞ
Example 8 In a heavy machine shop, the overhead crane is 75% utilized. Time study observations gave the average slinging time as 10.5 min with a standard deviation of 8.8 min. What is the average calling rate for the service of the crane, and what is the average delay in getting service? If the average service time is cut to 8.0 min with a standard deviation of 6.0 min, how much reduction will occur, on average, in the delay of getting service? Solution This is an (M/G/1):(1/FCFS/1) queueing system. Since the overhead crane is utilized 75%, q ¼ 0:75. Again, 60 ¼ 5:71 per h 10:5 k ¼ ql ¼ 0:75 5:71 ¼ 4:29 per h 8:8 r ¼ 8:8 min ¼ h: 60
l¼
Now. the average delay in getting the service is given by 2 2 þ ð0:75Þ2 k2 r2 þ q2 ð4:29Þ 8:8 60 ¼ ¼ 0:4468 h ¼ 26:8 min: Lq ¼ 2kð1 qÞ 2 4:29 ð1 0:75Þ
If the service time is cut to 8 min, then l¼
60 4:29 ¼ 7:5 per hour and q ¼ ¼ 0:571: 8 7:5
6 In this case, the standard deviation = 6.0 min ¼ 60 ¼ 0:1 h.
Then Lq ¼
ð4:29Þ2 ð0:1Þ2 þ ð0:571Þ2 h ¼ 0:1386 h ¼ 8:3 min: 2 4:29 ð1 0:571Þ
472
15
Queueing Theory
In the second case, a reduction of (26.8–8.3), i.e. 18.5 min, will occur on average in the delay of getting service. The average waiting time of a customer in queue is given by Lq k2 r2 þ q2 ¼ : k 2kð1 qÞ
Wq ¼
The average waiting time that a customer spends in the system is given by Ws ¼ Wq þ service time ¼
k2 r2 þ q2 1 þ : 2kð1 qÞ l
Efficiency of this queueing system The efficiency of this queueing system is measured by the ratio
Wq E ðt Þ
as follows:
Wq average waiting time in the queue : ¼ average service time E ðt Þ From this ratio, we conclude that the system will be better for a smaller ratio. Note: In particular, let G ¼ M. Then 1 1 ; E ðt Þ ¼ : 2 l l h i 2 1 1 k k l2 þ l2 q2 q
¼ qþ ; Ls ¼ þ ¼ k l 1 q 1 q 2 1l VarðtÞ ¼
which is the same result as obtained earlier for the queueing model (M/M/1):(1/ FCFS/1). Example 9 In a queueing system with a single server, arrivals are Poisson distributed with mean 5 per hour. Find the expected waiting time in the system if the service time distribution is normal with mean 3 min and standard deviation 2 min. Hints: Here k = 5 per hour, l ¼ 60 3 ¼ 20 per hour. r ¼ 2 min ¼ The expected waiting time in the system
2 h: 60
15.4
Non-poisson Queueing Models
473
¼ Ls ¼
15.5
k2 r2 þ q2 1 þ : 2kð1 qÞ l
Mixed Queueing Model and Cost
15.5.1 Introduction In this section, we shall discuss a queueing model with random arrival rate and deterministic service rate. This type of queueing model is known as a mixed queueing model. We shall also discuss the cost models of an M/M/1 system with finite and infinite capacity to find the optimum service level by minimizing the cost of the system.
15.5.2 Mixed Queueing Model For a queueing model, if either the arrival rate or the service rate is fixed and the other one is random, then the model is known as a mixed queueing model. Model: (M/D/1):(∞/FCFS/∞) This queueing system involves a single server, Poisson arrivals and a deterministic distribution. The capacity of the system is infinite. Here the service time is not a random variable but a constant. Without loss of generality, the service time can be taken as unity, l ¼ 1. It is also assumed that a steady state condition prevails, with q ¼ k\1 as l ¼ 1. In this scheme, the system is observed just after the completion of service of a customer. Hence, if n customers are in the system at the observation time, then at the time of the next observation, the number n0 in the system is given by n0 ¼
k if n ¼ 0 ; n 1 þ k; if n [ 0
where k ¼ 0; 1; 2; . . . is the number of arrivals during the service time which is taken as unity. Let qk be the probability that k persons arrive during the service time. Since the arrival rate follows a Poisson distribution, qk ¼
ek ðkÞk ; k!
k ¼ 0; 1; 2; . . .
474
15
Table 15.2 Transaction matrix
Queueing Theory
n0 ! n
0
1
2
3
4
…
…
0 1 2 3 4 .. . .. .
q0 q0 0 0 0 .. . .. .
q1 q1 q0 0 0 .. . .. .
q2 q2 q1 q0 0 .. . .. .
q3 q3 q2 q1 q0 .. . .. .
q4 q4 q3 q2 q1 .. . .. .
… … … … … .. . .. .
… … … … … .. . .. .
where k is the mean arrival rate. For different values of n and n0 , the transaction matrix between the end and the beginning of the service is given by Table 15.2. In the transaction matrix, q0 is the probability of no arrival during the service time. Let pn ðn ¼ 0; 1; 2; . . .Þ be the steady state probability that there are n customers in the system. Then the probabilities are given by p0 ¼ p0 q0 þ p1 q0 p1 ¼ p0 q1 þ p1 q1 þ p2 q0 p2 ¼ p0 q2 þ p1 q2 þ p2 q1 þ p3 q0
The ðn þ 1Þth equation of the above is given by pn ¼ p0 qn þ p1 qn þ p2 qn1 þ p3 qn3 þ þ pn þ 1 q0 ; or
pn ¼ p 0 q n þ
nX þ1
pj qn þ 1j ;
n ¼ 0; 1; 2; . . .
n ¼ 0; 1; 2; . . .:
j¼1
ð15:86Þ Applying a Z-transform to Eq. (15.86), we have P n [Z-transform of pn is defined by Z ðpn Þ ¼ 1 n¼0 z pn ¼ PðzÞ ðsayÞ] 1 X n¼0
z n pn ¼
1 X n¼0
z n p0 qn þ
1 X n¼0
( zn
nX þ1 j¼1
) pj qn þ 1j
15.5
Mixed Queueing Model and Cost
¼
1 X
z pn qn þ n
n¼0
or
1 X
475
( z
n þ 11
n¼0
nX þ1
) pj qn þ 1j
j¼0
( ) 1 m X 1X m z pn ¼ p0 z qn þ z pj qmj p0 qm ; where n ¼ m þ 1: z m¼1 n¼0 n¼0 j¼0
1 X
1 X
n
n
ð15:87Þ Let us consider rm ¼
m X
pj qmj
ð15:88Þ
j¼0
or
rm ¼ p0 qm þ p1 qm1 þ p2 qm2 þ þ pm q0 ;
i.e. rm is the convolution of pm and qm . Hence, Z ðrm Þ ¼ Z ðqm ÞZ ðpm Þ or
RðzÞ ¼ QðzÞPðzÞ; where RðzÞ ¼ Z ðrm Þ and QðzÞ ¼ Z ðqm Þ:
ð15:89Þ
Then, from (15.87), we have PðzÞ ¼ p0 QðzÞ þ
1 1 1X p0 X zm rm z m qm z m¼1 z m¼1
1 p0 fRðzÞ r0 g fQðzÞ q0 g z z 1 p0 ¼ p0 QðzÞ þ RðzÞ QðzÞ ½as r0 ¼ p0 q0 z z 1 p0 ¼ p0 QðzÞ þ PðzÞQðzÞ QðzÞ ½as RðzÞ ¼ PðzÞQðzÞ z z ðz 1ÞQðzÞ PðzÞ ¼ p0 : z QðzÞ ¼ p0 QðzÞ þ
or Now,
ðz 1ÞQðzÞ 0 Pð1Þ ¼ lim PðzÞ ¼ p0 lim ¼ form z!1 z!1 z Q ðzÞ 0 ðz 1ÞQ0 ðzÞ þ QðzÞ ¼ p0 lim ½Using L'Hospital's rule z!1 1 Q0 ðzÞ Q ð 1Þ ¼ p0 1 Q 0 ð 1Þ p0 ½as Qð1Þ ¼ 1: ¼ 1 Q 0 ð 1Þ
ð15:90Þ
476
15
Hence,
Queueing Theory
¼1
p0 1Q0 ð1Þ
or p0 ¼ 1 Q0 ð1Þ: Now we have to compute Q0 ð1Þ. )QðzÞ ¼
1 X
qn zn ; where qn is the probability of new arrivals:
n¼0
)QðzÞ ¼
1 k n X e k
n!
n¼0
¼ ek
zn ½Since the distribution of arrivals is Poisson; qn ¼
ek kn n!
1 X
ðkzÞn n! n¼0
¼ ekðz1Þ : Hence, Q0 ð1Þ ¼ k i:e:; p0 ¼ 1 k. From (15.90), we have PðzÞ ¼
ð1 kÞðz 1ÞQðzÞ ð1 kÞðz 1Þekðz1Þ ð1 kÞð1 zÞ ¼ ¼ : z Q ðzÞ 1 zekðz1Þ z ekðz1Þ
Determination of pn In obtaining pn we observe that a general expression cannot be easily obtained in this case because of the complexity of PðzÞ. However, pn can be obtained by using the formula Pn ð0Þ ¼ n!pn ; where Pn ð0Þ ¼
h
i
@ n PðzÞ @zn z¼0 1 n n! P ð0Þ.
as PðzÞ ¼
P1 n¼0
pn z n .
Therefore, pn ¼ Again, the expected number of customers in the system is given by 2
1 X
3
pn z 7 6 since PðzÞ ¼ 6 7 n¼0 7 n pn ¼ P ð 1Þ 6 Ls ¼ 6 7 1 X 4 0 n¼0 n1 5 )P ðzÞ ¼ npn z 1 X
or
Ls ¼
n
0
00
k þ Q ð1Þ kþk ¼ 2ð 1 kÞ 2ð 1 kÞ 2
n¼0
As PðzÞ ¼ p0
ðz 1ÞQðzÞ Z QðzÞ
:
15.6
15.6
Cost Models in Queueing System
477
Cost Models in Queueing System
In a queueing system, there are two types of conflicting costs: (i) Cost of offering the service to the customers (ii) Cost incurred due to delay in offering the service to the customers. These two types of costs conflict in nature. An increase in the existing service facility would reduce the customers’ waiting time; on the other hand, a decrease in the level of service would increase the customers’ waiting time. This means that a higher cost of offering the service would reduce the cost of waiting to the customers and vice versa. Therefore, the service level in a queueing system is a function of the service rate l, which balances the two conflicting costs. The objective is to find the optimum service level such that the sum of these costs is minimized. Hence, the total cost TC of the queueing system is given by TC ¼ \cost for providing service facility[ þ \cost for waiting of customers[ TC ¼ C1 l þ C2 Ls where C1 = cost of service per customer per unit time C2 = cost for waiting of customers per unit time Ls = average number of customers in the queueing system. Now, for the queueing model (M/M/1):(1 /FCFS/1), Ls ¼
k lk
and TC ¼ C1 l þ C2
k : lk
Here TC is a function of the continuous variable l. Then, the necessary condition for TC to be optimum is given by dðTCÞ ¼ 0; dl C2 k which implies C1 ðlk ¼0 Þ2
ð15:91Þ
478
15
) l ¼ kþ
Queueing Theory
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi kC2 =C1 :
ð15:92Þ
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 2kC2 ¼ ðlk kC2 =C1 . Then ddlTC 3 , which is positive for l ¼ k þ 2 Þ Hence, the optimum value of l is l ¼ k þ
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi kC2 =C1 :
ð15:93Þ
This result shows that the optimum value of l is dependent on C1 ; C2 and also on the arrival rate k. Again, for the queueing model (M/M/1):(N/FCFS/1), the capacity of the system is finite, and its value is N when the system is fulfilled; then newly arrived customers cannot join in the queue. As a result, the larger the value of N, the smaller will be the number of lost customers. In this case, the total cost TC of the system is given by TC ¼ hcost of service ratei þ hcost of waiting customer i or
þ hcost of servicing customersi þ hcost of lost customersi TC ¼ C1 l þ C2 Ls þ C3 N þ kpN C4 ; ð15:94Þ
where C3 = cost of servicing to each customer per unit time C4 = cost per lost customer and kpN = number of lost customers per unit time. Case 1 When q 6¼ 1 Ls ¼
q½1 ðN þ 1ÞqN þ NqN þ 1 ð1 qÞð1 qN þ 1 Þ
ð15:95Þ
ð1 qÞqN : 1 qN þ 1
ð15:96Þ
and pN ¼
In this case, TC ¼ C1 l þ
C2 q½1 ðN þ 1ÞqN þ NqN þ 1 þ C3 N þ kC4 ð1 qÞqN = 1 qN þ 1 : N þ 1 ð1 qÞð1 q Þ ð15:97Þ
15.6
Cost Models in Queueing System
479
Case 2 When q = 1 Ls ¼
N 1 and pN ¼ : 2 N þ1
In this case, TC ¼ C1 l þ C2 N=2 þ C3 N þ kC4 =ðN þ 1Þ:
ð15:98Þ
As N is given, TC is a function of the continuous variable l only. The optimum value of l can be obtained easily by minimizing the total cost TC given by either (15.97) or (15.98).
15.7
Exercises
1. Arrivals at a telephone booth are considered to be Poisson, with an average time of 10 min between one arrival and the next. The length of a phone call is assumed to be distributed exponentially with mean 3 min. (i) What is the probability that a person arriving at the booth will have to wait? (ii) Find the average number of units in the system. (iii) Estimate the fraction of a day that the phone will be in use. (iv) What is the probability that it will take a person more than 10 min altogether to wait for the phone and complete his call? 2. At a certain airport it takes exactly 5 min to land an aeroplane, once it is given the signal to land. Although incoming planes have scheduled arrival times, the wide variability in arrival times produces an effect which makes the incoming planes appear to arrive in a Poisson fashion at an average rate of 6 per hour. This produces occasional congestion in the airport, which can be dangerous and costly. Under these circumstances, how much time will a pilot expect to spend circling the field waiting to land? 3. Patients arrive at a clinic according to a Poisson distribution at a rate of 30 patients per hour. The waiting room does not accommodate more than 14 patients. The examination time per patient is exponential with a mean rate of 20 per hour. (i) Find the effective arrival rate at the clinic. (ii) What is the probability that an arriving patient will not wait? (iii) What is the expected waiting time until a patient is discharged from the clinic?
480
15
Queueing Theory
4. A barber with a one-man operation takes exactly 25 min to complete one haircut. If customers arrive in a Poisson fashion at an average rate of one every 40 min, how long on average must a customer wait for service? Also find the average time that a customer spends in the barbershop. 5. A road transport company has one reservation clerk on duty at a time. He handles information on bus schedules and makes reservations. Customers arrive at a rate of 8 per hour, and the clerk can service 12 customers on average per hour. (i) What is the average number of customers waiting for the service of the clerk? (ii) What is the average time a customer has to wait before getting service? (iii) The management is contemplating installing a computer system to handle the information and reservations. This is expected to reduce the service time from 5 to 3 min. The additional cost of having the new system works out to Rs. 400.00 per day. If the cost of goodwill of having to wait is estimated to be Re. 1.00 per minute spent waiting to be served, should the company install the computer system? Assume that the working time is 8 h per day. 6. At a one-man barbershop, customers arrive according to a Poisson distribution with a mean arrival rate of 5 per hour, and the hair-cutting time is exponentially distributed, with an average haircut taking 10 min. It is assumed that because of his excellent reputation, customers are always willing to wait for this barber. Calculate: (i) The average number of customers in the shop and the average number of customers waiting for a haircut (ii) The percentage of time an arrival can walk right in without having to wait (iii) The percentage of customers who have to wait prior to getting into the barber’s chair. 7. At a certain filling station, customers arrive in a Poisson process with an average time of 12 per hour. The time intervals between services follow an exponential distribution, and as such the mean time taken to service a unit is 2 min. Evaluate: (i) (ii) (iii) (iv) (v) (vi)
The The The The The The
probability that there is no customer at the counter probability that there are more than two customers at the counter probability that there is no customer to be served probability that a customer is being served, but nobody is waiting expected number of customers in the waiting line expected time a customer spends in the system.
8. A supermarket has a single cashier. During peak hours, customers arrive at a rate of 20 customers per hour. The average number of customers who can be processed by the cashier is 24 per hour. Calculate:
15.7
Exercises
(i) (ii) (iii) (iv) (v)
The The The The The
481
probability that the cashier is idle average number of customers in the queueing system average time a customer spends in the system average number of customers in the queue average time a customer spends in the queue waiting for service.
9. An airlines organization has one reservation clerk on duty in its local branch at any given time. The clerk handles information regarding passenger reservations and flight schedules. Assume that the number of customers arriving during any given period is Poisson distributed with an arrival rate of 8 per hour and that the reservation clerk can serve a customer in 6 min on average, with an exponentially distributed service time. (i) What is the probability that the system is busy? (ii) What is the average time that a customer spends in the system? (iii) What is the average length of the queue, and what is the number of customers in the system? 10. At a public telephone booth in a post office, arrivals are considered to be Poisson with an average inter-arrival time of 12 min. The length of a phone call may be assumed to be distributed exponentially with an average of 4 min. (i) What is the probability that a fresh arrival will not have to wait for the phone? (ii) What is the probability that an arrival will have to wait more than 10 min before the phone is free? (iii) What is the average length of queues that form from time to time? 11. A bank plans to open a single server drive-in banking facility at a particular centre. It is estimated that 20 customers will arrive each hour on average. If, on average, it requires 2 min to process a customer’s transaction, determine: (i) The proportion of time that the system will be idle (ii) The average of how long a customer will have to wait before reaching the server (iii) The fraction of customers who will have to wait. 12. A warehouse has only one loading dock manned by a three-person crew. Trucks arrive at the loading dock at an average rate of 4 trucks per hour, and the arrival rate is Poisson distributed. The loading of a truck takes 10 min on average and can be assumed to be exponentially distributed. The operating cost of a truck is Rs. 50 per hour, and the members of the loading crew are paid Rs. 15 each per hour. Would you advise the truck owner to add another crew of three persons? 13. In a factory, the machine breakdown on an average rate is 10 machines per hour. The idle time cost of a machine is estimated to be Rs. 100 per hour. The factory works 8 h a day. The factory manager is considering 2 mechanics for repairing the machines. The first mechanic A takes about 5 min on average to
482
14.
15.
16.
17.
18.
19.
15
Queueing Theory
repair a machine and demands wages Rs. 50 per hour. The second mechanic B takes about 4 min in repairing a machine and demands wages at the rate of Rs. 75 per hour. Assuming that the rate of machine breakdown is Poisson distributed and the repair rate is exponentially distributed, which of the two mechanics should be engaged? In the production shop of a company, the breakdown of the machines is found to be Poisson distributed with an average rate of 3 machines per hour. The breakdown time at one machine costs Rs. 200 per hour to the company. The company is considering two repairmen for hire. One of the repairmen is slow but cheap, the other fast but expensive. The slow-cheap repairman demands Rs. 100 per hour and will repair the broken-down machines exponentially at the rate of 4 per hour. The fast-expensive repairman demands Rs. 150 per hour and will repair machines exponentially at an average rate of 6 per hour. Which repairman should be hired? At a railway station, only one train is handled at a time. The railway yard is sufficient only for waiting of two trains to wait; then a newly arriving train is given the signal to leave the station. Trains arrive at the station at an average rate of 6 per hour, and the railway station can handle them on an average of 12 per hour. Assuming Poisson arrivals and an exponential service distribution, find the steady state probabilities for the various numbers of trains in the system. Also find the average waiting time of a new train coming into the yard. Assume that the goods trains are coming into a yard at the rate of 30 trains per day, and suppose that the inter-arrival times follow an exponential distribution. The service time for each train is assumed to be exponential with an average of 36 min. If the yard can admit 9 trains at a time (there being 10 lines, one of which is reserved for shunting purposes), calculate the probability that the yard is empty and find the average queue length. If for a period of 2 h in the day (8 to 10 a.m.) trains arrive at the yard every 20 min but the service time continues to remain 36 min, then calculate for this period: (a) the probability that the yard is empty, (b) the average number of trains in the system, on the assumption that the line capacity of the yard is limited to 4 trains only. A petrol station has a single pump and space for not more than 3 cars (2 waiting, 1 being served). A car arriving when the space is filled to capacity goes elsewhere for petrol. Cars arrive according to a Poisson distribution at a mean rate of one every 8 min. Their service time has an exponential distribution with a mean of 4 min. The proprietor has the opportunity of renting an adjacent piece of land, which would provide space for an additional car to wait (he cannot build another pump). The rent would be Rs. 10 per week. The expected net profit from each customer is Re. 0.50, and the station is open 10 h every day. Would it be profitable to rent the additional space? A barbershop has two barbers and three chairs for waiting customers. Assume that customers arrive in a Poisson fashion at a rate of 5 per hour and that each
15.7
Exercises
483
barber services customers according to an exponential distribution with a mean of 15 min. Further, if a customer arrives and there are no empty chairs in the shop, he will leave. Find the steady state probabilities. What is the probability that the shop is empty? What is the expected number of customers in the shop? 20. A car servicing station has two bays where service can be offered simultaneously. Due to space limitations, only 4 cars are accepted for servicing. The arrival pattern is Poisson with 12 cars per day. The service time in both bays is exponentially distributed with l ¼ 8 cars per day per bay. Find the average number of cars in the service station, the average number of cars waiting to be serviced and the average time a car spends in the system. 21. A mechanic repairs four machines. The mean time between service requirements is 5 h for each machine and forms an exponential distribution. The mean repair time is 1 h and also follows the same distribution pattern. Machine downtime costs Rs. 25 per hour, and the mechanic costs Rs. 55 per day. Determine the following: (a) The probability that the service facility will be idle (b) The probability of various numbers of machines (0–4) to be repaired, and being repaired (c) The expected number of machines waiting to be repaired, and being repaired (d) The expected downtime cost per day.
22.
23.
24.
25.
Would it be economical to engage two mechanics, each repairing only two machines? A repairman is to be hired to repair machines which break down at an average rate of 6 per hour. The breakdown follows a Poisson distribution. The non-productive time of a machine is considered to cost Rs. 20 per hour. Two repairmen, Mr. X and Mr. Y, have been interviewed for this purpose. Mr. X charges Rs. 10 per hour, and he services breakdown machines at the rate of 8 per hour. Mr. Y demands Rs. 14 per hour, and he services at an average rate of 12 per hour. Which repairman should be hired? (Assume an 8-h shift per day.) A TV repairman finds that the time spent on his jobs has an exponential distribution with mean 30 min. If he repairs sets in the order in which they come in, and if the arrival of sets is approximately Poisson with an average of 10 per 8-h day, what is the repairman’s expected idle time each day? How many jobs are ahead of the average set just brought in? An insurance company has three claim adjusters in its branch office. People with claims against the company are found to arrive in a Poisson fashion at an average rate of 20 per 8-h day. The amount of time that an adjuster spends with a claimant is found to have an exponential distribution with a mean service time of 40 min. Claimants are processed in the order of their appearance. How many hours a week can an adjuster except to spend with claimants? At a port, let there be six unloading berths and four unloading crews. When all the berths are full, arriving ships are diverted to an overflow capacity 20 km
484
15
Queueing Theory
down the river. Tankers arrive according to a Poisson process with a mean of one every 2 h. It takes an unloading crew, on average, 10 h to unload a tanker, the unloading time following an exponential distribution. (a) On average, how long does a tanker spend at the port? (b) What is the average arrival rate at the overflow capacity? 26. There is congestion on the platform of a railway station. The trains arrive at the rate of 30 trains per day. The waiting time for any train to hump is exponentially distributed with an average of 36 min. Calculate the following: (a) The mean queue size (b) The probability that the queue size exceeds 9. 27. A colliery working one shift per day uses a large number of locomotives which break down at random intervals; on average one fails per 8-h shift. The fitter carries out a standard maintenance schedule on each faulty locomotive. Each of the five main parts of this schedule takes on average 12 h, but the time varies widely. How much time will the fitter have for the other tasks, and what is the average time a locomotive is out of service? 28. If, for a period of 2 h in a day, trains arrive at the yard every 24 min and the service time is 36 min, then calculate for this period: (a) The probability that the yard is empty (b) The average queue length on the assumption that the line capacity of the yard is limited to 4 trains only. 29. At what average rate must a clerk at a supermarket work in order to ensure a probability of 0.09 that the customer will not have to wait longer than 10 min? It is assumed that there is only one counter, to which customers arrive in a Poisson fashion at an average of 15 per hour. The length of service by the clerk has an exponential distribution. 30. Customers arrive at a one-window drive-in bank according to a Poisson distribution with mean 10 per hour. The service time per customer is exponential with mean 5 min. The space in front of the window, including that for the serviced car, can accommodate a maximum of three cars. Other cars can wait outside the space. (a) What is the probability that an arriving customer can drive directly to the space in front of the window? (b) What is the probability that an arriving customer will have to wait outside the indicated space? (c) How long is an arriving customer expected to wait before starting service? (d) How many spaces should be provided in front of the window so that all the arriving customers can wait in front of the window at least 90% of the time?
15.7
Exercises
485
31. In a self-service facility, arrivals occur according to a Poisson distribution with mean 50 per hour. The service time per customer is exponentially distributed with mean 5 min. (a) Find the expected number of customers in service. (b) What is the percentage of time the facility is idle? 32. An oil refinery receives crude oil at an average rate of one tanker per day. The unloading facilities, which operate 24 h per day, can handle one tanker at a time, but can unload tankers at an average rate of 2 per day. Under the usual assumptions of Poisson arrival and exponential service times, determine the: (a) (b) (c) (d)
Average number of tankers in the system Average time spent by a tanker in the system Average waiting time of a tanker in the queue Percentage of time in which exactly two tankers are in the system.
33. A group of engineers has two terminals available to aid in their calculations. The average computing job requires 20 min of terminal time, and each engineer requires some computation about once every 0.5 h; that is, the mean time between a call for service is 0.5 h. Assume these jobs are distributed with an exponential distribution. If there are six engineers in the group, find: (i) The expected number of engineers waiting to use one of the terminals (ii) The total lost time per day. 34. In a factory cafeteria, the customers (employees) have to pass through three counters. The customers buy coupons at the first counter, select and collect the snacks at the second counter and collect tea at the third. The server at each counter takes on average 1.5 min, although the distribution of service time is approximately Poisson at an average rate of 6 per hour. Calculate: a. The average time of getting the service b. The average time a customer spends waiting in the cafeteria c. The most probable time in getting the service. 35. At a certain airport it takes exactly 5 min to land an aeroplane, once it is given the signal to land. Although incoming planes have scheduled arrival times, the wide variability in arrival times produces an effect which makes the incoming planes appear to arrive in a Poisson fashion at an average rate of 6 per hour. This produces occasional congestion at the airport, which can be dangerous and costly. Under these circumstances, how much time will a pilot expect to spend circling the field waiting to land? 36. In a car manufacturing plant, a loading crane takes exactly 10 min to load a car into a wagon and again come back to position to load another car. If the arrival of cars is a Poisson stream at an average of one every 20 min, calculate the average waiting time of a car.
486
15
Queueing Theory
37. A barber runs his own saloon. It takes him exactly 25 min to complete one haircut. Customers arrive in a Poisson fashion at an average rate of one every 35 min. (a) For what percentage of time would the barber be idle? (b) What is the average time that a customer spends in the shop? 38. The repair of a lathe requires four steps to be completed one after another in a certain order. The time taken to perform each step follows an exponential distribution with a mean of 5 min and is independent of the other steps. The machine breakdown follows a Poisson process with a mean rate of 2 breakdowns per hour. Answer the following: (i) What is the expected idle time of the machine, assuming there is only one repairman available in the workshop? (ii) What is the average waiting time of a broken-down machine in the queue? (iii) What is the expected number of broken-down machines in the queue?
Chapter 16
Flow in Networks
16.1
Objectives
The objectives of this chapter are to: • Present ideas on graphs and their properties • Study flow and potential in networks • Discuss the solution procedure of a minimum path problem when all arc lengths are either non-negative or unrestricted in sign • Present ideas on network flow, maximum flow problems and generalized maximum flow problems • Discuss the source, the sink of a flow, the cut and its capacity in a network flow problem and also the proof of the max-flow min-cut theorem • Discuss the solution procedure of maximum flow problems • Derive the LPP formulation of a network.
16.2
Introduction
Network is an important topic in the areas of engineering as well as in applied mathematics. It arises in the different sectors of our daily life, such as transportation, electrical circuits and communication. Its representation is also widely used for problems in diverse areas like production, distribution, project planning, facilities location, resource management and financial planning. In fact, a network representation provides such a powerful visual and conceptual aid for portraying the relationship between the components of the system that it is used in virtually every field of scientific, social and economic endeavours. One of the most exciting developments in operations research in recent years has been the rapid advancement in both the methodology and applications of network optimization models. © Springer Nature Singapore Pte Ltd. 2019 A. K. Bhunia et al., Advanced Optimization and Operations Research, Springer Optimization and Its Applications 153, https://doi.org/10.1007/978-981-32-9967-2_16
487
488
16 Flow in Networks
A number of algorithmic breakthroughs have had a major impact, employing ideas from computer science concerning data structure and efficient data manipulation. Consequently, algorithms and software are now available and are being used to solve numerous problems on a routine basis that would have been completely intractable two or three decades ago. A network, in its more generalized and abstract sense, is called a graph. In recent years, graph theory has been a subject of much study and research by mathematicians and has found ever-increasing applications in diverse areas. In the field of operations research, graph theory plays an important role in finding the optimal solution of the problem of choosing the best sequence of operations from a finite number of alternatives.
16.3
Graphs
A graph is a set of points (commonly called vertices or nodes) in space that are interconnected by a set of lines (called edges or arcs). For a graph G, the vertex set is denoted by V and the edge set by E. Common nomenclature denotes the number of vertices jV j by n and the number of edges jE j by m. In Fig. 16.1, a graph G with V ¼ fv1 ; v2 ; v3 ; v4 ; v5 g, E ¼ fe1 ; e2 ; . . .; e7 g; n ¼ 5 and m ¼ 7 is shown. The elements of E are denoted either by ek ; k ¼ 1; 2; . . .; m or vi ; vj . If the values of both n and m are finite, then the graph G is said to be finite. The degree of a vertex v [denoted by d(v)] is the number of edges that have v as an end point. For example, in the graph of Fig. 16.1, there are two vertices v1 and v2 of odd degree (both have degree 3). A self-loop is an edge vi ; vj where vi ¼ vj . Two edges vi ; vj and ðvr ; vs Þ are parallel edges if vi ¼ vr and vj ¼ vs . An arc with an arrow denoting the direction from vi to vj is called a directed arc. A graph with directed arcs is called a directed graph or digraph (Fig. 16.2). A subgraph of G(V, E) is defined as a graph G1(V1, E1) with V1 V and E1 containing all those arcs of G which connect the vertices of G1. For example, in Fig. 16.1, if V1 ¼ ðv1 ; v2 ; v4 Þ and E1 ¼ ðe1 ; e4 ; e6 Þ, then G1(V1, E1) is a subgraph of G. Fig. 16.1 Graph
v1
e1
v2 e2
e7
v3
e4
e6
e3 v5
e5
v4
16.3
Graphs
Fig. 16.2 a Undirected graph, b directed graph
489
v1
(a)
v2
(b)
v3
A partial graph of G(V, E) is a graph G2(V2, E2) which contains all the vertices of G and some of its arcs ðE2 E Þ. An arc (directed or undirected) is said to be incident with a vertex in which it joins to some other vertex; thus, it connects two vertices. The directed arc ek ¼ vi ; vj is said to be incident from or going from vi and incident to or going to vj. We call vi as the initial vertex and vj as the terminal vertex of the arc vi ; vj . Let V1 and V2 be two subsets of V such that they have no common vertex, and let ek ¼ vi ; vj be an arc such that vi 2 V1 ; vj 2 V2 . Then ek is said to be incident from or going from V1 and incident to or going to V2. It is incident with both V1 and V2 and is said to connect them. In Fig. 16.3, if V1 ¼ fv2 ; v3 g and V2 ¼ fv6 ; v7 g, then e4 connects V1 and V2. It runs from V1 to V2 and is incident with both. Now we denote by XðVk Þ the set of arcs of G(V, E) incident with a subset Vk of V, by X þ ðVk Þ the set of arcs incident to Vk and by X ðVk Þ the set of arcs incident from Vk. Hence,
Fig. 16.3 Directed graph
490
16 Flow in Networks
X þ ðV1 Þ ¼ fe1 ; e5 g;
X ðV1 Þ ¼ fe2 ; e4 ; e6 g; XðV1 Þ ¼ fe1 ; e2 ; e4 ; e5 ; e6 g:
Similarly, X þ ðV2 Þ ¼ fe4 ; e8 g; X ðV2 Þ ¼ fe5 g; XðV2 Þ ¼ fe4 ; e5 ; e8 g. For a sequence of arcs ðe1 ; e2 ; . . .; ek ; ek þ 1 ; . . .; em Þ of a graph, if every intermediate arc ek has one vertex common with the arc ek−1 and another common with ek+1, then this sequence is called a chain. For example, the sequence ðe1 ; e6 ; e7 Þ in Fig. 16.3 is a chain. We may also denote a chain by the vertices which it connects; for example, this chain may also be written as ðv1 ; v2 ; v4 ; v5 Þ. A chain becomes a cycle if, in the sequence of arcs, no arc is used twice, and the first arc has a common vertex with the last arc and this vertex is not common with any intermediate arc. For example, the chain ðe3 ; e4 ; e9 ; e5 Þ in Fig. 16.3 is a cycle. A path is a chain in which all the arcs are directed in the same sense such that the terminal vertex of the preceding arc is the initial vertex of the succeeding arc. In Fig. 16.3, the sequence of arcs ðe1 ; e6 ; e7 ; e8 Þ is a path. We may also denote the path in terms of the vertices as ðv1 ; v2 ; v4 ; v5 ; v7 Þ. Note that every path is a chain, but every chain is not a path. A circuit is a cycle in which all the arcs are directed in the same sense. The cycle ðe1 ; e3 ; e2 Þ is a circuit. A graph is said to be connected if for every pair of vertices there is a chain connecting the two. A graph is strongly connected if there is a path connecting every pair of vertices in it. A tree is defined as a connected graph with at least two vertices and no cycles. It can be proved that a tree with n vertices has (n − 1) arcs and that every pair of vertices is joined by one and only one chain. If we delete an arc from a tree, the resulting graph is not connected, and if we add an arc, a cycle is formed. As the name indicates, a natural tree is the best example of a graphical tree, with the branches forming the arcs and the extremities of the branches forming the vertices (Fig. 16.4). A vertex which is connected to every other vertex of the graph by a path is called a centre of the graph. A graph may or may not have a centre, or it may have many centres. Every vertex of a strongly connected graph is a centre. A tree with a centre is called an arborescence. In Fig. 16.5, the centre is marked. In arborescence all the arcs incident with the centre are going away from it, and all other arcs are directed in the same sense.
Fig. 16.4 Tree
16.4
Minimum Path Problem
491
.
Centre Fig. 16.5 Graph with centre
16.4
Minimum Path Problem
Let us consider a real number xij which is associated with each arc vi ; vj of a graph G(V, E), and also let va and vb be two vertices of the graph. There may be a number P of paths from va to vb. For each path, the length of the path is defined as xij , where the summation is over the sequence of arcs forming the path. In this case, the objective of the problem is to find the path of smallest length. The term ‘length’ is used here in a general sense of any real number associated with the arc and should not be regarded as a geometrical distance. It may be distance, time, cost, etc. There may be some abstract situations in which the length is not even non-negative. In general, xij is a real number, unrestricted in sign. Several techniques/algorithms have been suggested for solving the minimum path problem. Here, we shall describe two different techniques for solving the minimum path problem when (i) xij 0 and (ii) xij is unrestricted in sign. Case I When xij 0, i.e. when all arc lengths are non-negative. Let fi denote the minimum length of the path from the vertex va to vi. We have to find fb. Obviously, fa = 0. Let Vp be a subset of V such that va 2 Vp and vb 62 Vp . First, we have to determine fi for every vi in Vp and then fi + xij for every vi in Vp and vj not in Vp such that vi ; vj is an arc incident from Vp. Let
fr þ xrs ¼ Min fi þ xij ; i
where vr 2 Vp and vs 62 Vp : Then the length of the minimum path from va to vs is given by fs = fr + xrs. Now, from an enlarged subset Vp+1 of V denoted by Vp þ 1 ¼ Vp [ fvs g we repeat the process.
492
16 Flow in Networks
Suppose we start with p = 0 with V0 consisting of a single vertex va and fa = 0. Following the procedure, the sets V1 ; V2 ; . . .; Vp ; Vp þ 1 ; . . . are formed. As soon as we arrive at a set in this sequence which includes vb and fb has been obtained. If no such set can be found, there is no path connecting va to vb. We note that for solving these types of problems, drawing of the graph is not essential either to describe the problem or to find its solution. The problem is completely enunciated if all the vertices, arcs and arc lengths are specified. In fact, in a large problem with vertices and arcs, drawing of a graph is neither practicable nor necessary. Case II: When xij is unrestricted in sign, i.e. when arc lengths are unrestricted in sign. Let va, vb be two vertices in the graph G(V, E) whose arc lengths are real numbers, positive, negative or zero. We have to find the minimum path from va to vb. Let us assume that there is no circuit in the graph whose arc length is negative. If there is any such circuit, one can go around and around it and decrease the length of the path without limit, getting an unbounded solution. For solving the problem, we have to construct an arborescence A1 ðV1 ; E1 Þ; V1 V; E1 E, with centre va and V1 containing all the vertices of V which can be reached from va along a path and E1 containing from va along a path and some of the arcs of E which are necessary to construct the arborescence. If V1 contains vb, a path connects va to vb. In a particular arborescence, this path is unique. There may be several arborescences; therefore, several paths exist. If in any problem only one arborescence is possible, there is only one path from va to vb, and that is the solution. If V1 does not contain vb, there is no path from va to vb, and the problem has no solution. We now discuss the method of construction of an arborescence. Mark out the arcs going from vb. From the vertices so reached, mark out the arcs (not necessarily all of them) going out to other vertices. No vertex should be reached by more than one arc; that is, not more than one arc should be incident to any vertex. If there is a vertex to which no arc is incident, it cannot be reached from va and so is left out. No arc incident to va should be drawn (Fig. 16.6). Let fi denote the length of the path from va to any vertex vi in the arborescence. The arborescence determines fi uniquely for each vi in V, but fi is not necessarily minimum. Let vj ; vi be an arc in G but not in arborescence A1. Then consider the length fj þ xji and compare it with fi. If fi fj þ xji , make no change. If fi [ fj þ xji , delete the arc incident to vi in A1 and include the arc vj ; vi . This modifies the arborescence from A1 to A2 and reduces fi to its new value fj þ xji , the reduction in the value of fi being fi fj xji . The lengths of the paths to the vertices going through vi are also reduced by the same amount. These adjustments are made, and thus the new values of fi for all vi in A2 are calculated. Now repeat the operations in A2; i.e. select a vertex and check whether any alternative arc gives a smaller path to it. If yes, modify A2 to A3 and adjust fi
16.4
Minimum Path Problem
Fig. 16.6 Graph with centre
493
(a) 6
7
3
5
4
2
8
0
1
6
7
3
5
4
2
0
1
(b)
(c) 6
7
3
5
4
2
0
1
accordingly. Finally, we reach an arborescence Ar which cannot further be changed by the described procedure. Theorem-1 The arborescence Ar marks out the minimum path to each vi from va, and fb in this arborescence is the minimum path to vb. Proof Let ðva ; v1 ; v2 ; . . .; vb Þ be any path in G from va to vb. Its length is xa1 þ x12 þ þ xpb . The vertices in this path are in Ar because Ar contains all those vertices of G which can be reached from va. By the property of Ar, for every vertex vi in Ar and for every arc vj ; vi in G,
494
16 Flow in Networks
fi fj þ xji or
fi fj xji
because otherwise Ar could have been further modified. Writing these inequalities for all vertices of the above path, we have f1 fa xa1 f2 f1 x12 ... fb fp xpb : Adding, we get fb fa xa1 þ a12 þ þ xpb or
fb xa1 þ a12 þ þ xpb
½* fa ¼ 0:
Thus, it is proved that no path from va to vb in G can be smaller than fb. Since the path length fb is also in G, this path is the minimum. Note The path of maximum length can be found either by changing the signs of the lengths of all the arcs and then finding the minimum or by reversing the inequality fi [ fj þ xji to fi \fj þ xji .
16.5
Problem of Potential Difference
Potential plays an important role in electricity, fluid flow and other branches of mechanics. In graph theory, the potential in a network is defined as follows. For each vertex in a graph G(V, E), there is associated a real number fi called its potential. For an arc of a graph, there is an associated potential difference. This potential difference is denoted by xij for an arc ek = (vi, vj) and is defined by xij = fj – fi. Hence, the potential difference in a chain through the vertices v1, v2, …, vp is defined as x12 þ x23 þ þ xp1;p ¼ fp f1 : It follows immediately that the potential difference in a cycle is zero. Again, it is obvious that xij = −xji. The problem of maximum potential difference in a network is stated as follows: The potential difference xab of two vertices va and vb of a graph G(V, E) is to be maximized subject to the conditions xij cij for all the arcs (vi, vj), where cij are constants.
16.5
Problem of Potential Difference
495
In terms of potential, these constraints may be written as fj fi cij
for all the arcs ðvi ; vj Þ:
Hence, the problem reduces to the following: Maximize fb fa subject to the constraints fj fi cij
for all arcs vi ; vj of the graph GðV; E Þ:
Now, if we consider the potential of the vertex va as zero, then this problem reduces to Maximize fb subject to the constraints fj fi cij for all arcs vi ; vj of the graph GðV; E Þ; where fi is a real-valued function associated with the vertex vi in a graph with fa = 0. The problem is identical to the minimum path problem, as the inequalities of this problem are similar to the inequalities of the minimum path problem. The only difference between the two inequalities is that xij of the minimum path problem is replaced by cij here. Therefore, to solve the problem of potential difference, we have to solve the problem of minimum path from va to vb with cij as the length of the arc (vi, vj). A more general problem of maximum potential difference in a network is of the following type: bij fj fi cij for all arcs vi ; vj : The solution methodology remains the same, as each inequality of the above type can be written as fj fi cij fi fj bij :
16.6
Network Flow Problem
Like potential, flow in a network is also a familiar concept in electrical theory, flows of traffic through a network of roads, flows of liquid through a network of pipelines, etc. The basic principle of flow in a network is that, for every vertex, the total flow in is equal to the total flow out. To extend this idea in a more abstract situation, it is required to give a precise definition of flow in a graph.
496
16 Flow in Networks
Let a real number xk be associated with every arc ek, k = 1, 2, …, m of a graph G (V, E) such that for every vertex vj, X X xk ¼ x ; k ¼ 1; 2; . . .; m; 1 k P where the left-hand side summation is taken for all arcs going to vj and the P right-hand side summation 1 is taken for all arcs going from vj. Then xk is said to be a flow in the arc ek and the set {xk}, k = 1, 2, …, m is said to be a flow in the graph G. Now we shall define a graph to state the problem of maximum flow in a network. Let G(V, E) be a graph with V as the set of (n + 2) vertices va, v1, v2, v3, …, vn, vb and E as the set of (m + 1) arcs e0, e1, e2, …, em. The vertices va and vb and the arc e0 play a special role in this graph. va is called the source, vb the sink, and the arc e0 connects vb to va. It is the only arc going from vb. For every arc ek, k = 1, 2, …, m (except e0), there is associated a real number ck ( 0) called the capacity of the arc. Let {xk} be a flow in the graph G such that 0 xk ck (k = 1, 2, …, m), where ck is the capacity of the flow of the arc ek. In this context, note that x0 as the flow in the arc e0 is defined, but the capacity of the arc e0 is not defined. So there is no constraint on x0. Since x0 is the only flow out at vb and flow in at va, then by the definition of flow, we have total flow in at vb ¼ total flow out at vb ¼ x0 ¼ total flow in at va ¼ total flow out at va Since all the flows out at va are equal to the flows in at vb, then va and vb are called the source and sink respectively. The arc e0 plays an important role in the flow network and is considered the return arc. Hence, our problem is to determine the maximum flow out at the source (=maximum flow in in the sink). More precisely, our problem is to determine the flow {xk} such that x0 is maximum subject to the constraints 0 xk ck, k = 1, 2, …, m. Now we have an important question: How can we solve the maximum flow problem? The procedure for solving such a problem is given in the following algorithm. Algorithm Step 1 Set an initial feasible flow by guessing. If this is not possible, initially set xk = 0 for all k.
16.6
Network Flow Problem
497
Step 2 Divide the set V of vertices into two subsets W1 and W2 such that each vertex is either in W1 or in W2 but not in both. Initially, consider W1 = {va} and all other vertices in W2. Step 3 Transfer a vertex from W2 to W1 by the following procedure: For vi 2 W1, vj 2 W2, (a) Transfer vj to W1 if (vi, vj) is an arc ek with xk < ck; (b) Transfer vj to W1 if (vj, vi) is an arc ek with xk > 0; (c) Do not transfer vj to W1 otherwise. Repeat the process for transferring the vertices from W2 to W1. Finally, if vb is transferred to W1 by this procedure, the flow is not optimal. Go to Step 4. Step 4 Increase the value of xk in the arc of category (a) mentioned above in which xk < ck and decrease the value of xk in the arc of category (b) in which xk > 0 so that the flow remains feasible and at least one arc gets the capacity flow. Then go to Step 2. After repetitive operations of Steps 2, 3 and 4, a situation will appear when vb cannot be transferred to W1 by the operation of Step 3. In this situation, the flow is optimal. Definition If in the graph G(V, E) of the maximum flow problem, W2 is a subset of V such that vb 2 W2, va 62 W2, then the set of arcs X þ ðW2 Þ (arcs incident to W2) is said to be a cut. The capacity of the cut is the sum of the capacities of the arcs contained in the cut. Theorem 1 For any feasible flow {xk}, k = 1, 2, …, m, in the graph, the flow x0 in the return arc is not greater than the capacity of any cut in the graph. Proof Let X þ ðW2 Þ be any cut. Now, let us consider the flow in the arcs going to and going from W2. According to the definition of flow, the flow in is equal to the flow out. Therefore, X
xk ¼ x0 þ
X
x; 1 k
P P where and 1 denote the summations over the arcs going to and going from W2 (except e0). P Since xk 0 for all k, then xk x0 . Also, xk ck for all k. P P Therefore ck x0 ; where ck is the capacity of the cut X þ ðW2 Þ. This proves the theorem. Theorem 2 The preceding algorithm solves the problem of the maximum flow. Proof Let us suppose that by successive applications of the algorithm, a situation will appear when no vertex of W2 can be transferred to W1 by the prescribed
498
16 Flow in Networks
procedure and vb 2 W2. Let the set of arcs X þ ðW2 Þ be a cut. Also let ek 2 X þ ðW2 Þ. This implies that ek is an arc (vr, vs), where vr 2 W1, vs 2 W2. In this case, the flow in this arc should be saturated, i.e. xk = ck; otherwise, it would have been possible to transfer vs from W2 to W1 by the criterion of Step 3(a), which is contrary to the hypothesis. Again, let ei 2 X þ ðW2 Þ, i 6¼ 0. This means that ui is an arc (vp, vq), where vp 2 W2, vq 2 W1. In this case, the flow in this arc should be zero; otherwise, it would have been possible to transfer vp from W2 to W1 by the criterion of Step 3(b), which is again contrary to the hypothesis. P Therefore, we conclude that the flow into W2 is ck , the summation being over all ek 2 X þ ðW2 Þ, and the flow out of W2 is only in the return arc e0 as it is the only arc going from W2 carrying a non-zero flow. Let the flow in e0 be y0. Then, from the definition of flow, X ck ¼ y0 ; P where ck is the total capacity of the cut obtained by the application of the algorithm. But from the theorem ‘For any feasible flow {xk}, k = 1, 2, …, m, in the graph, the flow x0 in the return arc is not greater than the capacity of any cut in the graph.’, for any flow x0 in e0, P ck ; x0 i:e: x0 y0 ; P where ck is the capacity of any cut. It follows that y0 = Max x0. This means that the maximal flow problem can be solved by the preceding algorithm.
Max-Flow Min-Cut Theorem Theorem 3 The maximum flow in a graph is equal to the minimum of the capacities of all possible cuts in it. Proof Let va and vb be the source and sink of the graph G(V, E) of a flow problem. Also, let W2 be a subset of V such that vb 2 W2, va 62 W2. Let X þ ðW2 Þ be any cut. Let us consider the flow in the arcs going to and going from W2. Now, from the definition of flow, the flow in should be equal to the flow out, P
i:e:
P
where and W2 (except e0).
1
X
xk ¼ x0 þ
X
x; 1 k
denote the summations over the arcs going to and going from
16.6
Network Flow Problem
499
P Since for all k, xk 0; xk x0 . Again, xk ck for all k. P P ck is the total capacity of the cut X þ ðW2 Þ, Therefore, ck x0 where X or x0 ck : X Therefore; max x0 min ck : Again, we know that there is a cut corresponding to which the flow in e0 is equal to the cut capacity. Necessarily this flow should be maximum, and the corresponding cut capacity should be the least of all cut capacities. This proves the theorem.
16.7
Generalized Problem of Maximum Flow
The general form of the maximum flow problem is as follows: Let G(V, E) be a graph consisting of a source va, a sink vb, vertices vi, i = 1, 2, …, n, arcs ek, k = 1, 2, …, m and a return arc e0. Find a flow {xk} in G such that x0 is maximum subject to bk xk ck : Here xk can vary between an upper bound ck and a lower bound bk, both of which are real numbers and not necessarily non-negative. Thus, negative flow xk in an arc is admissible and is interpreted as a flow-xk in the reverse direction. The constraints on xk may be rewritten as 0 x k bk c k bk or
0 yk ck bk ; where xk ¼ yk þ bk
or
y k ¼ x k bk :
In terms of the new variables yk, the constraints are similar to those of the earlier maximum flow problem. However, if {xk} is a flow, {yk} is not necessarily a flow. From the definition of flow, {xk} is a flow if X
xk ¼
X
x 1 k
P P P P or if ykP þ bkP ¼ 1 yk þ 1 bk for every vertex vj, where and 1 are summations on arcs going to and going from vjP . From this equation, it is clear that for {y } to be a flow, the relation bk ¼ k P b must hold. But there is no reason that this relation should hold, in general. 1 k
500
16 Flow in Networks
To overcome this difficulty, a new vertex v0 in G, called a fictitious vertex, is introduced, and the necessary number of arcs, called fictitious arcs, are drawn according to the following rules. Rule 1: For an arc ek = (vi, vj) with bk > 0, form a cycle (v0, vj, vi, v0) by drawing the arcs (v0, vj) and (vi, v0). Then getting a flow in this cycle, assume a flow bk in each of the arcs (v0, vj), (vj, vi), (vi, v0). Rule 2: For an arc ek = (vi, vj) with bk < 0, form a cycle (v0, vi, vj, v0) by drawing the arcs (v0, vi) and (vj, v0). Then getting a flow in this cycle, assume a flow −bk in each of the arcs (v0, vi), (vi, vj), (vj, v0). Due to the inclusion of the fictitious vertex and arcs in G(V, E), we get a new graph G1(V1, E1) with constant flows in the cycles so formed. These flows together form a flow in G1, as it is a linear combination of flows in the cycles. Now, we shall refer to this flow as the fictitious flow in G1. Let {xk} be any flow in G. It is also a flow in G1. Let this flow be superimposed on the fictitious flow. The result is a flow, as it is a linear combination of two flows. Denoting this flow by {yk}, we have yk ¼
xk bk ; for ek in G : the fictitious flow for all fictitious arcs
Let us now determine the flow {yk} in G1 such that y0 is maximum subject to 0 yk ck bk for ek 2 G and yk ¼ fictitious flow for fictitious arcs: This involves keeping the flow constant in some arcs of G1 and varying it in others until y0 is maximum. This can be done by our previous algorithm. Having determined the optimal flow {yk}, we have to calculate xk = yk + bk, and finally we get the optimal flow {xk} in G.
16.8
Formulation of an LPP of a Flow Network
Let us consider the flow network shown in Fig. 16.7. Also, let x0 be the flow of the return arc from vb to va. For different arcs, the flow variables are given in the following table: Arc Variable
(a, 1) x1
(1, 2) x2
(1, b) x3
(2, 1) x4
(2, 3) x5
(2, 4) x6
(2, b) x7
(3, a) x8
(3, 4) x9
(4, b) x10
(b, 2) x11
16.8
Formulation of an LPP of a Flow Network
501
x3
4
1
x1 5
x2
2
x4
x11
x5 5
2
x8
x6 x9
3
4
x7
2
a
b
3
6
5
x10 3
5
4
x0 Fig. 16.7 Representation of network flow
Here vi ; vj is denoted by ði; jÞ: From the definition of flow, we know that the flow into the vertex is equal to the flow out from the vertex. Hence, for different vertices, the constraints are as follows: For vertex v1, x1 þ x4 ¼ x2 þ x3 ) x1 x2 x3 þ x4 ¼ 0: For vertex v2, x2 þ x11 ¼ x4 þ x5 þ x6 þ x7 ) x2 x4 x5 x6 þ x11 ¼ 0: For vertex v3, x5 ¼ x8 þ x9 ) x5 x8 x9 ¼ 0: For vertex v4, x6 þ x9 ¼ x10 ) x6 þ x9 x10 ¼ 0:
502
16 Flow in Networks
For vertex va, x0 þ x8 ¼ x1 ) x0 x1 þ x8 ¼ 0: For vertex vb, x3 þ x7 þ x10 ¼ x0 þ x11 ) x0 þ x11 x3 x7 x10 ¼ 0: For the network, the capacity constraints are as follows: x1 5, x2 6, x3 4, x4 2, x5 2, x6 5, x7 4, x8 5, x9 5, x10 3, x11 3 and xi 0; i = 1, 2, 3, …, 11. Hence, the LPP of the given network is as follows: Maximize f = x0 subject to x1 – x2 – x3 + x4
=0
– x4 – x5 – x6
x2
+ x11 – x8 – x9
x5
=0
+ x9 – x10
x6 x0 – x1
=0
+ x8
x0
– x3
=0
=0
– x7
- x10 +x11 = 0
x1 5, x2 6, x3 4, x4 2, x5 2, x6 5, x7 4, x8 5, x9 5, x10 3, x11 3, xi 0; i = 1, 2, 3, …, 11 and x0 is unrestricted in sign. Example 1 Find the minimum path from v0 to v8 in the graph (Fig. 16.8) in which the number along a directed arc denotes its length.
10
1
4
3
2
0
1
2
1
8
3 6
5
4
2
6
3
4
Fig. 16.8 Representation of directed flow network
10
7
6
1 8
2 5
1 7
8
16.8
Formulation of an LPP of a Flow Network
503
Solution Let V be the set of all of the given graph and Vp a subset of V. Let vertices xij be the arc length of an arc vi ; vj and fi be the minimum length of the path from the vertex v0 to a vertex vi of Vp. Also, let fs be the minimum length of the path from Vp to another vertex vs, not in Vp, through an arc incident from Vp, i:e: fs ¼ min fi þ xij for all vi 2 Vp and vi ; vj 2 X Vp ; i
where X Vp is the set of arcs incident from Vp. We denote vi ; vj as (i, j). Now to find the minimum path, we construct Table 16.1. From Table 16.1, it is seen that the minimum path is either 0–1–2–3–6–8 or 0– 1–2–5–6–8 and the length of each path is 17. Example 2 Find the minimum path from v0 to v7 in the graph of Fig. 16.9. Solution From the graph, it is clear that there is no circuit whose length is negative. Now, we draw an arborescence A1 (see Fig. 16.10a) with centre v0 consisting of all those vertices of the graph which can be reached from v0 and the necessary number of arcs. From the graph, it is also clear that a number of arborescences can be drawn. A1 is one of them. Now the lengths fi of the paths from v0 to different vertices vi of A1 are as follows: f0 ¼ 0;
f1 ¼ 1; f2 ¼ 4; f3 ¼ 4 1 ¼ 5; f4 ¼ 3;
f7 ¼ 3 þ 2 ¼ 5; f5 ¼ 2; f6 ¼ 2 þ 2 ¼ 0: Let us consider the vertex v2. There is an arc ðv1 ; v2 Þ in G which is not in A1 such that f2 ¼ 4\f1 þ x12 ¼ 1 þ 2 ¼ 3: So, A1 will be unchanged. Now we consider the vertex v3. There is an arc ðv1 ; v3 Þ in G which is not in A1 such that f3 ¼ 5\f1 þ x13 ¼ 1 þ 2 ¼ 3: So, A1 will be unchanged. Now, for v4 in A2, arc ðv3 ; v4 Þ is in G, but not in A2 such that f4 ¼ 3 [ f3 þ x34 ¼ 5 4 ¼ 9: So we delete the arc ðv0 ; v4 Þ, include ðv3 ; v4 Þ, get another arborescence A2 (see Fig. 16.10b) with f4 = −9 and consequently f7 = −9 + 2 = −7. For v5 in A2, arc ðv4 ; v5 Þ is in G but not in A2 such that
504
16 Flow in Networks
Table 16.1 Calculation of minimum path xij p Vp fi fi + xij X Vp 0
0
0
1
0
0
1
2
0 1
0 2
2
5
0 1 2 3 5
0 2 5 6 6
0 1 2 3 4 5 0 1 2 3 4 5 6
0 2 5 6 7 6 6 0 2 5 6 7 6 10
7
10
2
3
4
5
(0, (0, (0, (0, (0, (1, (1, (1, (0, (1, (1, (2, (2,
vs
Minimum path up to vs
2
1
0–1
5
2
0–1–2
6 6
3 5
0–1–2–3 0–1–2–5
7
4
0–1–2–5–4
10 10 10 11
10 10 10
6 7 6
0–1–2–3–6 0–1–2–5–4–7 0–1–2–5–6
7
17
17
8
0–1–2–3–6–8 0–1–2–5–6–8
10
20
1) 2) 3) 2) 3) 2) 4) 5) 3) 4) 5) 3) 5)
2 6 8 6 8 3 10 8 8 10 8 1 1
2 6 8 6 8 5 12 10 8 12 10 6 6
(1, 4)
10
12
6) 4) 6) 7)
4 1 4 5
10 7 10 11
(3, 6) (4, 7) (5,6) (5,7)
4 3 4 5
(6, 8) (7, 8)
(3, (5, (5, (5,
fs
16.8
Formulation of an LPP of a Flow Network
-5
-1
6
3
7
-2 -2
2
505
-4
2
-1 2
4
5
2
3
-2
2
-4 0
1
1
Fig. 16.9 Graph with known edge
(a)
(b)
6
3
7
2
2
-1 4
5
0
2
-4
1
0
1
1
1
(d)
6
5
-1
4
5
-2
(c)
2
-4
2
-4
-2
3
7
2 2
3
6
3
7
-4
2
6
-2 -2
-1
-2 4
2
3
7
-4
2
5
2
-4 0
-1
4
-4 1
0
1
1
Fig. 16.10 a Arborescence A1, b arborescence A2, c arborescence A3, d arborescence A4
f5 ¼ 2 [ f4 þ x45 ¼ 9 2 ¼ 11: So we delete ðv0 ; v5 Þ, include ðv4 ; v5 Þ, get another arborescence A3 (see Fig. 16.10c) with f5 = −11 and consequently f6 = −11 + 2 = −9. For v6 in A3, arc ðv4 ; v6 Þ is in G but not in A3 such that
506
16 Flow in Networks
Fig. 16.11 Arborescence A5
f6 ¼ 9 [ f4 þ x46 ¼ 9 2 ¼ 11: So we delete ðv5 ; v6 Þ, include ðv4 ; v6 Þ and get another arborescence A4 (see Fig. 16.10d) with f6 = −11. For v7 in A4, arc ðv6 ; v7 Þ is in G but not in A4 such that f7 ¼ 7 [ f6 þ x67 ¼ 11 1 ¼ 12: So we delete ðv4 ; v7 Þ, include ðv6 ; v7 Þ and get another arborescence A5 (see Fig. 16.11) with f7 = −12. Again, for v7 in A5, arc ðv3 ; v7 Þ is in G but not in A5 such that f7 ¼ 12\f3 þ x37 ¼ 5 5 ¼ 10: So, we leave A5 unchanged. From the above arborescence A5, it is observed that A5 cannot further be modified. Therefore, no alternative arc decreases the length of the path from v0 to any vertex. Hence, the minimum path from v0 to v7 contains the vertices v0 ; v2 ; v3 ; v4 ; v6 ; v7 with length −12. Example 3 Find the minimum path from v1 to v8 in the graph with arc lengths as follows [here (i, j) denotes the arc vi ; vj ]: Arc
(1, 2)
(1, 3)
(1, 4)
(2, 3)
(2, 6)
(2, 5)
(3, 5)
(3, 4)
−8
7
−3 (7, 3)
7 (7, 8)
Length Arc
−1 (4, 7)
4 (5, 6)
−11 2 (5, 8) (6, 3)
(6, 4)
(6, 7)
(6, 8)
Length
3
1
12
2
6
– 10
4
–2
2
Solution For the given problem, the corresponding graph G is given in Fig. 16.12. From the graph, it is clear that there is no circuit whose length is negative. Now, we draw an arborescence A1 (see Fig. 16.13) with centre v1 consisting of all those vertices of the graph which can be reached from v1 and the necessary arcs. Now the lengths fi of the paths from v1 to different vertices vi of A1 are as follows:
16.8
Formulation of an LPP of a Flow Network
507
-8 6
2 7
-1
-10
2 4
1
1 4
3
-3 7
12 5
8
2 -2
-11
2 7
4 Fig. 16.12 Graph
Fig. 16.13 Arborescence A1
8 -10
5
6 -8
7 3
-3 2
4
3 4
-1
-11 1
f1 ¼ 0;
f2 ¼ 1; f3 ¼ 4; f4 ¼ 11; f5 ¼ 1; f6 ¼ 9; f7 ¼ 8; f8 ¼ 19
Let us consider the vertex v3. This vertex is connected by the arcs ðv1 ; v3 Þ; ðv2 ; v3 Þ; ðv6 ; v3 Þ and ðv7 ; v3 Þ. Among these, arcs ðv2 ; v3 Þ; ðv6 ; v3 Þ and ðv7 ; v3 Þ are not in the arborescence A1 such that for the arcðv2 ; v3 Þ f3 ¼ 4 [ f2 þ x23 ¼ 1 þ 2 ¼ 1 f3 ¼ 4 [ f6 þ x63 ¼ 9 þ 4 ¼ 5 for the arcðv6 ; v3 Þ and f3 ¼ 4 [ f7 þ x73 ¼ 8 2 ¼ 10 for the arcðv7 ; v3 Þ:
508
16 Flow in Networks
Fig. 16.14 Arborescence A2
8 -10
5
6
7 -2
-8
3
-3 2
4
3
-1
-11 1
Hence, we delete the arc ðv1 ; v3 Þ, include ðv7 ; v3 Þ, get another arborescence A2 (see Fig. 16.14) with f3 ¼ 10 and consequently f5 ¼ 10 3 ¼ 13. Again for the vertex v4 in A2, arcs ðv3 ; v4 Þ and ðv6 ; v4 Þ are in G but not in A2 such that f4 ¼ 11 \ f3 þ x34 ¼ 10 þ 7 ¼ 3 for arc ðv3 ; v4 Þ and f4 ¼ 11 \ f6 þ x64 ¼ 9 þ 2 ¼ 7
for arc ðv6 ; v4 Þ:
So, A2 will not be changed. Now, for v5 in A2, arc ðv2 ; v5 Þ is in G but not in A2 such that f5 ¼ 13\f2 þ x25 ¼ 1 þ 7 ¼ 6: So A2 will not be changed. Again, for v6 in A2, arc ðv5 ; v6 Þ is in G but not in A2 such that f6 ¼ 9 [ f5 þ x56 ¼ 13 þ 1 ¼ 12: So we delete the arc ðv2 ; v6 Þ, include ðv5 ; v6 Þ, get another arborescence A3 (see Fig. 16.15) with f6 ¼ 12 and consequently f8 ¼ 12 10 ¼ 22. Now for vertex v7 in A3, arc ðv6 ; v7 Þ is in G but not in A3 such that f7 ¼ 8\f6 þ x67 ¼ 12 þ 6 ¼ 6:
16.8
Formulation of an LPP of a Flow Network
509
Fig. 16.15 Arborescence A3
8 -10 1 5
6
7 -2
3
-3 2
4
3
-1
-11 1
So we leave A3 unchanged. Again for vertex v8 in A3, arcs ðv5 ; v8 Þ and ðv7 ; v8 Þ are in G but not in A3 such that f8 ¼ 22\f5 þ x58 ¼ 13 þ 12 ¼ 1 and
f8 ¼ 22\f7 þ x78 ¼ 8 þ 2 ¼ 6:
So we leave A3 unchanged. From the arborescence A3, it is seen that the arborescence A3 cannot further be modified. Therefore, no alternative arc decreases the length of the path from v1 to any other vertex. Hence the minimum path from v1 to v8 is 1473568 with length −22. Example 4 Find the maximum potential difference x14 between v1 and v4 of the graph with the following data subject to the condition that for each arc xjk cjk: V E cjk
1 2 ð1; 2Þ ð1; 3Þ 3 2
3 4 ð2; 3Þ ð2; 4Þ 2 1
ð3; 4Þ ð1; 4Þ 4 1
Solution Treating the values of cjk as the arc lengths of a graph, we construct the graph in Fig. 16.16. Now we draw an arborescence A1 with centre v1 consisting of all those vertices of the graph which can be reached from v1 and the necessary number of arcs (Fig. 16.17). Now the lengths fj of the paths from v1 to different vertices vj of A1 are as follows:
510
16 Flow in Networks
Fig. 16.16 Graph of the problem of Example 4
3
1
2 -2
1
2
-1 3
4 4
Fig. 16.17 Arborescence A1
3
1
-1
2 3
f1 ¼ 0;
2
4
f2 ¼ 3; f3 ¼ 2; f4 ¼ 1
Now, for the vertex v4 in A1, arcs (v2, v4) and (v3, v4) are in G, but not in A1 such that
and
f4 ¼ 1\f2 þ c24 ¼ 3 þ 1 ¼ 4 f4 ¼ 1\f3 þ c34 ¼ 2 þ 4 ¼ 6:
So we leave A1 unchanged. Hence, the maximum potential difference x14 = −1 with the optimal path (v1, v4). Example 5 Find the maximum potential difference between v1 and v4 in the graph G(V, E), where V E
1 ð1; 2Þ
subject to the constraints
2 ð1; 3Þ
3 ð2; 3Þ
4 ð3; 4Þ
ð4; 2Þ
ð1; 4Þ
16.8
Formulation of an LPP of a Flow Network
511
Fig. 16.18 Graph of the problem of Example 5
2 4
2 10 3 2
6 -2
-1
-6
3
1 3 2 f2 f1 3; 6 f3 f2 10; f4 f3 2 2 f2 f4 ; 1 f4 f1 6; f3 f1 7: Solution The given constraints can be written as follows: f2 f1 3;
f1 f2 2; f3 f2 10; f2 f3 6
f4 f3 2;
f4 f2 2; f4 f1 6; f1 f4 1; f3 f1 7
For the given problem, the graph is given in Fig. 16.18. The student may easily solve the problem. For this problem, the maximum potential difference between v1 and v4 is 3. Example 6 In the graph of Fig. 16.19, the numbers along the arcs are the values of ci. Find the maximum flow in the graph. For solving the given problem, let us assume that the initial flow is zero in each arc. Let W1 = {va} and W2 be the set of other vertices. Now, for an arc (va, v1), vav 2 W1, v1 2 W2, the flow is zero, which is less than its capacity. So we transfer v1 to W1. Again, for an arc (v1, v5), v1 2 W1, v5 2 W2, the flow is less than its capacity 4. So we transfer v5 to W1. For the arc (v5, vb), v5 2 W1, vb 2 W2, the flow is zero, which is less than its capacity. So vb is transferred to W1. As a result, the solution is not optimal. In this process, the corresponding chain is (va, v1, v5, vb). The least capacity in this chain is 3. So, in each arc of this chain and also in the return arc (vb, va), we increase the flow in such a way that the flow of at least one arc coincides with its capacity, keeping the flows as same in all other arcs. The modified flow is feasible, as in each arc it is either less than or equal to its capacity, and also at every vertex the total flow in is equal to the total flow out. The above process is repeated for every modified flow until it is not possible to transfer vb to W1. The flows of all iterations are shown in Table 16.2. In each feasible flow the number in parentheses indicates the corresponding arc forming the chain in which vb is transferred to W1. The asterisk indicates that the flow in the corresponding arc is equal to its capacity and will not further be increased.
512
16 Flow in Networks
1
1
3
4
4
2
0
2 2
a
1
2
5
5
b
1 1
2
1
3
2 6
1
Fig. 16.19 Flow in graph
Table 16.2 Solution of flow problem of Example 6 Arcs
Capacity (ci)
Feasible flows I II
III
IV
V
VI
(a, (a, (a, (1, (1, (1, (2, (2, (3, (3, (4, (4, (5, (5, (6, (b,
3 2 1 1 4 2 2 1 1 1 2 0 2 5 2 –
(0) 0 0 (0) 0 0 0 0 (0) 0 (0) 0* 0 (0) 0 0
3* (0) 0 1* 2 0 0 (0) 1* 0 1 0* 0 3 (0) 3
3* (1) 0 1* 2 0 (0) 1* 1* (0) (1) 0* 0 3 (1) 4
3* 2* (0) (1*) (2) 0 1 1* 1* 1* (2) 0* 0 (3) 2* 5
3* 2* 1* 0 3 0 1 1* 1* 1* 1 0* 0 4 2* 6
1) 2) 3) 4) 5) 6) 4) 6) 5) 6) 3) b) 2) b) b) a)
(1) 0 0 1* (0) 0 0 0 1* 0 1 0* 0 (1) 0 1
16.8
Formulation of an LPP of a Flow Network
513
From Table 16.2, it is seen that the maximum flow in the graph is 6. Note The preceding problem can be solved in different ways of vertex transfer from W2 to W1. Two different ways of solving the problem have been shown in Tables 16.3 and 16.4. Example 7 Find the maximum flow in the graph with the following arcs and flow capacities. Arc vj ; vk is denoted as (j, k). va is the source and vb , the sink. Flow Flow
Arc Capacity Arc Capacity
(a, 1) 2 (2, 6) 1
(a, 2) 2
(a, 3) 2
(3, 4) 1
(1, 4) 1
(3, 5) 1
(1, 5) 1 (3, 6) 1
(1, 6) 1
(2, 4) 1
(4, b) 2
(5, b) 2
(2, 5) 1 (6, b) 2
For the given problem, first of all we draw the graph, which is shown in Fig. 16.20. For solving the given problem, let us assume that the initial flow is zero in each arc. Let W1 = {va} and W2 be the set of other vertices. Now, for an arc (va, v1), va 2 W1, v1 2 W2, the flow is zero, which is less than its flow capacity. So we transfer v1 to W1. Again, for an arc (v1, v4), v1 2 W1, v4 2 W2, the flow is less than its flow capacity. So we transfer v4 to W1. For the arc (v4, vb), v4 2 W1, vb 2 W2, the flow is zero, which is less than its flow capacity. So vb is transferred to W1. As a
Table 16.3 Alternative solution of flow problem of Example 6 Arcs
Capacity (ci)
Feasible flows I II
III
IV
V
(a, (a, (a, (1, (1, (1, (2, (2, (3, (3, (4, (4, (5, (5, (6, (b,
3 2 1 1 4 2 2 1 1 1 2 0 2 5 2 –
(0) 0 0 0 (0) 0 0 0 0 0 0 0* 0 (0) 0 0
3* (1) 0 0 3 0 (0) 1* (0) 0 (0) 0* 0 (3) 1 4
3* 2* (0) 0 3 0 1 1* 1* (0) 1 0* 0 4 (1) 5
3* 2* 1* 0 3 0 1 1* 1* 1* 1 0* 0 4 2* 6
1) 2) 3) 4) 5) 6) 4) 6) 5) 6) 3) b) 2) b) b) a)
3* (0) 0 0 3 0 0 (0) 0 0 0 0* 0 3 (0) 3
514
16 Flow in Networks
Table 16.4 Alternative solution of flow problem of Example 6 Arcs
Capacity (ci)
Feasible flows I II
III
IV
V
(a, (a, (a, (1, (1, (1, (2, (2, (3, (3, (4, (4, (5, (5, (6, (b,
3 2 1 1 4 2 2 1 1 1 2 0 2 5 2 –
0 (0) 0 0 0 0 0 (0) 0 0 0 0* 0 0 (0) 0
0 2* (0) 0 0 0 1 1* (0) 1* 1 0* 0 (0) 2* 2
(0) 2* 1* 0 (0) 0 1 1* 1* 1* 1 0* 0 (1) 2* 3
3* 2* 1* 0 3 0 1 1* 1* 1* 1 0* 0 4 2* 6
1) 2) 3) 4) 5) 6) 4) 6) 5) 6) 3) b) 2) b) b) a)
1
1
2
0 (1) 0 0 0 0 (0) 1* 0 (0) (0) 0* 0 0 (1) 1
1
4
1
2
1 2
a
1
2 1
2
5
b
1 2
1 3
2
1
6
Fig. 16.20 Maximum flow and capacity
result, the solution is not optimal. In this process, the corresponding chain is (va, v1, v4, vb). The least flow capacity in this chain is 1. So, in each arc of this chain and also in the return arc (vb, va), we increase the flow in such a way that the flow of at
16.8
Formulation of an LPP of a Flow Network
515
Table 16.5 Solution of flow problem of Example 7 Arcs
Capacity
Feasible flows I II
III
IV
V
VI
VII
(a, (a, (a, (1, (1, (1, (2, (2, (2, (3, (3, (3, (4, (5, (6, (b,
2 2 2 1 1 1 1 1 1 1 1 1 2 2 2
(0) 0 0 (0) 0 0 0 0 0 0 0 0 (0) 0 0 0
2* (0) 0 1* 1* 0 0 (0) 0 0 0 0 1 (1) 0 2
2* (1) 0 1* 1* 0 0 1* (0) 0 0 0 1 2* (0) 3
2* 2* (0) 1* 1* 0 0 1* 1* 0 0 (0) 1 2* (1) 4
2* 2* (1) 1* 1* 0 0 1* 1* (0) 0 1* (1) 2* 2* 5
2* 2* 2* 1* 1* 0 0 1* 1* 1* 0 1* 2* 2* 2* 6
1) 2) 3) 4) 5) 6) 4) 5) 6) 4) 5) 6) b) b) b) a)
(1) 0 0 1* (0) 0 0 0 0 0 0 0 1 (0) 0 1
least one arc coincides with its flow capacity, keeping the flows as same in all other arcs. The modified flow is feasible, as in each arc it is either less than or equal to its flow capacity, and at every vertex the total flow in is equal to the total flow out. The above process is repeated for every modified flow until it is not possible to transfer vb to W1. The flows of all iterations are shown in Table 16.5. In each feasible flow the number in parentheses indicates the corresponding arc forming the chain in which vb is transferred to W1. The asterisk indicates that the flow in the corresponding arc is equal to its capacity and will not further be increased. From Table 16.5, it is seen that the maximum flow in the graph is 6. Example 8 Find the maximal flow in the graph of Fig. 16.21 with the constraints 2 x1 10;
4 x2 12; 2 x3 4; 0 x4 5; 0 x5 10:
Solution Since the lower bounds of the first three variables x1, x2, x3 are non-zero real numbers, we introduce a fictitious vertex v0. Now, for e1, b1 = 2. Therefore, we draw two fictitious arcs (va, v1) and (va, v0) and assume a flow 2 in each of the arcs of the cycle (v0, v1, va, v0). See Fig. 16.22. Similarly, we draw the arcs (v0, v2) and (va, v0) and assume a flow 4 in each of the arcs of the cycle (v0, va, v2, v0). Since b3 = −2, we draw the arcs (v0, v2) and (v1, v0) and assume a flow 2 in each of the arcs of the cycle (v0, v2, v1, v0). In the other arcs e4 and e5, we assume the initial flow as zero.
516
16 Flow in Networks
1
e4
e1 e3
b
a
e5
e2
2
Fig. 16.21 Graph with constraints
1
0
b
a 2
Fig. 16.22 Maximum flow in modified graph
Then the initial flow is as follows: Arcs
(va, v0)
(v0, v2)
(v0, v1)
(va, v2)
(va, v1)
(v2, v1)
(v1, vb)
(v2, vb)
Flow
6
6
0
-4
-2
2
0
0
Now expressing the constraints in terms of yi (where yi ¼ xi bi ), we have 0 y1 8; 0 y2 8; 0 y3 6; 0 y4 5; 0 y5 10.
16.8
Formulation of an LPP of a Flow Network
Table 16.6 Solution of flow problem of Example 8
517
Arcs
Capacity (ci)
Feasible flows I II
III
(a, (a, (a, (0, (0, (1, (2, (2, (b,
– 8 8 – – 5 6 10 –
6 (−2) −4 0 6 (0) 2 0 0
6 3 6 0 6 5* 2 10* 15*
0) 1) 2) 1) 2) b) 1) b) a)
6 3 (−4) 0 6 5* 2 (0) 5
Keeping the flow in the fictitious arcs as same, we shall apply the algorithm for finding the maximum flow in the modified graph. The iterations for determining the modified flow are shown in Table 16.6. From Table 16.6, we get the optimal solution for the modified graph as y1 ¼ 3;
y2 ¼ 6; y3 ¼ 2; y4 ¼ 5; y5 ¼ 10:
Hence, x1 = y1 + 2 = 5, x2 = y2 + 4 = 10, x3 = −2 + y3 = 0, x4 = y4 = 5, x5 = y5 = 10 and the maximum flow = 15, satisfying the constraints of the original graph. Note For this problem, there is also an alternative solution. This solution is y1 = 5, y2 = 4, y3 = 0, y4 = 5, y5 = 10; i.e. x1 = 7, x2 = 8, x3 = −2, x4 = 5, x5 = 10. The negative flow −2 in the arc (v2, v1) can be interpreted as the positive flow 2 in the arc (v1, v2).
16.9
Exercises
1. Formulate the minimal cost and flows in a network. Hence, find the minimal cost of the network shown in Fig. 16.23, where, for cij ; dij ; cij represents the flow capacity from vertex i to vertex j, and dij represents the cost per unit flow from vertex i to vertex j. 2. Formulate the LPP of the network G shown in Fig. 16.24. 3. Find the maximum flow in the graph with the following arcs and arc capacities, where the flow in each arc is non-negative. Arc vj ; vk is denoted as ðj; kÞ. va is the source and vb, the sink.
518
16 Flow in Networks
1
(15, 4)
(14, 1)
b a
(13, 6)
(11, 2)
(8, 2)
3
(16, 1)
2
(17, 3)
Fig. 16.23 Flow network
4 1
5
6 3
6
a
2
b
2 4
3
5
2
4 5
3
5
Fig. 16.24 Flow network
Arc
(a, 1)
(a, 2)
(a, 3)
(1, 4)
(1, 5)
(1, 6)
(2, 4)
(2, 6)
Capacity Arc Capacity
3 (3, 5) 1
2 (3, 6) 1
1 (4, 3) 2
1 (4, b) 0
4 (5, 2) 2
2 (5, b) 5
2 (6, b) 2
1
16.9
Exercises
519
4. Find the maximum flow in the graph with the following arcs and arc capacities, where the flow in each arc is non-negative. Arc (vj, vk) is denoted as (j, k). va is the source and vb is the sink. Arc
(a, 1)
(a, 2)
(a, 3)
(1, 4)
(1, 5)
(1, 6)
(2, 4)
(2, 5)
(2, 6)
Capacity Arc Capacity
2 (3, 4) 1
2 (3, 5) 1
2 (3, 6) 1
1 (4, b) 2
1 (5, b) 2
1 (6, b) 2
1
1
1
5. Find the maximum non-negative flow in the network described below. Arc (vj, vk) is denoted as (j, k). va is the source and vb is the sink. Arc
(a, 1)
(a, 2)
(1, 2)
(1, 3)
(1, 4)
(2, 4)
(3, 2)
(3, 4)
(4, 3)
Capacity Arc Capacity
8 (3, b) 10
10 (4, b) 9
3
4
2
8
3
4
2
6. Find the maximum flow in the network with the following data, where the flow in arcs is not necessarily non-negative. The arc (vj, vk) is denoted as (j, k), and the flow limit (b, c) means that the constraint on the flow xi is bi xi ci . va is the source and vb is the sink. Arc
(a, 1)
(a, 2)
(1, 2)
(1, 3)
(2, 4)
(3, 4)
(3, b)
(4, b)
ðbi ; ci Þ
(0, 10)
(0, 5)
(−2, 3)
(7, 10)
(−3, 5)
(−1, 1)
(0, 8)
(0, 4)
Chapter 17
Inventory Control Theory
17.1
Objectives
The objectives of this chapter are to: • Discuss the concepts of inventory as well as the various forms of inventory and reasons for maintaining inventories • Define the different terminologies of inventory like demand, replenishment, constraints, shortages, lead time, planning/time horizon and various types of inventory costs • Derive the single item purchasing and manufacturing inventory models with and without shortages • Determine the optimal order quantity for single item price break inventory models and multi-item purchasing model with different constraints like investment, average inventory and space constraints • Introduce the probabilistic models for discrete (the well-known newspaper boy problem) and continuous cases • Discuss the fuzzy inventory model.
17.2
Introduction
In a broad sense, inventory is defined as an idle resource of an enterprise/company/ manufacturing firm. It can be defined as a stock of physical goods, commodities or other economic resources which are used to meet customer demand or production requirements. This means that the inventory acts as a buffer stock between a supplier and a customer.
© Springer Nature Singapore Pte Ltd. 2019 A. K. Bhunia et al., Advanced Optimization and Operations Research, Springer Optimization and Its Applications 153, https://doi.org/10.1007/978-981-32-9967-2_17
521
522
17
Inventory Control Theory
The inventory or stock of goods may be kept in any one of the following forms: (i) (ii) (iii) (iv)
Raw materials Semi-finished goods (work-in-process inventory) Finished (or produced) goods Maintenance, repair and operating (MRO) supply items.
In any sector of an economy, the control and maintenance of inventory is a common problem to all organizations. Inventories of physical goods are maintained in government and non-government establishments, e.g. agriculture, industry, military, business, etc. Some reasons for maintaining inventories are as follows: (i) To conduct the smooth and efficient running of business (ii) To provide customers service by meeting their demands from stock without delay (iii) To avail price discounts for bulk purchasing (iv) To maintain more stable operations and/or workforce (v) To take financial advantage of transporting/shipping economics (vi) To plan an overall operating strategy through decoupling of successive stages in the chain of acquiring goods, preparing products, shipping to branch warehouses and finally serving customers (vii) To motivate customers to purchase more by displaying a large number of goods in the showroom/shop (viii) To take advantage in purchasing of some raw materials and some commonly used physical goods (such as paddy, wheat, etc.) whose prices seasonally fluctuate. In this context, it is more profitable to procure a sufficient quantity of these raw materials/commonly used physical goods when their prices are low to be used later during the high price season or when need arises. Production/inventory planning and control is essentially concerned with the design, operation and control of an inventory system in any sector of a given economy. The problem of inventory control is primarily concerned with the following fundamental questions: (i) (ii) (iii) (iv)
Which items should be carried in stock? Which items should be produced? How many of each of these items should be ordered/produced? When should an order be placed? Or when should an item be produced? What type of inventory control system should be used?
In practice, it is a formidable task to determine a suitable inventory policy. Regarding these questions, an inventory problem is a problem of making optimal decisions. In other words, an inventory problem deals with decisions that optimize either the cost function (total or average cost) or the profit function (total or average profit) of the inventory system. However, there are certain types of problems, such as those relating to the storage of water in a dam, in which one has no control over
17.2
Introduction
523
the replenishment of inventory. The supply of inventory of water in a dam depends on rainfall; the organization operating the dam has no control over it. This chapter presents mathematical models of different inventory systems. However, this type of model is based on different assumptions and approximations. It is difficult both to devise and operate with an exact/accurate model as nobody knows what the real situation is. Therefore, it is almost impossible to construct a realistic model with accuracy. For this reason, some approximations and simplifications must be considered during the model formulation. The solution of the inventory problem is a set of specific values of variables that minimizes the total (or average) cost of the system or maximizes the total (or average) profit of the system.
17.2.1 Types of Inventories There are different types of inventories, namely: (i) transportation inventories, (ii) fluctuation inventories, (iii) anticipation inventories, (iv) decoupling inventories and (v) lot-size inventories. Transportation inventories These arise due to transportation of inventory items to various distribution centres and customers from the various production centres. When the transportation time is long, the items under transport cannot be served to customers. These inventories exist solely because of transportation time. Fluctuation inventories These have to be carried because sales and production times cannot be predicted accurately. In real-life problems, there are fluctuations in the demand and lead times that affect the production of items. Anticipation inventories These build up in advance by anticipating or foreseeing the future demand for the season of large sales due to a promotion programme or a plant shutdown period. Decoupling inventories These inventories are used to reduce the interdependence of various stages of the production system. Lot-size inventories Generally, the rate of consumption is different from the rate of production or purchasing. Therefore, items are produced in larger quantities which result in lot-size inventories, also called cycle inventories.
524
17.3
17
Inventory Control Theory
Basic Terminologies in Inventory
The inventory system depends on several system factors and parameters such as demand, replenishment rate, shortages, constraints, various types of costs, etc. Demand Demand is defined as the number of units of an item required by the customer in a unit time and has the dimension of a quantity. It may be known exactly or in terms of probabilities, or it may be completely unknown. The demand pattern of items may be either deterministic or probabilistic. Problems in which demand is known and fixed are called deterministic problems. Problems in which the demand is assumed to be a random variable are called stochastic or probabilistic problems. In the case of deterministic demand, it is assumed that the quantities needed over subsequent periods of time are known exactly. Further, the known demand may be fixed or variable with time, stock level or selling price of an item, etc. Probabilistic demand occurs when requirements over a certain period of time are not known with certainty, but their pattern can be described by a known probability distribution. In some cases, demand may also be represented by uncertain data in a non-stochastic sense, i.e. by vague/imprecise data. This type of demand is termed fuzzy demand, and the system is called a fuzzy system. Replenishment Replenishment refers to the amount of quantities that are scheduled to be put into inventories, at the time when decisions are made about ordering these quantities or at the time when they are actually added to stock. Replenishment can be categorized according to size, pattern and lead time. Replenishment size may be constant or variable, depending upon the type of inventory system. It may depend on time, demand and/or on-hand inventory level. The replenishment patterns are usually instantaneous, uniform or in batch. The replenishment quantity again may be probabilistic or fuzzy in nature. Constraints Constraints are the limitations imposed on the inventory system. They may be imposed on the amount of investment, available space, the amount of inventory held, average instantaneous expenditure, number of orders, etc. Fully backlogged/partially backlogged shortages During a stock-out period, sales or goodwill may be affected either by a delay or by complete refusal in meeting the demand. If the unfulfilled demand for the goods is satisfied completely at a later date, then it is a case of a fully backlogged shortage. That is, it is assumed that no customer balks during this period, and the demand of all the waiting customers is met at the beginning of the next period gradually after the commencement of the next production.
17.3
Basic Terminologies in Inventory
525
Generally, it is observed that during the stock-out period some of the customers wait for the product and others balk away. When this happens, the phenomenon is called a partially backlogged shortage. Lead time The time gap between the time of placing an order or production start and the time of arrival of goods in stock is called the lead time. It may be a constant or a variable. Again, variable lead time may be probabilistic or imprecise. Planning/time horizon The time period over which the inventory level will be controlled is called the time/ planning horizon. It may be finite or infinite depending upon the nature of the inventory system for the commodity. Deterioration/damageability/perishability Deterioration is defined as decay, evaporation, obsolescence or loss of utility or marginal value of a commodity that results in a decreasing usefulness from the original condition. Vegetables, foodgrains and semiconductor chips are examples of products that can deteriorate. Damageability is defined by the damage when items are broken or lose their utility due to accumulated stress, bad handling, a hostile environment, etc. The amount of damage by stress varies with the size of the stock and the duration for which the stress is applied. Items made of glass, china, clay, ceramic, mud, etc. are examples of damageable products. Perishable items are those which have a finite lifetime (fixed or random). Fixed lifetime products (e.g. human blood) have a deterministic shelf life, while the random lifetime scenario is closely related to the case of an inventory which experiences continuous physical depletion due to deterioration or decay. Various types of inventory costs Inventory costs are the costs associated with the operation of an inventory system and result from action or lack of action on the part of management in establishing the system. They are basic economic parameters to any inventory decision model. Purchase or unit cost The purchase or unit cost of an item is the unit purchase price of obtaining the item from an external source or the unit production cost for the internal production. It may also depend upon the demand. When production is done in large quantities, it results in reduction of production costs per unit. Also, when quantity discounts are allowed for bulk orders, the unit price is reduced and dependent on the quantity purchased or ordered.
526
17
Inventory Control Theory
Ordering/setup cost The ordering or setup cost originates from the experience of issuing a purchase order to an outside supplier or from internal production setup costs. The ordering cost includes clerical and administrative costs, telephone charges, telegrams, transportation costs, loading and unloading costs, etc. Generally, this cost is assumed to be independent of the quantity ordered or produced. In some cases, it may depend on the quantity of goods purchased because of price breaks quantity discounts or transportation costs. Holding or carrying cost The holding or carrying cost is the cost associated with the storage of the inventory until its use or sale. It is directly proportional to the amount/quantity in the inventory and the time for which the stocks are held. This cost generally includes costs related to insurance, taxes, obsolescence, deterioration, renting of warehouse, light, heat, maintenance and interest on the money locked up. Shortage cost or stock-out cost The shortage cost or stock-out cost is the penalty incurred for being unable to meet a demand when it occurs. This cost arises due to shortage of goods, lost sales for delay in meeting the demand or total inability to meet the demand. In the case where the unfulfilled demand for the goods can be satisfied at a later date (backlogging case), this cost depends on both the shortage quantity and the delaying time. On the other hand, if the unfulfilled demand is lost (no backlogging case), the shortage cost becomes proportional to the shortage quantity only. In both cases, there is a loss of goodwill, which cannot be quantified for the development of the mathematical model. Disposal cost When an amount of some units of an item remains in excess at the end of the inventory cycle, and if this amount is sold at a lower price in the next cycle to derive some advantages like clearing the stock or winding up the business, the revenue earned through such a process is called the disposal cost. Salvage values During storage, some units are partially spoiled or damaged; i.e. some units lose their utility partially. In a developing country, it is normally observed that some of these units are sold at a reduced price (less than the purchase price) to a section of customers, and this gives some revenue to management. This revenue is called the salvage value.
17.4
Classification of Inventory Models
Inventory problems (models) may be classified into two categories.
17.4
Classification of Inventory Models
527
(i) Deterministic inventory models These are the inventory models in which demand is assumed to be known constant or variable (dependent on time, stock level, selling price of the item, etc.). Here, we shall consider deterministic inventory models for known constant demand. Such models are usually referred to as economic lot-size models or economic order quantity (EOQ) models. There are different types of models under this category, namely (a) (b) (c) (d) (e) (f)
Purchasing inventory model with no shortage Manufacturing inventory model with no shortage Purchasing inventory model with shortages Manufacturing model with shortages Multi-item inventory model Price break inventory model.
(ii) Probabilistic inventory models These are the inventory models in which the demand is a random variable with a known probability distribution. Here, the future demand is determined by collecting data from past experience.
17.5
Purchasing Inventory Model with No Shortage (Model 1)
In this model, we want to derive the formula for the optimum order quantity per cycle of a single product so as to minimize the total average cost under the following assumptions and notation: (i) Demand is deterministic and uniform at a rate D units of quantity per unit time. (ii) Production is instantaneous (i.e. production rate is infinite). (iii) Shortages are not allowed. (iv) The lead time is zero. (v) The inventory planning horizon is infinite, and the inventory system involves only one item and one stocking point. (vi) Only a single order is placed at the beginning of each cycle, and the entire lot is delivered in one batch. (vii) The inventory carrying cost C1 per unit quantity per unit time and the ordering cost C3 per order are known and constant. (viii) T is the cycle length and Q is the ordering quantity per cycle. Let us assume that an enterprise purchases an amount of Q units of an item at time t = 0. This amount will be depleted to meet customer demand. Ultimately, the stock level reaches zero at time t = T. The inventory situation is shown in Fig. 17.1.
528
17
Inventory Control Theory
B Q
t=0
Q
t = 2T
A t=T
O
Time
Fig. 17.1 Pictorial representation of purchasing inventory model with no shortage
Clearly;
Q ¼ DT
ð17:1Þ
Now, the inventory carrying cost for the entire cycle T is C1 (area of DAOB) = 12 C1 QT, and the ordering cost for the said cycle T is C3. Hence, the total cost for time T is given by X ¼ C3 þ
1 C1 QT: 2
Therefore, the total average cost is given by CðQÞ ¼ or or
C3 1 þ QC1 2 T C3 D 1 þ C1 Q CðQÞ ¼ Q 2 CðQÞ ¼
X T
Q * Q ¼ DT ) T ¼ : D
ð17:2Þ
The optimum value of Q which minimizes C(Q) is obtained by equating the first derivative of C(Q) with respect to Q to zero, dC 1 C3 D ¼ 0 or, C1 2 ¼ 0 dQ 2 Q rffiffiffiffiffiffiffiffiffiffiffi 2C3 D : or Q ¼ C1 i:e:;
2 Now, ddCðQÞ ¼ 2CQ33D, which is positive for Q ¼ Q2
qffiffiffiffiffiffiffiffi
2C3 D C1 .
17.5
Purchasing Inventory Model with No Shortage (Model 1)
529
Hence, C(Q) is minimum for which the optimum value of Q is rffiffiffiffiffiffiffiffiffiffiffi 2C3 D : Q ¼ C1
ð17:3Þ
This is known as either the economic lot-size formula or the EOQ formula. The qffiffiffiffiffiffiffi corresponding optimum time interval is T ¼ QD ¼ C2C1 D3 , and the minimum cost pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi per unit time is given by Cmin ¼ CQ3D þ 12 C1 Q ¼ 2C1 C3 D. This model was first developed by Ford Harris (1915) of the Westinghouse Corporation, USA, in the year 1915. He derived the well-known classical lot-size Formula (17.3). However, according to Erlenkotter (1989), the earliest model was formulated by Harris in the year 1913. This formula was also developed independently by R. H. Wilson after a few years; thus, it has been called the Harris-Wilson formula. Remark (i) The total inventory time units for the entire cycle T is 12 QT, so the average inventory at any time is 12 QT=T ¼ 12 Q. (ii) Since C1 [ 0, from f ðQÞ ¼ 12 C1 Q it is obvious that the inventory carrying cost is a linear function of Q with a positive slope; i.e. for a smaller average inventory, the inventory carrying costs are lower. In contrast, gðQÞ ¼ CQ3 D; i.e. the ordering cost increases as Q decreases. (iii) In the preceding model, if we always maintain an inventory B on hand as buffer stock, then the average inventory at any time is 12 Q þ B. Therefore, the total cost per unit time is CðQÞ ¼ 12 Q þ B C1 þ C3 D Q. As before, we obtain the optimal values of Q and T as follows: rffiffiffiffiffiffiffiffiffiffiffi rffiffiffiffiffiffiffiffiffi 2C3 D 2C3 Q¼Q ¼ and T ¼ T ¼ : C1 DC1 (iv) In the preceding model, if the ordering cost is taken as C3 þ bQ (where b is the purchase cost per unit quantity) instead of a fixed ordering cost, then there is no change in the optimum order quantity. Proof In this case, the average cost is given by 1 D CðQÞ ¼ C1 Q þ ðC3 þ bQÞ: 2 Q As the necessary condition for the optimum of C(Q) in (17.4), we have
ð17:4Þ
530
17
rffiffiffiffiffiffiffiffiffiffiffi 2C3 D C ðQÞ ¼ 0 implies Q ¼ C1 0
Hence, Q ¼
Inventory Control Theory
and C00 ðQÞ [ 0:
qffiffiffiffiffiffiffiffi
2C3 D C1 .
This shows that there is no change in Q* in spite of the change in the ordering cost. Example 1 An engineering factory consumes 5000 units of a component per year. The ordering, receiving and handling costs are Rs. 300 per order, the trucking cost is Rs. 1200 per order, the interest cost is Rs. 0.06 per unit per year, the deterioration and obsolescence costs are Rs. 0.004 per unit per year and the storage cost is Rs. 1000 per year for 5000 units. Calculate the economic order quantity and minimum average cost. Solution In the given problem, we have demand (D) = 5000 units. Ordering cost=replenishment cost ¼ ordering; receiving; handling costs and trucking costs ¼ Rs: ð300 þ 1200Þ ¼ Rs:1500 per order: Inventory carrying cost ¼ interest costs þ deterioration and obsolescence costs þ storage costs 1000 ¼ 0:06 þ 0:004 þ rupees per unit per year 5000 ¼ Rs: 0:264 per unit per year:
Hence, the economic order quantity is given by rffiffiffiffiffiffiffiffiffiffiffi rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2C3 D 2 1500 5000 Q ¼ ¼ ¼ 753:8 ðapprox:Þ: C1 0:264
Also, the minimum average cost is pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2C1 C3 D ¼ Rs: 2 0:264 1500 5000 ¼ Rs: 1989:97 ðapprox:Þ:
17.6
Manufacturing Model with No Shortage (Model 2)
(Economic lot-size model with finite rate of replenishment and without shortage) In this model, we shall derive the formula for the optimum production quantity per cycle of a single product so as to minimize the total average cost under the following assumptions and notation:
17.6
Manufacturing Model with No Shortage (Model 2)
531
(i) (ii) (iii) (iv)
Demand is deterministic and uniform at a rate D units of quantity per unit time. Shortages are not allowed. The lead time is zero. The production rate or replenishment rate is finite, say, K units per unit time (K > D). (v) The production-inventory planning horizon is infinite, and the production system involves only one item and one stocking point. (vi) The inventory carrying cost C1 per unit quantity per unit time and the setup cost C3 per production cycle are known and constant. (vii) T is the cycle length and Q is the economic lot size. In this model, each production cycle time T consists of two parts t1 and t2 , where: (i) t1 is the period during which the stock is increasing at a constant rate K – D units per unit time. (ii) t2 is the period during which there is no replenishment (or production), but inventory is decreasing at the rate of D units per unit time. Further, it is assumed that S is the stock available at the end of time t1 which is expected to be consumed during the remaining period t2 at the consumption rate D. The pictorial representation of the model is shown in Fig. 17.2. Therefore, ðK DÞt1 ¼ S S : KD Since the total quantity produced during the production period t1 is Q, or
t1 ¼
ð17:5Þ
) Q ¼ Kt1 or Q ¼ K
S KD ; which implies S ¼ Q: KD K
ð17:6Þ
Again, Q = DT, i.e. T ¼ Q D.
A K–D
D S
O
t1
M
t2
B
Time
Fig. 17.2 Pictorial representation of manufacturing inventory model with no shortage
532
17
Inventory Control Theory
Now the inventory carrying cost for the entire cycle T is ðDOABÞC1 ¼ 12 TSC1 , and the setup cost for time period T is C3. Therefore, the total cost for the entire cycle T is given by X ¼ C3 þ 12 C1 ST. Therefore, the total average cost is given by CðQÞ ¼ XT C3 1 þ C1 S 2 T C3 D 1 K D þ C1 Q or CðQÞ ¼ Q 2 K
or CðQÞ ¼
KD * Q ¼ DT and S ¼ Q : K
ð17:7Þ
The optimum of Q which minimizes C(Q) is obtained by equating the first derivative of C(Q) with respect to Q to zero, dC ¼0 dQ C3 D 1 K D ¼0 or 2 þ C1 Q 2 K rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2C3 DK : or Q ¼ : C1 K D i:e:
2
Again; ddQC2 ¼ 2CQ33D ¼ a positive quantity for Q ¼
ð17:8Þ
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2C3 DK C1 : KD:
Hence, C(Q) is minimum for which the optimum value of Q is rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2C3 DK : Q ¼ : C1 K D
ð17:9Þ
The corresponding time interval is Q T ¼ ¼ D
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2C3 K ; C1 DðK DÞ
ð17:10Þ
and the minimum average cost is given by Cmin
1K D C3 D C1 Q þ ¼ ¼ 2 K Q
rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi KD 2C1 C3 D : K
ð17:11Þ
Remark (i) For this model, Q*, T* and Cmin can be written in the following form:
17.6
Manufacturing Model with No Shortage (Model 2)
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2C3 D 1 Q ¼ ; C1 1 D=K
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2C3 1 T ¼ : DC1 1 D=K
533
ð17:12Þ
If K ! ∞, i.e. the production rate is infinite, this model reduces to Model 1. Therefore, when K ! ∞, then Q*, T* and Cmin reduce to the expressions for Q*, T* and Cmin of Model 1.
17.7
Purchasing Inventory Model with Shortage (Model 3)
In this model, we shall derive the optimal order level and the minimum average cost under the following assumptions and notation: (i) Demand is deterministic and uniform at a rate D units of quantity per unit time. (ii) Production is instantaneous (i.e. production rate is infinite). (iii) Shortages are allowed and fully backlogged. (iv) The lead time is zero. (v) The inventory planning horizon is infinite, and the inventory system involves only one item and one stocking point. (vi) Only a single order will be placed at the beginning of each cycle, and the entire lot is delivered in one batch. (vii) The inventory carrying cost C1 per unit quantity per unit time, the shortage cost C2 per unit quantity per unit time and the ordering cost C3 per order are known and constant. (viii) Q is the lot size per cycle, S1 is the initial inventory level after fulfilling the backlogged quantity of the previous cycle and Q − S1 is the maximum shortage level. (ix) T be the cycle length or scheduling period whereas t1 is the no shortage period. The inventory situation is shown in Fig. 17.3. According to the assumptions of (viii) and (ix), we have Q = DT. Regarding the cycle length or scheduling period of the inventory system, two cases may arise: Case 1: The cycle length or scheduling period T is constant. Case 2: The cycle length or scheduling period T is a variable. For Case 1, T is constant; i.e. the inventory is to be replenished after every time period T. As t1 is the no shortage period, S1 ¼ Dt1 or t1 ¼ S1 =D.
534
17
Inventory Control Theory
Now, the inventory carrying cost during the period 0 to t1 is 1 1 C1 ðarea of DOABÞ ¼ C1 S1 t1 ¼ C1 S21 =D: 2 2 Again the shortage cost during the interval (t1 , T) is 1 C2 ðarea of DACDÞ ¼ C2 ðQ S1 ÞðT t1 Þ 2 1 Q S1 ¼ C2 ðQ S1 Þ2 =D * T t1 ¼ : 2 D Hence the total average cost of the system is given by (see Fig. 17.3) " # 1 S21 1 ðQ S1 Þ2 C1 þ C2 =T: C¼ 2 D 2 D
ð17:13Þ
Since the setup cost C3 and time period T are constant, the average setup cost C3 =T, also being constant, will not be considered in the cost expression. Since T is constant, Q = DT is also constant. Hence, the preceding expression for the average cost is a function of a single variable S1 . Thus, we can easily minimize expression (13) with respect to S1 , as in Model 1. In this case; S1 ¼
C2 Q C2 DT C1 C2 Q C1 C2 DT ¼ and Cmin ¼ ¼ : C1 þ C2 C1 þ C2 C1 þ C2 C1 þ C2 ð17:14Þ
B S1 A t1 T
D Q - S1 C
Fig. 17.3 Pictorial representation of purchasing inventory model with shortage
Time
17.7
Purchasing Inventory Model with Shortage (Model 3)
535
For Case 2, the cycle length or scheduling period T is a variable. As in Case 1, the average cost of the inventory system will be " # 1 S21 1 ðQ S1 Þ2 C ¼ C3 þ C1 þ C2 =T; ð17:15Þ 2 D 2 D where Q = DT. Here, the average cost C is a function of two independent variables T and S1. Now, for the optimal value of C, we have @C ¼0 @S1 Thus,
and
@C ¼ 0: @T
@C DT : ¼ 0 gives S1 ¼ C2 @S1 ðC1 þ C2 Þ
ð17:16Þ
Again, @C C1 S21 DT S1 C2 ðDT S1 Þ2 C3 ¼ 0 gives þ C2 2 ¼ 0: 2 @T 2D T T 2D T2 T Setting S1 ¼ C2 ðC1DT þ C2 Þ in this expression and simplifying, we have sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2C3 ðC1 þ C2 Þ : T ¼T ¼ C1 C2 D Then, S1 ¼
S1
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2C2 C3 D ¼ : C1 ðC1 þ C2 Þ
ð17:17Þ
ð17:18aÞ
ð17:18bÞ
Obviously, for the values of T and S1 given by (17.18a) and (17.18b), @2C @2S [ 0; [0 2 @T 2 @S1
2 2 @2C @2C @ C and [ 0: @S1 @T @S1 dT
Hence, C is minimum for the values of T and S1 given by (17.18a) and (17.18b).
536
17
Inventory Control Theory
Therefore, the optimum order quantity for minimum cost is given by sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2C3 ðC1 þ C2 Þ 2C3 ðC1 þ C2 ÞD ¼ Q ¼ DT ¼ D C1 C2 D C1 C2 and Cmin
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2C1 C2 C3 D ¼C ¼ : ðC1 þ C2 Þ
ð17:19Þ
ð17:20Þ
Remark (i) If C1 ! ∞ and C2 > 0, inventories are prohibited. In this case S1 ¼ 0 and qffiffiffiffiffiffiffiffi each lot size Q ¼ 2CC32D is used to fill the backorders. (ii) If C2 ! ∞ and C1 > 0, then shortages are prohibited. In this case, S1 ¼ qffiffiffiffiffiffiffiffi Q ¼ 2CC31D and each batch Q* is used entirely for inventory. (iii) If shortage costs are negligible, then C1 > 0 and C2 ! 0. In this case, S1 ! 0 and Q* ! ∞. (iv) If the inventory carrying costs are negligible, then C1 ! 0 and C2 > 0. In this case, Q* ! ∞ and S1 ! ∞; i.e. S1 ! Q*. Thus, due to very small inventory carrying costs, a large lot size should be ordered and used to meet the future demand. (v) When the inventory carrying costs and shortage costs are equal, i.e. when C1 = C2, C1 Cþ1 C2 ¼ 12. pffiffiffiqffiffiffiffiffiffiffiffi In this case, Q ¼ 2 2CC31D, which shows that the lot size is √2 times the lot size of Model 1. Example 2 The demand for an item is 18,000 units per year. The inventory carrying cost is Rs. 1.20 per unit time, and the cost of shortages is Rs. 5.00. The ordering cost is Rs. 400.00. Assuming that the replenishment rate is instantaneous, determine the optimum order quantity, shortage quantity and cycle length. Solution For the problem, it is given that demand (D) = 1800 units per year, the carrying cost (C1) = Rs. 1.20 per unit, the shortage cost (C2) = Rs. 5.00 and the ordering cost (C3) = Rs. 400 per order. The optimum order quantity Q* is given by sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffirffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2C3 ðC1 þ C2 ÞD 2 400 ð1:2 þ 5Þ 18000 ¼ 3857 units: ¼ Q ¼ C1 C2 1:2 5
17.7
Purchasing Inventory Model with Shortage (Model 3)
537
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2C2 C3 D The optimum shortage quantity Q S1 ¼ 3857 C1 ðC1 þ C2 Þ sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 5 400 18; 000 ¼ 3857 ¼ 746 units ðapprox:Þ 1:2 ð1:2 þ 5Þ
3857 ¼ 0:214 year (approx.) The optimal cycle length T ¼ QD ¼ 18000
Example 3 The demand for an item is deterministic and constant over time and is equal to 600 units per year. The unit cost of the item is Rs. 50.00, while the cost of placing an order is Rs. 100.00. The inventory carrying cost is 20% of the unit cost of the item, and the shortage cost per month is Re. 1. Find the optimal ordering quantity. If shortages are not allowed, what would be the loss of the company? Solution It is given that D = 600 units/year. C1 = 20% of Rs. 50.00 = Rs. 10.00 C2 = Re 1.00 per month, i.e. Rs. 12.00 per year C3 = Rs. 100.00 per order. When shortages are allowed, the optimal ordering quantity Q* is given by sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2C3 ðC1 þ C2 ÞD ¼ 148 units Q ¼ C1 C2
and the minimum cost per year is CðQ Þ ¼
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2C1 C2 C3 D=ðC1 þ C2 Þ ¼ Rs: 809:04.
If shortages are not allowed, then the optimal order quantity is Q ¼
rffiffiffiffiffiffiffiffiffiffiffi 2C3 D ¼ 109:5 units C1
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi and the relevant average cost is given by CðQ Þ ¼ Rs: 2C1 C3 D ¼ Rs: 1095:44. Therefore, if shortages are not allowed, the loss of the company will be Rs. (1095.44–809.04), i.e. Rs. 286.40.
538
17.8
17
Inventory Control Theory
Manufacturing Model with Shortage
(Economic lot-size model with finite rate of replenishment and shortages) In this model, we shall derive the formula for the optimum production quantity, shortage quantity and cycle length of a single product by minimizing the average cost of the production system under the following assumptions and notation: (i) The production rate or replenishment rate is finite, say K units per unit time (K > D). (ii) The production-inventory planning horizon is infinite, and the production system involves only one item and one stocking point. (iii) Demand of the item is deterministic and uniform at a rate D units of quantity per unit time. (iv) Shortages are allowed. (v) The lead time is zero. (vi) The inventory carrying cost C1 per unit quantity per unit time, the shortage cost C2 per unit quantity per unit time and the setup cost C3 per setup are known and constant. (vii) T is the cycle length of the system; i.e. T is the interval between production cycles. (viii) Q is the economic lot size. Let us assume that each production cycle of length T consists of two parts t12 and t34 which are further subdivided into t1 and t2 , t3 and t4 , where (i) inventory is building up at a constant rate K − D units per unit time during the interval ½0; t1 ; (ii) at time t ¼ t1 ; the production is stopped and the stock level decreases due to meeting customer demand only up to the time t12 ¼ t1 þ t2 ; (iii) shortages are accumulated at a constant rate of D units per unit time during the time t3 , i.e. during the interval ½t12 ; t12 þ t3 : (iv) Shortages are being filled immediately at a constant rate K – D units per unit time during the time t4 , i.e., during the interval ½t12 þ t3 ; T: (v) The production cycle then repeats itself after the time T ¼ t1 þ t2 þ t3 þ t4 : Again, let the inventory level be S1 at t ¼ t1 and at the end of time t ¼ t1 þ t2 , the stock level reaches zero. Now shortages start, and we assume that shortages are built up of quantity S2 at time t ¼ t1 þ t2 þ t3 , and then these shortages are filled up to the time t = t1 + t2 + t3 + t4. The pictorial representation of the inventory situation is given in Fig. 17.4. Now our objectives are to find the optimal value of Q; S1 ; S2 ; t1 ; t2 ; t3 ; t4 and T with the minimum average total cost. The inventory carrying cost over the time period T is given by 1 1 Ch ¼ C1 DOAC ¼ C1 OC AB ¼ C1 ðt1 þ t2 ÞS1 ; 2 2
17.8
Manufacturing Model with Shortage
539
A
K–D D C O
t1
t2
B
F
H t3
t4
Time
K–D
D E
Fig. 17.4 Pictorial representation of manufacturing inventory model with shortage
and the shortage cost over time T is given by 1 1 Cs ¼ C2 DCEF ¼ C2 CF EH ¼ C2 ðt3 þ t4 ÞS2 : 2 2 Hence, the total average cost of the production system is given by C ¼ ½C3 þ Ch þ Cs =T:
ð17:21Þ
From Fig. 17.4, it is clear that S1 ¼ ðK DÞt1
or; t1 ¼
S1 : K D
ð17:22Þ
Again, S1 ¼ Dt2
or; t2 ¼
S1 : D
or, t3 ¼
S2 D
Now, in a stock-out situation, S2 ¼ Dt3
ð17:23Þ
540
17
Inventory Control Theory
S2 : KD
ð17:24Þ
and S2 ¼ ðK DÞt4
or, t4 ¼
Since the total quantity produced over the time period T is Q, Q ¼ DT; where D is the demand rate or Dðt1 þ t2 þ t3 þ t4 Þ ¼ Q S1 S1 S2 S2 þ þ þ or D ¼ Q: KD D D KD
ð17:25Þ
After simplification, we have S1 þ S 2 ¼
KD Q: K
ð17:26Þ
Again t1 þ t2 ¼
K S1 DðK DÞ
and t3 þ t4 ¼
K S2 : DðK DÞ
ð17:27Þ
Now substituting the values of t1 + t2, t3 + t4 and T = Q/D in (17.21), we have CðQ; S1 ; S2 Þ ¼
DC3 1 K C1 S1 þ C2 S22 þ : 2Q K D Q
ð17:28Þ
Using (17.26), this equation reduces to " # 2 1 K KD DC3 CðQ; S2 Þ ¼ C1 Q S2 þ C2 S22 þ : 2Q K D K Q
ð17:29Þ
Now, for the extreme values of C(Q, S2), we have @C @C ¼ 0; ¼ 0: @Q @S2 @C KD Q ¼ 0 implies S2 ¼ C1 : @Q K C1 þ C2
ð17:30Þ
17.8
Manufacturing Model with Shortage
Again, @C ¼ 0 gives Q ¼ @S2
541
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi rffiffiffiffiffiffiffiffiffiffiffiffiffi 2C3 ðC1 þ C2 Þ KD : C1 C2 KD
ð17:31Þ
For the values of Q and S2 given in (17.31) and (17.30), it can easily be verified that 2 2 @2C @2C @2C @2C @ C [ 0; [ 0 and [ 0: 2 2 2 2 @Q @Q @S2 @Q@S2 @S2 Hence, C(Q, S2) is minimum and the optimal values of Q and S2 are given by sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffirffiffiffiffiffiffiffiffiffiffiffiffiffi 2C3 ðC1 þ C2 Þ KD Q ¼ C1 C2 KD
ð17:32Þ
and S2
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffirffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2C1 C3 DðK DÞ ¼ K C2 ðC1 þ C2 Þ
Q T ¼ ¼ D
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffisffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2C3 ðC1 þ C2 Þ K C1 C2 DðK DÞ
KD S1 ¼ Q S2 ¼ K
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffirffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2C2 C3 DðK DÞ : K C1 ðC1 þ C2 Þ
Now Cmin ¼ CðQ
; S2 Þ
rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffirffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2C1 C2 C3 DðK DÞ : ¼ K C1 þ C2
ð17:33Þ
ð17:34Þ
ð17:35Þ
ð17:36Þ
Remarks (i) In this model, if we assume that the production rate is infinite, i.e. K ! ∞, then the optimal quantities obtained by taking K ! ∞ in (17.32), (17.34) and (17.36) are sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2C3 ðC1 þ C2 Þ Q ¼ D; C1 C2 sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2C3 ðC1 þ C2 Þ 2C1 C2 C3 D and Cmin ¼ T ¼ : C1 C2 D C1 þ C2
542
17
Inventory Control Theory
This means that Model 4 reduces to Model 3 if K ! ∞. (ii) If shortages are not allowed in Model 4, then it reduces to Model 3. In this case, taking C2 ! ∞ in (17.32), (17.34) and (17.36), we obtain the required expressions of Model 3, which are as follows: sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2C3 KD Q ¼ ; C1 ðK DÞ sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2C3 K 2C1 C3 DðK DÞ : T ¼ and Cmin ¼ C1 DðK DÞ K
Example 4 The demand for an item in a company is 18,000 units per year. The company can produce the item at a rate of 3000 per month. The cost of one setup is Rs. 500, and the holding cost of one unit per month is Rs. 0.15. The shortage cost of one unit is Rs. 20 per month. Determine the optimum manufacturing quantity and the shortage quantity. Also determine the manufacturing time and the time between setups. Solution For this problem, it is given that C1 = Rs. 0.15 per month, C2 = Rs. 20 per month, C3 = Rs. 500.00 per setup, K = 3000 per month and D = 18,000 units per year, i.e. 1500 units per month. The optimum manufacturing quantity Q* is given by sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffirffiffiffiffiffiffiffiffiffiffiffiffiffi rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffirffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2C3 ðC1 þ C2 Þ KD 2 500 ð0:15 þ 20Þ 3000 1500 ¼ Q ¼ C1 C2 KD 0:15 20 3000 1500 ¼ 4489 units ðapprox:Þ The optimum shortage quantity is given by S2 ¼ C1
Q D
KD Q ¼ 17 units ðapprox:Þ: K ðC1 þ C2 Þ
The manufacturing time = ¼ 4489 1500 ¼ 3 months.
Q K
¼ 4489 3000 ¼ 1:5 months, and the time between setups
17.9
17.9
Multi-item Inventory Model
543
Multi-item Inventory Model
In earlier, we have discussed inventory models for single items, or each item separately, but if there exists a relationship among the items under some limitations, then it is not possible to consider them separately. Thus, after constructing the average cost expression in these models, we shall use the method of Lagrange multipliers to minimize the average cost. In all such problems, first of all we shall solve the problem ignoring the limitations and then consider the effect of limitations. Now, we shall develop a multi-item inventory model under the following assumptions and notations: (i) There are n items with instantaneous production; i.e. the production rate of each item is infinite. (ii) Shortages are not allowed. For the ith (i = 1, 2, …, n) item: (iii) Di is the uniform demand rate. (iv) The inventory carrying cost Cli per unit quantity per unit time and the ordering cost C3i per order are known and constant. (v) Ti is the cycle length. Let Qi be the ordering quantity of the ith item. Then;
Qi ¼ Di Ti or; Ti ¼ Qi =Di :
Now, the total inventory time units for the ith item is 12 Qi Ti . Hence, the inventory carrying cost for the ith item over the inventory cycle is 1 C 2 1i Qi Ti . Therefore, the average cost for the ith item is
1 Ci ¼ C3i þ C1i Qi Ti =Ti 2 1 or Ci ¼ C3i Di =Qi þ C1i Qi : 2 Hence, the total average cost for n items is given by C¼
n X
Ci
i¼1
n X 1 or C ¼ C3i Di =Qi þ C1i Qi : 2 i¼1 Here C is a function of Q1 ; Q2 ; . . .; Qn .
544
17
Inventory Control Theory
For optimum values of Qi (i = 1, 2, …, n), we must have @C=@Qi ¼ 0; 1 i:e: C1i C3i Di =Q2i ¼ 0 2 rffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2C3i Di : or Qi ¼ C1i Hence, the optimum value of Qi is Qi
rffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2C3i Di ¼ C1i
ð17:37Þ
17.9.1 Limitation on Inventories If there is a limitation on inventories that requires that the average number of all units in inventory should not exceed k units of all types, then the problem is to minimize the cost C subject to the condition that: 1 2
Pn
i¼1
Qi k [Since the average number of inventory at any time for an item is
Qi .]
1 2
or
n X
Qi 2k 0:
i¼1
Here two cases may arise: Case I When
1 2
n P i¼1
Qi k
In this case, the optimal values Q*i (i = 1, 2,…,n) given by Qi
rffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2C3i Di ¼ C1i
satisfy the constraint directly. Case II When
1 2
n P i¼1
Qi [ k
In this case, we have to solve the following problem:
ð17:38Þ
17.9
Multi-item Inventory Model
Minimize
545
C¼
n X 1
2
i¼1
C1i Qi þ C3i Di =Qi
subject to the constraint (17.38). To solve this problem, we shall use the Lagrange multiplier method. The corresponding Lagrangian function is L¼
n X 1 i¼1
2
"
C1i Qi þ C3i Di =Qi þ k
n X
# Qi 2k ;
i¼1
where k ð [ 0Þ is the Lagrange multiplier. The necessary conditions for L to be minimum are @L @L ¼ 0: ¼ 0; i ¼ 1; 2; . . .; n and @Qi @k Now, from
@L @Qi
¼ 0 we have 1 C3i Di C1i 2 þ k ¼ 0; 2 Qi
i ¼ 1; 2; . . .; n
or
Again, from
@L @k
1 2C3i Di 2 : Qi ¼ C1i þ 2k
¼ 0 we have n X
Qi 2k ¼ 0 or
i¼1
n X
Qi ¼ 2k:
i¼1
Hence, the optimum value of Qi is Qi ¼
and
2C3i Di C1i þ 2k
n X
12
Qi ¼ 2k:
ð17:39Þ ð17:40Þ
i¼1
To obtain the values of Qi from (17.39), we find the optimal value of k* of k by the successive trial and error method or the linear interpolation method, subject to the condition given by (17.40). Equation (17.40) implies that Qi must satisfy the inventory constraint in an equality sense.
546
17
Inventory Control Theory
17.9.2 Limitation of Floor Space (or Warehouse Capacity) Here we shall discuss the multi-item inventory model with the limitation of warehouse floor space. Let A be the maximum storage area available for n different items, ai be the storage area required per unit of the ith item and Qi be the amount ordered for the ith item. Thus, the storage requirement constraint becomes n X
ai Qi A;
Qi [ 0
i¼1
or
n X
ai Qi A 0:
ð17:41Þ
i¼1
Now two possibilities may arise: Case 1 When
n P i¼1
ai Qi A
In this case, the optimal values Qi (i = 1, 2, …, n) given by Qi ¼
qffiffiffiffiffiffiffiffiffiffi 2C3i Di C1i
satisfy
the constraint (17.41) directly. Hence, these optimal values Qi are the required values. Case 2 When
n P i¼1
ai Qi [ A
In this case, we have to solve the problem as follows: n
P 1 Minimize C ¼ 2 C1i Qi þ C3i Di =Qi i¼1
subject to the constraint (17.41). To solve it, we shall use the Lagrange multiplier method, and the corresponding Lagrangian function is L¼
n X 1 i¼1
2
C1i Qi þ C3i Di =Qi þ k
"
n X
# ai Qi A ;
i¼1
where k (>0) is the Lagrange multiplier. The necessary conditions for L to be minimum are @L ¼ 0; @Qi @L ¼ 0: and @k
i ¼ 1; 2; . . .; n
17.9
Multi-item Inventory Model
Now from
@L @Qi
547
¼ 0; we have 1 C3i Di C1i 2 þ k ai ¼ 0; 2 Qi
Again, from
@L @Qi
i ¼ 1; 2; . . .; n:
ð17:42aÞ
¼ 0; we have n X
ai Qi A ¼ 0
ð17:42bÞ
i¼1
Solving (17.42a) and (17.42b), we have the optimal values of Qi as Qi
12 2C3i Di ¼ ; C1i þ 2 k ai
i ¼ 1; 2; . . .; n
ð17:43Þ
and n X
ai Qi ¼ A:
ð17:44Þ
i¼1
To obtain the values of Qi from (17.43), we find the optimal value k* from k by the successive trial and error method or the linear interpolation method subject to the condition given by (17.44). Equation (17.44) implies that Qi must satisfy the inventory constraint in the equality sense.
17.9.3 Limitation on Investment In this case, there is an upper limit M on the amount to be invested on inventory. Let C4i be the unit price of the ith item. Then n X
C4i Qi M:
ð17:45Þ
i¼1
Now two possibilities may arise: Case I When
n P i¼1
C4i Qi M
In this case, the constraint is satisfied by Q*i automatically. Hence, the optimal values of Q*i (i = 1, 2, …, n) are given by Qi ¼
rffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2C3i Di : C1i
548
17
Case II When
n P i¼1
Inventory Control Theory
C4i Qi [ M
In this case, our problem is as follows: C¼
Minimize
n X 1
2
i¼1
C1i Qi þ C3i Di =Qi
subject to the constraint (17.45). To solve it, we shall use the Lagrange multiplier method, and the corresponding Lagrangian function is L¼
n X 1 i¼1
2
"
C1i Qi þ C3i Di =Qi þ k
n X
# C4i Qi M ;
i¼1
where k (>0) is the Lagrange multiplier. The necessary conditions for L to be minimum are @L ¼ 0; i ¼ 1; 2; . . .; n @Qi @L ¼ 0: and @k Now from
@L @Qi
¼ 0, we have Qi ¼
Again, from
@L @k
12 2C3i Di ; i ¼ 1; 2; . . .; n: C1i þ 2kC4i
¼ 0, we have n X
C4i Qi M ¼ 0 or
i¼1
n X
C4i Qi ¼ M:
i¼1
Hence, the optimum value of Qi is given by Qi
and
2C3i Di ¼ C1i þ 2k C4i n X i¼1
12
C4i Qi ¼ M:
ð17:46Þ ð17:47Þ
17.9
Multi-item Inventory Model
549
Thus, the values of Qi are obtained from (17.46) subject to the condition given by (17.47) where the optimal value k* of k can be found by either the successive trial and error method or the linear interpolation method. Example 5 A workshop produces three machine parts A, B, C, and the total storage space available is 640 m2. Obtain the optimal lot size for each item from the following data:
Cost per unit (Rs.) Storage space required (m2/unit) Ordering cost (Rs.) (C3) No. of units required/year
Items A
B
C
10 0.60 100 5000
15 0.80 200 2000
5 0.45 75 10,000
The carrying charge on each item is 20% of unit cost. Solution Considering one year as one unit of time, we have: carrying charge of A (C11) = Rs. (20% of 10) = Rs. 2.00 carrying charge of B (C12) = Rs. (20% of 15) = Rs. 3.00 carrying charge of C (C13) = Rs. (20% of 5) = Re. 1.00. Now, without considering the effect of restriction on storage space availability, the qffiffiffiffiffiffiffiffiffiffi optimal value Qi of the ith item is given by Qi ¼ 2CC3i1iDi ; i ¼ 1; 2; 3. rffiffiffiffiffiffiffiffiffiffiffiffiffiffi rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2Di C3i 2 5000 100 ) ¼ ¼ ¼ 707 2 C11 rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 2:000 200 ¼ 516 Q2 ¼ 3 p ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Q3 ¼ 2 10; 000 75 ¼ 1225: Q1
Then the total storage space required for these values of Q*i (i = 1, 2, 3) is 3 X
ai Qi ¼ a1 Q1 þ a2 Q2 þ a3 Q3
i¼1
¼ ð0:60707 þ 0:80516 þ 0:451225Þ m2 ¼ 1388:25 m2 :
550
17
Inventory Control Theory
This storage space is greater then the available storage space of 640 m2. Therefore, we shall try to find a suitable value of k* by the trial and error method for computing Q*i by using Qi ¼
and
3 P i¼1
2C3i Di C1i þ 2k ai
12
ai Qi ¼ 640.
If we take k* = 5, rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2C3i Di 2 5000 100 ¼ 354 ¼ ¼ 2 þ 2 5 0:60 C1i þ 2 5 a1 rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 2000 200 Q2 ¼ ¼ 270 3 þ 2 5 0:80 rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 10000 75 ¼ 522: Q3 ¼ 1 þ 2 5 0:45 Q1
and
Hence, the corresponding storage space is 0.60354 + 0.80270 + 0.45552 = 663.30 m2. This storage space is greater than the available storage space of 640 m2. Again, if we take k* = 6, then rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 5000 100 ¼ 330 ¼ 2 þ 2 6 0:60 rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 2000 200 Q2 ¼ ¼ 252 3 þ 2 6 0:80 rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 10000 75 ¼ 484: Q3 ¼ 1 þ 2 6 0:45 Q1
and
Hence, the corresponding storage space is 0.60330 + 0.80352 + 0.45484 = 617.40 m2, which is less then the available storage space of 640 m2. Hence, it is clear that the most suitable value of k* lies between 5 and 6. Let us assume that the required storage space will be 640 m2 for k* = x. Now considering the linear relationship between the value of k* and the required storage space, we have x6 640617:4
or
56 ¼ 663:3617:4
x 6 ¼ 22:6 45:9 ) k ¼ 5:5:
or x ¼ 5:5 ðapprox:Þ
17.9
Multi-item Inventory Model
551
For this value of k*, rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 5000 100 ¼ 341 ¼ 2 þ 2 5:5 0:60 rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 2000 200 Q2 ¼ ¼ 272 2 þ 2 5:5 0:80 rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 10000 75 ¼ 502: Q3 ¼ 1 þ 2 5:5 0:45 Q1
And
Hence, the optimal lot sizes of the three machine parts A, B, C are Q1 ¼ 341 units, Q2 ¼ 272 units and Q3 ¼ 502 units. Example 6 A company producing three items has a limited inventory of averagely 750 items of all types. Determine the optimal production quantities for each item separately, when the following information is given: Product
1
2
3
Holding cost (Rs.) Ordering cost (Rs.) Demand
0.05 50 100
0.02 40 120
0.04 60 75
Solution Neglecting the restriction of the total value of the inventory level, we get the qffiffiffiffiffiffiffiffiffiffi optimal values Q*i for the ith item, which is given by Qi ¼ 2CC3i1iDi ; i ¼ 1; 2; 3. Hence;
And
rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 50 100 ¼ ¼ 447 0:05 rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 40 120 Q2 ¼ ¼ 693 0:02 rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 60 75 ¼ 474: Q3 ¼ 0:04 Q1
Therefore, the total average inventory is (447 + 693 + 474)/2 units = 807 units. But the average inventory is 750 units. Therefore, we have to determine the value of parameter k* by the trial and error method for computing Qi by using Qi
rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2C3i D 1X Qi ¼ 750: ¼ and C1i þ 2 k 2
552
17
Inventory Control Theory
Now, for k* = 0.005, Q1
rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 50 100 ¼ 408; Q2 ¼ 566 and Q3 ¼ 424: ¼ 0:05 þ 2 0:005
Therefore, the total average inventory is (408 + 566 + 424)/2 = 699 units, which is less than the given average inventory of items. Again, for k* = 0.003, Q1 ¼ 423; Q2 ¼ 608; Q3 ¼ 442; and the average inventory = (423 + 608 + 442) = 737, which is less than 750 units. Again, for k* = 0.002, we have Q1 ¼ 430, Q2 ¼ 632, Q3 ¼ 452, and the average inventory = (430 + 632 + 452) = 757, which is greater than 750 units. Therefore, the most suitable value of k* lies between 0.002 and 0.003. Let us assume that for k* = x the average inventory will be 750. Now, considering the linear relationship between k* and the average inventory, we have x 0:003 0:003 0:002 13 0:001 ¼ or x 0:003 ¼ 750 737 737 757 20 or x ¼ 0:00235 or k ¼ 0:00235: For k* = 0.00235, Q1 ¼ 428; Q2 ¼ 623; Q3 ¼ 449. Hence, the optimal production quantities for products 1, 2 and 3 are 428, 623 and 449 units respectively.
17.10
Inventory Models with Price Breaks
In the earlier discussion, we assumed that the unit production cost or unit purchase cost is constant, so, we did not need to consider this cost in the analysis. However, in reality, it is not always true that the unit cost of an item is independent of the quantity procured or produced. Again, discounts are offered by the supplier or wholesaler or manufacturer for the purchase of large quantities. Such discounts are referred to as quantity discounts or price breaks. In this section, we shall consider a class of inventory in which cost is a variable. When items are purchased in bulk, some discount price is usually offered by the supplier. Let us assume that the unit purchase cost of an item is pj when the purchased quantity lies between bj − 1 and bj (j = 1, 2, …, m). Explicitly, we have:
17.10
Inventory Models with Price Breaks
Range of quantity to be purchased b0 \Q\b1 b1 Q\b2 b2 Q\b3 ... bj1 Q\bj ... bm1 Q\bm
553
unit purchase cost p1 p2 p3 ... pj ... pm
In general, b0 = 0 and bm = ∞ and p1 > p2 > > pj > > pm. The values b1, b2, b3, …, bm − 1 are termed as price breaks, as the unit price lies in the intervals between these values. The problem is to determine an economic order quantity Q which minimizes the total cost. In this model, the assumptions are as follows: (i) (ii) (iii) (iv)
The demand rate is known and uniform. Shortages are not permitted. Production for the supply of commodities is instantaneous. The lead time is zero.
17.10.1
Purchasing Inventory Model with Single Price Break
Let D be the demand rate, C1 be the holding cost per unit quantity per unit time and C3 the fixed ordering cost per order. Also, let p1 be the purchasing cost per unit quantity if the ordered quantity is less than b and p2 (p2 < p1) be the purchasing cost per unit quantity if the ordered quantity is greater than or equal to b quantities, i.e. Range of quantity to be purchased
Unit purchase cost
0 b, then the optimum lot size is Q*. (ii) If Q* < b, then go to Step 2.
for the case Q > b and then
17.10
Inventory Models with Price Breaks
Step 2. Evaluate Q* by the formula Q ¼ 0
00
555
qffiffiffiffiffiffiffiffi
2C3 D C1
for the case Q < b and evaluate
C ðQ Þ and C ðbÞ. (i) If C 0 ðQ Þ\C00 ðbÞ, then Q* is the optimum lot size. (ii) Otherwise, b is the optimal lot size.
17.10.2
Purchase Inventory Model with Two Price Breaks
Range of quantity to be purchased
Unit purchase cost
0 < Q < b1 b1 Q < b2 b2 Q
p1 p2 p3
The procedure used involving one price break is extended to the case with two price breaks. Working rule: Step 1. Compute Q* for Q > b2 (say Q*3). If Q*3 > b2, then the optimal lot size is Q*3; otherwise, go to Step 2. Step 2. Compute Q* for b1 Q*< b2 (say Q*2). Since Q*3 < b2, then Q*2 is also less than b2. In this case there are two possibilities: either Q*2 b1 or Q*2 < b1. If Q*2 b1, then compare the cost C(Q*2) and C(b2) to obtain the optimum lot size. The quantity with lower cost will naturally be the optimum one. If Q*2 < b1, then go to Step 3. Step 3. If Q*2 < b1, then compute Q*1 for the case 0 < Q < b1 and compare the cost C(Q*1), C(b1) and C(b2) to determine the optimal lot size. The quantity with lower cost will naturally be the optimum one. Example 7 Find the optimum order quantity for a product for which the price breaks are as follows: Range of quantity to be purchased 0\Q\100 100 Q\200 200 Q
Purchase cost per unit Rs: 20:00 Rs: 18:00 Rs: 16:00
The monthly demand for the product is 400 units. The storage cost is 20% of the unit cost of the product and the cost of ordering is Rs. 25.00 per order.
556
17
Inventory Control Theory
Solution Here D = 400 units/month, C3 = Rs. 25.00 per order C1 = 20% of purchase cost per unit = 0.2 time of purchase cost per unit Let
Q3
rffiffiffiffiffiffiffiffiffiffiffi 2C3 D for Q [ 200 ¼ C1 rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi 2 25 400 ¼ 79: ¼ 0:2 16
Since Q*3 < 200, Q*3 is not the optimum order quantity. Therefore, we have to proceed to calculate Q*2. Now
Q2
rffiffiffiffiffiffiffiffiffiffiffi 2C3 D for 100 Q\200 ¼ C1 rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 25 400 ¼ 75: ¼ 0:2 18
Again, since Q*2 < 100, Q*2 is not the optimum order quantity. Now we have to proceed to calculate Q*1. Q1
rffiffiffiffiffiffiffiffiffiffiffi 2C3 D for 0\Q\100 ¼ C1 ffi rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 25 400 ¼ ¼ 71: 0:2 20
Since 0 < Q*1 < 100, then we have to compute C(Q*1), C(100), C(200). Now 1 C3 D CðQ1 Þ ¼ C0 ðQ1 Þ ¼ p1 D þ C1 Q1 þ 2 Q1 1 25 400 ¼ 20 400 þ 0:2 20 71 þ 2 71 ¼ Rs: 8282:85 1 C3 D C1 100 þ 2 100 1 25 400 ¼ 18 400 þ 0:2 18 100 þ 2 100 ¼ Rs: 7480:00
Cð100Þ ¼ C 00 ð100Þ ¼ p2 D þ
17.10
Inventory Models with Price Breaks
557
and 1 C3 D C1 200 þ 2 200 1 25 400 ¼ 16 400 þ 0:2 16 200 þ 2 200 ¼ Rs: 6770:00:
Cð200Þ ¼ C 000 ð200Þ ¼ p3 D þ
Since C(200) < C(100) < C(Q*1), the optimal order quantity is 200 units, i.e. Q* = 200.
17.11
Probabilistic Inventory Model
Here we consider the situations when the demand is not known exactly, but the probability distribution of the demand is somehow known. The control variable in such cases is assumed to be either the scheduling period or the order level or both. The optimum order levels will thus be derived by minimizing the total expected cost rather than the actual cost involved.
17.11.1
Single Period Inventory Model with Continuous Demand (No Replenishment Cost Model)
In this model, we have to find the optimum order quantity so as to minimize the total expected cost with the following assumptions: (i) The scheduling period T is fixed and known. Hence, we do not include the setup cost in our derivation, as it is a prescribed constant. (ii) Production is instantaneous. (iii) The lead time is zero. (iv) The demand is uniformly distributed over the period. (v) Shortages are allowed and fully backlogged. (vi) The holding cost C1 per unit quantity per unit time and the shortage cost C2 per unit quantity per unit time are known and constant. Model Formulation: Let x be the amount on hand before an order is placed. Also, let Q be the level of inventory at the beginning of each period, and r units is the demand per time period. Depending on the amount r, two cases may arise: Case I: r Q Case II: r > Q
558
17
Fig. 17.5 Pictorial representation of single period model with continuous demand (Case I)
Inventory Control Theory
B
Q
M Q-r
O Fig. 17.6 Pictorial representation of single period model with continuous demand (Case II)
T
A
B
Q A O
C
r-Q M
The inventory situation for the case of r Q is shown in Fig. 17.5 and the situation for the case of r > Q is shown in Fig. 17.6. In the first case, r Q as shown in Fig. 17.5, no shortage occurs. In the second case, r > Q as shown in Fig. 17.6, both the costs are involved. Discrete case: when r is a discrete random variable Let the demand for D = r units be estimated at a discontinuous rate with probability p(r), r = 1, 2, …, n, … That is, we may expect the demand for one unit with probability p(1), 2 units with probability p(2) and so on. Since all the possibilities / P are to be considered, we must have pðrÞ ¼ 1 and p(r) 0. r¼1
We also assume that r will only be non-negative integers. Case I: In this case, r Q. Thus, there is no shortage, and the total inventory is represented by the total area OAMB ¼ 12 ðQ þ QrÞT ¼ ðQr=2ÞT (see Fig. 17.5).
17.11
Probabilistic Inventory Model
559
Hence, the holding cost for the time period T is C1 (Q − r/2) T. This is the holding cost when r ( Q) units is the demand rate in one period. But, the probability of the demand of r units is p(r). Hence, the expected value of the cost is C1 (Q − r/2) T p(r). Now, r can have only values less than Q. Hence, the total expected cost when r < Q is equal to Q
X r C1 Q TpðrÞ: 2 r¼0 Case II: In this case, r > Q, both the holding and shortage costs are involved. Here, the holding cost is 12 C1 Qt1 and the shortage cost is 12 C2 ðr QÞt2 where t1 and t2 represent the no-shortage and shortage cases and t1 + t2 = T. Now, from the similar triangles OBC and ACM in Fig. 17.6, we have t1 t2 ¼ Q rQ t1 t2 t1 þ t2 t1 t2 T or ¼ ¼ or; ¼ ¼ Q rQ r Q rQ r QT ðr QÞT and t2 ¼ : or t1 ¼ r r Hence the expected cost is / X 1 1 C1 Qt1 pðrÞ þ ðr QÞC2 t2 pðrÞ 2 2 r¼Q þ 1 " # / X Q2 ðr QÞ2 ¼ TpðrÞ þ C2 TpðrÞ : C1 2r 2r r¼Q þ 1 Therefore, the average expected cost is given by TECðQÞ ¼ C1
Q 1 1 X X X r Q2 ðr QÞ2 pðrÞ þ C2 pðrÞ: Q pðrÞ þ C1 2 2r 2r r¼Q þ 1 r¼Q þ 1 r¼0
ð17:49Þ The problem is now to find Q, so as to minimize TEC(Q). Let an amount Q + 1 instead of Q be produced. Then the average total expected cost is TEC(Q þ 1Þ ¼ C1
Q þ1 X r¼0
Qþ1
/ / X X r ðQ þ 1Þ2 ðr Q 1Þ2 pðrÞ þ C1 pðrÞ þ C2 pðrÞ: 2 2r 2r r¼Q þ 2 r¼Q þ 1
560
17
Inventory Control Theory
But C1
Q þ1 X r¼o
Qþ1
Q X r r pðrÞ ¼ C1 Q þ 1 pðrÞ 2 2 r¼0 Qþ1 þ C1 Q þ 1 pðQ þ 1Þ 2 Q Q X X r Qþ1 ¼ C1 pðQ þ 1Þ: Q pðrÞ þ C1 pðrÞ þ C1 2 2 r¼0 r¼0
Again, C1
1 1 X X ðQ þ 1Þ2 ðQ þ 1Þ2 ðQ þ 1Þ2 pðrÞ ¼ C1 pðrÞ C1 pðQ þ 1Þ 2r 2r 2ðQ þ 1Þ r¼Q þ 2 r¼Q þ 1
¼ C1
1 X Q2 þ 2Q þ 1 Qþ1 pðrÞ C1 pðQ þ 1Þ 2r 2 r¼Q þ 1
¼ C1
1 1 1 X X Q2 Q C1 X pðrÞ pðrÞ þ pðrÞ þ C1 r r 2r 2 r¼Q þ 1 r¼Q þ 1 r¼Q þ 1
C1 ðQ þ 1ÞpðQ þ 1Þ: 2
Similarly, C2
1 X ðr Q 1Þ2 pðrÞ 2r r¼Q þ 2
¼ C2
1 X ðr Q 1Þ2 pðrÞ 0 2r r¼Q þ 1
¼ C2
1 1 1 1 X X X ðr QÞ2 Q C2 X pðrÞ pðrÞ þ : pðrÞ C2 pðrÞ þ C2 r 2r 2 r¼Q þ 1 r r¼Q þ 1 r¼Q þ 1 r¼Q þ 1
Substituting these values in TEC(Q + 1) and then simplifying, we get "
# 1 1 X pðrÞ TECðQ þ 1Þ ¼ TECðQÞ þ ðC1 þ C2 Þ pðrÞ þ Q þ C2 : 2 r¼Q þ 1 r r¼0 Q X
Q X
1 If we set pðrÞ þ Q þ 2 r¼0
X 1 pðrÞ ¼ LðQÞ; r r¼Q þ 1
ð17:50Þ
17.11
Probabilistic Inventory Model
561
then TECðQ þ 1Þ ¼ TECðQÞ þ ðC1 þ C2 ÞLðQÞC2
ð17:51Þ
Similarly, substituting Q – 1 in place of Q in (17.49), we have TECðQ 1Þ ¼ TECðQÞ ðC1 þ C2 ÞLðQ 1Þ þ C2 :
ð17:52Þ
For optimal Q, we must have TECðQ þ 1Þ TECðQÞ [ 0 i.e. ðC1 þ C2 ÞLðQ Þ C2 [ 0 [From (17.51)] or
LðQÞ [
C2 : C1 þ C2
ð17:53Þ
Again, for optimal Q, we have TECðQ 1Þ TECðQÞ [ 0 or ðC1 þ C2 ÞLðQ 1Þ þ C2 [ 0 C2 or LðQ 1Þ\ : C1 þ C2
ð17:54Þ
Combining (17.53) and (17.54) for the optimal value of Q*, we have LðQ 1Þ\
C2 \LðQ Þ; C1 þ C2
ð17:55Þ
P1 P pðrÞ 1 where LðQÞ ¼ Q r¼0 pðrÞ þ Q þ 2 r¼Q þ 1 r . Using the relation (17.55), we find the range of optimum values of Q. In these cases, Q* need not be unique. If C1 Cþ2 C2 ¼ LðQ Þ, then both Q* and Q* + 1 are the optimal values. Similarly, if optimal values.
C2 C1 þ C2
¼ LðQ 1Þ, then both Q* and Q* − 1 are the
Continuous case: when r is a continuous random variable When the uncertain demand is estimated as a continuous random variable, the cost expressions of inventory, holding and shortage costs involve integrals instead of summation signs.
562
17
Inventory Control Theory
Let f(r) be the known probability density function for demand r. The discrete point probabilities by the probability differential f(r)dr for a small
p(r) are replaced d r d r interval, say, r ; r þ . In this case, we have 2
2
Z1
f ðrÞdr ¼ 1 and f ðrÞ 0:
0
Case 1 When r Q
Proceeding as before for r Q, the holding cost is C1 Q 2r t and there is no shortage cost. Case 2 When r > Q Proceeding as before for r > Q, the holding cost is
C1 Q 2 t 2r
and the shortage cost is
2 t C2 ðrQÞ 2r .
TECðQÞ ¼
ZQ
r C1 Q f ðrÞdr þ 2
dTECðQÞ ¼ C1 dQ
# Q2 ðr QÞ2 þ C2 f ðrÞdr C1 2r 2r
Q
0
)
Z/ "
ZQ 0
þ
r dr Q f ðrÞdr þ C1 ½ Q f ðrÞ 2 dQ 0
Z/ Q
( ) C1 C2 c1 Q2 C2 ðr QÞ2 dr 1 2Q 2ðr QÞ f ðrÞdr þ þ f ðrÞ dQ Q 2r 2r 2r 2r
After simplification, we have dTECðQÞ ¼ ðC1 þ C2 Þ dQ
ZQ
f ðrÞdr þ ðC1 þ C2 Þ
Z1
Qf ðrÞ dr C2 : r
Q
0
ðQÞ The necessary condition for TEC(Q) to be optimum is dTEC dQ ¼ 0
i.e. ðC1 þ C2 Þ
R Q 0
f ðrÞdr þ ðC1 þ C2 Þ ZQ or 0
ð17:56Þ
R/
f ðrÞdr þ
Q
Q f ðrÞ r
Z1 Q
for Q ¼ Q ,
C2 ¼ 0
Q f ðrÞ C2 ¼ : r C1 þ C2
ð17:57Þ
17.11
Probabilistic Inventory Model
563
Again from (17.56), d2 TECðQÞ ¼ ðC1 þ C2 Þf ðQÞ þ ðC1 þ C2 Þ dQ2
Z1
f ðrÞ dr ðC1 þ C2 Þf ðQÞ: r
Q
After simplification, we have d2 TECðQÞ ¼ ðC1 þ C2 Þ dQ2
Z1
f ðrÞ dr ¼ a positive quantity: r
Q
Hence, Eq. (17.57) gives the optimum value of Q for the minimum expected average cost. Example 8 A contractor of second-hand motor trucks maintains a stock of trucks every month. The demand of the trucks occurs at a relatively constant rate but not with a constant amount. The demand follows the following probability distributions: Demand (r)
0
1
2
3
4
5
6 or more
Probability [p(r)]
0.40
0.24
0.20
0.10
0.05
0.01
0.0
The holding cost of an old truck in stock for one month is Rs. 100.00, and the penalty for a truck not being supplied on demand is Rs. 1000.00. Determine the optimal size of the stock for the contractor. Solution
PQ
P1
Q +
1
0.3878
0.5
0.1939
0.40
0.5937
0.2400
0.1478
1.5
0.2217
0.64
0.8617
0.20
0.1000
0.0478
2.5
0.1195
0.84
0.9595
3
0.10
0.0333
0.0145
3.5
0.0575
0.94
0.9907
4
4
0.05
0.0125
0.0020
4.5
0.0090
0.99
0.9990
5
5
0.01
0.0020
0.0000
5.5
0.0000
1.0
1.0000
6 or more
6 or more
0.00
0.0000
0.0000
6.5
0.0000
1.0
1.0000
Q
r
p(r)
pðrÞ r
0
0
0.40
1
1
0.24
2
2
3
pðrÞ r¼Q þ 1 r
1 2
ðQ þ 12Þ
PQ þ
1 pðrÞ 2 r¼Q þ 1 r
r¼0
pðrÞ
Here C1 = Rs. 100.00 and C2 = Rs. 1000.00. )
C2 1000 10 ¼ ¼ 0:9090: ¼ C1 þ C2 100 þ 1000 11
LðQÞ ¼ ðQ þ 12Þ P1 PQ pðrÞ r¼Q þ 1 r þ r¼0 pðrÞ
564
17
Inventory Control Theory
Now for the optimal value of Q (say, Q*), we must have LðQ 1Þ\
where LðQÞ ¼
PQ
r¼0
C2 \LðQ Þ; C1 þ C2
P/ pðrÞ pðrÞ þ Q þ 12 r¼Q þ 1 r .
From the table, it is clear that for Q = 2, the preceding inequality is satisfied; i.e. L(1) < 0.9090 < L(2) or L(2 − 1) < 0.9090 < L(2). Hence, Q* = 2; i.e. the optimum stock level of trucks is 2.
17.11.2
Single Period Inventory Model with Instantaneous Demand (no Replenishment Cost Model)
In this model, we have to find the optimum order quantity which minimizes the total expected cost under the following assumptions: (i) T is the constant interval between orders (T may also be considered as unity, e.g. daily, weekly, monthly, etc.). (ii) Q is the stock level at the beginning of each period T. (iii) The lead time is zero. (iv) The holding cost C1 per unit quantity per unit time and the shortage cost C2 per unit quantity per unit time are known and constant. (v) r is the demand at each interval T. Model Formulation: In this model it is assumed that the total demand is filled at the beginning of the period. Thus, depending on the amount r demanded, the inventory position just after the demand occurs may be either positive (surplus) or negative (shortage); i.e. there are two cases: Case 1: r Q Case 2: r > Q The corresponding inventory situations are shown in Figs. 17.7 and 17.8. Discrete case: when r is discrete Let r be the estimated demand at an instantaneous rate with probabilities p(r). Then there is only a holding cost and no shortage cost. Here the holding cost is C1(Q − r).
17.11
Probabilistic Inventory Model
565
r r≤ Q (No shortage)
Q
Inventory
Q-r Time
T Fig. 17.7 Pictorial representation of single period model with no shortage
Q r
r > Q (No inventory) Time Shortage
r-Q
T Fig. 17.8 Pictorial representation of single period model with no inventory
In the second case, the demand r is filled at the beginning of the period. There is only a shortage cost, no holding cost. Therefore, the shortage cost in this case is C2 (r − Q). Therefore, the total expected cost for this model is TECðQÞ ¼ C1
Q X r¼0
ðQ rÞpðrÞ þ C2
/ X r¼Q þ 1
ðr QÞpðrÞ:
ð17:58Þ
566
17
Inventory Control Theory
Our problem is now to find Q so that TEC(Q) is minimum. Let an amount Q + 1 instead of Q be ordered. Than the total expected cost given in (17.58) reduces to TECðQ þ 1Þ ¼ C1
Q X
/ X
ðQ þ 1 rÞpðrÞ þ C2
ðr Q 1ÞpðrÞ:
r¼Q þ 2
r¼0
On simplification, we have TECðQ þ 1Þ ¼ C1
Q X
/ X
ðQ rÞpðrÞ þ C2
ðr QÞpðrÞ þ C1
r¼Q þ 1
r¼0
¼ TECðQÞ þ ðC1 þ C2 Þ
Q X
Q X
pðrÞ C2
r¼0
"
/ X
pðrÞ C2 Since
pðrÞ ¼ 1
r¼Q þ 1
r¼0
) TECðQ þ 1Þ TECðQÞ ¼ ðC1 þ C2 Þ
Q X
/ X r¼Q þ 1
Q X
pðrÞ #
pðrÞ :
r¼0
pðrÞ C2 :
ð17:59Þ
r¼0
Similarly, when an amount Q−1 instead of Q is ordered, then we have TECðQ 1Þ ¼ TECðQÞ ðC1 þ C2 Þ
Q1 X
pðrÞ þ C2
r¼0
or TECðQ 1Þ TECðQÞ ¼ ðC1 þ C2 Þ
Q1 X
ð17:60Þ pðrÞ þ C2 :
r¼0
For optimal Q*, we must have TECðQ þ 1Þ TECðQ Þ [ 0: From (17.59), we have ðC1 þ C2 Þ
Q X
pðrÞ C2 [ 0
r¼0 Q X
C2 or pðrÞ [ : C 1 þ C2 r¼0 Similarly, for optimal Q*, TEC(Q* − 1) − TEC(Q*) > 0.
ð17:61Þ
17.11
Probabilistic Inventory Model
567
From (17.60), we have ðC1 þ C2 Þ
Q 1 X
pðrÞ þ C2 [ 0
r¼0
or
Q 1 X
r¼0
ð17:62Þ
C2 pðrÞ\ : C1 þ C2
Thus, combining (17.61) and (17.62), we have Q 1 X
pðrÞ\
r¼0
Q X C2 \ pðrÞ C1 þ C2 r¼0
ð17:63Þ
C2 or pðr\Q 1Þ\ \pðr Q Þ; C1 þ C2
where pðr Q Þ represents the probability for r\Q . Continuous case: where r is a continuous variable Let the demand r be a continuous variable with the probability density function f(r). Then proceeding as before, the total expected cost for this model is ZQ Z/ TECðQÞ ¼ C1 ðQ rÞf ðrÞdr þ C2 ðr QÞf ðrÞdr: Q
0
dTECðQÞ ¼ C1 ) dQ
ZQ 0
f ðrÞdr C2
Z/
f ðrÞdr
Q
½Using Leibniz's rule for differentiation under the integral sign 2 3 ZQ Z / ZQ ¼ C1 f ðrÞdr C2 4 f ðrÞdr f ðrÞdr 5 0
0
ZQ ¼ C2 þ ðC1 þ C2 Þ f ðrÞdr: 0
For the optimum value of Q, we must have dTECðQÞ ¼ 0; dQ
0
568
17
i:e: ðC1 þ C2 Þ
RQ 0
f ðrÞdr ¼ C2 Z or
Q
f ðrÞdr ¼
0
Inventory Control Theory
C2 : C1 þ C2
ð17:64Þ
Moreover, it can be proved that d2 TECðQÞ ¼ ðC1 þ C2 Þf ðQÞ [ 0: dQ2 Therefore, the optimum value of Q, i.e. Q*, is given by (17.64). Example 9 A newspaper boy buys papers for Rs. 1.40 each and sells them for 2.00. He cannot return the unsold newspapers. Daily demand has the following distribution. No. of customers
23
24
25
26
27
28
29
30
31
32
Probability
0.01
0.03
0.06
0.10
0.20
0.25
0.15
0.10
0.05
0.05
If each day’s demand is independent of the previous day’s demand, how many papers should he ordered each day? Solution Let Q be the number of newspapers ordered per day and r be the demand for them, i.e. the number that are actually sold per day. In this problem, the holding cost per paper per day is C1 = Rs. 1.40, and the shortage cost per paper per day is C2 = Rs. (2.00–1.40) = Rs. 0.60. Now to obtain the optimal solution, we shall find the cumulative probability of daily demand of newspapers. Calculations are given in the following table. r
23
24
25
26
27
28
29
30
31
32
p(r) PQ
0.01 0.01
0.03 0.04
0.06 0.10
0.10 0.20
0.20 0.40
0.25 0.65
0.15 0.80
0.10 0.90
0.05 0.95
0.05 1.00
r¼23
pðrÞ
The required optimum value for Q* is determined by Q 1 X
pðrÞ\
r¼23
or
Q 1 X
r¼23
Q X C2 \ pðrÞ C1 þ C2 r¼23
pðrÞ\0:3\
Q X r¼23
pðrÞ
*
C2 0:60 3 ¼ ¼ 0:3 : ¼ C1 þ C2 1:40 þ 0:60 10
17.11
Probabilistic Inventory Model
569
From the table, we have 27 X
pðrÞ ¼ 0:40 [ 0:3 and
r¼23
)
26 X
pðrÞ ¼ 0:20\0:3:
r¼23
26 X
pðrÞ\0:3\
r¼23
27 X
pðrÞ:
r¼23
Hence, Q* = 27; i.e. the optimum number of newspapers to be ordered is 27. Example 10 The demand for a certain product has a rectangular distribution between 4000 and 5000. Find the optimal order quantity if the storage cost is Re. 1.00 per unit and the shortage cost is Rs. 7.00 per unit. Solution Here C1 = Re. 1.00, C2 = Rs. 7.00, C3 = 0. The required optimum value for Q* is determined by ZQ f ðrÞdr ¼
C2 : C1 þ C2
ð17:65Þ
0
Since the distribution is rectangular, the probability density function is given by 1 , a r b. Here a = 4000 and b = 5000. f ðrÞ ¼ ba Hence, from (17.65), we have ZQ 4000
ZQ
1 7 dr ¼ or 5000 4000 1þ7
1 7 dr ¼ 1000 8
4000
Q 4000 7 ¼ or Q ¼ 4875: or 1000 8 Hence, the optimal order quantity is Q ¼ 4875 units. Example 11 An ice cream company sells one type of its ice cream by weight. If the product is not sold on the day it is prepared, it can be sold for a loss of Rs. 0.50 per pound. But there is an unlimited market for one-day-old ice cream. On the other hand, the company makes a profit of Rs. 3.20 on every pound of ice cream sold on the day it is prepared. If daily orders form a distribution with f(x) = 0.02 − 0.0002 x, 0 x 100, how many pounds of ice cream should the company prepare every day?
570
17
Inventory Control Theory
Solution It is given that C1 = Rs. 0.50 and C2 = Rs. 3.20. Let Q be the amount of ice cream prepared every day. The required optimum for Q* is determined by Q R 0
or
f ðxÞdx ¼ C1 Cþ2 C2 Q R 0
or
ð0:02 0:0002xÞdx ¼ 0:5 32 þ 3:2
0:0002Q2 þ 0:04Q þ 1:730 ¼ 0:
On solving, Q* = 136.7 and 63.5. But Q* = 136.7 is not possible, as 0 x 100. Hence, Q* = 63.5 lb is the optimum quantity; i.e. the company will prepare 63.5 lb of ice cream per day.
17.12
Fuzzy Inventory Model
In this section, we shall discuss the solution procedure of a fuzzy non-linear programming problem. A deterministic or crisp non-linear programming problem may be defined as follows: Minimize g0 ðx; C0 Þ subject to gi ðx; Ci Þ bi ; and x 0;
i ¼ 1; 2; . . .; m
ð17:66Þ
where x ¼ ðx1 ; x2 ; . . .; xn ÞT is a variable vector, f, gi’s are algebraic expressions in x with coefficients C0 ¼ ðC01 ; C02 ; . . .; C0l0 Þ and Ci ¼ ðCi1 ; Ci2 ; . . .; Cili Þ respectively. Introducing fuzziness in crisp parameters, the Problem (17.66) reduces to
f0 Minimize g0 x; C subject to
Ci bei ; gi x; f and x 0;
i ¼ 1; 2; . . .; m
ð17:67Þ
17.12
Fuzzy Inventory Model
571
where the wavy bar (*) represents the fuzziness of the parameters. According to fuzzy set theory, the fuzzy objective, coefficients and constraints are defined by their membership functions, which may be either linear or non-linear. According to Bellman and Zadeh (1970) and following the techniques of Carlsson and Korhonen (1986) and Trappy et al. (1988), the problem (17.67) is transformed into a new optimization problem as follows: Maximize a subject to
1 gi x; l1 Ci ðaÞ lbi ðaÞ; and
i ¼ 0; 1; 2; . . .; m
ð17:68Þ
x 0;
n o where lCi ðxÞ ¼ lCi1 ðxÞ; lCi2 ðxÞ; . . .; lCili ðxÞ are the membership functions of the fuzzy coefficients, and lbi ðxÞ; i ¼ 1; 2; . . .; m are the membership functions of the fuzzy objective and fuzzy constraints. Here the additional variable a is known as the aspiration level. Therefore, the Lagrangian function Lða; x; kÞ is given by Lða; x; kÞ ¼ a
m n o X 1 ki gi x; l1 Ci ðaÞ lbi ðaÞ ; i¼0
where k ¼ ðk0 ; k1 ; k2 ; . . .; km ÞT is the Lagrange multiplier vector. Now the Kuhn-Tucker necessary conditions are @L ¼ 0; @xj @L ¼0 @a n
j ¼ 1; 2; . . .; n
o 1 ki gi x; l1 Ci ðaÞ lbi ðaÞ ¼ 0
1 i ¼ 0; 1; 2; . . .; m gi x; l1 Ci ðaÞ lbi ðaÞ 0;
ð17:69Þ
ki 0: Solving the problem (17.69), the corresponding optimal solution can be obtained. In a deterministic EOQ model, the problem is to determine the order level Q (>0) which minimizes the average total cost C(Q), i.e.
572
17
Inventory Control Theory
Min CðQÞ ¼ C1 Q=2 þ C3 D=Q ð17:70Þ
subject to AQ B; Q [ 0; where C3 = ordering cost per order, C1 = holding cost per unit quantity per unit time, D = demand per unit time, B = maximum available warehouse space (in square feet), A = the space required by each unit (in square units).
When the preceding objective goal, costs and available storage area become fuzzy, the problem (17.70) is transformed to: g ¼C f1 Q=2 þ C f3 D=Q CðQÞ
Min
ð17:71Þ
subject to e AQ B;
Q [ 0:
So the corresponding fuzzy non-linear programming problem is Max a subject to 1 1 l1 C1 ðaÞQ=2 þ lC3 ðaÞD=Q lC0 ðaÞ
AQ l1 B ðaÞ;
Q [ 0;
a 2 ½0; 1;
where lC3 ðxÞ; lC1 ðxÞ; lC0 ðxÞ and lB ðxÞ are the membership functions for the fuzzy ordering cost, holding cost, objective goal and storage area respectively. Here the Lagrangian function is n o 1 1 Lða; Q; k1 ; k2 Þ ¼ a k1 l1 C1 ðaÞQ=2 þ lC3 ðaÞD=Q lC0 ðaÞ k2 AQ l1 B ðaÞ ;
ð17:72Þ
where k1 and k2 are Lagrange multipliers. Now, we consider three combinations of different types of membership functions to represent the fuzzy goal, costs and warehouse space. Fuzzy goal, costs and storage area represented by linear membership functions In this case, lCi ði ¼ 1; 3Þ is given by
17.12
Fuzzy Inventory Model
lCi ðuÞ ¼ 1
573
for u [ Ci
¼ 1 ðCi uÞ=Pi ¼0
for Ci Pi u Ci ði ¼ 1; 3Þ for u\Ci Pi :
Graphically, this function is represented in Fig. 17.9. Again; lC0 ðCðQÞÞ ¼ 1
for CðQÞ\C0
¼ 1 ðCðQÞ C0 Þ=P0 ¼0
for C0 CðQÞ C0 þ P0 for CðQÞ [ C0 þ P0 :
Its pictorial representation is given in Fig. 17.10. The linear membership function for the storage area is given by lB ðAQÞ ¼ 1
for AQ\B
¼ 1 ðAQBÞ=P ¼0
for B AQ B þ P for AQ [ B þ P:
μCi ( u ) 1
u
Ci − Pi Ci Fig. 17.9 Linear membership function of Ci
μC ( C ( Q ) ) 0
O
C0
e ðQÞ Fig. 17.10 Linear membership function of C
C0 + P0
C(Q)
574
17
Inventory Control Theory
μ B ( AQ ) 1
B+P
B
O
AQ
Fig. 17.11 Linear membership function of AQ
This membership function is depicted in Fig. 17.11. Here P0, P and the Pi’s (i =1, 3) are the maximally acceptable violations of the aspiration levels C0, B and the Ci’s (i = 1, 3). Considering the nature of these parameters, we assume the membership function to be non-decreasing for fuzzy inventory costs and non-increasing for the fuzzy goal and storage area. Hence, l1 Ci ðaÞ ¼ Ci ð1 aÞPi ; l1 C0 ðaÞ l1 B ðaÞ
ði ¼ 1; 3Þ
¼ C0 þ ð1 aÞP0 ¼ B þ ð1 aÞP:
From (17.72), we have Lða; Q; k1 ; k2 Þ ¼ a k1 ½fC1 ð1 aÞP1 gQ=2 þ fC3 ð1 aÞP3 gD=Q C0 ð1 aÞP0 k2 fAQ B ð1 aÞPg:
Now, the Kuhn-Tucker necessary conditions are @L ¼0 @a @L ¼0 @Q fC1 ð1 aÞP1 gQ=2 þ fC3 ð1 aÞP3 gD=Q C0 ð1 aÞP0 ¼ 0 AQ B ð1 aÞP ¼ 0:
17.12
Fuzzy Inventory Model
575
Solving these equations, the expression for the optimal order quantity is Q ¼ fB þ ð1 a ÞPg=A; where a is a root of P1 P2 ð1 aÞ3 C1 P2 2BPP1 2AP0 P ð1 aÞ2 2BC1 P P1 B2 2DA2 P3 2APC0 2AP0 B ð1 aÞ C1 B2 þ 2DA2 C3 2AC0 B ¼ 0:
Fuzzy goal and storage area represented by parabolic concave membership functions and costs by linear membership functions In this case, the membership functions lCi ðuÞ; i ¼ 1; 3 for the fuzzy inventory cost will be the same as defined in the preceding section. The membership function lC0 ðCðQÞÞ for the fuzzy goal is defined as follows: lC0 ðCðQÞÞ ¼ 1
for CðQÞ\C0
¼ 1 ½fCðQÞ C0 g=P0 ¼0
2
for C0 CðQÞ C0 þ P0 for CðQÞ [ C0 þ P0 :
It is graphically represented in Fig. 17.12. Again, the membership function for the storage area is defined as lB ðAQÞ ¼ 1
for AQ\B 2
¼ 1 ½ðAQ BÞ=P ¼0
for B AQ B þ P for AQ [ B þ P:
Fig. 17.12 Parabolic e ðQÞ membership function of C
C0
C0 + P0
576
17
Inventory Control Theory
Proceeding exactly as before, the optimal order quantity is given by n pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi o Q ¼ B þ ð1 a ÞP =A; where a is a root of 3 P1 P2 ð1 aÞ2 þ 2BPP1 ð1 aÞ2 C1 P2 P1 B2 2A2 DP3 2AP2 ð1 aÞ 1 ð2BPC1 2AC0 P 2APBÞð1 aÞ2 C1 B2 þ 2A2 DC3 2AC0 B ¼ 0:
17.13
Exercises
1. The demand rate of a particular item is 12,000 units per year. The ordering cost per order is Rs. 350.00, and the holding cost is Rs. 0.20 per unit per month. If no shortage is allowed and the replenishment is instantaneous, determine (i) the optimal lot size, (ii) the optimum scheduling period, (iii) the minimum total average cost. 2. The annual requirement for a product is 3000 units. The ordering cost is Rs. 100.00 per order. The cost per unit is Rs. 10.00. The carrying cost per unit per year is 30% of the unit cost. (a) Find the EOQ. (b) If a new EOQ is found by using the ordering cost as Rs. 80.00, what would be the further savings in cost? 3. The demand rate for an item of a company is 18,000 units per year. The company can produce at the rate of 3000 units per month. The setup cost is Rs. 500.00 per order, and the holding cost is 0.15 per units per month. Calculate: (i) Optimum manufacturing quantity, (ii) the time of manufacture, (iii) cycle length, (iv) the optimum annual cost if the cost of an item is Rs. 2.00 per unit. 4. A contractor has to supply 10,000 bearings per day to an automobile manufacturer. He finds that when he starts a production run he can produce 25,000 bearings per day. The holding cost of a bearing in stock is Rs. 0.02 per year. The setup cost of a production is Rs. 18.00. How frequently should production runs be made? 5. The demand for an item is 18,000 units per year. The holding cost is Rs. 1.20 per unit per unit item, and the shortage cost is Rs.5.00. The ordering cost is Rs. 400.00. Assuming the replenishment rate as instantaneous, determine the optimum order quantity along with the total minimum cost of the system. 6. Find the optimum order level which minimizes the total expected cost under the following assumptions: (i) t is the constant interval between orders.
17.13
Exercises
(ii) (iii) (iv) (v)
577
Q is the stock (in discrete units) at the beginning. d is the estimated random instantaneous demand at a discontinuous rate. C1 and C2 are the holding and shortage costs per item per t time unit. The level time is zero.
7. Let F(x) be the probability density function of the demand x in a prescribed scheduling period of T weeks. The demand is assumed to occur with a uniform pattern, and the probability distribution is continuous. The unit carrying cost and the shortage cost are respectively C1 money units and C2 money units, both per unit per week. There is no setup cost for the system. Obtain the expression for optimal control. 8. Let the probability density of demand of a certain item during a week be f ðxÞ ¼
0:1; 0 x 10 : 0; Otherwise
This demand is assumed to occur with a uniform pattern over the week. Let the unit carrying cost of the item in inventory be Rs. 2.00 per week and the unit shortage cost be Rs. 8.00 per week. Determine the optimal order level of the inventory and the total minimum cost of the inventory system. 9. The annual demand for a product is 500 units. The cost of storage per unit per year is 10% of the unit cost. The ordering cost is Rs. 180 for each order. The unit cost depends upon the amount ordered. The range of amount ordered and the unit price are as follows: Range of amount ordered
Unit cost (Rs.)
1 q < 500 500 q < 1500 1500 q < 3000 3000 q
25.00 24.80 24.60 24.40
Find the optimal order quantity. 10. Show that when determining the optimal inventory level which minimizes the total expected cost in case of continuous (non-discrete) quantities, the condition to be satisfied is FðS0 Þ ¼
C2 ; C1 þ C2
where FðS0 Þ ¼
RS0
f ðrÞdr,
0
f (r) = the probability density function of the requirement of r quantity C2 = shortage cost per unit per unit time
578
17
Inventory Control Theory
C1 = holding cost per unit per unit time. 11. An ice cream company sells one of its types of ice cream by weight. If the product is not sold on the day it is prepared, it can be sold at a loss of 50 paise per pound. There is, however, an unlimited market for one-day-old ice cream. On the other hand, the company makes a profit of Rs. 3.20 on every pound of ice cream sold on the day it is prepared. Past daily orders form a distribution with f ðxÞ ¼ 0:020:0002x;
0 x 100:
How many pounds of ice cream should the company prepare every day? 12. The following data describe three inventory items. Determine the economic order quantity for each of the three items to be accommodated within a total available storage area of 25 square feet. Item
Setup cost (Rs.)
Demand (units per day)
Unit holding cost per unit time (Rs.)
Storage area required per unit (ft2)
1 2 3
10 5 15
2 4 4
0.3 0.1 0.2
1 1 1
Find the economic lot size for the inventory model with finite replenishment rate, shortages allowed but fully backlogged, uniform finite demand and zero lead time so that the total average cost is minimum. 13. A shop produces three items in lots. The demand rate for each item is constant and can be assumed to be deterministic. No backorders are allowed. The pertinent data for the items are given in the following table: Item
I
II
III
Carrying cost (Rs.) Setup cost (Rs.) Cost per unit (Rs.) Yearly demand rate (units)
20 50 6 10,000
20 40 7 12,000
20 60 5 7500
Determine approximately the economic order quantity for the three items subject to the condition that the total value of average inventory levels of these items does not exceed Rs. 1000.00. 14. Find the optimum order quantity for a product for which the price breaks are as follows: Quantity
Purchasing cost (per unit)
0 Q1 < 100 100 Q2 < 200 200 Q3
Rs. 20 Rs. 18 Rs. 16
17.13
Exercises
579
The monthly demand for the product is 400 units. The storage cost is 20% of the unit cost of the product, and the cost of ordering is Rs. 25.00 per month. 15. A newspaper boy buys papers for Rs. 1.70 each and sells them for Rs. 2.00 each. He cannot return unsold newspapers. The daily demand has the following distribution: No. of customers
23
24
25
26
27
28
29
30
31
32
Probability
0.01
0.03
0.06
0.10
0.20
0.25
0.15
0.10
0.05
0.05
If each day’s demand is independent of the previous day’s demand, how many papers should he order each day? 16. A small shop produces three machine parts 1, 2, 3 in lots. The shop has only 700 m2 of storage space. The appropriate data for the three items are presented in the following table: Item
1
2
3
Demand (unit per year) Setup cost (Rs.) Cost per unit (Rs.) Floor space required (sq.mt/unit)
2000 100 10 0.50
5000 200 20 0.60
10000 75 5 0.30
The shop uses an inventory carrying charge of 20% of the average inventory valuation per annum. If no stock-outs are allowed, determine the optimal lot size for each item.
Chapter 18
Mathematical Preliminaries
18.1
Objective
The objective of this chapter is to discuss some mathematical background of several topics, like matrices, determinants, vectors, probability and Markov processes, which will be very helpful for studying the different subject matters of this book.
18.2
Matrices
A rectangular array of mn elements (m, n being positive integers), arranged in m rows and n columns as shown 2
a11 6 a21 6 A¼6 6 4 am1
a12 a22 am2
3 a1n a2n 7 7 7 7 5 amn
0
a11 B a21 B or A ¼ B B @ am1
a12 a22 am2
1 a1n a2n C C C C A amn
is called matrix of order (m n) or an m by n matrix. The numbers aij are called the elements of the matrix. A useful, easy way of writing the matrix shown below: A ¼ aij mn or aij mn ;
i ¼ 1; 2; . . .; m;
j ¼ 1; 2; . . .; n:
In general, matrices are denoted by boldface uppercase letters A, B, C, . . ..
© Springer Nature Singapore Pte Ltd. 2019 A. K. Bhunia et al., Advanced Optimization and Operations Research, Springer Optimization and Its Applications 153, https://doi.org/10.1007/978-981-32-9967-2_18
581
582
18
Mathematical Preliminaries
Square matrix: If m = n, the matrix A is called a square matrix of order n (or of order m). Row matrix: If m = 1, i.e. the matrix A has only one row, then A is called a row matrix (or sometimes called an n-dimensional row vector). Column matrix: If n = 1, i.e. the matrix A has only one column, then A is called a column matrix (or sometimes called an m-dimensional column vector). Equality of matrices: Two matrices A and B are said to be equal if A and B are of the same order and each element ofA is the same as the corresponding element of B; i.e. if A ¼ aij mn and B ¼ bij mn , then A ¼ B only when aij ¼ bij for each i ¼ 1; 2; . . .; m and each j ¼ 1; 2; . . .; n. It is to be noted that the question of equality between two matrices of different orders does not arise. Null/zero or void matrix: A matrix A ¼ aij mn is called a null matrix or a void matrix of order m n, if every element of A is 0 (zero), i.e. aij ¼ 0 for each i ¼ 1; 2; . . .; m and j ¼ 1; 2; . . .; n. The null matrix is denoted by Omn or simply O. Diagonal element of a matrix A Let A ¼ aij mn be a square matrix of order n. The diagonal through the element at the left-hand top corner of A is called the principal (or leading) diagonal or simply the diagonal of A, and the elements of this diagonal are called the diagonal elements of A. 2 3 a11 a12 a1n 6 a21 a22 a2n 7 6 7 Thus, if A ¼ 6 .. .. 7, .. .. 4 . . 5 . . an1
an2
ann
then the diagonal elements of A are a11 ; a22 ; . . .; ann . Diagonal matrix: If all the elements of A other than the diagonal elements are zero, i.e. aij ¼ 0 for all i, j with i 6¼ j, then the matrix A is called a diagonal matrix and is denoted by ða11 ; a22 ; a33 ; . . .; ann Þ. Scalar matrix: If A is a diagonal matrix with all the diagonal elements are same, i.e. a11 ¼ a22 ¼ ¼ ann and aij ¼ 0 for all i, j with i 6¼ j, then A is said to be a scalar matrix. Identity matrix or unit matrix: If A be the scalar matrix with each diagonal element as unity, i.e. a11 ¼ a22 ¼ ¼ ann ¼ 1 and aij ¼ 0 for all i, j with i 6¼ j, then A is said to be a unit matrix and is denoted by In. Upper triangular matrix: If all the elements below the diagonal elements are zero, i.e. aij ¼ 0 for i > j, then the matrix is called an upper triangular matrix.
18.2
Matrices
583
Lower triangular matrix: If all the elements above the diagonal elements are 0, i.e. aij ¼ 0 for (i < j), then the matrix is called a lower triangular matrix. Triangular matrix: A matrix A is called a triangular matrix if it is either an upper triangular or a lower triangular matrix. Here it is mentioned that a matrix is a diagonal matrix if and only if it is both upper triangular and lower triangular. Addition of matrices Two matrices A and B are said to be conformable for addition of A and B with same order (i.e. addition of a matrix A with a matrix Bis defined only when they are of the same order). If A ¼ aij mn and B ¼ bij mn are two matrices, then C = A + B is a matrix of order m n, whose elements cij are given by cij ¼ aij þ bij for all i and j:
2 1 2 3 4 7 and B ¼ , 5 3 4 1 2 4 2 1 2 3 4 7 5 then C ¼ A þ B ¼ þ ¼ 5 3 4 1 2 4 4 2 3 1 2 3 1 2 3 4 5 If A ¼ 2 3 4 and B ¼ , 1 4 1 5 3 2 Thus, if A ¼
5 5 . 5 8
then these two matrices are not conformable for addition. Negative of a matrix: Let A ¼ aij mn be a matrix. Then the matrix B of order m n is called the negative of A if A þ B ¼ Omn . Clearly, B is the matrix aij mn and we denote the negative of A by −A, i.e. A ¼ aij mn . If A and B are two matrices of the same order, say A ¼ aij mn , B ¼ bij mn , then we define the difference between A and B to be a matrix of order m n denoted by A B and given by A B ¼ A þ ðBÞ. ) A B ¼ aij mn þ bij mn ¼ aij bij mn : Multiplication by a scalar: Let A ¼ aij mn be a matrix of order (m n) and c be a scalar. Then the scalar multiple c A is defined to be a matrix of order (m n), each element of which is c times the corresponding element of A,
584
18
Mathematical Preliminaries
i:e: c A ¼ caij mn : 2
3 If c = 5, A ¼ 4 9 2
3 2 3 6 15 30 1 5, then c A ¼ 4 45 5 5. 0 10 0
Some algebric properties of matrices (i) (ii) (iii) (iv) (v) (vi) (vii) (viii) (ix) (x)
A + B = B + A (matrix addition is commutative) A + (B + C) = (A + B) + C (matrix addition is associative) ðc1 þ c2 ÞA ¼ c1 A þ c2 A, where c1 and c2 are scalars cðA þ BÞ ¼ cA þ cB, where c is a scalar ð1ÞA ¼ A A B ¼ A þ ðBÞ cðAÞ ¼ ðcÞA, where c is a scalar ðAÞ ¼ A ¼ 1 A cðA BÞ ¼ c A þ cðBÞ ¼ c A þ ðcÞB ¼ c A c B ðc1 c2 ÞA ¼ c1 A c2 A:
Multiplication of matrices Let A and B be two matrices. Then the product of A and B denoted by A B is defined if and only if the number of columns of A is equal to the number of rows of B. Thus, if A ¼ aij mp and B ¼ bij pn matrix, then the product C ¼ A B ¼ cij will exist and will be an m n matrix where the elements cij of C are given by p X cij ¼ aik bkj ; i ¼ 1; 2; . . .; m; j ¼ 1; 2; . . .; n: k¼1
Let A ¼
10 21
1 3
0
and 23
1 0 B ¼ @ 1 2 2 1
1 0 3 A . 2 33
1 0 1 0 0 1 B C ) AB ¼ @ 1 2 3 A 2 1 3 2 1 2 1 1 þ 0 ð1Þ þ ð1Þ 2 1 0 þ 0 2 þ ð1Þ 1 1 0 þ 0 3 þ ð1Þ ð2Þ ¼ 2 1 þ 1 ð1Þ þ 3 2 2 0þ1 2þ3 1 2 0 þ 1 3 þ 3 ð2Þ 1 1 2 ¼ : 7 5 3
1
0
18.2
Matrices
585
Note (i) Matrix multiplication satisfies the associative and the distributive laws. For the matrices A, B, C, ðABÞC ¼ AðB CÞ; ðA þ BÞC ¼ AC þ BC
and
AðB þ C Þ ¼ AB þ AC
provided that multiplication and/or addition are defined. (ii) For any two matrices A, B: • Neither AB nor BA may be defined. • AB may be defined but BA is not defined and vice versa. • AB and BA are both defined if the number of rows of A = number of columns of B and number of columns of A = number of rows of B. (iii) Matrix multiplication is not, in general, commutative; i.e. for two matrices A, B, the equality be true (even if AB and BA are defined). AB = BA need not 1 2 5 6 Let A ¼ B¼ 3 4 1 2 5þ2 6þ4 7 10 AB ¼ ¼ and 15 þ 4 18 þ 8 19 26 5 þ 18 10 þ 24 23 34 BA ¼ ¼ : 1þ6 2þ8 7 10 Thus, AB 6¼ BA. (iv) If A is a square matrix of order (n n) or n (say), then A commutes with In , the identity matrix. If On be the null matrix and A be any scalar matrix then • A In ¼ In A ¼ A • On A ¼ AOn ¼ On Transpose of a matrix Let A be a matrix of order m n (say). Then the matrix of order n m obtained by interchanging the rows and columns of A is called the transpose of A, to be denoted by AT or A0 . So for A ¼ aij mn , AT ¼ aji nm . In this process, the ith row of A becomes 2 1 1 2 3 Let A ¼ , ) AT ¼ 4 2 4 5 6 23 3
the ith column of AT . 3 4 55 . 6 32
586
18
Mathematical Preliminaries
Some properties: (i) (ii) (iii) (iv)
T
ðAT Þ ¼ A ðcAÞT ¼ c AT , where c is a scalar ðA þ BÞT ¼ AT þ BT ðABÞT ¼ BT AT
Symmetric matrix: A square matrix A ¼ aij nn of order n is said to be symmetric if A ¼ AT , i.e. aij ¼ aji ; 8 i; j. Skew-symmetric matrix: A square matrix A ¼ aij nn of order n is said to be a skew-symmetric matrix if A ¼ AT , i.e. aij ¼ aji 8 i; j. Some properties: (i) Every diagonal matrix is symmetric. (ii) For any matrix A, A AT and AT A are symmetric matrices. (iii) For a square matrix A, A þ AT is symmetric, while A AT is a skew-symmetric matrix. (iv) The diagonal elements of a skew-symmetric matrix are all zero. Orthogonal matrix: A square matrix A is said to be orthogonal if AT A ¼ A AT ¼ I (identity matrix). Submatrix: A matrix formed by omitting some rows and columns of a matrix is known as a submatrix of the original matrix. Partitioned matrices: Let A be a matrix given by 2 6 a11 6 6 a21 6 A ¼ 6 6 6 4 a31
a12
a41
a42
a22 a32
.. . .. .
.. . .. .
3 a13
a14
a23
a24
a33
a34
a15 7 7 a25 7 7 7 7 7 a35 5
a43
a44
a45
Now, if we write
A11 A21
a ¼ 11 a21 a ¼ 31 a41
then A can be written as
a12 a A12 ¼ 13 a22 a23 a a32 A22 ¼ 33 a42 a43
a14 a24 a34 a44
a15 a25 a35 a45
18.2
Matrices
587
A11 A¼ A21
A12 : A22
Therefore, the matrix A has been partitioned into four submatrices, Aij ; i ¼ 1; 2; j ¼ 1; 2.
18.3
Determinant
There is a number associated with every square matrix A which is called the determinant of A and is usually denoted by j Aj or det A. The determinant of A of order n is defined as X ð1Þa1j a2j anp ; j Aj ¼ the sum being taken over all the permutations of the second subscripts. Note If A, B are square matrices of the same order (say, n), then (i) j Aj ¼ jAT j (ii) jA Bj ¼ j Aj jBj (iii) jk Aj ¼ kn j Aj, for any scalar k. Minor: The minor Mij of the element aij of a matrix A is the determinant obtained by striking out the ith row and jth column of matrix A. 2 3 a11 a12 a13 Let A ¼ 4 a21 a22 a23 5. a31 a32 a33
a22 a23
Then the minor of a11 is M11 ¼
a32 a33
a11 a12
. and the minor of a23 is M23 ¼
a31 a32
Cofactor: The cofactor of the element aij of the matrix A, denoted by Aij , is the coefficient of aij in the expansion of A, i:e:
Aij ¼ ð1Þi þ j Mij :
Note The determinant of a square matrix A of order n can be expressed as n P aij Aij ; i ¼ 1; 2; . . .; n; j Aj ¼ j¼1
588
18
Mathematical Preliminaries
where Aij is the cofactor of aij . n P Obviously, j Aj ¼ aij Aij ; j ¼ 1; 2; . . .; n. i¼1
Properties of determinants: (i) The value of a determinant remains unaltered if its rows are changed into columns and its columns into rows. (ii) If any two rows (or columns) of a determinant are interchanged, the determinant retains its absolute value but changes its sign. (iii) If two rows or columns of a determinant are identical, then the determinant vanishes. (iv) If all the elements of any row (or of any column) are multiplied by the same quantity, the determinant is multiplied by the same quantity. Adjoint or adjugate of a square matrix: Let A ¼ aij nn be a square matrix, and let Aij denote the cofactor of aij in j Aj (for i; j ¼ 1; 2; . . .; nÞ. Let B ¼ Aij nn . Then the transpose BT of B is called the adjoint or adjugate of A and denoted by Adjð AÞ or Adj A. Properties: (i) (ii) (iii) (iv)
A (Adj AÞ ¼ (Adj AÞ A ¼ j Aj In AdjðAT Þ ¼ ðAdj AÞT AdjðkAÞ ¼ kn1 Adjð AÞ, where k is a scalar jAdj Aj ¼ j Ajn1 ; if j Aj 6¼ 0
Inverse of a matrix: Let A be a square matrix of order n. A square matrix B (if it exists) is said to be an inverse of A if AB = BA = In. If such an inverse B of A exists, then A is said to be invertible. Singular matrices: A square matrix A is said to be singular if j Aj ¼ 0. Non-singular matrix: A square matrix A is said to be non-singular if j Aj 6¼ 0. Note (i) A square matrix is invertible if it is non-singular. (ii) An invertible matrix has a unique inverse. Adj A (iii) A1 ¼ j Aj ; if j Aj 6¼ 0: Some properties of the inverse of a matrix If A, B are two non-singular matrices of the same order, then 1
(i) ðA1 Þ ¼ A (ii) ðABÞ1 ¼ B1 A1
18.3
Determinant
589
(iii) ðATÞ1 ¼ ðA1 Þ (iv) ðk AÞ1 ¼ k1 A1 , for any scalar k 6¼ 0. T
Note If A1 ; A2 ; . . .; Am are m invertible matrices, then each A1 ; A2 ; . . .; Am is 1 1 1 also invertible, and ðA1 A2 . . .; Am Þ1 ¼ A1 m Am1 . . .; A2 A1 . Integral powers of a square matrix Let A be a square matrix of order n. Then the integral powers of A are defined as follows: (i) (ii) (iii) (iv)
If A is non-singular, then A0 ¼ In A1 ¼ A Am ¼ Am1 A m If A is non-singular, Am ¼ ðA1 Þ ¼ ðAm Þ1 , where m is a positive integer.
18.4
Rank of a Matrix
Let A be an (m n) non-null matrix. The rank of matrix A will be k ( min (m, n)), if the determinant of every square submatrix of A of order (k + 1) vanishes while there exists at least one square submatrix of order k whose determinant does not vanish. That is, the rank of A is the order of the largest square submatrix in A whose determinant does not vanish. The rank of A is denoted by r(A). 0
1 If A ¼ @ 2 3
1
1 2 3 2
4 6 A; then r ð AÞ ¼ 1; since j Aj ¼
2 4
3 6 6 9
3
6
¼ 0 9
and all square submatrices of order 2 also vanish. Hence, r(A) = 1. If A ¼
3 1 ; then rðAÞ ¼ 2; since j Aj 6¼ 0: 2 4
Properties: (i) (ii) (iii) (iv)
The The The The
rank rank rank rank
of of of of
matrix A = the rank of AT . a non-singular matrix of order n is n. a singular square matrix of order n is less than n. a null matrix is defined to be zero or infinite.
590
18.5
18
Mathematical Preliminaries
Solution of a System of Linear Equations
The system of linear equations a11 x1 þ a12 x2 þ a21 x1 þ a22 x2 þ an1 x1 þ an2 x2 þ
þ a1n xn þ a2n xn þ ann xn
9 ¼ b1 > > > ¼ b2 > = > > > > ; ¼ bn
ð18:1Þ
can be written in matrix form as Ax ¼ b where A ¼ aij nn is the coefficient matrix, x ¼ ðx1 ; x2 ; . . .; xn ÞT T
and
1
b ¼ ðb1 ; b2 ; . . .; bn Þ . The system has a unique solution x ¼ A b, provided j Aj 6¼ 0, i.e. the coefficient matrix A is non-singular. 2 3 a11 a12 a1n b1 6 a21 a22 a2n b2 7 6 7 7 Note Let us consider a matrix A ¼ ½A b ¼ 6 6 . . . . . . . . . . . . . . . 7: 4 ... ... ... ... ...5 an1 an2 ann bn The matrix A is called the augmented matrix of the system of Eq. (18.1). It is obvious that r A ¼ n, since j Aj 6¼ 0 and there does not exist a square submatrix of A of order (n + 1). Thus, it follows that the system (18.1) has a unique solution if r ð AÞ ¼ r A ¼ n. If r ð AÞ\n, i.e. if A is singular, then two cases arise: (i) If r ð AÞ ¼ r A \n, then the system (18.1) has an infinite number of solutions. (ii) If r ð AÞ\r A n, then the system (18.1) does not have any solutions. In this case, the system (18.1) is inconsistent. The system (18.1) is consistent if and only if r ð AÞ ¼ r A , in which the necessary and sufficient condition for the unique solution of the system (18.1) is r ð AÞ ¼ r A ¼ n.
18.6
18.6
Vectors
591
Vectors
Row vector: A row matrix with n elements a1, a2, …, an is known as an n-tuple row vector. It is usually denoted by a ¼ ða1 ; a2 ; . . .; an Þ. Column vector: A column matrix with elements b1 ; b2 ; . . .; bn is known as an n-tuple column vector b and b ¼ ðb1 ; b2 ; . . .; bn ÞT . Analytically, both the row and column vectors may be considered as points in the n-dimensional space, which is known as a vector space. Unit vector: A vector is said to be a unit vector if all its components are zero except one with unit value. ei is unit vector in the n-dimensional vector space whose ith component is 1. Clearly, there are n n-component unit vectors as follows: e1 ¼ ð1; 0; 0; . . .; 0Þ; e2 ¼ ð0; 1; 0; . . .; 0Þ; . . .; en ¼ ð0; 0; 0; . . .; 1Þ: Null vector: A vector is said to be a null vector if all the components of the vector are equal to zero. The null vector is usually denoted by O. An n-component null vector is written as O = (0, 0, …, 0) with n zeros. Linear combination of vectors: The linear combination of a set of k n-component vectors a1, a2, …, ak is a vector a with n-components given by the relation a ¼ k1 a1 þ k2 a2 þ þ kk a k ¼
k X
ki ai ;
i¼1
where ki are all scalar quantities. Linear dependence of vectors: A set of k n-component vectors a1 ; a2 ; . . .; ak is said to be linearly dependent if there exist scalars ki ði ¼ 1; 2; . . .; kÞ with at least one ki ð6¼ 0Þ such that k1 a1 þ k2 a2 þ þ kk a k ¼ O is satisfied, where O is an n-component null vector. Linearly independent vectors: A set of k n-component vectors ðk nÞa1 ; a2 ; . . .; ak is said to be linearly independent if the linear combination of all the vectors is equal to an n-component null vector only, when all the scalars are equal to zero. Mathematically, the set of vectors a1 ; a2 ; . . .; ak is said to be linearly independent if the condition k X i¼1
ki ai ¼ O
592
18
Mathematical Preliminaries
is satisfied only when all the scalar quantities ki are zero. Spanning set: A set of k n-component vectors a1 ; a2 ; . . .; ak in a vector space is said to be a spanning set or a generating set if every vector in the space can be expressed as a linear combination of the vectors a1 ; a2 ; . . .; ak . Basis set: A spanning set of vectors, all of which are linearly independent, is known as a basis set. The set of all n n-component unit vectors e1, e2, …, en is said to form a basis set in n-dimensional Euclidean space, and this basis set is known as a standard basis set. Properties: (i) A set of vectors is either linearly independent or dependent. (ii) A set of vectors with a null vector is always linearly dependent. (iii) A non-empty subset of a linearly independent set of vectors is always linearly independent, and a superset of a linearly dependent set of vectors is always linearly dependent. Replacement theorem: If a1 ; a2 ; . . .; an is any basis set in Rn and b is any often non-null vector in Rn such that b ¼ k1 a1 þ k2 a2 þ þ kr ar ; where ki 6¼ 0 for some i; then the vector ak (for which kk 6¼ 0Þ can be replaced by the vector b. The new set of n vectors a1 ; a2 ; . . .; ak1 ; b; ak þ 1 ; . . .; an is also a basis set in Rn. Example 1 Show that the vectors (4, 5), (6, 2), (8, 10) are linearly dependent. Solution Let a1 = (4, 5), a2 = (6, 2), a3 = (8, 10). Let k1 ; k2 ; k3 be three scalars such that k1 a1 þ k2 a2 þ k3 a3 ¼ 0: This implies that k1 ð4; 5Þ þ k2 ð6; 2Þ þ k3 ð8; 10Þ ¼ ð0; 0Þ. Equating the corresponding components, we have 4k1 þ 6k2 þ 8k3 ¼ 0 5k1 þ 2k2 þ 10k3 ¼ 0
ð18:2Þ
18.6
Vectors
593
From the above, we have k1 k2 k3 ¼ ¼ ¼ k ¼ 1 ðsayÞ: 60 16 40 40 8 30 Therefore, k1 ¼ 44; k2 ¼ 0; k3 ¼ 22, i.e. Eq. (18.2) is satisfied if k1 ¼ 44; k2 ¼ 0 and k3 ¼ 22, which are not all zero. Hence, 44a1 − 22a3 = 0, i.e. the given vectors are linearly dependent. Example 2 Show that the vectors (4, 3, 2), (2, 1, 4), (2, 3, −8) are linearly dependent. Solution The matrix formed with the components given by
2 3
4 4 3 2
A ¼ 4 2 1 4 5: Now, j Aj ¼
2
2 2 3 8
of the given three vectors is
3 1 3
2
4
¼ 0: 8
Hence, the rank of the matrix A is less than 3. Therefore, the given three vectors are linearly dependent. Alternative method: Let a1 ¼ ð4; 3; 2Þ; a2 ¼ ð2; 1; 4Þ; a3 ¼ ð2; 3; 8Þ. Let k1 ; k2 ; k3 be three scalars such that k1 a1 þ k2 a2 þ k3 a3 ¼ 0 or
k1 ð4; 3; 2Þ þ k2 ð2; 1; 4Þ þ k3 ð2; 3; 8Þ ¼ ð0; 0; 0Þ:
ð18:3Þ
Now equating the corresponding components, we have 4k1 þ 2k2 þ 2k3 ¼ 0
ð18:4Þ
3k1 þ k2 þ 3k3 ¼ 0
ð18:5Þ
2k1 þ 4k2 8k3 ¼ 0:
ð18:6Þ
594
18
Mathematical Preliminaries
From (18.4) and (18.5), we have k1 k2 k3 ¼ ¼ 6 2 6 12 4 6 k1 k2 k3 : ¼ ¼ ¼ k ¼ 1 ðsayÞ or 4 6 2 or k1 ¼ 4; k2 ¼ 6; k3 ¼ 2 Substituting these values of k1 ; k2 ; k3 in (18.3), we have 4a1 6a2 2a3 ¼ 0; which implies that the given three vectors are linearly dependent. Example 3 Show that the vectors (1, 2, 3), (2, 4, 1) and (3, 2, 9) are linearly independent. Solution The matrix formed with the components of the given three vectors is given by 2
1 2 A ¼ 42 4 3 2
3
1 3
1 5: Now, j Aj ¼
2
3 9
2 3
4 1
¼ 20 6¼ 0: 2 9
Hence, the rank of the square matrix A is 3. Therefore, the given three vectors are linearly independent. Alternative method: Let us assume that the given three vectors a1 ¼ ð1; 2; 3Þ; a2 ¼ ð2; 4; 1Þ; a3 ¼ ð3; 2; 9Þ are linearly dependent. Then there exist three quantities k1 ; k2 ; k3 , not all zero, such that k1 a1 þ k2 a2 þ k3 a3 ¼ 0; i:e: k1 ð1; 2; 3Þ þ k2 ð2; 4; 1Þ þ k3 ð3; 2; 9Þ ¼ ð0; 0; 0Þ:
ð18:7Þ
Now equating the corresponding components, we have k1 þ 2k2 þ 3k3 ¼ 0
ð18:8Þ
2k1 þ 4k2 þ 2k3 ¼ 0
ð18:9Þ
18.6
Vectors
595
3k1 þ k2 þ 9k3 ¼ 0
ð18:10Þ
From (18.8) and (18.9), we have k1 k2 k3 ¼ ¼ ¼ k ¼ 1 ðsayÞ; 4 12 6 2 4 4 i:e: k1 ¼ 8; k2 ¼ 4; k3 ¼ 0: With these values of k1 ; k2 ; k3 , Eq. (18.10) will not be satisfied. Then Eq. (18.7) will not be satisfied; therefore, Eq. (18.7) will not be satisfied with at least one of k1 ; k2 ; k3 as non-zero. Equation (18.7) will be satisfied only for k1 ¼ k2 ¼ k3 ¼ 0. Hence, the given three vectors are linearly independent. Example 4 Show that the vectors (2, −1, 0), (3, 5, 1), (1, 1, 2) form a basis for E3. Solution The matrix formed with the components of these three vectors is given by 2
2 1 A ¼ 43 5 1 1
3
2 0
1 5: Now, j Aj ¼
3
1 2
1 0
5 1
¼ 23 6¼ 0: 1 2
Thus, the matrix formed by the elements of the given vectors is non-singular and its rank is 3. Hence, the given vectors are linearly independent. If the given vectors span E3, then for any vector ða1 ; a2 ; a3 Þ in E3, there exist k1 ; k2 ; k3 such that ða1 ; a2 ; a3 Þ ¼ k1 ð2; 1; 0Þ þ k2 ð3; 5; 1Þ þ k3 ð1; 1; 2Þ: Now equating the corresponding components, we have 9 2k1 þ 3k2 þ k3 ¼ a1 = k1 þ 5k2 þ k3 ¼ a2 : ; k2 þ 2k3 ¼ a3 The coefficient matrix of these equations is given by 2
2 4 1 0
3 5 1
3 1
1 5 ¼ AT : Now, AT ¼ j Aj ¼ 23 6¼ 0: 2
ð18:11Þ
ð18:12Þ
596
18
Mathematical Preliminaries
Hence, there exists a unique solution for k1 ; k2 ; k3 whatever may be the value of a1, a2, a3, and the vector ða1 ; a2 ; a3 Þ can be expressed as the linear combination of the vectors (2, −1, 0), (3, 5, 1) and (1, 1, 2). Hence, the given three vectors form a basis for E3. Example 5 Find a basis for E3 that contains the vectors (1, 2, 2) and (2, 0, 1). Solution We know that the standard basis for E3 is {e1, e2, e3}, where e1 = (1, 0, 0), e2 = (0, 1, 0) and e3 = (0, 0, 1). Let a1 ¼ ð1; 2; 2Þ and a2 ¼ ð2; 0; 1Þ. Now a1 ¼ ð1; 2; 2Þ ¼ e1 þ 2e2 þ 2e3 . Hence, a1 can replace any one vector from fe1 ; e2 ; e3 g to form a new basis. Let it be fa1 ; e2 ; e3 g. Again; let a2 ¼ k1 a1 þ k2 e2 þ k3 e3 ; i:e: ð2; 0; 1Þ ¼ k1 ð1; 2; 2Þ þ k2 ð0; 1; 0Þ þ k3 ð0; 0; 1Þ:
ð18:13Þ
Equating the corresponding components, we have k1 ¼ 2; 2k1 þ k2 ¼ 0; 2k1 þ k3 ¼ 1: Solving, we have k1 ¼ 2; k2 ¼ 4; k3 ¼ 3. Hence, from (18.1), we have a2 ¼ 2a1 4e2 3e3 : Therefore, a2 can replace e2 or e3 from fa1 ; e2 ; e3 g to form a new basis. Hence, the new basis is {a1, a2, e3}.
18.7
Probability
Random experiment: An experiment performed without a definite prediction about the result is known as a random experiment. Example: Whenever a coin is tossed or a die is thrown, it cannot be predicted whether a ‘head’ or a ‘tail’ will appear or which face of the die will come up on top. These are examples of random experiments. Trial and event: If multiple experiments are performed under essentially with the same physical condition, then each experiment is called a trial, and the results (outcomes) are known as events.
18.7
Probability
597
Example: If a coin is tossed several times, then each tossing is a trial and the results (‘heads’ or ‘tails’) are the events. Events are classified as (i) elementary events and (ii) composite events. Elementary events are those which are single in nature. ‘head’ and ‘tail’ are examples of elementary events in the case of tossing a coin. Similarly, in the case of throwing a die, ‘face 3’ and ‘face 4’ are examples of elementary events. On the other hand, composite events are those which are formed through the combination of two or more elementary events. The event ‘even-numbered face’ is an example of a composite event in the case of throwing a die. Mutually exclusive cases (or events): If two or more events do not happen simultaneously then these events are said to be mutually exclusive. Equally likely cases (or events): Two events are said to be equally likely if any one of them cannot be predicted over the other during a trial. Example: In the case of tossing an unbiased coin, both the events ‘head’ and ‘tail’ are equally likely. Exhaustive cases (or events): The number of all possible cases that may happen during a trial are all together said to form an exhaustive case. Example: In the case of throwing a die, there are six possibilities: ‘face one’, ‘face two’, ‘face three’, ‘face four’, ‘face five’, ‘face six’. So there are six exhaustive cases. Favourable cases (or events): The cases which favour the occurrence of a particular event are said to be favourable to that event. Example: In the case of throwing a die, the event ‘multiple of 3’ covers the two cases ‘face 3’ and ‘face 6’. So these cases are favourable to the event. Classical definition of probability: If there are n exhaustive, pairwise mutually exclusive, and of these equally likely outcomes of a random experiment, and if m out of these are favourable to a particular event ‘A’, then the probability of the happening of A is denoted by P (A) and is defined by Pð AÞ ¼
m : n
Properties: (i) An event which is bound to happen is known as a certain event. The probability of a certain event is unity, i.e. P(S) = 1, where S is the certain event.
598
18
Mathematical Preliminaries
(ii) An event which will never happen is known as an impossible event. The probability of an impossible event is zero, i.e. P(X) = 0, where X is the impossible event. (iii) For any event A, Pð AÞ 0. (iv) The probability of an event lies between 0 and 1, i.e. 0 Pð AÞ 1. (v) P A ¼ 1 Pð AÞ, where A is the complementary event of A. Sample space or event space: The set of all possible outcomes of a random experiment is called the sample space associated with the experiment. The possible outcomes (elements of the sample space) are called sample points. Example For the random experiment of tossing a coin, there are two outcomes: (i) head, (ii) tail. If we denote the occurrence of a head by H, the occurrence of a tail by T and the sample space by S, then S ¼ fH; T g: If the random experiment consists of tossing a coin twice, the associated sample space is S ¼ fHH; HT; TH; TT g; where HT denotes the occurrence of H from the first toss and the occurrence of T from the second one. Discrete and continuous sample spaces: A sample space that consists of a finite (or an infinite but countable) number of sample points is called a discrete sample space. A sample space which is not discrete is called a continuous sample space. Axiomatic definition of probability Let E be a random experiment described by the event space S, and let A be any event connected with E. The probability of A is a number associated with A to be denoted by P(A) such that the following axioms are satisfied: (i) Pð AÞ 0. (ii) The probability of a certain event P(S) = 1. (iii) For any number of pairwise mutually exclusive events A1, A2, A3, … PðA1 þ A2 þ A3 þ Þ ¼ PðA1 Þ þ PðA2 Þ þ PðA3 Þ þ Conditional probability For two events A and B, if A occurs when B has already occurred, then the probability of A with respect to B is called the conditional probability of A with respect to B. It is denoted by P(A/B) and defined by
18.7
Probability
599
A PðABÞ ; provided PðBÞ 6¼ 0: P ¼ B PðBÞ Properties: (i) P(AB) = P(A) P(B/A) = P(B) P(A/B) (ii) P(ABC) = P(A) P(B/A) P(C/AB) for the events A, B, C. Independent events Two events A and B are said to be independent if the occurrence of one remains unaffected by the occurrence of the other. Hence; PðA=BÞ ¼ Pð AÞ;
Þ i:e: PPððAB BÞ ¼ Pð AÞ;
i:e: PðABÞ ¼ Pð AÞPðBÞ: This result can be generalized in the following form: PðA1 A2 An Þ ¼ PðA1 ÞPðA2 Þ PðAn Þ: Random variable If S is the sample space associated with a random experiment, then a function X : S ! R which assigns one and only one real number for each element x 2 S is called a random variable. Example The sample space associated with the random experiment of tossing a coin twice is S ¼ fHH; HT; TH; TT g:
If a random variable is defined by X ¼ number of heads obtained; then we have X ðHH Þ ¼ 2; X ðHT Þ ¼ 1; X ðTH Þ ¼ 1; X ðTT Þ ¼ 0: Discrete random variable: A random variable X is said to be discrete if X possesses values which are either finite or countably infinite.
600
18
Mathematical Preliminaries
Discrete distribution If the random variable X takes a discrete set of values . . . x2 ; x1 ; x0 ; x1 ; x2 ; . . ., where . . . x2 \x1 \x0 \x1 \x2 ; . . . with probabilities PðX ¼ xi Þ ¼ fi
ði ¼ 0; 1; 2; . . .Þ;
then the distribution function is obtained as follows. In xi x\xi þ 1 (
i X
FðxÞ ¼ Pð1 X xÞ ¼ P
) ðX ¼ xk Þ
k¼1
¼
i X
PðX ¼ xk Þ ¼
k¼1
i X
fk ði ¼ 0; 1; 2; . . .Þ:
k¼1
Here fi is called the probability mass function of X. Properties of the probability mass function are: (i) fi 0 8i 1 P (ii) fi ¼ 1 i¼1
(iii) Pða\X bÞ ¼
P a\xi b
fi .
Some important discrete distributions (i) Binomial distribution A discrete random variable X having the set f0; 1; 2; . . .; ng as the spectrum is said to have a binomial distribution with parameters n; p if the probability mass function of X is given by fi ¼ n ci pi qni ; for i ¼ 0; 1; 2; . . .; n; where n is a positive integer and 0\p\1; q ¼ 1 p: (ii) Poisson distribution A discrete random variable X having the set f0; 1; 2; . . .g as the spectrum is said to have a Poisson distribution with parameter lð [ 0Þ if the probability mass function of X is given by fi ¼ el
li ; for i ¼ 0; 1; 2; . . . i!
18.7
Probability
601
Continuous random variable A random variable X is said to be continuous if X possesses values which are uncountably infinite. Continuous distribution The distribution of a random variable X is said to be continuous if the distribution function F(x) is continuous and its derivative F′(x) is piecewise continuous everywhere. The derivative of the distribution function of X, i.e. F′(x), is called the probability density function (PDF) of the random variable X and is denoted by f(x). The necessary conditions for a probability density function are as follows: (i) f ð xÞ 0 everywhere R1 (ii) f ð xÞdx ¼ 1: 1
Properties: (i) PðX ¼ aÞ ¼ F ðaÞ F ða 0Þ ¼ 0 since F(x) is continuous at any point a (a being constant) Rb (ii) Pða\X bÞ ¼ F ðbÞ F ðaÞ ¼ f ð xÞdx (iii) F ð xÞ ¼ (iv)
Rx 1
a
f ðuÞdu
Pðx\X x þ dxÞ ¼ F ðx þ dxÞ F ð xÞ ¼ dF ð xÞ ¼ F 0 ð xÞdx ¼ f ð xÞdx:
Density curve: The curve y = f(x) is called the density curve. It gives a useful graphical representation in the continuous case. Some important continuous distributions: (i) Uniform or rectangular distribution A continuous random variable X having the interval ða; bÞ as the spectrum is said to have a uniform distribution if the probability density function of X is given by f ðxÞ ¼
1 ba ;
0;
a\x\b ; otherwise
where a and b are two parameters of the distribution.
602
18
Mathematical Preliminaries
(ii) Normal ðm; rÞ distribution A continuous random variable X having the interval ð1; 1Þ as the spectrum is said to have a normal distribution if the probability density function of X is given by ðxmÞ2 1 f ðxÞ ¼ pffiffiffiffiffiffi e 2r2 ; 2pr
1\x\1:
It is a probability distribution with two parameters m; r and is denoted by Nðm; rÞ. x2
If m ¼ 0 and r ¼ 1; then f ðxÞ ¼ p1ffiffiffiffi e 2 ; 2p
1\x\1:
The corresponding distribution is called a standard normal distribution and is denoted by Nð0; 1Þ: (iii) Gamma distribution A continuous random variable X having ð0; 1Þ as its spectrum is said to have a gamma distribution if the probability density function of X is given by 8 x l1 > < e x ; 0\x\1; l [ 0 CðlÞ f ðxÞ ¼ > : 0; otherwise Here l is the only parameter of the distribution. (iv) Beta distribution of the first kind A continuous random variable X having ð0; 1Þ as its spectrum is said to have a beta distribution of the first kind if the probability density function of X is given by 8 < xm1 ð1 xÞn1 ; 0\x\1; m; n [ 0 : f ðxÞ ¼ bðm; nÞ : 0; otherwise (v) Beta distribution of the second kind A continuous random variable X having ð0; 1Þ as its spectrum is said to have a beta distribution of the second kind if the probability density function of X is given by 8
0) are the parameters of the distribution.
18.8
Markov Process
A Markov process (chain) is a stochastic process used to analyse decision-making problems in which the occurrence of a specific event is dependent on the occurrence of the event immediately prior to the current event. Basically, this process is used to identify: (i) A specific state of the system being studied (ii) The state transition relationship. The occurrence of an event at a specified point in time (say, in the nth period) puts the system in a given state, say Sn. If, after the passage of one time unit,
18.8
Markov Process
611
another event occurs (during the (n + 1)th time period), the system has moved from state Sn to state Sn+1. For example, Sn may represent the number of persons in the queue at a railway window at time t = n. As the time changes to t = n + 1, the state of the queue also changes to Sn+1. The probability of moving from one state to another or remaining in the same state in a single time period is called the transition probability. Also, since the probability of moving from one state to another depends on the probability of the preceding state, the transition probability is a conditional probability. The transition probability pij is the probability that the system presently in state Ei will be in state Ej at some later step. A finite Markov process must satisfy the following conditions: (i) The process consists of a finite number of states. (ii) The probability of moving from one state to another is dependent only on the immediately preceding state. (iii) Transition probabilities are stationary. (iv) The process has a set of initial probabilities which may be given or determined. State transition matrix: A state transition matrix is a rectangular array whose elements represent the transition probabilities for a Markov process. Let Si represent the ith (i = 1, 2, …, m) state of a stochastic process and pij the transition probability of moving from state Si to state Sj in one step. Then a one-stage state transition matrix P can be described as follows: 0 S S1 B 1 p 11 S2 B B P ¼ .. B p21 . . B @ .. Sm pm1
S2 p12 p22 .. . pm2
1 Sm p1m C C p2m C C: .. C . A
pmm
In the transition matrix of the Markov process, pij = 0 when no transition occurs from state i to state j, and pij = 1 when the system is moving from the ith state to the jth state only, at the next transition. Each row of the transition matrix represents a one-step transition probability distribution over all states. This means that pi1 þ pi2 þ þ pim ¼ 1 for all i and 0 pij 1: Transition diagram: In a transition diagram, the transition probabilities which can occur in any situation are shown (see Fig. 18.1).
612
18
Mathematical Preliminaries
Fig. 18.1 Transition diagram
The arrows from each state indicate the possible state to which a process can move from the given state. The transition matrix corresponding to Fig. 18.1 is as follows: 0 S S1 B 1 p 11 P ¼ S2 B @ 0 S3 p31
S2 p12 0 0
1 S3 0 C C: p23 A p33
n-step transition probability: Here we shall discuss the probabilistic behaviour of a system over a period of time. Let the system occupy state Si initially (i.e. at time 0). Then the probability that the system moves to state Sj at time n (i.e. after n steps) is an n-step transition ðnÞ ð1Þ probability. It is denoted pij . We note that pij ¼ pij for all i and j. ðnÞ
Let pi
denote the probability that the system is in state Sn at time n. Then the ðn þ 1Þ
ðnÞ
that the system will be in state Sj at time n + 1 is related to pi by probability pj the following equation: ðn þ 1Þ
pj
¼
n X
ðnÞ
pi pij
n ¼ 0; 1; 2; . . .
i¼1
where pij is the transition probability of transitioning from Si to Sj in one step.
18.8
Markov Process
613
For j = 1, 2, …, m, the preceding relation gives a system of m equations that can be written in matrix form as follows: 1 0 ðn þ 1Þ p1 p11 B ðn þ 1Þ C B C B p21 B p2 C B . B .. C ¼ B @ . A @ .. pm1 pðn þ 1Þ 0
m
or
p12 p22 .. . pm2
10 ðnÞ 1 p p1m B 1ðnÞ C p2m C B C p2 C C .. CB B . C . A@ .. A pmm pðmnÞ .. .
pðn þ 1Þ ¼ ApðnÞ ;
where pðn þ 1Þ ; pðnÞ are the state probability vectors at time n + 1 and n respectively, and A is a one-stage state transition matrix. Therefore, if we know the initial state probabilities (n = 0), we can easily compute the probabilities at any time successively as follows: pð1Þ ¼ Apð0Þ pð2Þ ¼ Apð1Þ ¼ A2 pð0Þ ... pðnÞ ¼ Apðn1Þ ¼ An pð0Þ : From these results, we can easily calculate the probability that the system occupies each of its states after n steps by premultiplying the initial state probability vector by the nth power of the transition matrix. Steady state system: In any system, when it is observed that there is no change in the transition probabilities, the system would reach a point of equilibrium; i.e. no further changes would occur in the state probabilities. There is a limiting probability that the system in a finite (but large) number of transitions will reach steady state equilibrium. This means that when n becomes very large, each pij tends to a fixed limit, and each state probability vector p(n) approaches a constant value, i.e. pðn þ 1Þ ¼ pðnÞ ¼ p; independent of n: Thus, in the limiting case, lim pðn þ 1Þ ¼ lim ApðnÞ becomes p ¼ Ap: Therefore, when n approaches infinity, p(n) becomes constant (i.e. independent) of time, and the system is said to be a steady state system.
Bibliography
Ahuja, R., Magnati, T., & Orlin, J. (1993). Network flows: Theory, algorithms and applications. Upper Saddle River: Prentice Hall. Aokie, M. (1971). Introduction to optimization techniques: Fundamentals and applications of nonlinear programming. New York: The Macmillan Company. Barnett, S. (1975). Introduction to mathematical control theory. Oxford: Oxford University Press. Bazaraa, M., Arvis, J. J., & Sherali, H. (1990). Linear programming and network flows. New York: Wiley. Bazaraa, M. S., Sherali, H. D., & Shetty, C. M. (2004). Nonlinear programming. New York: Wiley. Bazaraa, M. S., Sherali, H. D., & Shetty, C. M. (2006). Nonlinear programming—Theory and algorithms. New York: Wiley-Interscience. Beale, E. M. I. (1959). On quadratic programming. Naval Research Logistics Quarterly, 5, 227– 243. Bellman, R. E., & Dreyfus, S. E. (1962). Applied dynamic programming. Princeton: Princeton University Press. Bellmen, R. E., & Zadeh, L. A., (1970). Decision making in a fuzzy environment. Management Science, 17(4), 141–164. Beveridge, G. S. G., & Schechter, R. S. (1970). Optimization: Theory and practice. New York: McGraw-Hill. Bhatti, M. A. (2012). Practical optimization methods. Berlin: Springer-Verlag. Bhunia, A. K. (2008). Advanced operations research. DDE study materials, MAS-305, 404. Burdwan: Burdwan University. Bhunia, A. K. Advanced optimization and operational research-II. DDE study materials (Module 111-115). Midnapore: Vidyasagar University. Bhunia, A. K. (2006–2007). Operations research. DDE study materials. Burdwan: Burdwan University. Bradley, S., Hax, A., & Magnanti, T. (1977). Applied mathematical programming. Reading: Addison-Wesley. Bronson, R., & Naadimuthu, G. (1982). Theory and problems of operations research. In Schaum’s outline series. New York: McGraw-Hill. Carlsson, C., & Korhonen, P. (1986). A parametric approach to fuzzy linear programming. Fuzzy Sets Systems, 29, 303–313. Carter, M. W., & Price, C. C. (2001). Operations research: A practical introduction. Boca Raton: CRC Press.
© Springer Nature Singapore Pte Ltd. 2019 A. K. Bhunia et al., Advanced Optimization and Operations Research, Springer Optimization and Its Applications 153, https://doi.org/10.1007/978-981-32-9967-2
615
616
Bibliography
Chakravorty, J. G., & Ghosh, P. R. (2009). Linear programming and game theory. Kolkata: Moulik Library. Chong, E. K. P., & Zak, S. H. (2004). An introduction to optimization. New York: Wiley. Churchman, C. W., Achoff, R. L., & Arnoff, E. L. (1957). Introduction to operations research. New York: Wiley. Cooper, L., & Steinberg, D. (1970). Introduction to methods of optimization. Philadelphia: W.B. Saunders Co. Dano, S. (1975). Non-linear programming and dynamic programming. Berlin: Springer-Verlag. Denardo, E. (1982). Dynamic programming theory and applications. Upper Saddle River: Prentice Hall. Dixan, L. C. W. (1972). Nonlinear optimization. London: The English Universities Press Ltd. Eallenkotter, D. (1989). An early classic misplaced: Ford W. Harris’s Economic Order Quantity model of 1915. Management Science, 35, 898–900. Evans, J. R., & Minieka, E. (1992). Optimization algorithms for networks and graphs. Boca Raton: CRC Press. Ferenc, F., Jenö, S., & Ferenc, S. (1999). Introduction to the theory of games: Concepts, methods, applications. Berlin: Springer-Verlag. Fletcher, R. (Ed.). (1969). Optimization. London: Academic Press. Forst, W., & Hoffmann, D. (2010). Optimization—Theory and practice. Berlin: Springer. Gue, R., & Thomas, M. E. (1968). Mathematical methods in operations research. New York: Macmillan. Gupta, R. K. (1992). Operations research. Agra: Krishna Prakasha Media (P) Ltd, Meerut, India. Gupta, P. K., & Hira, D. S. (2004). Operations research. New Delhi: S. Chand & Co. Ltd. Hadley, G. (1964). Nonlinear and dynamic programming (Vol. II). Reading: Addition-Wesley. Hadley, G., & Whitin, T. M. (1963). Analysis of inventory systems. Englewood Cliffs: Prentice Hall. Harris, F. (1915). Operations and cost (Factory Management Series). Chicego, A.W. Shaw co. Hillier, F. S., & Lieberman, G. J. (2001). Introduction to operations research. New York: McGraw-Hill. Jahn, J. (2007). Introduction to the theory of nonlinear optimization. Berlin: Springer. Jones, A. J. (2001). Game theory: Mathematical models of conflict. Chichester: Ellis Horwood Ltd. Joshi, M. C., & Moudgalya, K. M. (2008). Optimization theory and practice. New Delhi: Narosa Publishing House. Kambo, N. S. (2005). Mathematical programming techniques. New Delhi: Allied East-West Press Pvt. Ltd. Levenson, M. E. (1967). Maxima and minima. New York: Macmillan. Mangasarian, O. L. (1994). Non-linear programming. New York: McGraw-Hill. McCormick, G. P. (1983). Nonlinear programming: Theory, algorithms, and applications. New York: Wiley. McKinsey, J. C. C. (2003). Introduction to the theory of games. Mineola: Dover Publications. Mesterton-Gibbons, M. (2001). An introduction to game-theoretic modelling. Providence: American Mathematical Society. Mital, K. V., & Mohan, C. (2003). Optimization methods in operations research and system analysis. New Delhi: New Age International Publishers. Mohan, C., & Deep, K. (2008). Optimization techniques. New Delhi: New Age International Publishers. Mohan, C., & Deep, K. (2009). Optimization techniques. New Delhi: New Age Science. Mustafi, C. K. (2002). Operations research—Methods and practice. New Delhi: New Age International Publishers Ltd. Naddor, E. (1966). Inventory systems. New York: Wiley. Nicolaj, N., & Ralph, P. B. (1994). Foundations of game theory: Noncooperative games. Berlin: Springer.
Bibliography
617
Osborne, M. J., & Rubinstein, A. (1994). A course in game theory. Cambridge: The MIT Press. Panik, M. J. (1976). Classical optimization: Foundations and extensions. Amsterdam: North-Holland. Peressini, A. L., Sullivan, F. E., & Uhl, J. J., Jr. (1988). The mathematics of nonlinear programming. Berlin: Springer-Verlag. Rao, S. S. (1978). Optimization—Theory and applications. New Delhi: Wiley Eastern. Rao, S. S. (2011). Engineering optimization theory and practice. New Delhi: New Age International (P) Ltd. Publishers. Ravindran, A., Ragsdell, K. M., & Reklaitis, G. V. (1983). Engineering optimization: Methods and applications. New York: Wiley. Reza, F. M. (1961). An introduction to information theory. New York: McGraw-Hill. Roy, T. K. (1998). Some inventory models in fuzzy environment (Dissertation). Vidyasagar University. Sanyal, D. C., & Das, K. (2008). Linear programming and game theory. Kolkata: U.N. Dhur & Sons Pvt. Ltd. Sharma, J. K. (2007). Operations research—Theory and applications. New York: Macmillan. Sharma, S. D. (2009). Operations research. Delhi: Kedarnath Ramnath. Silver, E., & Peterson, R. (1985). Decision systems for inventory management and production planning. New York: Wiley. Simmons, D. M. (1975). Nonlinear programming for operations research. Englewood Cliffs: Prentice Hall. Swarup, K., Gupta, P. K., & Mohan, M. (2003). Operations research. New Delhi: Sultan Chand & Sons. Taha, H. A. (1997). Operations research—An introduction. Upper Saddle River: Prentice-Hall. Trappy, J. F. C., Liu, C. R., & Chang, T. C. (1988). Fuzzy non-linear programming: Theory and application in manufacturing. International Journal of Production Research, 26, 957–985. Zangwill, W. I. (1969). Nonlinear programming: A unified approach. Englewood Cliffs: Prentice-Hall.
Index
A Active constraint, 254 Activity, 369 Admissible variation, 226 Algebraic method, 330 Anticipation inventories, 523 Artificial variables, 53
Crash cost, 392 Crash time, 387 Critical activity, 378 Critical path, 388 Critical path analysis, 375 Critical Path Method (CPM), 367 Cutting plane method, 172
B Basic feasible solution, 33 Basic variables, 34 Basis, 592 Beale’s method, 292 Big M method, 29 Bordered Hessian matrix, 230 Bounded variable technique, 127 Branch and bound method, 171 Buffer stock, 521 Burst event, 370
D Decision-making problem, 3 Degenerate solution, 34 Demand, 523 Deterioration, 525 Deterministic inventory models, 527 Direct cost of a project, 387 Direct substitution method, 211 Disposal cost, 526 Dual simplex algorithm, 114 Dual simplex method, 109 Dummy activity, 370
C Canonical form, 31 Capacity, 497 Capacity of the cut, 497 Characteristics of queueing system, 405 Complementary slackness conditions, 274 Concave function, 13 Consistent, 590 Constrained optimization problems, 211 Constrained variation method, 211 Constraints, 212 Controlling of a project, 387 Convex function, 13 Convex set, 14 Cost slope of an activity, 388
E Event, 596 Event float, 376 Event slack, 376 Expected time, 382 Extreme point, 201 F Feasible direction, 251 Feasible region, 251 Feasible solution, 33 Float time, 376 Fluctuation inventories, 523 Forward pass, 379
© Springer Nature Singapore Pte Ltd. 2019 A. K. Bhunia et al., Advanced Optimization and Operations Research, Springer Optimization and Its Applications 153, https://doi.org/10.1007/978-981-32-9967-2
619
620 Free float, 376 Fully backlogged shortages, 524 Fuzzy inventory model, 570 Fuzzy non-linear programming, 570 G Games with saddle point, 308 Game theory, 303 Gomory’s cutting plane method, 172 Graph, 497 Graphical method, 330, 331 H Hancock determinantal equation, 230 Hessian matrix, 202 Holding cost, 526 I Iconic model, 8 Identity matrix, 582 Idle period, 409 Inactive constraint, 250 Independence of events, 599 Independent float, 376 Indirect cost of a project, 387 Initial basic feasible solution, 48 Integer programming problem, 171 Inter-arrival time, 410 Inventory, 521 Inventory control system, 521 Inventory models with price breaks, 552 K Kuhn-Tucker necessary conditions, 253 Kuhn-Tucker sufficient conditions, 255 L Lagrange multiplier method, 231 Lagrange multipliers, 241 Lead time, 521 Limitation on inventory, 544 Limitation on investment, 547 Limitation on storage space, 546 Linear combination of vectors, 591 Linear dependence of vectors, 591 Little’s formula, 426 Lower envelope, 333 M Manufacturing model with no shortages, 527 Manufacturing model with shortages, 527 Mixed integer programming, 171 Mixed strategies, 313 Monte Carlo simulation, 9
Index Monte Carlo technique, 9 Most likely time in PERT, 380 Multi-item inventory model, 527 N Net evaluation, 45 Network, 372 Network construction, 372 Nodes, 373 Normal cost, 388 Normal distribution, 602 Normal time, 388 Numbering of events, 374 O Objective function, 47 Operations Research, 11 Optimal solution, 49 Optimistic time, 380 Optimum strategies, 333 Ordering cost, 527 P Parametric programming, 138 Partially backlogged shortages, 524 Personal management, 4 Planning horizon, 525 Planning of a project, 368 Poisson’s axiom, 459 Probabilistic inventory models, 527 Production management, 527 Production problem, 527 Project scheduling, 369 Purchase cost, 529 Purchasing model, 527 Pure birth process, 411 Pure strategies, 341 Q Queueing capacity, 415 Queueing problem, 403 Queueing service discipline, 416 Queueing system, 416 Queueing theory, 403 Queues, 416 R Rectangular games, 308 Replenishment, 530 Research and development, 4 Revised simplex method, 79 Role of computer in OR, 11
Index S Saddle point, 311 Salvage value, 526 Scope of OR, 6 Set-up cost, 534 Shortage cost, 526 Simplex algorithm, 42 Simplex method, 52 Slack variables, 50 Standard form of L.P.P., 31 Stationary point, 225 Steady state system, 416 Stock-out cost, 526 Surplus variable, 57 T Time-cost trade-off, 387 Time estimates, 380 Total float, 377 Traffic intensity, 419 Transaction matrix, 474
621 Transient state, 410 Transition probability, 611 Transportation inventories, 523 Two-person zero-sum game, 307 Two-phase simplex method, 67 U Unconstrained optimization problem, 197 Upper envelope, 334 V Value of the game, 317 W Waiting line, 403 Waiting time distribution, 422 Wolfe’s modified simplex method, 275 Z Zero-sum game, 307